Statistical Power in Meta-Analysis
ERIC Educational Resources Information Center
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Austin, Peter C; Schuster, Tibor; Platt, Robert W
2015-10-15
Estimating statistical power is an important component of the design of both randomized controlled trials (RCTs) and observational studies. Methods for estimating statistical power in RCTs have been well described and can be implemented simply. In observational studies, statistical methods must be used to remove the effects of confounding that can occur due to non-random treatment assignment. Inverse probability of treatment weighting (IPTW) using the propensity score is an attractive method for estimating the effects of treatment using observational data. However, sample size and power calculations have not been adequately described for these methods. We used an extensive series of Monte Carlo simulations to compare the statistical power of an IPTW analysis of an observational study with time-to-event outcomes with that of an analysis of a similarly-structured RCT. We examined the impact of four factors on the statistical power function: number of observed events, prevalence of treatment, the marginal hazard ratio, and the strength of the treatment-selection process. We found that, on average, an IPTW analysis had lower statistical power compared to an analysis of a similarly-structured RCT. The difference in statistical power increased as the magnitude of the treatment-selection model increased. The statistical power of an IPTW analysis tended to be lower than the statistical power of a similarly-structured RCT.
How Many Studies Do You Need? A Primer on Statistical Power for Meta-Analysis
ERIC Educational Resources Information Center
Valentine, Jeffrey C.; Pigott, Therese D.; Rothstein, Hannah R.
2010-01-01
In this article, the authors outline methods for using fixed and random effects power analysis in the context of meta-analysis. Like statistical power analysis for primary studies, power analysis for meta-analysis can be done either prospectively or retrospectively and requires assumptions about parameters that are unknown. The authors provide…
Evaluating and Reporting Statistical Power in Counseling Research
ERIC Educational Resources Information Center
Balkin, Richard S.; Sheperis, Carl J.
2011-01-01
Despite recommendations from the "Publication Manual of the American Psychological Association" (6th ed.) to include information on statistical power when publishing quantitative results, authors seldom include analysis or discussion of statistical power. The rationale for discussing statistical power is addressed, approaches to using "G*Power" to…
A new u-statistic with superior design sensitivity in matched observational studies.
Rosenbaum, Paul R
2011-09-01
In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is "no departure" then this is the same as the power of a randomization test in a randomized experiment. A new family of u-statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments-that is, it often has good Pitman efficiency-but small effects are invariably sensitive to small unobserved biases. Members of this family of u-statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u-statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology. © 2010, The International Biometric Society.
Statistical power analysis of cardiovascular safety pharmacology studies in conscious rats.
Bhatt, Siddhartha; Li, Dingzhou; Flynn, Declan; Wisialowski, Todd; Hemkens, Michelle; Steidl-Nichols, Jill
2016-01-01
Cardiovascular (CV) toxicity and related attrition are a major challenge for novel therapeutic entities and identifying CV liability early is critical for effective derisking. CV safety pharmacology studies in rats are a valuable tool for early investigation of CV risk. Thorough understanding of data analysis techniques and statistical power of these studies is currently lacking and is imperative for enabling sound decision-making. Data from 24 crossover and 12 parallel design CV telemetry rat studies were used for statistical power calculations. Average values of telemetry parameters (heart rate, blood pressure, body temperature, and activity) were logged every 60s (from 1h predose to 24h post-dose) and reduced to 15min mean values. These data were subsequently binned into super intervals for statistical analysis. A repeated measure analysis of variance was used for statistical analysis of crossover studies and a repeated measure analysis of covariance was used for parallel studies. Statistical power analysis was performed to generate power curves and establish relationships between detectable CV (blood pressure and heart rate) changes and statistical power. Additionally, data from a crossover CV study with phentolamine at 4, 20 and 100mg/kg are reported as a representative example of data analysis methods. Phentolamine produced a CV profile characteristic of alpha adrenergic receptor antagonism, evidenced by a dose-dependent decrease in blood pressure and reflex tachycardia. Detectable blood pressure changes at 80% statistical power for crossover studies (n=8) were 4-5mmHg. For parallel studies (n=8), detectable changes at 80% power were 6-7mmHg. Detectable heart rate changes for both study designs were 20-22bpm. Based on our results, the conscious rat CV model is a sensitive tool to detect and mitigate CV risk in early safety studies. Furthermore, these results will enable informed selection of appropriate models and study design for early stage CV studies. Copyright © 2016 Elsevier Inc. All rights reserved.
Monte Carlo based statistical power analysis for mediation models: methods and software.
Zhang, Zhiyong
2014-12-01
The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.
Egbewale, Bolaji E; Lewis, Martyn; Sim, Julius
2014-04-09
Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. 126 hypothetical trial scenarios were evaluated (126,000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power.
2014-01-01
Background Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. Methods 126 hypothetical trial scenarios were evaluated (126 000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Results Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Conclusions Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power. PMID:24712304
The Statistical Power of Planned Comparisons.
ERIC Educational Resources Information Center
Benton, Roberta L.
Basic principles underlying statistical power are examined; and issues pertaining to effect size, sample size, error variance, and significance level are highlighted via the use of specific hypothetical examples. Analysis of variance (ANOVA) and related methods remain popular, although other procedures sometimes have more statistical power against…
NASA Astrophysics Data System (ADS)
Asal, F. F.
2012-07-01
Digital elevation data obtained from different Engineering Surveying techniques is utilized in generating Digital Elevation Model (DEM), which is employed in many Engineering and Environmental applications. This data is usually in discrete point format making it necessary to utilize an interpolation approach for the creation of DEM. Quality assessment of the DEM is a vital issue controlling its use in different applications; however this assessment relies heavily on statistical methods with neglecting the visual methods. The research applies visual analysis investigation on DEMs generated using IDW interpolator of varying powers in order to examine their potential in the assessment of the effects of the variation of the IDW power on the quality of the DEMs. Real elevation data has been collected from field using total station instrument in a corrugated terrain. DEMs have been generated from the data at a unified cell size using IDW interpolator with power values ranging from one to ten. Visual analysis has been undertaken using 2D and 3D views of the DEM; in addition, statistical analysis has been performed for assessment of the validity of the visual techniques in doing such analysis. Visual analysis has shown that smoothing of the DEM decreases with the increase in the power value till the power of four; however, increasing the power more than four does not leave noticeable changes on 2D and 3D views of the DEM. The statistical analysis has supported these results where the value of the Standard Deviation (SD) of the DEM has increased with increasing the power. More specifically, changing the power from one to two has produced 36% of the total increase (the increase in SD due to changing the power from one to ten) in SD and changing to the powers of three and four has given 60% and 75% respectively. This refers to decrease in DEM smoothing with the increase in the power of the IDW. The study also has shown that applying visual methods supported by statistical analysis has proven good potential in the DEM quality assessment.
ERIC Educational Resources Information Center
Texeira, Antonio; Rosa, Alvaro; Calapez, Teresa
2009-01-01
This article presents statistical power analysis (SPA) based on the normal distribution using Excel, adopting textbook and SPA approaches. The objective is to present the latter in a comparative way within a framework that is familiar to textbook level readers, as a first step to understand SPA with other distributions. The analysis focuses on the…
Jiang, Wei; Yu, Weichuan
2017-02-15
In genome-wide association studies (GWASs) of common diseases/traits, we often analyze multiple GWASs with the same phenotype together to discover associated genetic variants with higher power. Since it is difficult to access data with detailed individual measurements, summary-statistics-based meta-analysis methods have become popular to jointly analyze datasets from multiple GWASs. In this paper, we propose a novel summary-statistics-based joint analysis method based on controlling the joint local false discovery rate (Jlfdr). We prove that our method is the most powerful summary-statistics-based joint analysis method when controlling the false discovery rate at a certain level. In particular, the Jlfdr-based method achieves higher power than commonly used meta-analysis methods when analyzing heterogeneous datasets from multiple GWASs. Simulation experiments demonstrate the superior power of our method over meta-analysis methods. Also, our method discovers more associations than meta-analysis methods from empirical datasets of four phenotypes. The R-package is available at: http://bioinformatics.ust.hk/Jlfdr.html . eeyu@ust.hk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
DOE Office of Scientific and Technical Information (OSTI.GOV)
Halligan, Matthew
Radiated power calculation approaches for practical scenarios of incomplete high- density interface characterization information and incomplete incident power information are presented. The suggested approaches build upon a method that characterizes power losses through the definition of power loss constant matrices. Potential radiated power estimates include using total power loss information, partial radiated power loss information, worst case analysis, and statistical bounding analysis. A method is also proposed to calculate radiated power when incident power information is not fully known for non-periodic signals at the interface. Incident data signals are modeled from a two-state Markov chain where bit state probabilities aremore » derived. The total spectrum for windowed signals is postulated as the superposition of spectra from individual pulses in a data sequence. Statistical bounding methods are proposed as a basis for the radiated power calculation due to the statistical calculation complexity to find a radiated power probability density function.« less
The Importance of Teaching Power in Statistical Hypothesis Testing
ERIC Educational Resources Information Center
Olinsky, Alan; Schumacher, Phyllis; Quinn, John
2012-01-01
In this paper, we discuss the importance of teaching power considerations in statistical hypothesis testing. Statistical power analysis determines the ability of a study to detect a meaningful effect size, where the effect size is the difference between the hypothesized value of the population parameter under the null hypothesis and the true value…
Westfall, Jacob; Kenny, David A; Judd, Charles M
2014-10-01
Researchers designing experiments in which a sample of participants responds to a sample of stimuli are faced with difficult questions about optimal study design. The conventional procedures of statistical power analysis fail to provide appropriate answers to these questions because they are based on statistical models in which stimuli are not assumed to be a source of random variation in the data, models that are inappropriate for experiments involving crossed random factors of participants and stimuli. In this article, we present new methods of power analysis for designs with crossed random factors, and we give detailed, practical guidance to psychology researchers planning experiments in which a sample of participants responds to a sample of stimuli. We extensively examine 5 commonly used experimental designs, describe how to estimate statistical power in each, and provide power analysis results based on a reasonable set of default parameter values. We then develop general conclusions and formulate rules of thumb concerning the optimal design of experiments in which a sample of participants responds to a sample of stimuli. We show that in crossed designs, statistical power typically does not approach unity as the number of participants goes to infinity but instead approaches a maximum attainable power value that is possibly small, depending on the stimulus sample. We also consider the statistical merits of designs involving multiple stimulus blocks. Finally, we provide a simple and flexible Web-based power application to aid researchers in planning studies with samples of stimuli.
"Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs"
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2009-01-01
Power computations for one-level experimental designs that assume simple random samples are greatly facilitated by power tables such as those presented in Cohen's book about statistical power analysis. However, in education and the social sciences experimental designs have naturally nested structures and multilevel models are needed to compute the…
Powerlaw: a Python package for analysis of heavy-tailed distributions.
Alstott, Jeff; Bullmore, Ed; Plenz, Dietmar
2014-01-01
Power laws are theoretically interesting probability distributions that are also frequently used to describe empirical data. In recent years, effective statistical methods for fitting power laws have been developed, but appropriate use of these techniques requires significant programming and statistical insight. In order to greatly decrease the barriers to using good statistical methods for fitting power law distributions, we developed the powerlaw Python package. This software package provides easy commands for basic fitting and statistical analysis of distributions. Notably, it also seeks to support a variety of user needs by being exhaustive in the options available to the user. The source code is publicly available and easily extensible.
1990-03-01
equation of the statistical energy analysis (SEA) using the procedure indicated in equation (13) [8, 9]. Similarly, one may state the quantities (. (X-)) and...CONGRESS ON ACOUSTICS, July 24-31 1986, Toronto, Canada, Paper D6-1. 5. CUSCHIERI, J.M., Power flow as a compliment to statistical energy analysis and...34Random response of identical one-dimensional subsystems", Journal of Sound and Vibration, 1980, Vol. 70, p. 343-353. 8. LYON, R.H., Statistical Energy Analysis of
The 1993 Mississippi river flood: A one hundred or a one thousand year event?
Malamud, B.D.; Turcotte, D.L.; Barton, C.C.
1996-01-01
Power-law (fractal) extreme-value statistics are applicable to many natural phenomena under a wide variety of circumstances. Data from a hydrologic station in Keokuk, Iowa, shows the great flood of the Mississippi River in 1993 has a recurrence interval on the order of 100 years using power-law statistics applied to partial-duration flood series and on the order of 1,000 years using a log-Pearson type 3 (LP3) distribution applied to annual series. The LP3 analysis is the federally adopted probability distribution for flood-frequency estimation of extreme events. We suggest that power-law statistics are preferable to LP3 analysis. As a further test of the power-law approach we consider paleoflood data from the Colorado River. We compare power-law and LP3 extrapolations of historical data with these paleo-floods. The results are remarkably similar to those obtained for the Mississippi River: Recurrence intervals from power-law statistics applied to Lees Ferry discharge data are generally consistent with inferred 100- and 1,000-year paleofloods, whereas LP3 analysis gives recurrence intervals that are orders of magnitude longer. For both the Keokuk and Lees Ferry gauges, the use of an annual series introduces an artificial curvature in log-log space that leads to an underestimate of severe floods. Power-law statistics are predicting much shorter recurrence intervals than the federally adopted LP3 statistics. We suggest that if power-law behavior is applicable, then the likelihood of severe floods is much higher. More conservative dam designs and land-use restrictions Nay be required.
On the analysis of very small samples of Gaussian repeated measurements: an alternative approach.
Westgate, Philip M; Burchett, Woodrow W
2017-03-15
The analysis of very small samples of Gaussian repeated measurements can be challenging. First, due to a very small number of independent subjects contributing outcomes over time, statistical power can be quite small. Second, nuisance covariance parameters must be appropriately accounted for in the analysis in order to maintain the nominal test size. However, available statistical strategies that ensure valid statistical inference may lack power, whereas more powerful methods may have the potential for inflated test sizes. Therefore, we explore an alternative approach to the analysis of very small samples of Gaussian repeated measurements, with the goal of maintaining valid inference while also improving statistical power relative to other valid methods. This approach uses generalized estimating equations with a bias-corrected empirical covariance matrix that accounts for all small-sample aspects of nuisance correlation parameter estimation in order to maintain valid inference. Furthermore, the approach utilizes correlation selection strategies with the goal of choosing the working structure that will result in the greatest power. In our study, we show that when accurate modeling of the nuisance correlation structure impacts the efficiency of regression parameter estimation, this method can improve power relative to existing methods that yield valid inference. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling
Wood, John
2017-01-01
Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered—some very seriously so—but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. PMID:28706080
Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz
2015-03-01
FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.
Robust inference for group sequential trials.
Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei
2017-03-01
For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.
On the structure and phase transitions of power-law Poissonian ensembles
NASA Astrophysics Data System (ADS)
Eliazar, Iddo; Oshanin, Gleb
2012-10-01
Power-law Poissonian ensembles are Poisson processes that are defined on the positive half-line, and that are governed by power-law intensities. Power-law Poissonian ensembles are stochastic objects of fundamental significance; they uniquely display an array of fractal features and they uniquely generate a span of important applications. In this paper we apply three different methods—oligarchic analysis, Lorenzian analysis and heterogeneity analysis—to explore power-law Poissonian ensembles. The amalgamation of these analyses, combined with the topology of power-law Poissonian ensembles, establishes a detailed and multi-faceted picture of the statistical structure and the statistical phase transitions of these elemental ensembles.
Targeted On-Demand Team Performance App Development
2016-10-01
from three sites; 6) Preliminary analysis indicates larger than estimate effect size and study is sufficiently powered for generalizable outcomes...statistical analyses, and examine any resulting qualitative data for trends or connections to statistical outcomes. On Schedule 21 Predictive...Preliminary analysis indicates larger than estimate effect size and study is sufficiently powered for generalizable outcomes. What opportunities for
Teaching Principles of Linkage and Gene Mapping with the Tomato.
ERIC Educational Resources Information Center
Hawk, James A.; And Others
1980-01-01
A three-point linkage system in tomatoes is used to explain concepts of gene mapping, linking and statistical analysis. The system is designed for teaching the effective use of statistics, and the power of genetic analysis from statistical analysis of phenotypic ratios. (Author/SA)
APPLICATION OF STATISTICAL ENERGY ANALYSIS TO VIBRATIONS OF MULTI-PANEL STRUCTURES.
cylindrical shell are compared with predictions obtained from statistical energy analysis . Generally good agreement is observed. The flow of mechanical...the coefficients of proportionality between power flow and average modal energy difference, which one must know in order to apply statistical energy analysis . No
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Low statistical power in biomedical science: a review of three human research domains.
Dumas-Mallet, Estelle; Button, Katherine S; Boraud, Thomas; Gonon, Francois; Munafò, Marcus R
2017-02-01
Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0-10% or 11-20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation.
Low statistical power in biomedical science: a review of three human research domains
Dumas-Mallet, Estelle; Button, Katherine S.; Boraud, Thomas; Gonon, Francois
2017-01-01
Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0–10% or 11–20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation. PMID:28386409
Power Analysis Software for Educational Researchers
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; Long, Haiying; Abaci, Serdar
2012-01-01
Given the importance of statistical power analysis in quantitative research and the repeated emphasis on it by American Educational Research Association/American Psychological Association journals, the authors examined the reporting practice of power analysis by the quantitative studies published in 12 education/psychology journals between 2005…
Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling.
Nord, Camilla L; Valton, Vincent; Wood, John; Roiser, Jonathan P
2017-08-23
Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered-some very seriously so-but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. Copyright © 2017 Nord, Valton et al.
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Improved score statistics for meta-analysis in single-variant and gene-level association studies.
Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo
2018-06-01
Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Asking Sensitive Questions: A Statistical Power Analysis of Randomized Response Models
ERIC Educational Resources Information Center
Ulrich, Rolf; Schroter, Hannes; Striegel, Heiko; Simon, Perikles
2012-01-01
This article derives the power curves for a Wald test that can be applied to randomized response models when small prevalence rates must be assessed (e.g., detecting doping behavior among elite athletes). These curves enable the assessment of the statistical power that is associated with each model (e.g., Warner's model, crosswise model, unrelated…
Sample size and power considerations in network meta-analysis
2012-01-01
Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327
Zhu, Yun; Fan, Ruzong; Xiong, Momiao
2017-01-01
Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics. PMID:29040274
Reliable and More Powerful Methods for Power Analysis in Structural Equation Modeling
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Zhang, Zhiyong; Zhao, Yanyun
2017-01-01
The normal-distribution-based likelihood ratio statistic T[subscript ml] = nF[subscript ml] is widely used for power analysis in structural Equation modeling (SEM). In such an analysis, power and sample size are computed by assuming that T[subscript ml] follows a central chi-square distribution under H[subscript 0] and a noncentral chi-square…
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
ERIC Educational Resources Information Center
Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.
2010-01-01
This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…
ERIC Educational Resources Information Center
Dong, Nianbo; Spybrook, Jessaca; Kelcey, Ben
2016-01-01
The purpose of this study is to propose a general framework for power analyses to detect the moderator effects in two- and three-level cluster randomized trials (CRTs). The study specifically aims to: (1) develop the statistical formulations for calculating statistical power, minimum detectable effect size (MDES) and its confidence interval to…
Statistical power analysis in wildlife research
Steidl, R.J.; Hayes, J.P.
1997-01-01
Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be a??0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to 'accept' a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the I?-level, effect sizes, and sample sizes used in calculations must also be reported.
Can power-law scaling and neuronal avalanches arise from stochastic dynamics?
Touboul, Jonathan; Destexhe, Alain
2010-02-11
The presence of self-organized criticality in biology is often evidenced by a power-law scaling of event size distributions, which can be measured by linear regression on logarithmic axes. We show here that such a procedure does not necessarily mean that the system exhibits self-organized criticality. We first provide an analysis of multisite local field potential (LFP) recordings of brain activity and show that event size distributions defined as negative LFP peaks can be close to power-law distributions. However, this result is not robust to change in detection threshold, or when tested using more rigorous statistical analyses such as the Kolmogorov-Smirnov test. Similar power-law scaling is observed for surrogate signals, suggesting that power-law scaling may be a generic property of thresholded stochastic processes. We next investigate this problem analytically, and show that, indeed, stochastic processes can produce spurious power-law scaling without the presence of underlying self-organized criticality. However, this power-law is only apparent in logarithmic representations, and does not survive more rigorous analysis such as the Kolmogorov-Smirnov test. The same analysis was also performed on an artificial network known to display self-organized criticality. In this case, both the graphical representations and the rigorous statistical analysis reveal with no ambiguity that the avalanche size is distributed as a power-law. We conclude that logarithmic representations can lead to spurious power-law scaling induced by the stochastic nature of the phenomenon. This apparent power-law scaling does not constitute a proof of self-organized criticality, which should be demonstrated by more stringent statistical tests.
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity
Beasley, T. Mark
2013-01-01
Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation due to increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed. PMID:24954952
A κ-generalized statistical mechanics approach to income analysis
NASA Astrophysics Data System (ADS)
Clementi, F.; Gallegati, M.; Kaniadakis, G.
2009-02-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.
Experimental design, power and sample size for animal reproduction experiments.
Chapman, Phillip L; Seidel, George E
2008-01-01
The present paper concerns statistical issues in the design of animal reproduction experiments, with emphasis on the problems of sample size determination and power calculations. We include examples and non-technical discussions aimed at helping researchers avoid serious errors that may invalidate or seriously impair the validity of conclusions from experiments. Screen shots from interactive power calculation programs and basic SAS power calculation programs are presented to aid in understanding statistical power and computing power in some common experimental situations. Practical issues that are common to most statistical design problems are briefly discussed. These include one-sided hypothesis tests, power level criteria, equality of within-group variances, transformations of response variables to achieve variance equality, optimal specification of treatment group sizes, 'post hoc' power analysis and arguments for the increased use of confidence intervals in place of hypothesis tests.
The Empirical Review of Meta-Analysis Published in Korea
ERIC Educational Resources Information Center
Park, Sunyoung; Hong, Sehee
2016-01-01
Meta-analysis is a statistical method that is increasingly utilized to combine and compare the results of previous primary studies. However, because of the lack of comprehensive guidelines for how to use meta-analysis, many meta-analysis studies have failed to consider important aspects, such as statistical programs, power analysis, publication…
Pounds, Stan; Cao, Xueyuan; Cheng, Cheng; Yang, Jun; Campana, Dario; Evans, William E.; Pui, Ching-Hon; Relling, Mary V.
2010-01-01
Powerful methods for integrated analysis of multiple biological data sets are needed to maximize interpretation capacity and acquire meaningful knowledge. We recently developed Projection Onto the Most Interesting Statistical Evidence (PROMISE). PROMISE is a statistical procedure that incorporates prior knowledge about the biological relationships among endpoint variables into an integrated analysis of microarray gene expression data with multiple biological and clinical endpoints. Here, PROMISE is adapted to the integrated analysis of pharmacologic, clinical, and genome-wide genotype data that incorporating knowledge about the biological relationships among pharmacologic and clinical response data. An efficient permutation-testing algorithm is introduced so that statistical calculations are computationally feasible in this higher-dimension setting. The new method is applied to a pediatric leukemia data set. The results clearly indicate that PROMISE is a powerful statistical tool for identifying genomic features that exhibit a biologically meaningful pattern of association with multiple endpoint variables. PMID:21516175
EEG Correlates of Fluctuation in Cognitive Performance in an Air Traffic Control Task
2014-11-01
using non-parametric statistical analysis to identify neurophysiological patterns due to the time-on-task effect. Significant changes in EEG power...EEG, Cognitive Performance, Power Spectral Analysis , Non-Parametric Analysis Document is available to the public through the Internet...3 Performance Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 EEG
A new statistic for the analysis of circular data in gamma-ray astronomy
NASA Technical Reports Server (NTRS)
Protheroe, R. J.
1985-01-01
A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.
ERIC Educational Resources Information Center
Madhere, Serge
An analytic procedure, efficiency analysis, is proposed for improving the utility of quantitative program evaluation for decision making. The three features of the procedure are explained: (1) for statistical control, it adopts and extends the regression-discontinuity design; (2) for statistical inferences, it de-emphasizes hypothesis testing in…
Power-law statistics of neurophysiological processes analyzed using short signals
NASA Astrophysics Data System (ADS)
Pavlova, Olga N.; Runnova, Anastasiya E.; Pavlov, Alexey N.
2018-04-01
We discuss the problem of quantifying power-law statistics of complex processes from short signals. Based on the analysis of electroencephalograms (EEG) we compare three interrelated approaches which enable characterization of the power spectral density (PSD) and show that an application of the detrended fluctuation analysis (DFA) or the wavelet-transform modulus maxima (WTMM) method represents a useful way of indirect characterization of the PSD features from short data sets. We conclude that despite DFA- and WTMM-based measures can be obtained from the estimated PSD, these tools outperform the standard spectral analysis when characterization of the analyzed regime should be provided based on a very limited amount of data.
Statistical issues on the analysis of change in follow-up studies in dental research.
Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S
2007-12-01
To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.
A global estimate of the Earth's magnetic crustal thickness
NASA Astrophysics Data System (ADS)
Vervelidou, Foteini; Thébault, Erwan
2014-05-01
The Earth's lithosphere is considered to be magnetic only down to the Curie isotherm. Therefore the Curie isotherm can, in principle, be estimated by analysis of magnetic data. Here, we propose such an analysis in the spectral domain by means of a newly introduced regional spatial power spectrum. This spectrum is based on the Revised Spherical Cap Harmonic Analysis (R-SCHA) formalism (Thébault et al., 2006). We briefly discuss its properties and its relationship with the Spherical Harmonic spatial power spectrum. This relationship allows us to adapt any theoretical expression of the lithospheric field power spectrum expressed in Spherical Harmonic degrees to the regional formulation. We compared previously published statistical expressions (Jackson, 1994 ; Voorhies et al., 2002) to the recent lithospheric field models derived from the CHAMP and airborne measurements and we finally developed a new statistical form for the power spectrum of the Earth's magnetic lithosphere that we think provides more consistent results. This expression depends on the mean magnetization, the mean crustal thickness and a power law value that describes the amount of spatial correlation of the sources. In this study, we make a combine use of the R-SCHA surface power spectrum and this statistical form. We conduct a series of regional spectral analyses for the entire Earth. For each region, we estimate the R-SCHA surface power spectrum of the NGDC-720 Spherical Harmonic model (Maus, 2010). We then fit each of these observational spectra to the statistical expression of the power spectrum of the Earth's lithosphere. By doing so, we estimate the large wavelengths of the magnetic crustal thickness on a global scale that are not accessible directly from the magnetic measurements due to the masking core field. We then discuss these results and compare them to the results we obtained by conducting a similar spectral analysis, but this time in the cartesian coordinates, by means of a published statistical expression (Maus et al., 1997). We also compare our results to crustal thickness global maps derived by means of additional geophysical data (Purucker et al., 2002).
A study of the feasibility of statistical analysis of airport performance simulation
NASA Technical Reports Server (NTRS)
Myers, R. H.
1982-01-01
The feasibility of conducting a statistical analysis of simulation experiments to study airport capacity is investigated. First, the form of the distribution of airport capacity is studied. Since the distribution is non-Gaussian, it is important to determine the effect of this distribution on standard analysis of variance techniques and power calculations. Next, power computations are made in order to determine how economic simulation experiments would be if they are designed to detect capacity changes from condition to condition. Many of the conclusions drawn are results of Monte-Carlo techniques.
Power Analysis in Two-Level Unbalanced Designs
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2010-01-01
Previous work on statistical power has discussed mainly single-level designs or 2-level balanced designs with random effects. Although balanced experiments are common, in practice balance cannot always be achieved. Work on class size is one example of unbalanced designs. This study provides methods for power analysis in 2-level unbalanced designs…
Power flow as a complement to statistical energy analysis and finite element analysis
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1987-01-01
Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.
The Shock and Vibration Digest. Volume 15. Number 1
1983-01-01
acoustics The books are arranged to engineer is statistical energy analysis (SEA). This show the wealth of information that exists and the concept is...is also used for vibrating systems in pie nonlinear elements. However, for systems with a which statistical energy analysis and power flow continuous... statistical energy analysis to analyze the random nonlinear algebraic equations can be difficult. response of two identical subsystems coupled at an end
The Shock and Vibration Digest, Volume 17, Number 8
1985-08-01
ate, transmit, and radiate audible sound. dures are based on acoustic power flow, statistical energy analysis (SEA), and modal methods [22-283. A...modified partition area. features of the acoustic field. I.--1 85-1642 Statistical Energy Analysis , Structural Reso- nances, and Beam Networks BUILDING...energy methods, Structural resonance L.J. Lee Heriot-Watt Univ., Chambers St., Edinburgh The statistical energy analysis method is EHI 1HX, Scotland
Statistical modeling of an integrated boiler for coal fired thermal power plant.
Chandrasekharan, Sreepradha; Panda, Rames Chandra; Swaminathan, Bhuvaneswari Natrajan
2017-06-01
The coal fired thermal power plants plays major role in the power production in the world as they are available in abundance. Many of the existing power plants are based on the subcritical technology which can produce power with the efficiency of around 33%. But the newer plants are built on either supercritical or ultra-supercritical technology whose efficiency can be up to 50%. Main objective of the work is to enhance the efficiency of the existing subcritical power plants to compensate for the increasing demand. For achieving the objective, the statistical modeling of the boiler units such as economizer, drum and the superheater are initially carried out. The effectiveness of the developed models is tested using analysis methods like R 2 analysis and ANOVA (Analysis of Variance). The dependability of the process variable (temperature) on different manipulated variables is analyzed in the paper. Validations of the model are provided with their error analysis. Response surface methodology (RSM) supported by DOE (design of experiments) are implemented to optimize the operating parameters. Individual models along with the integrated model are used to study and design the predictive control of the coal-fired thermal power plant.
Robust Fixed-Structure Control
1994-10-30
Deterministic Foundation for Statistical Energy Analysis ," J. Sound Vibr., to appear. 1.96 D. S. Bernstein and S. P. Bhat, "Lyapunov Stability, Semistability...S. Bernstein, "Power Flow, Energy Balance, and Statistical Energy Analysis for Large Scale, Interconnected Systems," Proc. Amer. Contr. Conf., pp
Seven ways to increase power without increasing N.
Hansen, W B; Collins, L M
1994-01-01
Many readers of this monograph may wonder why a chapter on statistical power was included. After all, by now the issue of statistical power is in many respects mundane. Everyone knows that statistical power is a central research consideration, and certainly most National Institute on Drug Abuse grantees or prospective grantees understand the importance of including a power analysis in research proposals. However, there is ample evidence that, in practice, prevention researchers are not paying sufficient attention to statistical power. If they were, the findings observed by Hansen (1992) in a recent review of the prevention literature would not have emerged. Hansen (1992) examined statistical power based on 46 cohorts followed longitudinally, using nonparametric assumptions given the subjects' age at posttest and the numbers of subjects. Results of this analysis indicated that, in order for a study to attain 80-percent power for detecting differences between treatment and control groups, the difference between groups at posttest would need to be at least 8 percent (in the best studies) and as much as 16 percent (in the weakest studies). In order for a study to attain 80-percent power for detecting group differences in pre-post change, 22 of the 46 cohorts would have needed relative pre-post reductions of greater than 100 percent. Thirty-three of the 46 cohorts had less than 50-percent power to detect a 50-percent relative reduction in substance use. These results are consistent with other review findings (e.g., Lipsey 1990) that have shown a similar lack of power in a broad range of research topics. Thus, it seems that, although researchers are aware of the importance of statistical power (particularly of the necessity for calculating it when proposing research), they somehow are failing to end up with adequate power in their completed studies. This chapter argues that the failure of many prevention studies to maintain adequate statistical power is due to an overemphasis on sample size (N) as the only, or even the best, way to increase statistical power. It is easy to see how this overemphasis has come about. Sample size is easy to manipulate, has the advantage of being related to power in a straight-forward way, and usually is under the direct control of the researcher, except for limitations imposed by finances or subject availability. Another option for increasing power is to increase the alpha used for hypothesis-testing but, as very few researchers seriously consider significance levels much larger than the traditional .05, this strategy seldom is used. Of course, sample size is important, and the authors of this chapter are not recommending that researchers cease choosing sample sizes carefully. Rather, they argue that researchers should not confine themselves to increasing N to enhance power. It is important to take additional measures to maintain and improve power over and above making sure the initial sample size is sufficient. The authors recommend two general strategies. One strategy involves attempting to maintain the effective initial sample size so that power is not lost needlessly. The other strategy is to take measures to maximize the third factor that determines statistical power: effect size.
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei
2016-02-01
Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.
Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei
2015-01-01
Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979
A Mechanical Power Flow Capability for the Finite Element Code NASTRAN
1989-07-01
perimental methods. statistical energy analysis , the finite element method, and a finite element analog-,y using heat conduction equations. Experimental...weights and inertias of the transducers attached to an experimental structure may produce accuracy problems. Statistical energy analysis (SEA) is a...405-422 (1987). 8. Lyon, R.L., Statistical Energy Analysis of Dynamical Sistems, The M.I.T. Press, (1975). 9. Mickol, J.D., and R.J. Bernhard, "An
Venter, Anre; Maxwell, Scott E; Bolig, Erika
2002-06-01
Adding a pretest as a covariate to a randomized posttest-only design increases statistical power, as does the addition of intermediate time points to a randomized pretest-posttest design. Although typically 5 waves of data are required in this instance to produce meaningful gains in power, a 3-wave intensive design allows the evaluation of the straight-line growth model and may reduce the effect of missing data. The authors identify the statistically most powerful method of data analysis in the 3-wave intensive design. If straight-line growth is assumed, the pretest-posttest slope must assume fairly extreme values for the intermediate time point to increase power beyond the standard analysis of covariance on the posttest with the pretest as covariate, ignoring the intermediate time point.
1989-03-01
statistical energy analysis , the finite clement method, and the power flow method. Experimental solutions are the most common in the literature. The authors of...to the added weights and inertias of the transducers attached to an experimental structure. Statistical energy analysis (SEA) is a computational method...Analysis and Diagnosis," Journal of Sound and Vibration, Vol. 115, No. 3, pp. 405-422 (1987). 8. Lyon, R.L., Statistical Energy Analysis of Dynamical Systems
A shift from significance test to hypothesis test through power analysis in medical research.
Singh, G
2006-01-01
Medical research literature until recently, exhibited substantial dominance of the Fisher's significance test approach of statistical inference concentrating more on probability of type I error over Neyman-Pearson's hypothesis test considering both probability of type I and II error. Fisher's approach dichotomises results into significant or not significant results with a P value. The Neyman-Pearson's approach talks of acceptance or rejection of null hypothesis. Based on the same theory these two approaches deal with same objective and conclude in their own way. The advancement in computing techniques and availability of statistical software have resulted in increasing application of power calculations in medical research and thereby reporting the result of significance tests in the light of power of the test also. Significance test approach, when it incorporates power analysis contains the essence of hypothesis test approach. It may be safely argued that rising application of power analysis in medical research may have initiated a shift from Fisher's significance test to Neyman-Pearson's hypothesis test procedure.
Statistical analysis of cascading failures in power grids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chertkov, Michael; Pfitzner, Rene; Turitsyn, Konstantin
2010-12-01
We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systemsmore » consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.« less
Wu, Zheyang; Zhao, Hongyu
2012-01-01
For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.
Wu, Zheyang; Zhao, Hongyu
2013-01-01
For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610
Analysis of Loss-of-Offsite-Power Events 1997-2015
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Nancy Ellen; Schroeder, John Alton
2016-07-01
Loss of offsite power (LOOP) can have a major negative impact on a power plant’s ability to achieve and maintain safe shutdown conditions. LOOP event frequencies and times required for subsequent restoration of offsite power are important inputs to plant probabilistic risk assessments. This report presents a statistical and engineering analysis of LOOP frequencies and durations at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience during calendar years 1997 through 2015. LOOP events during critical operation that do not result in a reactor trip, are not included. Frequencies and durations weremore » determined for four event categories: plant-centered, switchyard-centered, grid-related, and weather-related. Emergency diesel generator reliability is also considered (failure to start, failure to load and run, and failure to run more than 1 hour). There is an adverse trend in LOOP durations. The previously reported adverse trend in LOOP frequency was not statistically significant for 2006-2015. Grid-related LOOPs happen predominantly in the summer. Switchyard-centered LOOPs happen predominantly in winter and spring. Plant-centered and weather-related LOOPs do not show statistically significant seasonality. The engineering analysis of LOOP data shows that human errors have been much less frequent since 1997 than in the 1986 -1996 time period.« less
Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,
2016-01-01
Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.
Statistical Analysis of Large-Scale Structure of Universe
NASA Astrophysics Data System (ADS)
Tugay, A. V.
While galaxy cluster catalogs were compiled many decades ago, other structural elements of cosmic web are detected at definite level only in the newest works. For example, extragalactic filaments were described by velocity field and SDSS galaxy distribution during the last years. Large-scale structure of the Universe could be also mapped in the future using ATHENA observations in X-rays and SKA in radio band. Until detailed observations are not available for the most volume of Universe, some integral statistical parameters can be used for its description. Such methods as galaxy correlation function, power spectrum, statistical moments and peak statistics are commonly used with this aim. The parameters of power spectrum and other statistics are important for constraining the models of dark matter, dark energy, inflation and brane cosmology. In the present work we describe the growth of large-scale density fluctuations in one- and three-dimensional case with Fourier harmonics of hydrodynamical parameters. In result we get power-law relation for the matter power spectrum.
Colegrave, Nick
2017-01-01
A common approach to the analysis of experimental data across much of the biological sciences is test-qualified pooling. Here non-significant terms are dropped from a statistical model, effectively pooling the variation associated with each removed term with the error term used to test hypotheses (or estimate effect sizes). This pooling is only carried out if statistical testing on the basis of applying that data to a previous more complicated model provides motivation for this model simplification; hence the pooling is test-qualified. In pooling, the researcher increases the degrees of freedom of the error term with the aim of increasing statistical power to test their hypotheses of interest. Despite this approach being widely adopted and explicitly recommended by some of the most widely cited statistical textbooks aimed at biologists, here we argue that (except in highly specialized circumstances that we can identify) the hoped-for improvement in statistical power will be small or non-existent, and there is likely to be much reduced reliability of the statistical procedures through deviation of type I error rates from nominal levels. We thus call for greatly reduced use of test-qualified pooling across experimental biology, more careful justification of any use that continues, and a different philosophy for initial selection of statistical models in the light of this change in procedure. PMID:28330912
The Shock and Vibration Digest. Volume 15, Number 7
1983-07-01
systems noise -- for tant analytical tool, the statistical energy analysis example, from a specific metal, chain driven, con- method, has been the subject...34Experimental Determination of Vibration Parameters Re- ~~~quired in the Statistical Energy Analysis Meth- .,i. 31. Dubowsky, S. and Morris, T.L., "An...34Coupling Loss Factors for 55. Upton, R., "Sound Intensity -. A Powerful New Statistical Energy Analysis of Sound Trans- Measurement Tool," S/V, Sound
Climate Considerations Of The Electricity Supply Systems In Industries
NASA Astrophysics Data System (ADS)
Asset, Khabdullin; Zauresh, Khabdullina
2014-12-01
The study is focused on analysis of climate considerations of electricity supply systems in a pellet industry. The developed analysis model consists of two modules: statistical data of active power losses evaluation module and climate aspects evaluation module. The statistical data module is presented as a universal mathematical model of electrical systems and components of industrial load. It forms a basis for detailed accounting of power loss from the voltage levels. On the basis of the universal model, a set of programs is designed to perform the calculation and experimental research. It helps to obtain the statistical characteristics of the power losses and loads of the electricity supply systems and to define the nature of changes in these characteristics. Within the module, several methods and algorithms for calculating parameters of equivalent circuits of low- and high-voltage ADC and SD with a massive smooth rotor with laminated poles are developed. The climate aspects module includes an analysis of the experimental data of power supply system in pellet production. It allows identification of GHG emission reduction parameters: operation hours, type of electrical motors, values of load factor and deviation of standard value of voltage.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Minică, Camelia C; Dolan, Conor V; Hottenga, Jouke-Jan; Willemsen, Gonneke; Vink, Jacqueline M; Boomsma, Dorret I
2013-05-01
When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of two statistical approaches suitable to model imputed genotype data: the mixture approach, which involves the full distribution of the imputed genotypes and the dosage approach, where the mean of the conditional distribution features as the imputed genotype. Simulations were run by varying sibship size, size of the phenotypic correlations among siblings, imputation accuracy and minor allele frequency of the causal SNP. Furthermore, as imputing sibling data and extending the model to include sibships of size two or greater requires modeling the familial covariance matrix, we inquired whether model misspecification affects power. Finally, the results obtained via simulations were empirically verified in two datasets with continuous phenotype data (height) and with a dichotomous phenotype (smoking initiation). Across the settings considered, the mixture and the dosage approach are equally powerful and both produce unbiased parameter estimates. In addition, the likelihood-ratio test in the linear mixed model appears to be robust to the considered misspecification in the background covariance structure, given low to moderate phenotypic correlations among siblings. Empirical results show that the inclusion in association analysis of imputed sibling genotypes does not always result in larger test statistic. The actual test statistic may drop in value due to small effect sizes. That is, if the power benefit is small, that the change in distribution of the test statistic under the alternative is relatively small, the probability is greater of obtaining a smaller test statistic. As the genetic effects are typically hypothesized to be small, in practice, the decision on whether family-based imputation could be used as a means to increase power should be informed by prior power calculations and by the consideration of the background correlation.
el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J
2007-09-24
In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.
Libiger, Ondrej; Schork, Nicholas J.
2015-01-01
It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061
Parametric and experimental analysis using a power flow approach
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1990-01-01
A structural power flow approach for the analysis of structure-borne transmission of vibrations is used to analyze the influence of structural parameters on transmitted power. The parametric analysis is also performed using the Statistical Energy Analysis approach and the results are compared with those obtained using the power flow approach. The advantages of structural power flow analysis are demonstrated by comparing the type of results that are obtained by the two analytical methods. Also, to demonstrate that the power flow results represent a direct physical parameter that can be measured on a typical structure, an experimental study of structural power flow is presented. This experimental study presents results for an L shaped beam for which an available solution was already obtained. Various methods to measure vibrational power flow are compared to study their advantages and disadvantages.
Biotic indices have been used ot assess biological condition by dividing index scores into condition categories. Historically the number of categories has been based on professional judgement. Alternatively, statistical methods such as power analysis can be used to determine the ...
Robustness of Type I Error and Power in Set Correlation Analysis of Contingency Tables.
ERIC Educational Resources Information Center
Cohen, Jacob; Nee, John C. M.
1990-01-01
The analysis of contingency tables via set correlation allows the assessment of subhypotheses involving contrast functions of the categories of the nominal scales. The robustness of such methods with regard to Type I error and statistical power was studied via a Monte Carlo experiment. (TJH)
Advanced statistical energy analysis
NASA Astrophysics Data System (ADS)
Heron, K. H.
1994-09-01
A high-frequency theory (advanced statistical energy analysis (ASEA)) is developed which takes account of the mechanism of tunnelling and uses a ray theory approach to track the power flowing around a plate or a beam network and then uses statistical energy analysis (SEA) to take care of any residual power. ASEA divides the energy of each sub-system into energy that is freely available for transfer to other sub-systems and energy that is fixed within the sub-systems that are physically separate and can be interpreted as a series of mathematical models, the first of which is identical to standard SEA and subsequent higher order models are convergent on an accurate prediction. Using a structural assembly of six rods as an example, ASEA is shown to converge onto the exact results while SEA is shown to overpredict by up to 60 dB.
A statistical spatial power spectrum of the Earth's lithospheric magnetic field
NASA Astrophysics Data System (ADS)
Thébault, E.; Vervelidou, F.
2015-05-01
The magnetic field of the Earth's lithosphere arises from rock magnetization contrasts that were shaped over geological times. The field can be described mathematically in spherical harmonics or with distributions of magnetization. We exploit this dual representation and assume that the lithospheric field is induced by spatially varying susceptibility values within a shell of constant thickness. By introducing a statistical assumption about the power spectrum of the susceptibility, we then derive a statistical expression for the spatial power spectrum of the crustal magnetic field for the spatial scales ranging from 60 to 2500 km. This expression depends on the mean induced magnetization, the thickness of the shell, and a power law exponent for the power spectrum of the susceptibility. We test the relevance of this form with a misfit analysis to the observational NGDC-720 lithospheric magnetic field model power spectrum. This allows us to estimate a mean global apparent induced magnetization value between 0.3 and 0.6 A m-1, a mean magnetic crustal thickness value between 23 and 30 km, and a root mean square for the field value between 190 and 205 nT at 95 per cent. These estimates are in good agreement with independent models of the crustal magnetization and of the seismic crustal thickness. We carry out the same analysis in the continental and oceanic domains separately. We complement the misfit analyses with a Kolmogorov-Smirnov goodness-of-fit test and we conclude that the observed power spectrum can be each time a sample of the statistical one.
Webster, R J; Williams, A; Marchetti, F; Yauk, C L
2018-07-01
Mutations in germ cells pose potential genetic risks to offspring. However, de novo mutations are rare events that are spread across the genome and are difficult to detect. Thus, studies in this area have generally been under-powered, and no human germ cell mutagen has been identified. Whole Genome Sequencing (WGS) of human pedigrees has been proposed as an approach to overcome these technical and statistical challenges. WGS enables analysis of a much wider breadth of the genome than traditional approaches. Here, we performed power analyses to determine the feasibility of using WGS in human families to identify germ cell mutagens. Different statistical models were compared in the power analyses (ANOVA and multiple regression for one-child families, and mixed effect model sampling between two to four siblings per family). Assumptions were made based on parameters from the existing literature, such as the mutation-by-paternal age effect. We explored two scenarios: a constant effect due to an exposure that occurred in the past, and an accumulating effect where the exposure is continuing. Our analysis revealed the importance of modeling inter-family variability of the mutation-by-paternal age effect. Statistical power was improved by models accounting for the family-to-family variability. Our power analyses suggest that sufficient statistical power can be attained with 4-28 four-sibling families per treatment group, when the increase in mutations ranges from 40 to 10% respectively. Modeling family variability using mixed effect models provided a reduction in sample size compared to a multiple regression approach. Much larger sample sizes were required to detect an interaction effect between environmental exposures and paternal age. These findings inform study design and statistical modeling approaches to improve power and reduce sequencing costs for future studies in this area. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Woolley, Thomas W.; Dawson, George O.
It has been two decades since the first power analysis of a psychological journal and 10 years since the Journal of Research in Science Teaching made its contribution to this debate. One purpose of this article is to investigate what power-related changes, if any, have occurred in science education research over the past decade as a result of the earlier survey. In addition, previous recommendations are expanded and expounded upon within the context of more recent work in this area. The absence of any consistent mode of presenting statistical results, as well as little change with regard to power-related issues are reported. Guidelines for reporting the minimal amount of information demanded for clear and independent evaluation of research results by readers are also proposed.
The influence of control group reproduction on the statistical ...
Because of various Congressional mandates to protect the environment from endocrine disrupting chemicals (EDCs), the United States Environmental Protection Agency (USEPA) initiated the Endocrine Disruptor Screening Program. In the context of this framework, the Office of Research and Development within the USEPA developed the Medaka Extended One Generation Reproduction Test (MEOGRT) to characterize the endocrine action of a suspected EDC. One important endpoint of the MEOGRT is fecundity of breeding pairs of medaka. Power analyses were conducted to determine the number of replicates needed in proposed test designs and to determine the effects that varying reproductive parameters (e.g. mean fecundity, variance, and days with no egg production) will have on the statistical power of the test. A software tool, the MEOGRT Reproduction Power Analysis Tool, was developed to expedite these power analyses by both calculating estimates of the needed reproductive parameters (e.g. population mean and variance) and performing the power analysis under user specified scenarios. The manuscript illustrates how the reproductive performance of the control medaka that are used in a MEOGRT influence statistical power, and therefore the successful implementation of the protocol. Example scenarios, based upon medaka reproduction data collected at MED, are discussed that bolster the recommendation that facilities planning to implement the MEOGRT should have a culture of medaka with hi
Automated Box-Cox Transformations for Improved Visual Encoding.
Maciejewski, Ross; Pattath, Avin; Ko, Sungahn; Hafen, Ryan; Cleveland, William S; Ebert, David S
2013-01-01
The concept of preconditioning data (utilizing a power transformation as an initial step) for analysis and visualization is well established within the statistical community and is employed as part of statistical modeling and analysis. Such transformations condition the data to various inherent assumptions of statistical inference procedures, as well as making the data more symmetric and easier to visualize and interpret. In this paper, we explore the use of the Box-Cox family of power transformations to semiautomatically adjust visual parameters. We focus on time-series scaling, axis transformations, and color binning for choropleth maps. We illustrate the usage of this transformation through various examples, and discuss the value and some issues in semiautomatically using these transformations for more effective data visualization.
Biostatistical analysis of quantitative immunofluorescence microscopy images.
Giles, C; Albrecht, M A; Lam, V; Takechi, R; Mamo, J C
2016-12-01
Semiquantitative immunofluorescence microscopy has become a key methodology in biomedical research. Typical statistical workflows are considered in the context of avoiding pseudo-replication and marginalising experimental error. However, immunofluorescence microscopy naturally generates hierarchically structured data that can be leveraged to improve statistical power and enrich biological interpretation. Herein, we describe a robust distribution fitting procedure and compare several statistical tests, outlining their potential advantages/disadvantages in the context of biological interpretation. Further, we describe tractable procedures for power analysis that incorporates the underlying distribution, sample size and number of images captured per sample. The procedures outlined have significant potential for increasing understanding of biological processes and decreasing both ethical and financial burden through experimental optimization. © 2016 The Authors Journal of Microscopy © 2016 Royal Microscopical Society.
Coupling strength assumption in statistical energy analysis
Lafont, T.; Totaro, N.
2017-01-01
This paper is a discussion of the hypothesis of weak coupling in statistical energy analysis (SEA). The examples of coupled oscillators and statistical ensembles of coupled plates excited by broadband random forces are discussed. In each case, a reference calculation is compared with the SEA calculation. First, it is shown that the main SEA relation, the coupling power proportionality, is always valid for two oscillators irrespective of the coupling strength. But the case of three subsystems, consisting of oscillators or ensembles of plates, indicates that the coupling power proportionality fails when the coupling is strong. Strong coupling leads to non-zero indirect coupling loss factors and, sometimes, even to a reversal of the energy flow direction from low to high vibrational temperature. PMID:28484335
Comparative efficacy of two battery-powered toothbrushes on dental plaque removal.
Ruhlman, C Douglas; Bartizek, Robert D; Biesbrock, Aaron R
2002-01-01
A number of clinical studies have consistently demonstrated that power toothbrushes deliver superior plaque removal compared to manual toothbrushes. Recently, a new power toothbrush (Crest SpinBrush) has been marketed with a design that fundamentally differs from other marketed power toothbrushes. Other power toothbrushes feature a small, round head designed to oscillate for enhanced cleaning between the teeth and below the gumline. The new power toothbrush incorporates a similar round oscillating head in conjunction with fixed bristles, which allows the user to brush with optimal manual brushing technique. The objective of this randomized, examiner-blind, parallel design study was to compare the plaque removal efficacy of a positive control power toothbrush (Colgate Actibrush) to an experimental toothbrush (Crest SpinBrush) following a single use among 59 subjects. Baseline plaque scores were 1.64 and 1.40 for the experimental toothbrush and control toothbrush treatment groups, respectively. With regard to all surfaces examined, the experimental toothbrush delivered an adjusted (via analysis of covariance) mean difference between baseline and post-brushing plaque scores of 0.47, while the control toothbrush delivered an adjusted mean difference of 0.33. On average, the difference between toothbrushes was statistically significant (p = 0.013). Because the covariate slope for the experimental group was statistically significantly greater (p = 0.001) than the slope for the control group, a separate slope model was used. Further analysis demonstrated that the experimental group had statistically significantly greater plaque removal than the control group for baseline plaque scores above 1.43. With respect to buccal surfaces, using a separate slope analysis of covariance, the experimental toothbrush delivered an adjusted mean difference between baseline and post-brushing plaque scores of 0.61, while the control toothbrush delivered an adjusted mean difference of 0.39. This difference between toothbrushes was also statistically significant (p = 0.002). On average, the results on lingual surfaces demonstrated similar directional scores favoring the experimental toothbrush; however these results did not achieve statistical significance. In conclusion, the experimental Crest SpinBrush, with its novel fixed and oscillating bristle design, was found to be more effective than the positive control Colgate Actibrush, which is designed with a small round oscillating cluster of bristles.
Separate-channel analysis of two-channel microarrays: recovering inter-spot information.
Smyth, Gordon K; Altman, Naomi S
2013-05-26
Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.
Hou, Lin; Sun, Ning; Mane, Shrikant; Sayward, Fred; Rajeevan, Nallakkandi; Cheung, Kei-Hoi; Cho, Kelly; Pyarajan, Saiju; Aslan, Mihaela; Miller, Perry; Harvey, Philip D.; Gaziano, J. Michael; Concato, John; Zhao, Hongyu
2017-01-01
A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant’s DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/). PMID:28019059
The Use of Meta-Analytic Statistical Significance Testing
ERIC Educational Resources Information Center
Polanin, Joshua R.; Pigott, Terri D.
2015-01-01
Meta-analysis multiplicity, the concept of conducting multiple tests of statistical significance within one review, is an underdeveloped literature. We address this issue by considering how Type I errors can impact meta-analytic results, suggest how statistical power may be affected through the use of multiplicity corrections, and propose how…
Wave energy resource of Brazil: An analysis from 35 years of ERA-Interim reanalysis data
Araújo, Alex Maurício
2017-01-01
This paper presents a characterization of the wave power resource and an analysis of the wave power output for three (AquaBuoy, Pelamis and Wave Dragon) different wave energy converters (WEC) over the Brazilian offshore. To do so it used a 35 years reanalysis database from the ERA-Interim project. Annual and seasonal statistical analyzes of significant height and energy period were performed, and the directional variability of the incident waves were evaluated. The wave power resource was characterized in terms of the statistical parameters of mean, maximum, 95th percentile and standard deviation, and in terms of the temporal variability coefficients COV, SV e MV. From these analyses, the total annual wave power resource available over the Brazilian offshore was estimated in 89.97 GW, with largest mean wave power of 20.63 kW/m in the southernmost part of the study area. The analysis of the three WEC was based in the annual wave energy output and in the capacity factor. The higher capacity factor was 21.85% for Pelamis device at the southern region of the study area. PMID:28817731
Wave energy resource of Brazil: An analysis from 35 years of ERA-Interim reanalysis data.
Espindola, Rafael Luz; Araújo, Alex Maurício
2017-01-01
This paper presents a characterization of the wave power resource and an analysis of the wave power output for three (AquaBuoy, Pelamis and Wave Dragon) different wave energy converters (WEC) over the Brazilian offshore. To do so it used a 35 years reanalysis database from the ERA-Interim project. Annual and seasonal statistical analyzes of significant height and energy period were performed, and the directional variability of the incident waves were evaluated. The wave power resource was characterized in terms of the statistical parameters of mean, maximum, 95th percentile and standard deviation, and in terms of the temporal variability coefficients COV, SV e MV. From these analyses, the total annual wave power resource available over the Brazilian offshore was estimated in 89.97 GW, with largest mean wave power of 20.63 kW/m in the southernmost part of the study area. The analysis of the three WEC was based in the annual wave energy output and in the capacity factor. The higher capacity factor was 21.85% for Pelamis device at the southern region of the study area.
Statistical testing and power analysis for brain-wide association study.
Gong, Weikang; Wan, Lin; Lu, Wenlian; Ma, Liang; Cheng, Fan; Cheng, Wei; Grünewald, Stefan; Feng, Jianfeng
2018-04-05
The identification of connexel-wise associations, which involves examining functional connectivities between pairwise voxels across the whole brain, is both statistically and computationally challenging. Although such a connexel-wise methodology has recently been adopted by brain-wide association studies (BWAS) to identify connectivity changes in several mental disorders, such as schizophrenia, autism and depression, the multiple correction and power analysis methods designed specifically for connexel-wise analysis are still lacking. Therefore, we herein report the development of a rigorous statistical framework for connexel-wise significance testing based on the Gaussian random field theory. It includes controlling the family-wise error rate (FWER) of multiple hypothesis testings using topological inference methods, and calculating power and sample size for a connexel-wise study. Our theoretical framework can control the false-positive rate accurately, as validated empirically using two resting-state fMRI datasets. Compared with Bonferroni correction and false discovery rate (FDR), it can reduce false-positive rate and increase statistical power by appropriately utilizing the spatial information of fMRI data. Importantly, our method bypasses the need of non-parametric permutation to correct for multiple comparison, thus, it can efficiently tackle large datasets with high resolution fMRI images. The utility of our method is shown in a case-control study. Our approach can identify altered functional connectivities in a major depression disorder dataset, whereas existing methods fail. A software package is available at https://github.com/weikanggong/BWAS. Copyright © 2018 Elsevier B.V. All rights reserved.
Design and analysis of multiple diseases genome-wide association studies without controls.
Chen, Zhongxue; Huang, Hanwen; Ng, Hon Keung Tony
2012-11-15
In genome-wide association studies (GWAS), multiple diseases with shared controls is one of the case-control study designs. If data obtained from these studies are appropriately analyzed, this design can have several advantages such as improving statistical power in detecting associations and reducing the time and cost in the data collection process. In this paper, we propose a study design for GWAS which involves multiple diseases but without controls. We also propose corresponding statistical data analysis strategy for GWAS with multiple diseases but no controls. Through a simulation study, we show that the statistical association test with the proposed study design is more powerful than the test with single disease sharing common controls, and it has comparable power to the overall test based on the whole dataset including the controls. We also apply the proposed method to a real GWAS dataset to illustrate the methodologies and the advantages of the proposed design. Some possible limitations of this study design and testing method and their solutions are also discussed. Our findings indicate that the proposed study design and statistical analysis strategy could be more efficient than the usual case-control GWAS as well as those with shared controls. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Slaski, G.; Ohde, B.
2016-09-01
The article presents the results of a statistical dispersion analysis of an energy and power demand for tractive purposes of a battery electric vehicle. The authors compare data distribution for different values of an average speed in two approaches, namely a short and long period of observation. The short period of observation (generally around several hundred meters) results from a previously proposed macroscopic energy consumption model based on an average speed per road section. This approach yielded high values of standard deviation and coefficient of variation (the ratio between standard deviation and the mean) around 0.7-1.2. The long period of observation (about several kilometers long) is similar in length to standardized speed cycles used in testing a vehicle energy consumption and available range. The data were analysed to determine the impact of observation length on the energy and power demand variation. The analysis was based on a simulation of electric power and energy consumption performed with speed profiles data recorded in Poznan agglomeration.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E
2014-04-01
This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Willis, Brian H; Riley, Richard D
2017-09-20
An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Improved Statistics for Genome-Wide Interaction Analysis
Ueki, Masao; Cordell, Heather J.
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670
NASA Technical Reports Server (NTRS)
Perry, Boyd, III; Pototzky, Anthony S.; Woods, Jessica A.
1989-01-01
The results of a NASA investigation of a claimed Overlap between two gust response analysis methods: the Statistical Discrete Gust (SDG) Method and the Power Spectral Density (PSD) Method are presented. The claim is that the ratio of an SDG response to the corresponding PSD response is 10.4. Analytical results presented for several different airplanes at several different flight conditions indicate that such an Overlap does appear to exist. However, the claim was not met precisely: a scatter of up to about 10 percent about the 10.4 factor can be expected.
Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben
2017-09-15
Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NIRS-SPM: statistical parametric mapping for near infrared spectroscopy
NASA Astrophysics Data System (ADS)
Tak, Sungho; Jang, Kwang Eun; Jung, Jinwook; Jang, Jaeduck; Jeong, Yong; Ye, Jong Chul
2008-02-01
Even though there exists a powerful statistical parametric mapping (SPM) tool for fMRI, similar public domain tools are not available for near infrared spectroscopy (NIRS). In this paper, we describe a new public domain statistical toolbox called NIRS-SPM for quantitative analysis of NIRS signals. Specifically, NIRS-SPM statistically analyzes the NIRS data using GLM and makes inference as the excursion probability which comes from the random field that are interpolated from the sparse measurement. In order to obtain correct inference, NIRS-SPM offers the pre-coloring and pre-whitening method for temporal correlation estimation. For simultaneous recording NIRS signal with fMRI, the spatial mapping between fMRI image and real coordinate in 3-D digitizer is estimated using Horn's algorithm. These powerful tools allows us the super-resolution localization of the brain activation which is not possible using the conventional NIRS analysis tools.
An overview of data acquisition, signal coding and data analysis techniques for MST radars
NASA Technical Reports Server (NTRS)
Rastogi, P. K.
1986-01-01
An overview is given of the data acquisition, signal processing, and data analysis techniques that are currently in use with high power MST/ST (mesosphere stratosphere troposphere/stratosphere troposphere) radars. This review supplements the works of Rastogi (1983) and Farley (1984) presented at previous MAP workshops. A general description is given of data acquisition and signal processing operations and they are characterized on the basis of their disparate time scales. Then signal coding, a brief description of frequently used codes, and their limitations are discussed, and finally, several aspects of statistical data processing such as signal statistics, power spectrum and autocovariance analysis, outlier removal techniques are discussed.
Escalante, Yolanda; Saavedra, Jose M.; Tella, Victor; Mansilla, Mirella; García-Hermoso, Antonio; Dominguez, Ana M.
2012-01-01
The aims of this study were (i) to compare women’s water polo game-related statistics by match outcome (winning and losing teams) and phase (preliminary, classificatory, and semi-final/bronze medal/gold medal), and (ii) identify characteristics that discriminate performances for each phase. The game-related statistics of the 124 women’s matches played in five International Championships (World and European Championships) were analyzed. Differences between winning and losing teams in each phase were determined using the chi-squared. A discriminant analysis was then performed according to context in each of the three phases. It was found that the game-related statistics differentiate the winning from the losing teams in each phase of an international championship. The differentiating variables were both offensive (centre goals, power-play goals, counterattack goal, assists, offensive fouls, steals, blocked shots, and won sprints) and defensive (goalkeeper-blocked shots, goalkeeper-blocked inferiority shots, and goalkeeper-blocked 5-m shots). The discriminant analysis showed the game-related statistics to discriminate performance in all phases: preliminary, classificatory, and final phases (92%, 90%, and 83%, respectively). Two variables were discriminatory by match outcome (winning or losing teams) in all three phases: goals and goalkeeper-blocked shots. Key pointsThe preliminary phase that more than one variable was involved in this differentiation, including both offensive and defensive aspects of the game.The game-related statistics were found to have a high discriminatory power in predicting the result of matches with shots and goalkeeper-blocked shots being discriminatory variables in all three phases.Knowledge of the characteristics of women’s water polo game-related statistics of the winning teams and their power to predict match outcomes will allow coaches to take these characteristics into account when planning training and match preparation. PMID:24149356
1987-07-01
of vibrational power flow had been considered by experiments in the area of statistical energy analysis (SEA)8, 9 using other measurement ipproaches...Constants in Statistical Energy Analysis of Structure," J. Acoust. Soc. Am. Vol. 52, No. 2, pp. 516-524 (1973) 9. Fahy, F. and R. Pierri, "Application of
Ensor, Joie; Burke, Danielle L; Snell, Kym I E; Hemming, Karla; Riley, Richard D
2018-05-18
Researchers and funders should consider the statistical power of planned Individual Participant Data (IPD) meta-analysis projects, as they are often time-consuming and costly. We propose simulation-based power calculations utilising a two-stage framework, and illustrate the approach for a planned IPD meta-analysis of randomised trials with continuous outcomes where the aim is to identify treatment-covariate interactions. The simulation approach has four steps: (i) specify an underlying (data generating) statistical model for trials in the IPD meta-analysis; (ii) use readily available information (e.g. from publications) and prior knowledge (e.g. number of studies promising IPD) to specify model parameter values (e.g. control group mean, intervention effect, treatment-covariate interaction); (iii) simulate an IPD meta-analysis dataset of a particular size from the model, and apply a two-stage IPD meta-analysis to obtain the summary estimate of interest (e.g. interaction effect) and its associated p-value; (iv) repeat the previous step (e.g. thousands of times), then estimate the power to detect a genuine effect by the proportion of summary estimates with a significant p-value. In a planned IPD meta-analysis of lifestyle interventions to reduce weight gain in pregnancy, 14 trials (1183 patients) promised their IPD to examine a treatment-BMI interaction (i.e. whether baseline BMI modifies intervention effect on weight gain). Using our simulation-based approach, a two-stage IPD meta-analysis has < 60% power to detect a reduction of 1 kg weight gain for a 10-unit increase in BMI. Additional IPD from ten other published trials (containing 1761 patients) would improve power to over 80%, but only if a fixed-effect meta-analysis was appropriate. Pre-specified adjustment for prognostic factors would increase power further. Incorrect dichotomisation of BMI would reduce power by over 20%, similar to immediately throwing away IPD from ten trials. Simulation-based power calculations could inform the planning and funding of IPD projects, and should be used routinely.
Multi-reader ROC studies with split-plot designs: a comparison of statistical methods.
Obuchowski, Nancy A; Gallas, Brandon D; Hillis, Stephen L
2012-12-01
Multireader imaging trials often use a factorial design, in which study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of this design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper, the authors compare three methods of analysis for the split-plot design. Three statistical methods are presented: the Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean analysis-of-variance approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power, and confidence interval coverage of the three test statistics. The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% confidence intervals falls close to the nominal coverage for small and large sample sizes. The split-plot multireader, multicase study design can be statistically efficient compared to the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rates, similar power, and nominal confidence interval coverage, are available for this study design. Copyright © 2012 AUR. All rights reserved.
Chládek, J; Brázdil, M; Halámek, J; Plešinger, F; Jurák, P
2013-01-01
We present an off-line analysis procedure for exploring brain activity recorded from intra-cerebral electroencephalographic data (SEEG). The objective is to determine the statistical differences between different types of stimulations in the time-frequency domain. The procedure is based on computing relative signal power change and subsequent statistical analysis. An example of characteristic statistically significant event-related de/synchronization (ERD/ERS) detected across different frequency bands following different oddball stimuli is presented. The method is used for off-line functional classification of different brain areas.
Beneath the Skin: Statistics, Trust, and Status
ERIC Educational Resources Information Center
Smith, Richard
2011-01-01
Overreliance on statistics, and even faith in them--which Richard Smith in this essay calls a branch of "metricophilia"--is a common feature of research in education and in the social sciences more generally. Of course accurate statistics are important, but they often constitute essentially a powerful form of rhetoric. For purposes of analysis and…
Using R-Project for Free Statistical Analysis in Extension Research
ERIC Educational Resources Information Center
Mangiafico, Salvatore S.
2013-01-01
One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…
Irvine, Kathryn M.; Manlove, Kezia; Hollimon, Cynthia
2012-01-01
An important consideration for long term monitoring programs is determining the required sampling effort to detect trends in specific ecological indicators of interest. To enhance the Greater Yellowstone Inventory and Monitoring Network’s water resources protocol(s) (O’Ney 2006 and O’Ney et al. 2009 [under review]), we developed a set of tools to: (1) determine the statistical power for detecting trends of varying magnitude in a specified water quality parameter over different lengths of sampling (years) and different within-year collection frequencies (monthly or seasonal sampling) at particular locations using historical data, and (2) perform periodic trend analyses for water quality parameters while addressing seasonality and flow weighting. A power analysis for trend detection is a statistical procedure used to estimate the probability of rejecting the hypothesis of no trend when in fact there is a trend, within a specific modeling framework. In this report, we base our power estimates on using the seasonal Kendall test (Helsel and Hirsch 2002) for detecting trend in water quality parameters measured at fixed locations over multiple years. We also present procedures (R-scripts) for conducting a periodic trend analysis using the seasonal Kendall test with and without flow adjustment. This report provides the R-scripts developed for power and trend analysis, tutorials, and the associated tables and graphs. The purpose of this report is to provide practical information for monitoring network staff on how to use these statistical tools for water quality monitoring data sets.
Ma, Junshui; Wang, Shubing; Raubertas, Richard; Svetnik, Vladimir
2010-07-15
With the increasing popularity of using electroencephalography (EEG) to reveal the treatment effect in drug development clinical trials, the vast volume and complex nature of EEG data compose an intriguing, but challenging, topic. In this paper the statistical analysis methods recommended by the EEG community, along with methods frequently used in the published literature, are first reviewed. A straightforward adjustment of the existing methods to handle multichannel EEG data is then introduced. In addition, based on the spatial smoothness property of EEG data, a new category of statistical methods is proposed. The new methods use a linear combination of low-degree spherical harmonic (SPHARM) basis functions to represent a spatially smoothed version of the EEG data on the scalp, which is close to a sphere in shape. In total, seven statistical methods, including both the existing and the newly proposed methods, are applied to two clinical datasets to compare their power to detect a drug effect. Contrary to the EEG community's recommendation, our results suggest that (1) the nonparametric method does not outperform its parametric counterpart; and (2) including baseline data in the analysis does not always improve the statistical power. In addition, our results recommend that (3) simple paired statistical tests should be avoided due to their poor power; and (4) the proposed spatially smoothed methods perform better than their unsmoothed versions. Copyright 2010 Elsevier B.V. All rights reserved.
Multiplicative point process as a model of trading activity
NASA Astrophysics Data System (ADS)
Gontis, V.; Kaulakys, B.
2004-11-01
Signals consisting of a sequence of pulses show that inherent origin of the 1/ f noise is a Brownian fluctuation of the average interevent time between subsequent pulses of the pulse sequence. In this paper, we generalize the model of interevent time to reproduce a variety of self-affine time series exhibiting power spectral density S( f) scaling as a power of the frequency f. Furthermore, we analyze the relation between the power-law correlations and the origin of the power-law probability distribution of the signal intensity. We introduce a stochastic multiplicative model for the time intervals between point events and analyze the statistical properties of the signal analytically and numerically. Such model system exhibits power-law spectral density S( f)∼1/ fβ for various values of β, including β= {1}/{2}, 1 and {3}/{2}. Explicit expressions for the power spectra in the low-frequency limit and for the distribution density of the interevent time are obtained. The counting statistics of the events is analyzed analytically and numerically, as well. The specific interest of our analysis is related with the financial markets, where long-range correlations of price fluctuations largely depend on the number of transactions. We analyze the spectral density and counting statistics of the number of transactions. The model reproduces spectral properties of the real markets and explains the mechanism of power-law distribution of trading activity. The study provides evidence that the statistical properties of the financial markets are enclosed in the statistics of the time interval between trades. A multiplicative point process serves as a consistent model generating this statistics.
Modeling and replicating statistical topology and evidence for CMB nonhomogeneity
Agami, Sarit
2017-01-01
Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301
NASA Technical Reports Server (NTRS)
Zimmerman, G. A.; Olsen, E. T.
1992-01-01
Noise power estimation in the High-Resolution Microwave Survey (HRMS) sky survey element is considered as an example of a constant false alarm rate (CFAR) signal detection problem. Order-statistic-based noise power estimators for CFAR detection are considered in terms of required estimator accuracy and estimator dynamic range. By limiting the dynamic range of the value to be estimated, the performance of an order-statistic estimator can be achieved by simpler techniques requiring only a single pass of the data. Simple threshold-and-count techniques are examined, and it is shown how several parallel threshold-and-count estimation devices can be used to expand the dynamic range to meet HRMS system requirements with minimal hardware complexity. An input/output (I/O) efficient limited-precision order-statistic estimator with wide but limited dynamic range is also examined.
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Power Analysis for Complex Mediational Designs Using Monte Carlo Methods
ERIC Educational Resources Information Center
Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.
2010-01-01
Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex…
Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beggs, W.J.
1981-02-01
This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; themore » analysis of variance; quality control procedures; and linear regression analysis.« less
Properties of different selection signature statistics and a new strategy for combining them.
Ma, Y; Ding, X; Qanbari, S; Weigend, S; Zhang, Q; Simianer, H
2015-11-01
Identifying signatures of recent or ongoing selection is of high relevance in livestock population genomics. From a statistical perspective, determining a proper testing procedure and combining various test statistics is challenging. On the basis of extensive simulations in this study, we discuss the statistical properties of eight different established selection signature statistics. In the considered scenario, we show that a reasonable power to detect selection signatures is achieved with high marker density (>1 SNP/kb) as obtained from sequencing, while rather small sample sizes (~15 diploid individuals) appear to be sufficient. Most selection signature statistics such as composite likelihood ratio and cross population extended haplotype homozogysity have the highest power when fixation of the selected allele is reached, while integrated haplotype score has the highest power when selection is ongoing. We suggest a novel strategy, called de-correlated composite of multiple signals (DCMS) to combine different statistics for detecting selection signatures while accounting for the correlation between the different selection signature statistics. When examined with simulated data, DCMS consistently has a higher power than most of the single statistics and shows a reliable positional resolution. We illustrate the new statistic to the established selective sweep around the lactase gene in human HapMap data providing further evidence of the reliability of this new statistic. Then, we apply it to scan selection signatures in two chicken samples with diverse skin color. Our analysis suggests that a set of well-known genes such as BCO2, MC1R, ASIP and TYR were involved in the divergent selection for this trait.
A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L
2014-01-01
We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.
NASA Technical Reports Server (NTRS)
Perry, Boyd, III; Pototzky, Anthony S.; Woods, Jessica A.
1989-01-01
This paper presents the results of a NASA investigation of a claimed 'Overlap' between two gust response analysis methods: the Statistical Discrete Gust (SDG) method and the Power Spectral Density (PSD) method. The claim is that the ratio of an SDG response to the corresponding PSD response is 10.4. Analytical results presented in this paper for several different airplanes at several different flight conditions indicate that such an 'Overlap' does appear to exist. However, the claim was not met precisely: a scatter of up to about 10 percent about the 10.4 factor can be expected.
BrightStat.com: free statistics online.
Stricker, Daniel
2008-10-01
Powerful software for statistical analysis is expensive. Here I present BrightStat, a statistical software running on the Internet which is free of charge. BrightStat's goals, its main capabilities and functionalities are outlined. Three different sample runs, a Friedman test, a chi-square test, and a step-wise multiple regression are presented. The results obtained by BrightStat are compared with results computed by SPSS, one of the global leader in providing statistical software, and VassarStats, a collection of scripts for data analysis running on the Internet. Elementary statistics is an inherent part of academic education and BrightStat is an alternative to commercial products.
Parametric and experimental analysis using a power flow approach
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1988-01-01
Having defined and developed a structural power flow approach for the analysis of structure-borne transmission of structural vibrations, the technique is used to perform an analysis of the influence of structural parameters on the transmitted energy. As a base for comparison, the parametric analysis is first performed using a Statistical Energy Analysis approach and the results compared with those obtained using the power flow approach. The advantages of using structural power flow are thus demonstrated by comparing the type of results obtained by the two methods. Additionally, to demonstrate the advantages of using the power flow method and to show that the power flow results represent a direct physical parameter that can be measured on a typical structure, an experimental investigation of structural power flow is also presented. Results are presented for an L-shaped beam for which an analytical solution has already been obtained. Furthermore, the various methods available to measure vibrational power flow are compared to investigate the advantages and disadvantages of each method.
Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.
Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao
2015-08-01
Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.
Vajawat, Mayuri; Deepika, P. C.; Kumar, Vijay; Rajeshwari, P.
2015-01-01
Aim: To compare the efficacy of powered toothbrushes in improving gingival health and reducing salivary red complex counts as compared to manual toothbrushes, among autistic individuals. Materials and Methods: Forty autistics was selected. Test group received powered toothbrushes, and control group received manual toothbrushes. Plaque index and gingival index were recorded. Unstimulated saliva was collected for analysis of red complex organisms using polymerase chain reaction. Results: A statistically significant reduction in the plaque scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.002 for controls). This reduction was statistically more significant in the test group (P = 0.024). A statistically significant reduction in the gingival scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.001 for controls). This reduction was statistically more significant in the test group (P = 0.042). No statistically significant reduction in the detection rate of red complex organisms were seen at 4 weeks in both the groups. Conclusion: Powered toothbrushes result in a significant overall improvement in gingival health when constant reinforcement of oral hygiene instructions is given. PMID:26681855
A powerful approach for association analysis incorporating imprinting effects
Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam
2011-01-01
Motivation: For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. Results: In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy–Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. Contact: wingfung@hku.hk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21798962
A powerful approach for association analysis incorporating imprinting effects.
Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam
2011-09-15
For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy-Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. wingfung@hku.hk Supplementary data are available at Bioinformatics online.
Conducting Simulation Studies in the R Programming Environment.
Hallgren, Kevin A
2013-10-12
Simulation studies allow researchers to answer specific questions about data analysis, statistical power, and best-practices for obtaining accurate results in empirical research. Despite the benefits that simulation research can provide, many researchers are unfamiliar with available tools for conducting their own simulation studies. The use of simulation studies need not be restricted to researchers with advanced skills in statistics and computer programming, and such methods can be implemented by researchers with a variety of abilities and interests. The present paper provides an introduction to methods used for running simulation studies using the R statistical programming environment and is written for individuals with minimal experience running simulation studies or using R. The paper describes the rationale and benefits of using simulations and introduces R functions relevant for many simulation studies. Three examples illustrate different applications for simulation studies, including (a) the use of simulations to answer a novel question about statistical analysis, (b) the use of simulations to estimate statistical power, and (c) the use of simulations to obtain confidence intervals of parameter estimates through bootstrapping. Results and fully annotated syntax from these examples are provided.
NASA Astrophysics Data System (ADS)
Calì, M.; Santarelli, M. G. L.; Leone, P.
Gas Turbine Technologies (GTT) and Politecnico di Torino, both located in Torino (Italy), have been involved in the design and installation of a SOFC laboratory in order to analyse the operation, in cogenerative configuration, of the CHP 100 kW e SOFC Field Unit, built by Siemens-Westinghouse Power Corporation (SWPC), which is at present (May 2005) starting its operation and which will supply electric and thermal power to the GTT factory. In order to take the better advantage from the analysis of the on-site operation, and especially to correctly design the scheduled experimental tests on the system, we developed a mathematical model and run a simulated experimental campaign, applying a rigorous statistical approach to the analysis of the results. The aim of this work is the computer experimental analysis, through a statistical methodology (2 k factorial experiments), of the CHP 100 performance. First, the mathematical model has been calibrated with the results acquired during the first CHP100 demonstration at EDB/ELSAM in Westerwoort. After, the simulated tests have been performed in the form of computer experimental session, and the measurement uncertainties have been simulated with perturbation imposed to the model independent variables. The statistical methodology used for the computer experimental analysis is the factorial design (Yates' Technique): using the ANOVA technique the effect of the main independent variables (air utilization factor U ox, fuel utilization factor U F, internal fuel and air preheating and anodic recycling flow rate) has been investigated in a rigorous manner. Analysis accounts for the effects of parameters on stack electric power, thermal recovered power, single cell voltage, cell operative temperature, consumed fuel flow and steam to carbon ratio. Each main effect and interaction effect of parameters is shown with particular attention on generated electric power and stack heat recovered.
Power analysis on the time effect for the longitudinal Rasch model.
Feddag, M L; Blanchin, M; Hardouin, J B; Sebille, V
2014-01-01
Statistics literature in the social, behavioral, and biomedical sciences typically stress the importance of power analysis. Patient Reported Outcomes (PRO) such as quality of life and other perceived health measures (pain, fatigue, stress,...) are increasingly used as important health outcomes in clinical trials or in epidemiological studies. They cannot be directly observed nor measured as other clinical or biological data and they are often collected through questionnaires with binary or polytomous items. The Rasch model is the well known model in the item response theory (IRT) for binary data. The article proposes an approach to evaluate the statistical power of the time effect for the longitudinal Rasch model with two time points. The performance of this method is compared to the one obtained by simulation study. Finally, the proposed approach is illustrated on one subscale of the SF-36 questionnaire.
A peaking-regulation-balance-based method for wind & PV power integrated accommodation
NASA Astrophysics Data System (ADS)
Zhang, Jinfang; Li, Nan; Liu, Jun
2018-02-01
Rapid development of China’s new energy in current and future should be focused on cooperation of wind and PV power. Based on the analysis of system peaking balance, combined with the statistical features of wind and PV power output characteristics, a method of comprehensive integrated accommodation analysis of wind and PV power is put forward. By the electric power balance during night peaking load period in typical day, wind power installed capacity is determined firstly; then PV power installed capacity could be figured out by midday peak load hours, which effectively solves the problem of uncertainty when traditional method hard determines the combination of the wind and solar power simultaneously. The simulation results have validated the effectiveness of the proposed method.
Chung, Dongjun; Kim, Hang J; Zhao, Hongyu
2017-02-01
Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.
USDA-ARS?s Scientific Manuscript database
Characterizing population genetic structure across geographic space is a fundamental challenge in population genetics. Multivariate statistical analyses are powerful tools for summarizing genetic variability, but geographic information and accompanying metadata is not always easily integrated into t...
NASA Astrophysics Data System (ADS)
Simatos, N.; Perivolaropoulos, L.
2001-01-01
We use the publicly available code CMBFAST, as modified by Pogosian and Vachaspati, to simulate the effects of wiggly cosmic strings on the cosmic microwave background (CMB). Using the modified CMBFAST code, which takes into account vector modes and models wiggly cosmic strings by the one-scale model, we go beyond the angular power spectrum to construct CMB temperature maps with a resolution of a few degrees. The statistics of these maps are then studied using conventional and recently proposed statistical tests optimized for the detection of hidden temperature discontinuities induced by the Gott-Kaiser-Stebbins effect. We show, however, that these realistic maps cannot be distinguished in a statistically significant way from purely Gaussian maps with an identical power spectrum.
Statistical Performances of Resistive Active Power Splitter
NASA Astrophysics Data System (ADS)
Lalléchère, Sébastien; Ravelo, Blaise; Thakur, Atul
2016-03-01
In this paper, the synthesis and sensitivity analysis of an active power splitter (PWS) is proposed. It is based on the active cell composed of a Field Effect Transistor in cascade with shunted resistor at the input and the output (resistive amplifier topology). The PWS uncertainty versus resistance tolerances is suggested by using stochastic method. Furthermore, with the proposed topology, we can control easily the device gain while varying a resistance. This provides useful tool to analyse the statistical sensitivity of the system in uncertain environment.
GLIMMPSE Lite: Calculating Power and Sample Size on Smartphone Devices
Munjal, Aarti; Sakhadeo, Uttara R.; Muller, Keith E.; Glueck, Deborah H.; Kreidler, Sarah M.
2014-01-01
Researchers seeking to develop complex statistical applications for mobile devices face a common set of difficult implementation issues. In this work, we discuss general solutions to the design challenges. We demonstrate the utility of the solutions for a free mobile application designed to provide power and sample size calculations for univariate, one-way analysis of variance (ANOVA), GLIMMPSE Lite. Our design decisions provide a guide for other scientists seeking to produce statistical software for mobile platforms. PMID:25541688
The power and promise of RNA-seq in ecology and evolution.
Todd, Erica V; Black, Michael A; Gemmell, Neil J
2016-03-01
Reference is regularly made to the power of new genomic sequencing approaches. Using powerful technology, however, is not the same as having the necessary power to address a research question with statistical robustness. In the rush to adopt new and improved genomic research methods, limitations of technology and experimental design may be initially neglected. Here, we review these issues with regard to RNA sequencing (RNA-seq). RNA-seq adds large-scale transcriptomics to the toolkit of ecological and evolutionary biologists, enabling differential gene expression (DE) studies in nonmodel species without the need for prior genomic resources. High biological variance is typical of field-based gene expression studies and means that larger sample sizes are often needed to achieve the same degree of statistical power as clinical studies based on data from cell lines or inbred animal models. Sequencing costs have plummeted, yet RNA-seq studies still underutilize biological replication. Finite research budgets force a trade-off between sequencing effort and replication in RNA-seq experimental design. However, clear guidelines for negotiating this trade-off, while taking into account study-specific factors affecting power, are currently lacking. Study designs that prioritize sequencing depth over replication fail to capitalize on the power of RNA-seq technology for DE inference. Significant recent research effort has gone into developing statistical frameworks and software tools for power analysis and sample size calculation in the context of RNA-seq DE analysis. We synthesize progress in this area and derive an accessible rule-of-thumb guide for designing powerful RNA-seq experiments relevant in eco-evolutionary and clinical settings alike. © 2016 John Wiley & Sons Ltd.
Flynn, Kevin; Swintek, Joe; Johnson, Rodney
2017-02-01
Because of various Congressional mandates to protect the environment from endocrine disrupting chemicals (EDCs), the United States Environmental Protection Agency (USEPA) initiated the Endocrine Disruptor Screening Program. In the context of this framework, the Office of Research and Development within the USEPA developed the Medaka Extended One Generation Reproduction Test (MEOGRT) to characterize the endocrine action of a suspected EDC. One important endpoint of the MEOGRT is fecundity of medaka breeding pairs. Power analyses were conducted to determine the number of replicates needed in proposed test designs and to determine the effects that varying reproductive parameters (e.g. mean fecundity, variance, and days with no egg production) would have on the statistical power of the test. The MEOGRT Reproduction Power Analysis Tool (MRPAT) is a software tool developed to expedite these power analyses by both calculating estimates of the needed reproductive parameters (e.g. population mean and variance) and performing the power analysis under user specified scenarios. Example scenarios are detailed that highlight the importance of the reproductive parameters on statistical power. When control fecundity is increased from 21 to 38 eggs per pair per day and the variance decreased from 49 to 20, the gain in power is equivalent to increasing replication by 2.5 times. On the other hand, if 10% of the breeding pairs, including controls, do not spawn, the power to detect a 40% decrease in fecundity drops to 0.54 from nearly 0.98 when all pairs have some level of egg production. Perhaps most importantly, MRPAT was used to inform the decision making process that lead to the final recommendation of the MEOGRT to have 24 control breeding pairs and 12 breeding pairs in each exposure group. Published by Elsevier Inc.
A global goodness-of-fit statistic for Cox regression models.
Parzen, M; Lipsitz, S R
1999-06-01
In this paper, a global goodness-of-fit test statistic for a Cox regression model, which has an approximate chi-squared distribution when the model has been correctly specified, is proposed. Our goodness-of-fit statistic is global and has power to detect if interactions or higher order powers of covariates in the model are needed. The proposed statistic is similar to the Hosmer and Lemeshow (1980, Communications in Statistics A10, 1043-1069) goodness-of-fit statistic for binary data as well as Schoenfeld's (1980, Biometrika 67, 145-153) statistic for the Cox model. The methods are illustrated using data from a Mayo Clinic trial in primary billiary cirrhosis of the liver (Fleming and Harrington, 1991, Counting Processes and Survival Analysis), in which the outcome is the time until liver transplantation or death. The are 17 possible covariates. Two Cox proportional hazards models are fit to the data, and the proposed goodness-of-fit statistic is applied to the fitted models.
Nilsson, Björn; Håkansson, Petra; Johansson, Mikael; Nelander, Sven; Fioretos, Thoas
2007-01-01
Ontological analysis facilitates the interpretation of microarray data. Here we describe new ontological analysis methods which, unlike existing approaches, are threshold-free and statistically powerful. We perform extensive evaluations and introduce a new concept, detection spectra, to characterize methods. We show that different ontological analysis methods exhibit distinct detection spectra, and that it is critical to account for this diversity. Our results argue strongly against the continued use of existing methods, and provide directions towards an enhanced approach. PMID:17488501
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-08
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Harris, Alex; Reeder, Rachelle; Hyun, Jenny
2011-01-01
The authors surveyed 21 editors and reviewers from major psychology journals to identify and describe the statistical and design errors they encounter most often and to get their advice regarding prevention of these problems. Content analysis of the text responses revealed themes in 3 major areas: (a) problems with research design and reporting (e.g., lack of an a priori power analysis, lack of congruence between research questions and study design/analysis, failure to adequately describe statistical procedures); (b) inappropriate data analysis (e.g., improper use of analysis of variance, too many statistical tests without adjustments, inadequate strategy for addressing missing data); and (c) misinterpretation of results. If researchers attended to these common methodological and analytic issues, the scientific quality of manuscripts submitted to high-impact psychology journals might be significantly improved.
Vibroacoustic optimization using a statistical energy analysis model
NASA Astrophysics Data System (ADS)
Culla, Antonio; D`Ambrogio, Walter; Fregolent, Annalisa; Milana, Silvia
2016-08-01
In this paper, an optimization technique for medium-high frequency dynamic problems based on Statistical Energy Analysis (SEA) method is presented. Using a SEA model, the subsystem energies are controlled by internal loss factors (ILF) and coupling loss factors (CLF), which in turn depend on the physical parameters of the subsystems. A preliminary sensitivity analysis of subsystem energy to CLF's is performed to select CLF's that are most effective on subsystem energies. Since the injected power depends not only on the external loads but on the physical parameters of the subsystems as well, it must be taken into account under certain conditions. This is accomplished in the optimization procedure, where approximate relationships between CLF's, injected power and physical parameters are derived. The approach is applied on a typical aeronautical structure: the cabin of a helicopter.
Luo, Li; Zhu, Yun
2012-01-01
Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812
Luo, Li; Zhu, Yun; Xiong, Momiao
2012-06-01
The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.
Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi
2017-01-01
High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The use and misuse of statistical analyses. [in geophysics and space physics
NASA Technical Reports Server (NTRS)
Reiff, P. H.
1983-01-01
The statistical techniques most often used in space physics include Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density, and superposed epoch analysis. Tests are presented which can evaluate the significance of the results obtained through each of these. Data presented without some form of error analysis are frequently useless, since they offer no way of assessing whether a bump on a spectrum or on a superposed epoch analysis is real or merely a statistical fluctuation. Among many of the published linear correlations, for instance, the uncertainty in the intercept and slope is not given, so that the significance of the fitted parameters cannot be assessed.
Statistical Considerations of Food Allergy Prevention Studies.
Bahnson, Henry T; du Toit, George; Lack, Gideon
Clinical studies to prevent the development of food allergy have recently helped reshape public policy recommendations on the early introduction of allergenic foods. These trials are also prompting new research, and it is therefore important to address the unique design and analysis challenges of prevention trials. We highlight statistical concepts and give recommendations that clinical researchers may wish to adopt when designing future study protocols and analysis plans for prevention studies. Topics include selecting a study sample, addressing internal and external validity, improving statistical power, choosing alpha and beta, analysis innovations to address dilution effects, and analysis methods to deal with poor compliance, dropout, and missing data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Shrout, Patrick E; Rodgers, Joseph L
2018-01-04
Psychology advances knowledge by testing statistical hypotheses using empirical observations and data. The expectation is that most statistically significant findings can be replicated in new data and in new laboratories, but in practice many findings have replicated less often than expected, leading to claims of a replication crisis. We review recent methodological literature on questionable research practices, meta-analysis, and power analysis to explain the apparently high rates of failure to replicate. Psychologists can improve research practices to advance knowledge in ways that improve replicability. We recommend that researchers adopt open science conventions of preregi-stration and full disclosure and that replication efforts be based on multiple studies rather than on a single replication attempt. We call for more sophisticated power analyses, careful consideration of the various influences on effect sizes, and more complete disclosure of nonsignificant as well as statistically significant findings.
Got Power? A Systematic Review of Sample Size Adequacy in Health Professions Education Research
ERIC Educational Resources Information Center
Cook, David A.; Hatala, Rose
2015-01-01
Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011,…
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.
Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H
2015-01-01
Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks
2016-01-01
Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330
The Role of Margin in Link Design and Optimization
NASA Technical Reports Server (NTRS)
Cheung, K.
2015-01-01
Link analysis is a system engineering process in the design, development, and operation of communication systems and networks. Link models that are mathematical abstractions representing the useful signal power and the undesirable noise and attenuation effects (including weather effects if the signal path transverses through the atmosphere) that are integrated into the link budget calculation that provides the estimates of signal power and noise power at the receiver. Then the link margin is applied which attempts to counteract the fluctuations of the signal and noise power to ensure reliable data delivery from transmitter to receiver. (Link margin is dictated by the link margin policy or requirements.) A simple link budgeting approach assumes link parameters to be deterministic values typically adopted a rule-of-thumb policy of 3 dB link margin. This policy works for most S- and X-band links due to their insensitivity to weather effects. But for higher frequency links like Ka-band, Ku-band, and optical communication links, it is unclear if a 3 dB link margin would guarantee link closure. Statistical link analysis that adopted the 2-sigma or 3-sigma link margin incorporates link uncertainties in the sigma calculation. (The Deep Space Network (DSN) link margin policies are 2-sigma for downlink and 3-sigma for uplink.) The link reliability can therefore be quantified statistically even for higher frequency links. However in the current statistical link analysis approach, link reliability is only expressed as the likelihood of exceeding the signal-to-noise ratio (SNR) threshold that corresponds to a given bit-error-rate (BER) or frame-error-rate (FER) requirement. The method does not provide the true BER or FER estimate of the link with margin, or the required signalto-noise ratio (SNR) that would meet the BER or FER requirement in the statistical sense. In this paper, we perform in-depth analysis on the relationship between BER/FER requirement, operating SNR, and coding performance curve, in the case when the channel coherence time of link fluctuation is comparable or larger than the time duration of a codeword. We compute the "true" SNR design point that would meet the BER/FER requirement by taking into account the fluctuation of signal power and noise power at the receiver, and the shape of the coding performance curve. This analysis yields a number of valuable insights on the design choices of coding scheme and link margin for the reliable data delivery of a communication system - space and ground. We illustrate the aforementioned analysis using a number of standard NASA error-correcting codes.
The Design and Analysis of Transposon-Insertion Sequencing Experiments
Chao, Michael C.; Abel, Sören; Davis, Brigid M.; Waldor, Matthew K.
2016-01-01
Preface Transposon-insertion sequencing (TIS) is a powerful approach that can be widely applied to genome-wide definition of loci that are required for growth in diverse conditions. However, experimental design choices and stochastic biological processes can heavily influence the results of TIS experiments and affect downstream statistical analysis. Here, we discuss TIS experimental parameters and how these factors relate to the benefits and limitations of the various statistical frameworks that can be applied to computational analysis of TIS data. PMID:26775926
Application of Transformations in Parametric Inference
ERIC Educational Resources Information Center
Brownstein, Naomi; Pensky, Marianna
2008-01-01
The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…
75 FR 16202 - Notice of Issuance of Regulatory Guide
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-31
..., Revision 2, ``An Acceptable Model and Related Statistical Methods for the Analysis of Fuel Densification.... Introduction The U.S. Nuclear Regulatory Commission (NRC) is issuing a revision to an existing guide in the... nuclear power reactors. To meet these objectives, the guide describes statistical methods related to...
Modeling of a Robust Confidence Band for the Power Curve of a Wind Turbine.
Hernandez, Wilmar; Méndez, Alfredo; Maldonado-Correa, Jorge L; Balleteros, Francisco
2016-12-07
Having an accurate model of the power curve of a wind turbine allows us to better monitor its operation and planning of storage capacity. Since wind speed and direction is of a highly stochastic nature, the forecasting of the power generated by the wind turbine is of the same nature as well. In this paper, a method for obtaining a robust confidence band containing the power curve of a wind turbine under test conditions is presented. Here, the confidence band is bound by two curves which are estimated using parametric statistical inference techniques. However, the observations that are used for carrying out the statistical analysis are obtained by using the binning method, and in each bin, the outliers are eliminated by using a censorship process based on robust statistical techniques. Then, the observations that are not outliers are divided into observation sets. Finally, both the power curve of the wind turbine and the two curves that define the robust confidence band are estimated using each of the previously mentioned observation sets.
Modeling of a Robust Confidence Band for the Power Curve of a Wind Turbine
Hernandez, Wilmar; Méndez, Alfredo; Maldonado-Correa, Jorge L.; Balleteros, Francisco
2016-01-01
Having an accurate model of the power curve of a wind turbine allows us to better monitor its operation and planning of storage capacity. Since wind speed and direction is of a highly stochastic nature, the forecasting of the power generated by the wind turbine is of the same nature as well. In this paper, a method for obtaining a robust confidence band containing the power curve of a wind turbine under test conditions is presented. Here, the confidence band is bound by two curves which are estimated using parametric statistical inference techniques. However, the observations that are used for carrying out the statistical analysis are obtained by using the binning method, and in each bin, the outliers are eliminated by using a censorship process based on robust statistical techniques. Then, the observations that are not outliers are divided into observation sets. Finally, both the power curve of the wind turbine and the two curves that define the robust confidence band are estimated using each of the previously mentioned observation sets. PMID:27941604
A power analysis for multivariate tests of temporal trend in species composition.
Irvine, Kathryn M; Dinger, Eric C; Sarr, Daniel
2011-10-01
Long-term monitoring programs emphasize power analysis as a tool to determine the sampling effort necessary to effectively document ecologically significant changes in ecosystems. Programs that monitor entire multispecies assemblages require a method for determining the power of multivariate statistical models to detect trend. We provide a method to simulate presence-absence species assemblage data that are consistent with increasing or decreasing directional change in species composition within multiple sites. This step is the foundation for using Monte Carlo methods to approximate the power of any multivariate method for detecting temporal trends. We focus on comparing the power of the Mantel test, permutational multivariate analysis of variance, and constrained analysis of principal coordinates. We find that the power of the various methods we investigate is sensitive to the number of species in the community, univariate species patterns, and the number of sites sampled over time. For increasing directional change scenarios, constrained analysis of principal coordinates was as or more powerful than permutational multivariate analysis of variance, the Mantel test was the least powerful. However, in our investigation of decreasing directional change, the Mantel test was typically as or more powerful than the other models.
Power flow analysis of two coupled plates with arbitrary characteristics
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1990-01-01
In the last progress report (Feb. 1988) some results were presented for a parametric analysis on the vibrational power flow between two coupled plate structures using the mobility power flow approach. The results reported then were for changes in the structural parameters of the two plates, but with the two plates identical in their structural characteristics. Herein, limitation is removed. The vibrational power input and output are evaluated for different values of the structural damping loss factor for the source and receiver plates. In performing this parametric analysis, the source plate characteristics are kept constant. The purpose of this parametric analysis is to determine the most critical parameters that influence the flow of vibrational power from the source plate to the receiver plate. In the case of the structural damping parametric analysis, the influence of changes in the source plate damping is also investigated. The results obtained from the mobility power flow approach are compared to results obtained using a statistical energy analysis (SEA) approach. The significance of the power flow results are discussed together with a discussion and a comparison between the SEA results and the mobility power flow results. Furthermore, the benefits derived from using the mobility power flow approach are examined.
Mager, P P; Rothe, H
1990-10-01
Multicollinearity of physicochemical descriptors leads to serious consequences in quantitative structure-activity relationship (QSAR) analysis, such as incorrect estimators and test statistics of regression coefficients of the ordinary least-squares (OLS) model applied usually to QSARs. Beside the diagnosis of the known simple collinearity, principal component regression analysis (PCRA) also allows the diagnosis of various types of multicollinearity. Only if the absolute values of PCRA estimators are order statistics that decrease monotonically, the effects of multicollinearity can be circumvented. Otherwise, obscure phenomena may be observed, such as good data recognition but low predictive model power of a QSAR model.
Kobayashi, Katsuhiro; Jacobs, Julia; Gotman, Jean
2013-01-01
Objective A novel type of statistical time-frequency analysis was developed to elucidate changes of high-frequency EEG activity associated with epileptic spikes. Methods The method uses the Gabor Transform and detects changes of power in comparison to background activity using t-statistics that are controlled by the false discovery rate (FDR) to correct type I error of multiple testing. The analysis was applied to EEGs recorded at 2000 Hz from three patients with mesial temporal lobe epilepsy. Results Spike-related increase of high-frequency oscillations (HFOs) was clearly shown in the FDR-controlled t-spectra: it was most dramatic in spikes recorded from the hippocampus when the hippocampus was the seizure onset zone (SOZ). Depression of fast activity was observed immediately after the spikes, especially consistently in the discharges from the hippocampal SOZ. It corresponded to the slow wave part in case of spike-and-slow-wave complexes, but it was noted even in spikes without apparent slow waves. In one patient, a gradual increase of power above 200 Hz preceded spikes. Conclusions FDR-controlled t-spectra clearly detected the spike-related changes of HFOs that were unclear in standard power spectra. Significance We developed a promising tool to study the HFOs that may be closely linked to the pathophysiology of epileptogenesis. PMID:19394892
Zhu, Zhaozhong; Anttila, Verneri; Smoller, Jordan W; Lee, Phil H
2018-01-01
Advances in recent genome wide association studies (GWAS) suggest that pleiotropic effects on human complex traits are widespread. A number of classic and recent meta-analysis methods have been used to identify genetic loci with pleiotropic effects, but the overall performance of these methods is not well understood. In this work, we use extensive simulations and case studies of GWAS datasets to investigate the power and type-I error rates of ten meta-analysis methods. We specifically focus on three conditions commonly encountered in the studies of multiple traits: (1) extensive heterogeneity of genetic effects; (2) characterization of trait-specific association; and (3) inflated correlation of GWAS due to overlapping samples. Although the statistical power is highly variable under distinct study conditions, we found the superior power of several methods under diverse heterogeneity. In particular, classic fixed-effects model showed surprisingly good performance when a variant is associated with more than a half of study traits. As the number of traits with null effects increases, ASSET performed the best along with competitive specificity and sensitivity. With opposite directional effects, CPASSOC featured the first-rate power. However, caution is advised when using CPASSOC for studying genetically correlated traits with overlapping samples. We conclude with a discussion of unresolved issues and directions for future research.
Fractal analysis of the short time series in a visibility graph method
NASA Astrophysics Data System (ADS)
Li, Ruixue; Wang, Jiang; Yu, Haitao; Deng, Bin; Wei, Xile; Chen, Yingyuan
2016-05-01
The aim of this study is to evaluate the performance of the visibility graph (VG) method on short fractal time series. In this paper, the time series of Fractional Brownian motions (fBm), characterized by different Hurst exponent H, are simulated and then mapped into a scale-free visibility graph, of which the degree distributions show the power-law form. The maximum likelihood estimation (MLE) is applied to estimate power-law indexes of degree distribution, and in this progress, the Kolmogorov-Smirnov (KS) statistic is used to test the performance of estimation of power-law index, aiming to avoid the influence of droop head and heavy tail in degree distribution. As a result, we find that the MLE gives an optimal estimation of power-law index when KS statistic reaches its first local minimum. Based on the results from KS statistic, the relationship between the power-law index and the Hurst exponent is reexamined and then amended to meet short time series. Thus, a method combining VG, MLE and KS statistics is proposed to estimate Hurst exponents from short time series. Lastly, this paper also offers an exemplification to verify the effectiveness of the combined method. In addition, the corresponding results show that the VG can provide a reliable estimation of Hurst exponents.
Assessment and statistics of surgically induced astigmatism.
Naeser, Kristian
2008-05-01
The aim of the thesis was to develop methods for assessment of surgically induced astigmatism (SIA) in individual eyes, and in groups of eyes. The thesis is based on 12 peer-reviewed publications, published over a period of 16 years. In these publications older and contemporary literature was reviewed(1). A new method (the polar system) for analysis of SIA was developed. Multivariate statistical analysis of refractive data was described(2-4). Clinical validation studies were performed. The description of a cylinder surface with polar values and differential geometry was compared. The main results were: refractive data in the form of sphere, cylinder and axis may define an individual patient or data set, but are unsuited for mathematical and statistical analyses(1). The polar value system converts net astigmatisms to orthonormal components in dioptric space. A polar value is the difference in meridional power between two orthogonal meridians(5,6). Any pair of polar values, separated by an arch of 45 degrees, characterizes a net astigmatism completely(7). The two polar values represent the net curvital and net torsional power over the chosen meridian(8). The spherical component is described by the spherical equivalent power. Several clinical studies demonstrated the efficiency of multivariate statistical analysis of refractive data(4,9-11). Polar values and formal differential geometry describe astigmatic surfaces with similar concepts and mathematical functions(8). Other contemporary methods, such as Long's power matrix, Holladay's and Alpins' methods, Zernike(12) and Fourier analyses(8), are correlated to the polar value system. In conclusion, analysis of SIA should be performed with polar values or other contemporary component systems. The study was supported by Statens Sundhedsvidenskabeligt Forskningsråd, Cykelhandler P. Th. Rasmussen og Hustrus Mindelegat, Hotelejer Carl Larsen og Hustru Nicoline Larsens Mindelegat, Landsforeningen til Vaern om Synet, Forskningsinitiativet for Arhus Amt, Alcon Denmark, and Desirée and Niels Ydes Fond.
Extension of vibrational power flow techniques to two-dimensional structures
NASA Technical Reports Server (NTRS)
Cuschieri, Joseph M.
1988-01-01
In the analysis of the vibration response and structure-borne vibration transmission between elements of a complex structure, statistical energy analysis (SEA) or finite element analysis (FEA) are generally used. However, an alternative method is using vibrational power flow techniques which can be especially useful in the mid frequencies between the optimum frequency regimes for SEA and FEA. Power flow analysis has in general been used on 1-D beam-like structures or between structures with point joints. In this paper, the power flow technique is extended to 2-D plate-like structures joined along a common edge without frequency or spatial averaging the results, such that the resonant response of the structure is determined. The power flow results are compared to results obtained using FEA results at low frequencies and SEA at high frequencies. The agreement with FEA results is good but the power flow technique has an improved computational efficiency. Compared to the SEA results the power flow results show a closer representation of the actual response of the structure.
Extension of vibrational power flow techniques to two-dimensional structures
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1987-01-01
In the analysis of the vibration response and structure-borne vibration transmission between elements of a complex structure, statistical energy analysis (SEA) or Finite Element Analysis (FEA) are generally used. However, an alternative method is using vibrational power flow techniques which can be especially useful in the mid- frequencies between the optimum frequency regimes for FEA and SEA. Power flow analysis has in general been used on one-dimensional beam-like structures or between structures with point joints. In this paper, the power flow technique is extended to two-dimensional plate like structures joined along a common edge without frequency or spatial averaging the results, such that the resonant response of the structure is determined. The power flow results are compared to results obtained using FEA at low frequencies and SEA at high frequencies. The agreement with FEA results is good but the power flow technique has an improved computational efficiency. Compared to the SEA results the power flow results show a closer representation of the actual response of the structure.
Multi-Reader ROC studies with Split-Plot Designs: A Comparison of Statistical Methods
Obuchowski, Nancy A.; Gallas, Brandon D.; Hillis, Stephen L.
2012-01-01
Rationale and Objectives Multi-reader imaging trials often use a factorial design, where study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of the design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper we compare three methods of analysis for the split-plot design. Materials and Methods Three statistical methods are presented: Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean ANOVA approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power and confidence interval coverage of the three test statistics. Results The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% CIs fall close to the nominal coverage for small and large sample sizes. Conclusions The split-plot MRMC study design can be statistically efficient compared with the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rate, similar power, and nominal CI coverage, are available for this study design. PMID:23122570
Effect of Different Phases of Menstrual Cycle on Heart Rate Variability (HRV).
Brar, Tejinder Kaur; Singh, K D; Kumar, Avnish
2015-10-01
Heart Rate Variability (HRV), which is a measure of the cardiac autonomic tone, displays physiological changes throughout the menstrual cycle. The functions of the ANS in various phases of the menstrual cycle were examined in some studies. The aim of our study was to observe the effect of menstrual cycle on cardiac autonomic function parameters in healthy females. A cross-sectional (observational) study was conducted on 50 healthy females, in the age group of 18-25 years. Heart Rate Variability (HRV) was recorded by Physio Pac (PC-2004). The data consisted of Time Domain Analysis and Frequency Domain Analysis in menstrual, proliferative and secretory phase of menstrual cycle. Data collected was analysed statistically using student's pair t-test. The difference in mean heart rate, LF power%, LFnu and HFnu in menstrual and proliferative phase was found to be statistically significant. The difference in mean RR, Mean HR, RMSSD (the square root of the mean of the squares of the successive differences between adjacent NNs.), NN50 (the number of pairs of successive NNs that differ by more than 50 ms), pNN50 (the proportion of NN50 divided by total number of NNs.), VLF (very low frequency) power, LF (low frequency) power, LF power%, HF power %, LF/HF ratio, LFnu and HFnu was found to be statistically significant in proliferative and secretory phase. The difference in Mean RR, Mean HR, LFnu and HFnu was found to be statistically significant in secretory and menstrual phases. From the study it can be concluded that sympathetic nervous activity in secretory phase is greater than in the proliferative phase, whereas parasympathetic nervous activity is predominant in proliferative phase.
Effect of Different Phases of Menstrual Cycle on Heart Rate Variability (HRV)
Singh, K. D.; Kumar, Avnish
2015-01-01
Background Heart Rate Variability (HRV), which is a measure of the cardiac autonomic tone, displays physiological changes throughout the menstrual cycle. The functions of the ANS in various phases of the menstrual cycle were examined in some studies. Aims and Objectives The aim of our study was to observe the effect of menstrual cycle on cardiac autonomic function parameters in healthy females. Materials and Methods A cross-sectional (observational) study was conducted on 50 healthy females, in the age group of 18-25 years. Heart Rate Variability (HRV) was recorded by Physio Pac (PC-2004). The data consisted of Time Domain Analysis and Frequency Domain Analysis in menstrual, proliferative and secretory phase of menstrual cycle. Data collected was analysed statistically using student’s pair t-test. Results The difference in mean heart rate, LF power%, LFnu and HFnu in menstrual and proliferative phase was found to be statistically significant. The difference in mean RR, Mean HR, RMSSD (the square root of the mean of the squares of the successive differences between adjacent NNs.), NN50 (the number of pairs of successive NNs that differ by more than 50 ms), pNN50 (the proportion of NN50 divided by total number of NNs.), VLF (very low frequency) power, LF (low frequency) power, LF power%, HF power %, LF/HF ratio, LFnu and HFnu was found to be statistically significant in proliferative and secretory phase. The difference in Mean RR, Mean HR, LFnu and HFnu was found to be statistically significant in secretory and menstrual phases. Conclusion From the study it can be concluded that sympathetic nervous activity in secretory phase is greater than in the proliferative phase, whereas parasympathetic nervous activity is predominant in proliferative phase. PMID:26557512
Pataky, Todd C; Robinson, Mark A; Vanrenterghem, Jos
2018-01-03
Statistical power assessment is an important component of hypothesis-driven research but until relatively recently (mid-1990s) no methods were available for assessing power in experiments involving continuum data and in particular those involving one-dimensional (1D) time series. The purpose of this study was to describe how continuum-level power analyses can be used to plan hypothesis-driven biomechanics experiments involving 1D data. In particular, we demonstrate how theory- and pilot-driven 1D effect modeling can be used for sample-size calculations for both single- and multi-subject experiments. For theory-driven power analysis we use the minimum jerk hypothesis and single-subject experiments involving straight-line, planar reaching. For pilot-driven power analysis we use a previously published knee kinematics dataset. Results show that powers on the order of 0.8 can be achieved with relatively small sample sizes, five and ten for within-subject minimum jerk analysis and between-subject knee kinematics, respectively. However, the appropriate sample size depends on a priori justifications of biomechanical meaning and effect size. The main advantage of the proposed technique is that it encourages a priori justification regarding the clinical and/or scientific meaning of particular 1D effects, thereby robustly structuring subsequent experimental inquiry. In short, it shifts focus from a search for significance to a search for non-rejectable hypotheses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Statistical description of tectonic motions
NASA Technical Reports Server (NTRS)
Agnew, Duncan Carr
1991-01-01
The behavior of stochastic processes was studied whose power spectra are described by power-law behavior. The details of the analysis and the conclusions that were reached are presented. This analysis was extended to compare detection capabilities of different measurement techniques (e.g., gravimetry and GPS for the vertical, and seismometers and GPS for horizontal), both in general and for the specific case of the deformations produced by a dislocation in a half-space (which applies to seismic of preseismic sources). The time-domain behavior of power-law noises is also investigated.
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-04-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-01-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided. PMID:24834325
Analysis of broadcasting satellite service feeder link power control and polarization
NASA Technical Reports Server (NTRS)
Sullivan, T. M.
1982-01-01
Statistical analyses of carrier to interference power ratios (C/Is) were performed in assessing 17.5 GHz feeder links using (1) fixed power and power control, and (2) orthogonal linear and orthogonal circular polarizations. The analysis methods and attenuation/depolarization data base were based on CCIR findings to the greatest possible extent. Feeder links using adaptive power control were found to neither cause or suffer significant C/I degradation relative to that for fixed power feeder links having similar or less stringent availability objectives. The C/Is for sharing between orthogonal linearly polarized feeder links were found to be significantly higher than those for circular polarization only in links to nominally colocated satellites from nominally colocated Earth stations in high attenuation environments.
Got power? A systematic review of sample size adequacy in health professions education research.
Cook, David A; Hatala, Rose
2015-03-01
Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011, and included all studies evaluating simulation-based education for health professionals in comparison with no intervention or another simulation intervention. Reviewers working in duplicate abstracted information to calculate standardized mean differences (SMD's). We included 897 original research studies. Among the 627 no-intervention-comparison studies the median sample size was 25. Only two studies (0.3%) had ≥80% power to detect a small difference (SMD > 0.2 standard deviations) and 136 (22%) had power to detect a large difference (SMD > 0.8). 110 no-intervention-comparison studies failed to find a statistically significant difference, but none excluded a small difference and only 47 (43%) excluded a large difference. Among 297 studies comparing alternate simulation approaches the median sample size was 30. Only one study (0.3%) had ≥80% power to detect a small difference and 79 (27%) had power to detect a large difference. Of the 128 studies that did not detect a statistically significant effect, 4 (3%) excluded a small difference and 91 (71%) excluded a large difference. In conclusion, most education research studies are powered only to detect effects of large magnitude. For most studies that do not reach statistical significance, the possibility of large and important differences still exists.
ERIC Educational Resources Information Center
Martuza, Victor R.; Engel, John D.
Results from classical power analysis (Brewer, 1972) suggest that a researcher should not set a=p (when p is less than a) in a posteriori fashion when a study yields statistically significant results because of a resulting decrease in power. The purpose of the present report is to use Bayesian theory in examining the validity of this…
Multiple Phenotype Association Tests Using Summary Statistics in Genome-Wide Association Studies
Liu, Zhonghua; Lin, Xihong
2017-01-01
Summary We study in this paper jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. PMID:28653391
Multiple phenotype association tests using summary statistics in genome-wide association studies.
Liu, Zhonghua; Lin, Xihong
2018-03-01
We study in this article jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. © 2017, The International Biometric Society.
Statistical power comparisons at 3T and 7T with a GO / NOGO task.
Torrisi, Salvatore; Chen, Gang; Glen, Daniel; Bandettini, Peter A; Baker, Chris I; Reynolds, Richard; Yen-Ting Liu, Jeffrey; Leshin, Joseph; Balderston, Nicholas; Grillon, Christian; Ernst, Monique
2018-07-15
The field of cognitive neuroscience is weighing evidence about whether to move from standard field strength to ultra-high field (UHF). The present study contributes to the evidence by comparing a cognitive neuroscience paradigm at 3 Tesla (3T) and 7 Tesla (7T). The goal was to test and demonstrate the practical effects of field strength on a standard GO/NOGO task using accessible preprocessing and analysis tools. Two independent matched healthy samples (N = 31 each) were analyzed at 3T and 7T. Results show gains at 7T in statistical strength, the detection of smaller effects and group-level power. With an increased availability of UHF scanners, these gains may be exploited by cognitive neuroscientists and other neuroimaging researchers to develop more efficient or comprehensive experimental designs and, given the same sample size, achieve greater statistical power at 7T. Published by Elsevier Inc.
Power flow analysis of two coupled plates with arbitrary characteristics
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1988-01-01
The limitation of keeping two plates identical is removed and the vibrational power input and output are evaluated for different area ratios, plate thickness ratios, and for different values of the structural damping loss factor for the source plate (plate with excitation) and the receiver plate. In performing this parametric analysis, the source plate characteristics are kept constant. The purpose of this parametric analysis is to be able to determine the most critical parameters that influence the flow of vibrational power from the source plate to the receiver plate. In the case of the structural damping parametric analysis, the influence of changes in the source plate damping is also investigated. As was done previously, results obtained from the mobility power flow approach will be compared to results obtained using a statistical energy analysis (SEA) approach. The significance of the power flow results are discussed together with a discussion and a comparison between SEA results and the mobility power flow results. Furthermore, the benefits that can be derived from using the mobility power flow approach, are also examined.
Van Wynsberge, Simon; Gilbert, Antoine; Guillemot, Nicolas; Heintz, Tom; Tremblay-Boyer, Laura
2017-07-01
Extensive biological field surveys are costly and time consuming. To optimize sampling and ensure regular monitoring on the long term, identifying informative indicators of anthropogenic disturbances is a priority. In this study, we used 1800 candidate indicators by combining metrics measured from coral, fish, and macro-invertebrate assemblages surveyed from 2006 to 2012 in the vicinity of an ongoing mining project in the Voh-Koné-Pouembout lagoon, New Caledonia. We performed a power analysis to identify a subset of indicators which would best discriminate temporal changes due to a simulated chronic anthropogenic impact. Only 4% of tested indicators were likely to detect a 10% annual decrease of values with sufficient power (>0.80). Corals generally exerted higher statistical power than macro-invertebrates and fishes because of lower natural variability and higher occurrence. For the same reasons, higher taxonomic ranks provided higher power than lower taxonomic ranks. Nevertheless, a number of families of common sedentary or sessile macro-invertebrates and fishes also performed well in detecting changes: Echinometridae, Isognomidae, Muricidae, Tridacninae, Arcidae, and Turbinidae for macro-invertebrates and Pomacentridae, Labridae, and Chaetodontidae for fishes. Interestingly, these families did not provide high power in all geomorphological strata, suggesting that the ability of indicators in detecting anthropogenic impacts was closely linked to reef geomorphology. This study provides a first operational step toward identifying statistically relevant indicators of anthropogenic disturbances in New Caledonia's coral reefs, which can be useful in similar tropical reef ecosystems where little information is available regarding the responses of ecological indicators to anthropogenic disturbances.
Précis of statistical significance: rationale, validity, and utility.
Chow, S L
1998-04-01
The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.
Gaskin, Cadeyrn J; Happell, Brenda
2013-02-01
Having sufficient power to detect effect sizes of an expected magnitude is a core consideration when designing studies in which inferential statistics will be used. The main aim of this study was to investigate the statistical power in studies published in the International Journal of Mental Health Nursing. From volumes 19 (2010) and 20 (2011) of the journal, studies were analysed for their power to detect small, medium, and large effect sizes, according to Cohen's guidelines. The power of the 23 studies included in this review to detect small, medium, and large effects was 0.34, 0.79, and 0.94, respectively. In 90% of papers, no adjustments for experiment-wise error were reported. With a median of nine inferential tests per paper, the mean experiment-wise error rate was 0.51. A priori power analyses were only reported in 17% of studies. Although effect sizes for correlations and regressions were routinely reported, effect sizes for other tests (χ(2)-tests, t-tests, ANOVA/MANOVA) were largely absent from the papers. All types of effect sizes were infrequently interpreted. Researchers are strongly encouraged to conduct power analyses when designing studies, and to avoid scattergun approaches to data analysis (i.e. undertaking large numbers of tests in the hope of finding 'significant' results). Because reviewing effect sizes is essential for determining the clinical significance of study findings, researchers would better serve the field of mental health nursing if they reported and interpreted effect sizes. © 2012 The Authors. International Journal of Mental Health Nursing © 2012 Australian College of Mental Health Nurses Inc.
The Power of 'Evidence': Reliable Science or a Set of Blunt Tools?
ERIC Educational Resources Information Center
Wrigley, Terry
2018-01-01
In response to the increasing emphasis on 'evidence-based teaching', this article examines the privileging of randomised controlled trials and their statistical synthesis (meta-analysis). It also pays particular attention to two third-level statistical syntheses: John Hattie's "Visible learning" project and the EEF's "Teaching and…
Wong, Wing-Cheong; Ng, Hong-Kiat; Tantoso, Erwin; Soong, Richie; Eisenhaber, Frank
2018-02-12
Though earlier works on modelling transcript abundance from vertebrates to lower eukaroytes have specifically singled out the Zip's law, the observed distributions often deviate from a single power-law slope. In hindsight, while power-laws of critical phenomena are derived asymptotically under the conditions of infinite observations, real world observations are finite where the finite-size effects will set in to force a power-law distribution into an exponential decay and consequently, manifests as a curvature (i.e., varying exponent values) in a log-log plot. If transcript abundance is truly power-law distributed, the varying exponent signifies changing mathematical moments (e.g., mean, variance) and creates heteroskedasticity which compromises statistical rigor in analysis. The impact of this deviation from the asymptotic power-law on sequencing count data has never truly been examined and quantified. The anecdotal description of transcript abundance being almost Zipf's law-like distributed can be conceptualized as the imperfect mathematical rendition of the Pareto power-law distribution when subjected to the finite-size effects in the real world; This is regardless of the advancement in sequencing technology since sampling is finite in practice. Our conceptualization agrees well with our empirical analysis of two modern day NGS (Next-generation sequencing) datasets: an in-house generated dilution miRNA study of two gastric cancer cell lines (NUGC3 and AGS) and a publicly available spike-in miRNA data; Firstly, the finite-size effects causes the deviations of sequencing count data from Zipf's law and issues of reproducibility in sequencing experiments. Secondly, it manifests as heteroskedasticity among experimental replicates to bring about statistical woes. Surprisingly, a straightforward power-law correction that restores the distribution distortion to a single exponent value can dramatically reduce data heteroskedasticity to invoke an instant increase in signal-to-noise ratio by 50% and the statistical/detection sensitivity by as high as 30% regardless of the downstream mapping and normalization methods. Most importantly, the power-law correction improves concordance in significant calls among different normalization methods of a data series averagely by 22%. When presented with a higher sequence depth (4 times difference), the improvement in concordance is asymmetrical (32% for the higher sequencing depth instance versus 13% for the lower instance) and demonstrates that the simple power-law correction can increase significant detection with higher sequencing depths. Finally, the correction dramatically enhances the statistical conclusions and eludes the metastasis potential of the NUGC3 cell line against AGS of our dilution analysis. The finite-size effects due to undersampling generally plagues transcript count data with reproducibility issues but can be minimized through a simple power-law correction of the count distribution. This distribution correction has direct implication on the biological interpretation of the study and the rigor of the scientific findings. This article was reviewed by Oliviero Carugo, Thomas Dandekar and Sandor Pongor.
The power to detect linkage in complex disease by means of simple LOD-score analyses.
Greenberg, D A; Abreu, P; Hodge, S E
1998-01-01
Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage. PMID:9718328
The power to detect linkage in complex disease by means of simple LOD-score analyses.
Greenberg, D A; Abreu, P; Hodge, S E
1998-09-01
Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage.
Quantitative and statistical analysis Power grid topology of the Western Interconnection Energy storage for grid applications Research Interests Understanding the implications of high penetrations of renewable
[A comparison of convenience sampling and purposive sampling].
Suen, Lee-Jen Wu; Huang, Hui-Man; Lee, Hao-Hsien
2014-06-01
Convenience sampling and purposive sampling are two different sampling methods. This article first explains sampling terms such as target population, accessible population, simple random sampling, intended sample, actual sample, and statistical power analysis. These terms are then used to explain the difference between "convenience sampling" and purposive sampling." Convenience sampling is a non-probabilistic sampling technique applicable to qualitative or quantitative studies, although it is most frequently used in quantitative studies. In convenience samples, subjects more readily accessible to the researcher are more likely to be included. Thus, in quantitative studies, opportunity to participate is not equal for all qualified individuals in the target population and study results are not necessarily generalizable to this population. As in all quantitative studies, increasing the sample size increases the statistical power of the convenience sample. In contrast, purposive sampling is typically used in qualitative studies. Researchers who use this technique carefully select subjects based on study purpose with the expectation that each participant will provide unique and rich information of value to the study. As a result, members of the accessible population are not interchangeable and sample size is determined by data saturation not by statistical power analysis.
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Robbins, L G
2000-01-01
Graduate school programs in genetics have become so full that courses in statistics have often been eliminated. In addition, typical introductory statistics courses for the "statistics user" rather than the nascent statistician are laden with methods for analysis of measured variables while genetic data are most often discrete numbers. These courses are often seen by students and genetics professors alike as largely irrelevant cookbook courses. The powerful methods of likelihood analysis, although commonly employed in human genetics, are much less often used in other areas of genetics, even though current computational tools make this approach readily accessible. This article introduces the MLIKELY.PAS computer program and the logic of do-it-yourself maximum-likelihood statistics. The program itself, course materials, and expanded discussions of some examples that are only summarized here are available at http://www.unisi. it/ricerca/dip/bio_evol/sitomlikely/mlikely.h tml. PMID:10628965
Influence of nonlinear effects on statistical properties of the radiation from SASE FEL
NASA Astrophysics Data System (ADS)
Saldin, E. L.; Schneidmiller, E. A.; Yurkov, M. V.
1998-02-01
The paper presents analysis of statistical properties of the radiation from self-amplified spontaneous emission (SASE) free-electron laser operating in nonlinear mode. The present approach allows one to calculate the following statistical properties of the SASE FEL radiation: time and spectral field correlation functions, distribution of the fluctuations of the instantaneous radiation power, distribution of the energy in the electron bunch, distribution of the radiation energy after monochromator installed at the FEL amplifier exit and the radiation spectrum. It has been observed that the statistics of the instantaneous radiation power from SASE FEL operating in the nonlinear regime changes significantly with respect to the linear regime. All numerical results presented in the paper have been calculated for the 70 nm SASE FEL at the TESLA Test Facility under construction at DESY.
Association analysis of multiple traits by an approach of combining P values.
Chen, Lili; Wang, Yong; Zhou, Yajing
2018-03-01
Increasing evidence shows that one variant can affect multiple traits, which is a widespread phenomenon in complex diseases. Joint analysis of multiple traits can increase statistical power of association analysis and uncover the underlying genetic mechanism. Although there are many statistical methods to analyse multiple traits, most of these methods are usually suitable for detecting common variants associated with multiple traits. However, because of low minor allele frequency of rare variant, these methods are not optimal for rare variant association analysis. In this paper, we extend an adaptive combination of P values method (termed ADA) for single trait to test association between multiple traits and rare variants in the given region. For a given region, we use reverse regression model to test each rare variant associated with multiple traits and obtain the P value of single-variant test. Further, we take the weighted combination of these P values as the test statistic. Extensive simulation studies show that our approach is more powerful than several other comparison methods in most cases and is robust to the inclusion of a high proportion of neutral variants and the different directions of effects of causal variants.
Use of power analysis to develop detectable significance criteria for sea urchin toxicity tests
Carr, R.S.; Biedenbach, J.M.
1999-01-01
When sufficient data are available, the statistical power of a test can be determined using power analysis procedures. The term “detectable significance” has been coined to refer to this criterion based on power analysis and past performance of a test. This power analysis procedure has been performed with sea urchin (Arbacia punctulata) fertilization and embryological development data from sediment porewater toxicity tests. Data from 3100 and 2295 tests for the fertilization and embryological development tests, respectively, were used to calculate the criteria and regression equations describing the power curves. Using Dunnett's test, a minimum significant difference (MSD) (β = 0.05) of 15.5% and 19% for the fertilization test, and 16.4% and 20.6% for the embryological development test, for α ≤ 0.05 and α ≤ 0.01, respectively, were determined. The use of this second criterion reduces type I (false positive) errors and helps to establish a critical level of difference based on the past performance of the test.
Low power and type II errors in recent ophthalmology research.
Khan, Zainab; Milko, Jordan; Iqbal, Munir; Masri, Moness; Almeida, David R P
2016-10-01
To investigate the power of unpaired t tests in prospective, randomized controlled trials when these tests failed to detect a statistically significant difference and to determine the frequency of type II errors. Systematic review and meta-analysis. We examined all prospective, randomized controlled trials published between 2010 and 2012 in 4 major ophthalmology journals (Archives of Ophthalmology, British Journal of Ophthalmology, Ophthalmology, and American Journal of Ophthalmology). Studies that used unpaired t tests were included. Power was calculated using the number of subjects in each group, standard deviations, and α = 0.05. The difference between control and experimental means was set to be (1) 20% and (2) 50% of the absolute value of the control's initial conditions. Power and Precision version 4.0 software was used to carry out calculations. Finally, the proportion of articles with type II errors was calculated. β = 0.3 was set as the largest acceptable value for the probability of type II errors. In total, 280 articles were screened. Final analysis included 50 prospective, randomized controlled trials using unpaired t tests. The median power of tests to detect a 50% difference between means was 0.9 and was the same for all 4 journals regardless of the statistical significance of the test. The median power of tests to detect a 20% difference between means ranged from 0.26 to 0.9 for the 4 journals. The median power of these tests to detect a 50% and 20% difference between means was 0.9 and 0.5 for tests that did not achieve statistical significance. A total of 14% and 57% of articles with negative unpaired t tests contained results with β > 0.3 when power was calculated for differences between means of 50% and 20%, respectively. A large portion of studies demonstrate high probabilities of type II errors when detecting small differences between means. The power to detect small difference between means varies across journals. It is, therefore, worthwhile for authors to mention the minimum clinically important difference for individual studies. Journals can consider publishing statistical guidelines for authors to use. Day-to-day clinical decisions rely heavily on the evidence base formed by the plethora of studies available to clinicians. Prospective, randomized controlled clinical trials are highly regarded as a robust study and are used to make important clinical decisions that directly affect patient care. The quality of study designs and statistical methods in major clinical journals is improving overtime, 1 and researchers and journals are being more attentive to statistical methodologies incorporated by studies. The results of well-designed ophthalmic studies with robust methodologies, therefore, have the ability to modify the ways in which diseases are managed. Copyright © 2016 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.
The Application of Bayesian Analysis to Issues in Developmental Research
ERIC Educational Resources Information Center
Walker, Lawrence J.; Gustafson, Paul; Frimer, Jeremy A.
2007-01-01
This article reviews the concepts and methods of Bayesian statistical analysis, which can offer innovative and powerful solutions to some challenging analytical problems that characterize developmental research. In this article, we demonstrate the utility of Bayesian analysis, explain its unique adeptness in some circumstances, address some…
Statistical Analysis of Solar PV Power Frequency Spectrum for Optimal Employment of Building Loads
DOE Office of Scientific and Technical Information (OSTI.GOV)
Olama, Mohammed M; Sharma, Isha; Kuruganti, Teja
In this paper, a statistical analysis of the frequency spectrum of solar photovoltaic (PV) power output is conducted. This analysis quantifies the frequency content that can be used for purposes such as developing optimal employment of building loads and distributed energy resources. One year of solar PV power output data was collected and analyzed using one-second resolution to find ideal bounds and levels for the different frequency components. The annual, seasonal, and monthly statistics of the PV frequency content are computed and illustrated in boxplot format. To examine the compatibility of building loads for PV consumption, a spectral analysis ofmore » building loads such as Heating, Ventilation and Air-Conditioning (HVAC) units and water heaters was performed. This defined the bandwidth over which these devices can operate. Results show that nearly all of the PV output (about 98%) is contained within frequencies lower than 1 mHz (equivalent to ~15 min), which is compatible for consumption with local building loads such as HVAC units and water heaters. Medium frequencies in the range of ~15 min to ~1 min are likely to be suitable for consumption by fan equipment of variable air volume HVAC systems that have time constants in the range of few seconds to few minutes. This study indicates that most of the PV generation can be consumed by building loads with the help of proper control strategies, thereby reducing impact on the grid and the size of storage systems.« less
More Powerful Tests of Simple Interaction Contrasts in the Two-Way Factorial Design
ERIC Educational Resources Information Center
Hancock, Gregory R.; McNeish, Daniel M.
2017-01-01
For the two-way factorial design in analysis of variance, the current article explicates and compares three methods for controlling the Type I error rate for all possible simple interaction contrasts following a statistically significant interaction, including a proposed modification to the Bonferroni procedure that increases the power of…
PV System Component Fault and Failure Compilation and Analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klise, Geoffrey Taylor; Lavrova, Olga; Gooding, Renee Lynne
This report describes data collection and analysis of solar photovoltaic (PV) equipment events, which consist of faults and fa ilures that occur during the normal operation of a distributed PV system or PV power plant. We present summary statistics from locations w here maintenance data is being collected at various intervals, as well as reliability statistics gathered from that da ta, consisting of fault/failure distributions and repair distributions for a wide range of PV equipment types.
Statistical analysis of oil percolation through pressboard measured by optical recording
NASA Astrophysics Data System (ADS)
Rogalski, Przemysław; Kozak, Czesław
2017-08-01
The paper presents a measuring station used to measure the percolation of transformer oil by electrotechnical pressboard. Nytro Taurus insulating oil manufactured by Nynas company percolation rate by the Pucaro company pressboard investigation was made. Approximately 60 samples of Pucaro made pressboard, widely used for insulation of power transformers, was measured. Statistical analysis of oil percolation times were performed. The measurements made it possible to determine the distribution of capillary diameters occurring in the pressboard.
Advances in Statistical Methods for Substance Abuse Prevention Research
MacKinnon, David P.; Lockwood, Chondra M.
2010-01-01
The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467
Mechanistic Considerations Used in the Development of the PROFIT PCI Failure Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pankaskie, P. J.
A fuel Pellet-Zircaloy Cladding (thermo-mechanical-chemical) Interactions (PC!) failure model for estimating the probability of failure in !ransient increases in power (PROFIT) was developed. PROFIT is based on 1) standard statistical methods applied to available PC! fuel failure data and 2) a mechanistic analysis of the environmental and strain-rate-dependent stress versus strain characteristics of Zircaloy cladding. The statistical analysis of fuel failures attributable to PCI suggested that parameters in addition to power, transient increase in power, and burnup are needed to define PCI fuel failures in terms of probability estimates with known confidence limits. The PROFIT model, therefore, introduces an environmentalmore » and strain-rate dependent strain energy absorption to failure (SEAF) concept to account for the stress versus strain anomalies attributable to interstitial-disloction interaction effects in the Zircaloy cladding. Assuming that the power ramping rate is the operating corollary of strain-rate in the Zircaloy cladding, then the variables of first order importance in the PCI fuel failure phenomenon are postulated to be: 1. pre-transient fuel rod power, P{sub I}, 2. transient increase in fuel rod power, {Delta}P, 3. fuel burnup, Bu, and 4. the constitutive material property of the Zircaloy cladding, SEAF.« less
Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques
NASA Technical Reports Server (NTRS)
Kuan, Gary M
2008-01-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
MetaGenyo: a web tool for meta-analysis of genetic association studies.
Martorell-Marugan, Jordi; Toro-Dominguez, Daniel; Alarcon-Riquelme, Marta E; Carmona-Saez, Pedro
2017-12-16
Genetic association studies (GAS) aims to evaluate the association between genetic variants and phenotypes. In the last few years, the number of this type of study has increased exponentially, but the results are not always reproducible due to experimental designs, low sample sizes and other methodological errors. In this field, meta-analysis techniques are becoming very popular tools to combine results across studies to increase statistical power and to resolve discrepancies in genetic association studies. A meta-analysis summarizes research findings, increases statistical power and enables the identification of genuine associations between genotypes and phenotypes. Meta-analysis techniques are increasingly used in GAS, but it is also increasing the amount of published meta-analysis containing different errors. Although there are several software packages that implement meta-analysis, none of them are specifically designed for genetic association studies and in most cases their use requires advanced programming or scripting expertise. We have developed MetaGenyo, a web tool for meta-analysis in GAS. MetaGenyo implements a complete and comprehensive workflow that can be executed in an easy-to-use environment without programming knowledge. MetaGenyo has been developed to guide users through the main steps of a GAS meta-analysis, covering Hardy-Weinberg test, statistical association for different genetic models, analysis of heterogeneity, testing for publication bias, subgroup analysis and robustness testing of the results. MetaGenyo is a useful tool to conduct comprehensive genetic association meta-analysis. The application is freely available at http://bioinfo.genyo.es/metagenyo/ .
Statistical power and effect sizes of depression research in Japan.
Okumura, Yasuyuki; Sakamoto, Shinji
2011-06-01
Few studies have been conducted on the rationales for using interpretive guidelines for effect size, and most of the previous statistical power surveys have covered broad research domains. The present study aimed to estimate the statistical power and to obtain realistic target effect sizes of depression research in Japan. We systematically reviewed 18 leading journals of psychiatry and psychology in Japan and identified 974 depression studies that were mentioned in 935 articles published between 1990 and 2006. In 392 studies, logistic regression analyses revealed that using clinical populations was independently associated with being a statistical power of <0.80 (odds ratio 5.9, 95% confidence interval 2.9-12.0) and of <0.50 (odds ratio 4.9, 95% confidence interval 2.3-10.5). Of the studies using clinical populations, 80% did not achieve a power of 0.80 or more, and 44% did not achieve a power of 0.50 or more to detect the medium population effect sizes. A predictive model for the proportion of variance explained was developed using a linear mixed-effects model. The model was then used to obtain realistic target effect sizes in defined study characteristics. In the face of a real difference or correlation in population, many depression researchers are less likely to give a valid result than simply tossing a coin. It is important to educate depression researchers in order to enable them to conduct an a priori power analysis. © 2011 The Authors. Psychiatry and Clinical Neurosciences © 2011 Japanese Society of Psychiatry and Neurology.
An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics.
Kim, Junghi; Bai, Yun; Pan, Wei
2015-12-01
We study the problem of testing for single marker-multiple phenotype associations based on genome-wide association study (GWAS) summary statistics without access to individual-level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual-level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta-analyzed GWAS dataset with three blood lipid traits and another with sex-stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta-analyzed) genome-wide summary statistics, then extend the method to meta-analysis of multiple sets of genome-wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods. © 2015 WILEY PERIODICALS, INC.
Sex differences in discriminative power of volleyball game-related statistics.
João, Paulo Vicente; Leite, Nuno; Mesquita, Isabel; Sampaio, Jaime
2010-12-01
To identify sex differences in volleyball game-related statistics, the game-related statistics of several World Championships in 2007 (N=132) were analyzed using the software VIS from the International Volleyball Federation. Discriminant analysis was used to identify the game-related statistics which better discriminated performances by sex. Analysis yielded an emphasis on fault serves (SC = -.40), shot spikes (SC = .40), and reception digs (SC = .31). Specific robust numbers represent that considerable variability was evident in the game-related statistics profile, as men's volleyball games were better associated with terminal actions (errors of service), and women's volleyball games were characterized by continuous actions (in defense and attack). These differences may be related to the anthropometric and physiological differences between women and men and their influence on performance profiles.
Experimental Design and Power Calculation for RNA-seq Experiments.
Wu, Zhijin; Wu, Hao
2016-01-01
Power calculation is a critical component of RNA-seq experimental design. The flexibility of RNA-seq experiment and the wide dynamic range of transcription it measures make it an attractive technology for whole transcriptome analysis. These features, in addition to the high dimensionality of RNA-seq data, bring complexity in experimental design, making an analytical power calculation no longer realistic. In this chapter we review the major factors that influence the statistical power of detecting differential expression, and give examples of power assessment using the R package PROPER.
Bispectral analysis of equatorial spread F density irregularities
NASA Technical Reports Server (NTRS)
Labelle, J.; Lund, E. J.
1992-01-01
Bispectral analysis has been applied to density irregularities at frequencies 5-30 Hz observed with a sounding rocket launched from Peru in March 1983. Unlike the power spectrum, the bispectrum contains statistical information about the phase relations between the Fourier components which make up the waveform. In the case of spread F data from 475 km the 5-30 Hz portion of the spectrum displays overall enhanced bicoherence relative to that of the background instrumental noise and to that expected due to statistical considerations, implying that the observed f exp -2.5 power law spectrum has a significant non-Gaussian component. This is consistent with previous qualitative analyses. The bicoherence has also been calculated for simulated equatorial spread F density irregularities in approximately the same wavelength regime, and the resulting bispectrum has some features in common with that of the rocket data. The implications of this analysis for equatorial spread F are discussed, and some future investigations are suggested.
Evolutionary computing for the design search and optimization of space vehicle power subsystems
NASA Technical Reports Server (NTRS)
Kordon, Mark; Klimeck, Gerhard; Hanks, David; Hua, Hook
2004-01-01
Evolutionary computing has proven to be a straightforward and robust approach for optimizing a wide range of difficult analysis and design problems. This paper discusses the application of these techniques to an existing space vehicle power subsystem resource and performance analysis simulation in a parallel processing environment. Out preliminary results demonstrate that this approach has the potential to improve the space system trade study process by allowing engineers to statistically weight subsystem goals of mass, cost and performance then automatically size power elements based on anticipated performance of the subsystem rather than on worst-case estimates.
Appropriate Statistical Analysis for Two Independent Groups of Likert-Type Data
ERIC Educational Resources Information Center
Warachan, Boonyasit
2011-01-01
The objective of this research was to determine the robustness and statistical power of three different methods for testing the hypothesis that ordinal samples of five and seven Likert categories come from equal populations. The three methods are the two sample t-test with equal variances, the Mann-Whitney test, and the Kolmogorov-Smirnov test. In…
Generalized Majority Logic Criterion to Analyze the Statistical Strength of S-Boxes
NASA Astrophysics Data System (ADS)
Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan
2012-05-01
The majority logic criterion is applicable in the evaluation process of substitution boxes used in the advanced encryption standard (AES). The performance of modified or advanced substitution boxes is predicted by processing the results of statistical analysis by the majority logic criteria. In this paper, we use the majority logic criteria to analyze some popular and prevailing substitution boxes used in encryption processes. In particular, the majority logic criterion is applied to AES, affine power affine (APA), Gray, Lui J, residue prime, S8 AES, Skipjack, and Xyi substitution boxes. The majority logic criterion is further extended into a generalized majority logic criterion which has a broader spectrum of analyzing the effectiveness of substitution boxes in image encryption applications. The integral components of the statistical analyses used for the generalized majority logic criterion are derived from results of entropy analysis, contrast analysis, correlation analysis, homogeneity analysis, energy analysis, and mean of absolute deviation (MAD) analysis.
ERIC Educational Resources Information Center
Sinharay, Sandip
2017-01-01
Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses
Liu, Ruijie; Holik, Aliaksei Z.; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E.; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.; Ritchie, Matthew E.
2015-01-01
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package. PMID:25925576
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Zhang, Han; Wheeler, William; Hyland, Paula L; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai
2016-06-01
Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs.
Zhang, Han; Wheeler, William; Hyland, Paula L.; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai
2016-01-01
Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs. PMID:27362418
Optimized design and analysis of preclinical intervention studies in vivo
Laajala, Teemu D.; Jumppanen, Mikael; Huhtaniemi, Riikka; Fey, Vidal; Kaur, Amanpreet; Knuuttila, Matias; Aho, Eija; Oksala, Riikka; Westermarck, Jukka; Mäkelä, Sari; Poutanen, Matti; Aittokallio, Tero
2016-01-01
Recent reports have called into question the reproducibility, validity and translatability of the preclinical animal studies due to limitations in their experimental design and statistical analysis. To this end, we implemented a matching-based modelling approach for optimal intervention group allocation, randomization and power calculations, which takes full account of the complex animal characteristics at baseline prior to interventions. In prostate cancer xenograft studies, the method effectively normalized the confounding baseline variability, and resulted in animal allocations which were supported by RNA-seq profiling of the individual tumours. The matching information increased the statistical power to detect true treatment effects at smaller sample sizes in two castration-resistant prostate cancer models, thereby leading to saving of both animal lives and research costs. The novel modelling approach and its open-source and web-based software implementations enable the researchers to conduct adequately-powered and fully-blinded preclinical intervention studies, with the aim to accelerate the discovery of new therapeutic interventions. PMID:27480578
Optimized design and analysis of preclinical intervention studies in vivo.
Laajala, Teemu D; Jumppanen, Mikael; Huhtaniemi, Riikka; Fey, Vidal; Kaur, Amanpreet; Knuuttila, Matias; Aho, Eija; Oksala, Riikka; Westermarck, Jukka; Mäkelä, Sari; Poutanen, Matti; Aittokallio, Tero
2016-08-02
Recent reports have called into question the reproducibility, validity and translatability of the preclinical animal studies due to limitations in their experimental design and statistical analysis. To this end, we implemented a matching-based modelling approach for optimal intervention group allocation, randomization and power calculations, which takes full account of the complex animal characteristics at baseline prior to interventions. In prostate cancer xenograft studies, the method effectively normalized the confounding baseline variability, and resulted in animal allocations which were supported by RNA-seq profiling of the individual tumours. The matching information increased the statistical power to detect true treatment effects at smaller sample sizes in two castration-resistant prostate cancer models, thereby leading to saving of both animal lives and research costs. The novel modelling approach and its open-source and web-based software implementations enable the researchers to conduct adequately-powered and fully-blinded preclinical intervention studies, with the aim to accelerate the discovery of new therapeutic interventions.
Higher order statistical moment application for solar PV potential analysis
NASA Astrophysics Data System (ADS)
Basri, Mohd Juhari Mat; Abdullah, Samizee; Azrulhisham, Engku Ahmad; Harun, Khairulezuan
2016-10-01
Solar photovoltaic energy could be as alternative energy to fossil fuel, which is depleting and posing a global warming problem. However, this renewable energy is so variable and intermittent to be relied on. Therefore the knowledge of energy potential is very important for any site to build this solar photovoltaic power generation system. Here, the application of higher order statistical moment model is being analyzed using data collected from 5MW grid-connected photovoltaic system. Due to the dynamic changes of skewness and kurtosis of AC power and solar irradiance distributions of the solar farm, Pearson system where the probability distribution is calculated by matching their theoretical moments with that of the empirical moments of a distribution could be suitable for this purpose. On the advantage of the Pearson system in MATLAB, a software programming has been developed to help in data processing for distribution fitting and potential analysis for future projection of amount of AC power and solar irradiance availability.
NASA Astrophysics Data System (ADS)
Chung, Moo K.; Kim, Seung-Goo; Schaefer, Stacey M.; van Reekum, Carien M.; Peschke-Schmitz, Lara; Sutterer, Matthew J.; Davidson, Richard J.
2014-03-01
The sparse regression framework has been widely used in medical image processing and analysis. However, it has been rarely used in anatomical studies. We present a sparse shape modeling framework using the Laplace- Beltrami (LB) eigenfunctions of the underlying shape and show its improvement of statistical power. Tradition- ally, the LB-eigenfunctions are used as a basis for intrinsically representing surface shapes as a form of Fourier descriptors. To reduce high frequency noise, only the first few terms are used in the expansion and higher frequency terms are simply thrown away. However, some lower frequency terms may not necessarily contribute significantly in reconstructing the surfaces. Motivated by this idea, we present a LB-based method to filter out only the significant eigenfunctions by imposing a sparse penalty. For dense anatomical data such as deformation fields on a surface mesh, the sparse regression behaves like a smoothing process, which will reduce the error of incorrectly detecting false negatives. Hence the statistical power improves. The sparse shape model is then applied in investigating the influence of age on amygdala and hippocampus shapes in the normal population. The advantage of the LB sparse framework is demonstrated by showing the increased statistical power.
Common pitfalls in statistical analysis: “No evidence of effect” versus “evidence of no effect”
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
This article is the first in a series exploring common pitfalls in statistical analysis in biomedical research. The power of a clinical trial is the ability to find a difference between treatments, where such a difference exists. At the end of the study, the lack of difference between treatments does not mean that the treatments can be considered equivalent. The distinction between “no evidence of effect” and “evidence of no effect” needs to be understood. PMID:25657905
Statistics used in current nursing research.
Zellner, Kathleen; Boerst, Connie J; Tabb, Wil
2007-02-01
Undergraduate nursing research courses should emphasize the statistics most commonly used in the nursing literature to strengthen students' and beginning researchers' understanding of them. To determine the most commonly used statistics, we reviewed all quantitative research articles published in 13 nursing journals in 2000. The findings supported Beitz's categorization of kinds of statistics. Ten primary statistics used in 80% of nursing research published in 2000 were identified. We recommend that the appropriate use of those top 10 statistics be emphasized in undergraduate nursing education and that the nursing profession continue to advocate for the use of methods (e.g., power analysis, odds ratio) that may contribute to the advancement of nursing research.
Pan-Cancer Analysis of Mutation Hotspots in Protein Domains.
Miller, Martin L; Reznik, Ed; Gauthier, Nicholas P; Aksoy, Bülent Arman; Korkut, Anil; Gao, Jianjiong; Ciriello, Giovanni; Schultz, Nikolaus; Sander, Chris
2015-09-23
In cancer genomics, recurrence of mutations in independent tumor samples is a strong indicator of functional impact. However, rare functional mutations can escape detection by recurrence analysis owing to lack of statistical power. We enhance statistical power by extending the notion of recurrence of mutations from single genes to gene families that share homologous protein domains. Domain mutation analysis also sharpens the functional interpretation of the impact of mutations, as domains more succinctly embody function than entire genes. By mapping mutations in 22 different tumor types to equivalent positions in multiple sequence alignments of domains, we confirm well-known functional mutation hotspots, identify uncharacterized rare variants in one gene that are equivalent to well-characterized mutations in another gene, detect previously unknown mutation hotspots, and provide hypotheses about molecular mechanisms and downstream effects of domain mutations. With the rapid expansion of cancer genomics projects, protein domain hotspot analysis will likely provide many more leads linking mutations in proteins to the cancer phenotype. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Ducariu, A.; Constantin, G. C.; Puscas, N. N.
2005-08-01
In the small gain approximation and the unsaturated regime in this paper we report some original results concerning the evaluation of the Fano factor, statistical fluctuation and spontaneous emission factor which characterize the photon statistics on the number of excited modes, dopant concentration and power pumping in the single and double pass Er3+ - doped LiNbO, straight waveguide amplifiers pumped near 1484 nm using erfc, Gaussian and constant profile of the Er3+ ions in LiNbO, crystal. We demonstrated that for 50 mW input pump power the Poisson photon statistics are maintained in the above mentioned amplifiers for concentrations of the Er ions smaller than l026 m-3 and also high gains and low noise figures are achievable. The obtained results can be used for the design of optoelectronic integrated circuits.
HAZARDS SUMMARY REPORT FOR A TWO WATT PROMETHIUM-147 FUELED THERMOELECTRIC GENERATOR
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
1959-06-01
Discussions are included of the APU design, vehicle integration, Pm/sup 147/ properties, shielding requirements, hazards design criteria, statistical analysis for impact, and radiation protection. The use of Pm/sup 147/ makes possible the fabrication of an auxiliary power unit which has applications for low power space missions of <10 watts (electrical). (B.O.G.)
New Statistics for Testing Differential Expression of Pathways from Microarray Data
NASA Astrophysics Data System (ADS)
Siu, Hoicheong; Dong, Hua; Jin, Li; Xiong, Momiao
Exploring biological meaning from microarray data is very important but remains a great challenge. Here, we developed three new statistics: linear combination test, quadratic test and de-correlation test to identify differentially expressed pathways from gene expression profile. We apply our statistics to two rheumatoid arthritis datasets. Notably, our results reveal three significant pathways and 275 genes in common in two datasets. The pathways we found are meaningful to uncover the disease mechanisms of rheumatoid arthritis, which implies that our statistics are a powerful tool in functional analysis of gene expression data.
Alignments of parity even/odd-only multipoles in CMB
NASA Astrophysics Data System (ADS)
Aluri, Pavan K.; Ralston, John P.; Weltman, Amanda
2017-12-01
We compare the statistics of parity even and odd multipoles of the cosmic microwave background (CMB) sky from Planck full mission temperature measurements. An excess power in odd multipoles compared to even multipoles has previously been found on large angular scales. Motivated by this apparent parity asymmetry, we evaluate directional statistics associated with even compared to odd multipoles, along with their significances. Primary tools are the Power tensor and Alignment tensor statistics. We limit our analysis to the first 60 multipoles i.e. l = [2, 61]. We find no evidence for statistically unusual alignments of even parity multipoles. More than one independent statistic finds evidence for alignments of anisotropy axes of odd multipoles, with a significance equivalent to ∼2σ or more. The robustness of alignment axes is tested by making Galactic cuts and varying the multipole range. Very interestingly, the region spanned by the (a)symmetry axes is found to broadly contain other parity (a)symmetry axes previously observed in the literature.
Smith, Paul F.
2017-01-01
Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins. PMID:29371855
Smith, Paul F
2017-01-01
Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins.
Applications of the DOE/NASA wind turbine engineering information system
NASA Technical Reports Server (NTRS)
Neustadter, H. E.; Spera, D. A.
1981-01-01
A statistical analysis of data obtained from the Technology and Engineering Information Systems was made. The systems analyzed consist of the following elements: (1) sensors which measure critical parameters (e.g., wind speed and direction, output power, blade loads and component vibrations); (2) remote multiplexing units (RMUs) on each wind turbine which frequency-modulate, multiplex and transmit sensor outputs; (3) on-site instrumentation to record, process and display the sensor output; and (4) statistical analysis of data. Two examples of the capabilities of these systems are presented. The first illustrates the standardized format for application of statistical analysis to each directly measured parameter. The second shows the use of a model to estimate the variability of the rotor thrust loading, which is a derived parameter.
Wavelet analysis of polarization maps of polycrystalline biological fluids networks
NASA Astrophysics Data System (ADS)
Ushenko, Y. A.
2011-12-01
The optical model of human joints synovial fluid is proposed. The statistic (statistic moments), correlation (autocorrelation function) and self-similar (Log-Log dependencies of power spectrum) structure of polarization two-dimensional distributions (polarization maps) of synovial fluid has been analyzed. It has been shown that differentiation of polarization maps of joint synovial fluid with different physiological state samples is expected of scale-discriminative analysis. To mark out of small-scale domain structure of synovial fluid polarization maps, the wavelet analysis has been used. The set of parameters, which characterize statistic, correlation and self-similar structure of wavelet coefficients' distributions of different scales of polarization domains for diagnostics and differentiation of polycrystalline network transformation connected with the pathological processes, has been determined.
Statistical analysis of RHIC beam position monitors performance
NASA Astrophysics Data System (ADS)
Calaga, R.; Tomás, R.
2004-04-01
A detailed statistical analysis of beam position monitors (BPM) performance at RHIC is a critical factor in improving regular operations and future runs. Robust identification of malfunctioning BPMs plays an important role in any orbit or turn-by-turn analysis. Singular value decomposition and Fourier transform methods, which have evolved as powerful numerical techniques in signal processing, will aid in such identification from BPM data. This is the first attempt at RHIC to use a large set of data to statistically enhance the capability of these two techniques and determine BPM performance. A comparison from run 2003 data shows striking agreement between the two methods and hence can be used to improve BPM functioning at RHIC and possibly other accelerators.
The Power Prior: Theory and Applications
Ibrahim, Joseph G.; Chen, Ming-Hui; Gwon, Yeongjin; Chen, Fang
2015-01-01
The power prior has been widely used in many applications covering a large number of disciplines. The power prior is intended to be an informative prior constructed from historical data. It has been used in clinical trials, genetics, health care, psychology, environmental health, engineering, economics, and business. It has also been applied for a wide variety of models and settings, both in the experimental design and analysis contexts. In this review article, we give an A to Z exposition of the power prior and its applications to date. We review its theoretical properties, variations in its formulation, statistical contexts for which it has been used, applications, and its advantages over other informative priors. We review models for which it has been used, including generalized linear models, survival models, and random effects models. Statistical areas where the power prior has been used include model selection, experimental design, hierarchical modeling, and conjugate priors. Prequentist properties of power priors in posterior inference are established and a simulation study is conducted to further examine the empirical performance of the posterior estimates with power priors. Real data analyses are given illustrating the power prior as well as the use of the power prior in the Bayesian design of clinical trials. PMID:26346180
Statistical power for detecting trends with applications to seabird monitoring
Hatch, Shyla A.
2003-01-01
Power analysis is helpful in defining goals for ecological monitoring and evaluating the performance of ongoing efforts. I examined detection standards proposed for population monitoring of seabirds using two programs (MONITOR and TRENDS) specially designed for power analysis of trend data. Neither program models within- and among-years components of variance explicitly and independently, thus an error term that incorporates both components is an essential input. Residual variation in seabird counts consisted of day-to-day variation within years and unexplained variation among years in approximately equal parts. The appropriate measure of error for power analysis is the standard error of estimation (S.E.est) from a regression of annual means against year. Replicate counts within years are helpful in minimizing S.E.est but should not be treated as independent samples for estimating power to detect trends. Other issues include a choice of assumptions about variance structure and selection of an exponential or linear model of population change. Seabird count data are characterized by strong correlations between S.D. and mean, thus a constant CV model is appropriate for power calculations. Time series were fit about equally well with exponential or linear models, but log transformation ensures equal variances over time, a basic assumption of regression analysis. Using sample data from seabird monitoring in Alaska, I computed the number of years required (with annual censusing) to detect trends of -1.4% per year (50% decline in 50 years) and -2.7% per year (50% decline in 25 years). At ??=0.05 and a desired power of 0.9, estimated study intervals ranged from 11 to 69 years depending on species, trend, software, and study design. Power to detect a negative trend of 6.7% per year (50% decline in 10 years) is suggested as an alternative standard for seabird monitoring that achieves a reasonable match between statistical and biological significance.
[Statistical analysis of German radiologic periodicals: developmental trends in the last 10 years].
Golder, W
1999-09-01
To identify which statistical tests are applied in German radiological publications, to what extent their use has changed during the last decade, and which factors might be responsible for this development. The major articles published in "ROFO" and "DER RADIOLOGE" during 1988, 1993 and 1998 were reviewed for statistical content. The contributions were classified by principal focus and radiological subspecialty. The methods used were assigned to descriptive, basal and advanced statistics. Sample size, significance level and power were established. The use of experts' assistance was monitored. Finally, we calculated the so-called cumulative accessibility of the publications. 525 contributions were found to be eligible. In 1988, 87% used descriptive statistics only, 12.5% basal, and 0.5% advanced statistics. The corresponding figures in 1993 and 1998 are 62 and 49%, 32 and 41%, and 6 and 10%, respectively. Statistical techniques were most likely to be used in research on musculoskeletal imaging and articles dedicated to MRI. Six basic categories of statistical methods account for the complete statistical analysis appearing in 90% of the articles. ROC analysis is the single most common advanced technique. Authors make increasingly use of statistical experts' opinion and programs. During the last decade, the use of statistical methods in German radiological journals has fundamentally improved, both quantitatively and qualitatively. Presently, advanced techniques account for 20% of the pertinent statistical tests. This development seems to be promoted by the increasing availability of statistical analysis software.
Moving beyond the Bar Plot and the Line Graph to Create Informative and Attractive Graphics
ERIC Educational Resources Information Center
Larson-Hall, Jenifer
2017-01-01
Graphics are often mistaken for a mere frill in the methodological arsenal of data analysis when in fact they can be one of the simplest and at the same time most powerful methods of communicating statistical information (Tufte, 2001). The first section of the article argues for the statistical necessity of graphs, echoing and amplifying similar…
A statistical power analysis of woody carbon flux from forest inventory data
James A. Westfall; Christopher W. Woodall; Mark A. Hatfield
2013-01-01
At a national scale, the carbon (C) balance of numerous forest ecosystem C pools can be monitored using a stock change approach based on national forest inventory data. Given the potential influence of disturbance events and/or climate change processes, the statistical detection of changes in forest C stocks is paramount to maintaining the net sequestration status of...
Analysis of Emergency Diesel Generators Failure Incidents in Nuclear Power Plants
NASA Astrophysics Data System (ADS)
Hunt, Ronderio LaDavis
In early years of operation, emergency diesel generators have had a minimal rate of demand failures. Emergency diesel generators are designed to operate as a backup when the main source of electricity has been disrupted. As of late, EDGs (emergency diesel generators) have been failing at NPPs (nuclear power plants) around the United States causing either station blackouts or loss of onsite and offsite power. These failures occurred from a specific type called demand failures. This thesis evaluated the current problem that raised concern in the nuclear industry which was averaging 1 EDG demand failure/year in 1997 to having an excessive event of 4 EDG demand failure year which occurred in 2011. To determine the next occurrence of the extreme event and possible cause to an event of such happening, two analyses were conducted, the statistical and root cause analysis. Considering the statistical analysis in which an extreme event probability approach was applied to determine the next occurrence year of an excessive event as well as, the probability of that excessive event occurring. Using the root cause analysis in which the potential causes of the excessive event occurred by evaluating, the EDG manufacturers, aging, policy changes/ maintenance practices and failure components. The root cause analysis investigated the correlation between demand failure data and historical data. Final results from the statistical analysis showed expectations of an excessive event occurring in a fixed range of probability and a wider range of probability from the extreme event probability approach. The root-cause analysis of the demand failure data followed historical statistics for the EDG manufacturer, aging and policy changes/ maintenance practices but, indicated a possible cause regarding the excessive event with the failure components. Conclusions showed the next excessive demand failure year, prediction of the probability and the next occurrence year of such failures, with an acceptable confidence level, was difficult but, it was likely that this type of failure will not be a 100 year event. It was noticeable to see that the majority of the EDG demand failures occurred within the main components as of 2005. The overall analysis of this study provided from percentages, indicated that it would be appropriate to make the statement that the excessive event was caused by the overall age (wear and tear) of the Emergency Diesel Generators in Nuclear Power Plants. Future Work will be to better determine the return period of the excessive event once the occurrence has happened for a second time by implementing the extreme event probability approach.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Williams, L. Keoki; Buu, Anne
2017-01-01
We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
Cheng, Dunlei; Branscum, Adam J; Stamey, James D
2010-07-01
To quantify the impact of ignoring misclassification of a response variable and measurement error in a covariate on statistical power, and to develop software for sample size and power analysis that accounts for these flaws in epidemiologic data. A Monte Carlo simulation-based procedure is developed to illustrate the differences in design requirements and inferences between analytic methods that properly account for misclassification and measurement error to those that do not in regression models for cross-sectional and cohort data. We found that failure to account for these flaws in epidemiologic data can lead to a substantial reduction in statistical power, over 25% in some cases. The proposed method substantially reduced bias by up to a ten-fold margin compared to naive estimates obtained by ignoring misclassification and mismeasurement. We recommend as routine practice that researchers account for errors in measurement of both response and covariate data when determining sample size, performing power calculations, or analyzing data from epidemiological studies. 2010 Elsevier Inc. All rights reserved.
Interim analyses in 2 x 2 crossover trials.
Cook, R J
1995-09-01
A method is presented for performing interim analyses in long term 2 x 2 crossover trials with serial patient entry. The analyses are based on a linear statistic that combines data from individuals observed for one treatment period with data from individuals observed for both periods. The coefficients in this linear combination can be chosen quite arbitrarily, but we focus on variance-based weights to maximize power for tests regarding direct treatment effects. The type I error rate of this procedure is controlled by utilizing the joint distribution of the linear statistics over analysis stages. Methods for performing power and sample size calculations are indicated. A two-stage sequential design involving simultaneous patient entry and a single between-period interim analysis is considered in detail. The power and average number of measurements required for this design are compared to those of the usual crossover trial. The results indicate that, while there is minimal loss in power relative to the usual crossover design in the absence of differential carry-over effects, the proposed design can have substantially greater power when differential carry-over effects are present. The two-stage crossover design can also lead to more economical studies in terms of the expected number of measurements required, due to the potential for early stopping. Attention is directed toward normally distributed responses.
An overview of meta-analysis for clinicians.
Lee, Young Ho
2018-03-01
The number of medical studies being published is increasing exponentially, and clinicians must routinely process large amounts of new information. Moreover, the results of individual studies are often insufficient to provide confident answers, as their results are not consistently reproducible. A meta-analysis is a statistical method for combining the results of different studies on the same topic and it may resolve conflicts among studies. Meta-analysis is being used increasingly and plays an important role in medical research. This review introduces the basic concepts, steps, advantages, and caveats of meta-analysis, to help clinicians understand it in clinical practice and research. A major advantage of a meta-analysis is that it produces a precise estimate of the effect size, with considerably increased statistical power, which is important when the power of the primary study is limited because of a small sample size. A meta-analysis may yield conclusive results when individual studies are inconclusive. Furthermore, meta-analyses investigate the source of variation and different effects among subgroups. In summary, a meta-analysis is an objective, quantitative method that provides less biased estimates on a specific topic. Understanding how to conduct a meta-analysis aids clinicians in the process of making clinical decisions.
Data series embedding and scale invariant statistics.
Michieli, I; Medved, B; Ristov, S
2010-06-01
Data sequences acquired from bio-systems such as human gait data, heart rate interbeat data, or DNA sequences exhibit complex dynamics that is frequently described by a long-memory or power-law decay of autocorrelation function. One way of characterizing that dynamics is through scale invariant statistics or "fractal-like" behavior. For quantifying scale invariant parameters of physiological signals several methods have been proposed. Among them the most common are detrended fluctuation analysis, sample mean variance analyses, power spectral density analysis, R/S analysis, and recently in the realm of the multifractal approach, wavelet analysis. In this paper it is demonstrated that embedding the time series data in the high-dimensional pseudo-phase space reveals scale invariant statistics in the simple fashion. The procedure is applied on different stride interval data sets from human gait measurements time series (Physio-Bank data library). Results show that introduced mapping adequately separates long-memory from random behavior. Smaller gait data sets were analyzed and scale-free trends for limited scale intervals were successfully detected. The method was verified on artificially produced time series with known scaling behavior and with the varying content of noise. The possibility for the method to falsely detect long-range dependence in the artificially generated short range dependence series was investigated. (c) 2009 Elsevier B.V. All rights reserved.
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.
Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao
2016-04-01
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.
NASA Astrophysics Data System (ADS)
Shirasaki, Masato; Nishimichi, Takahiro; Li, Baojiu; Higuchi, Yuichi
2017-04-01
We investigate the information content of various cosmic shear statistics on the theory of gravity. Focusing on the Hu-Sawicki-type f(R) model, we perform a set of ray-tracing simulations and measure the convergence bispectrum, peak counts and Minkowski functionals. We first show that while the convergence power spectrum does have sensitivity to the current value of extra scalar degree of freedom |fR0|, it is largely compensated by a change in the present density amplitude parameter σ8 and the matter density parameter Ωm0. With accurate covariance matrices obtained from 1000 lensing simulations, we then examine the constraining power of the three additional statistics. We find that these probes are indeed helpful to break the parameter degeneracy, which cannot be resolved from the power spectrum alone. We show that especially the peak counts and Minkowski functionals have the potential to rigorously (marginally) detect the signature of modified gravity with the parameter |fR0| as small as 10-5 (10-6) if we can properly model them on small (˜1 arcmin) scale in a future survey with a sky coverage of 1500 deg2. We also show that the signal level is similar among the additional three statistics and all of them provide complementary information to the power spectrum. These findings indicate the importance of combining multiple probes beyond the standard power spectrum analysis to detect possible modifications to general relativity.
In vivo Comet assay--statistical analysis and power calculations of mice testicular cells.
Hansen, Merete Kjær; Sharma, Anoop Kumar; Dybdahl, Marianne; Boberg, Julie; Kulahci, Murat
2014-11-01
The in vivo Comet assay is a sensitive method for evaluating DNA damage. A recurrent concern is how to analyze the data appropriately and efficiently. A popular approach is to summarize the raw data into a summary statistic prior to the statistical analysis. However, consensus on which summary statistic to use has yet to be reached. Another important consideration concerns the assessment of proper sample sizes in the design of Comet assay studies. This study aims to identify a statistic suitably summarizing the % tail DNA of mice testicular samples in Comet assay studies. A second aim is to provide curves for this statistic outlining the number of animals and gels to use. The current study was based on 11 compounds administered via oral gavage in three doses to male mice: CAS no. 110-26-9, CAS no. 512-56-1, CAS no. 111873-33-7, CAS no. 79-94-7, CAS no. 115-96-8, CAS no. 598-55-0, CAS no. 636-97-5, CAS no. 85-28-9, CAS no. 13674-87-8, CAS no. 43100-38-5 and CAS no. 60965-26-6. Testicular cells were examined using the alkaline version of the Comet assay and the DNA damage was quantified as % tail DNA using a fully automatic scoring system. From the raw data 23 summary statistics were examined. A linear mixed-effects model was fitted to the summarized data and the estimated variance components were used to generate power curves as a function of sample size. The statistic that most appropriately summarized the within-sample distributions was the median of the log-transformed data, as it most consistently conformed to the assumptions of the statistical model. Power curves for 1.5-, 2-, and 2.5-fold changes of the highest dose group compared to the control group when 50 and 100 cells were scored per gel are provided to aid in the design of future Comet assay studies on testicular cells. Copyright © 2014 Elsevier B.V. All rights reserved.
Publication Bias in "Red, Rank, and Romance in Women Viewing Men," by Elliot et al. (2010)
ERIC Educational Resources Information Center
Francis, Gregory
2013-01-01
Elliot et al. (2010) reported multiple experimental findings that the color red modified women's ratings of attractiveness, sexual desirability, and status of a photographed man. An analysis of the reported statistics of these studies indicates that the experiments lack sufficient power to support these claims. Given the power of the experiments,…
Rolland, Y; Bézy-Wendling, J; Duvauferrier, R; Coatrieux, J L
1999-03-01
To demonstrate the usefulness of a model of the parenchymous vascularization to evaluate texture analysis methods. Slices with thickness varying from 1 to 4 mm were reformatted from a 3D vascular model corresponding to either normal tissue perfusion or local hypervascularization. Parameters of statistical methods were measured on 16128x128 regions of interest, and mean values and standard deviation were calculated. For each parameter, the performances (discrimination power and stability) were evaluated. Among 11 calculated statistical parameters, three (homogeneity, entropy, mean of gradients) were found to have a good discriminating power to differentiate normal perfusion from hypervascularization, but only the gradient mean was found to have a good stability with respect to the thickness. Five parameters (run percentage, run length distribution, long run emphasis, contrast, and gray level distribution) were found to have intermediate results. In the remaining three, curtosis and correlation was found to have little discrimination power, skewness none. This 3D vascular model, which allows the generation of various examples of vascular textures, is a powerful tool to assess the performance of texture analysis methods. This improves our knowledge of the methods and should contribute to their a priori choice when designing clinical studies.
Injuries Associated With Hazards Involving Motor Vehicle Power Windows
DOT National Transportation Integrated Search
1997-05-01
National Highway Traffic Safety Administration's (NHTSA) National Center for : Statistics and Analysis (NCSA) recently completed a study of data from the : Consumer Product Safety Commission's (CPSC) National Electronic Injury : Surveillance System (...
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.
Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R
2012-08-01
Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.
Nam, Dougu
2017-06-01
Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
Effective Analysis of Reaction Time Data
ERIC Educational Resources Information Center
Whelan, Robert
2008-01-01
Most analyses of reaction time (RT) data are conducted by using the statistical techniques with which psychologists are most familiar, such as analysis of variance on the sample mean. Unfortunately, these methods are usually inappropriate for RT data, because they have little power to detect genuine differences in RT between conditions. In…
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Joint resonant CMB power spectrum and bispectrum estimation
NASA Astrophysics Data System (ADS)
Meerburg, P. Daniel; Münchmeyer, Moritz; Wandelt, Benjamin
2016-02-01
We develop the tools necessary to assess the statistical significance of resonant features in the CMB correlation functions, combining power spectrum and bispectrum measurements. This significance is typically addressed by running a large number of simulations to derive the probability density function (PDF) of the feature-amplitude in the Gaussian case. Although these simulations are tractable for the power spectrum, for the bispectrum they require significant computational resources. We show that, by assuming that the PDF is given by a multivariate Gaussian where the covariance is determined by the Fisher matrix of the sine and cosine terms, we can efficiently produce spectra that are statistically close to those derived from full simulations. By drawing a large number of spectra from this PDF, both for the power spectrum and the bispectrum, we can quickly determine the statistical significance of candidate signatures in the CMB, considering both single frequency and multifrequency estimators. We show that for resonance models, cosmology and foreground parameters have little influence on the estimated amplitude, which allows us to simplify the analysis considerably. A more precise likelihood treatment can then be applied to candidate signatures only. We also discuss a modal expansion approach for the power spectrum, aimed at quickly scanning through large families of oscillating models.
Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie
2013-01-01
Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.
Hurricane Isaac: A Longitudinal Analysis of Storm Characteristics and Power Outage Risk.
Tonn, Gina L; Guikema, Seth D; Ferreira, Celso M; Quiring, Steven M
2016-10-01
In August 2012, Hurricane Isaac, a Category 1 hurricane at landfall, caused extensive power outages in Louisiana. The storm brought high winds, storm surge, and flooding to Louisiana, and power outages were widespread and prolonged. Hourly power outage data for the state of Louisiana were collected during the storm and analyzed. This analysis included correlation of hourly power outage figures by zip code with storm conditions including wind, rainfall, and storm surge using a nonparametric ensemble data mining approach. Results were analyzed to understand how correlation of power outages with storm conditions differed geographically within the state. This analysis provided insight on how rainfall and storm surge, along with wind, contribute to power outages in hurricanes. By conducting a longitudinal study of outages at the zip code level, we were able to gain insight into the causal drivers of power outages during hurricanes. Our analysis showed that the statistical importance of storm characteristic covariates to power outages varies geographically. For Hurricane Isaac, wind speed, precipitation, and previous outages generally had high importance, whereas storm surge had lower importance, even in zip codes that experienced significant surge. The results of this analysis can inform the development of power outage forecasting models, which often focus strictly on wind-related covariates. Our study of Hurricane Isaac indicates that inclusion of other covariates, particularly precipitation, may improve model accuracy and robustness across a range of storm conditions and geography. © 2016 Society for Risk Analysis.
2016-12-01
KS and AD Statistical Power via Monte Carlo Simulation Statistical power is the probability of correctly rejecting the null hypothesis when the...Select a caveat DISTRIBUTION STATEMENT A. Approved for public release: distribution unlimited. Determining the Statistical Power...real-world data to test the accuracy of the simulation. Statistical comparison of these metrics can be necessary when making such a determination
NASA Astrophysics Data System (ADS)
Donges, J. F.; Schleussner, C.-F.; Siegmund, J. F.; Donner, R. V.
2016-05-01
Studying event time series is a powerful approach for analyzing the dynamics of complex dynamical systems in many fields of science. In this paper, we describe the method of event coincidence analysis to provide a framework for quantifying the strength, directionality and time lag of statistical interrelationships between event series. Event coincidence analysis allows to formulate and test null hypotheses on the origin of the observed interrelationships including tests based on Poisson processes or, more generally, stochastic point processes with a prescribed inter-event time distribution and other higher-order properties. Applying the framework to country-level observational data yields evidence that flood events have acted as triggers of epidemic outbreaks globally since the 1950s. Facing projected future changes in the statistics of climatic extreme events, statistical techniques such as event coincidence analysis will be relevant for investigating the impacts of anthropogenic climate change on human societies and ecosystems worldwide.
[A Review on the Use of Effect Size in Nursing Research].
Kang, Hyuncheol; Yeon, Kyupil; Han, Sang Tae
2015-10-01
The purpose of this study was to introduce the main concepts of statistical testing and effect size and to provide researchers in nursing science with guidance on how to calculate the effect size for the statistical analysis methods mainly used in nursing. For t-test, analysis of variance, correlation analysis, regression analysis which are used frequently in nursing research, the generally accepted definitions of the effect size were explained. Some formulae for calculating the effect size are described with several examples in nursing research. Furthermore, the authors present the required minimum sample size for each example utilizing G*Power 3 software that is the most widely used program for calculating sample size. It is noted that statistical significance testing and effect size measurement serve different purposes, and the reliance on only one side may be misleading. Some practical guidelines are recommended for combining statistical significance testing and effect size measure in order to make more balanced decisions in quantitative analyses.
Power calculator for instrumental variable analysis in pharmacoepidemiology
Walker, Venexia M; Davies, Neil M; Windmeijer, Frank; Burgess, Stephen; Martin, Richard M
2017-01-01
Abstract Background Instrumental variable analysis, for example with physicians’ prescribing preferences as an instrument for medications issued in primary care, is an increasingly popular method in the field of pharmacoepidemiology. Existing power calculators for studies using instrumental variable analysis, such as Mendelian randomization power calculators, do not allow for the structure of research questions in this field. This is because the analysis in pharmacoepidemiology will typically have stronger instruments and detect larger causal effects than in other fields. Consequently, there is a need for dedicated power calculators for pharmacoepidemiological research. Methods and Results The formula for calculating the power of a study using instrumental variable analysis in the context of pharmacoepidemiology is derived before being validated by a simulation study. The formula is applicable for studies using a single binary instrument to analyse the causal effect of a binary exposure on a continuous outcome. An online calculator, as well as packages in both R and Stata, are provided for the implementation of the formula by others. Conclusions The statistical power of instrumental variable analysis in pharmacoepidemiological studies to detect a clinically meaningful treatment effect is an important consideration. Research questions in this field have distinct structures that must be accounted for when calculating power. The formula presented differs from existing instrumental variable power formulae due to its parametrization, which is designed specifically for ease of use by pharmacoepidemiologists. PMID:28575313
Methods for trend analysis: Examples with problem/failure data
NASA Technical Reports Server (NTRS)
Church, Curtis K.
1989-01-01
Statistics are emphasized as an important role in quality control and reliability. Consequently, Trend Analysis Techniques recommended a variety of statistical methodologies that could be applied to time series data. The major goal of the working handbook, using data from the MSFC Problem Assessment System, is to illustrate some of the techniques in the NASA standard, some different techniques, and to notice patterns of data. Techniques for trend estimation used are: regression (exponential, power, reciprocal, straight line) and Kendall's rank correlation coefficient. The important details of a statistical strategy for estimating a trend component are covered in the examples. However, careful analysis and interpretation is necessary because of small samples and frequent zero problem reports in a given time period. Further investigations to deal with these issues are being conducted.
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.
Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin
2015-04-01
Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Zipkin, Elise F.; Kinlan, Brian P.; Sussman, Allison; Rypkema, Diana; Wimer, Mark; O'Connell, Allan F.
2015-01-01
Estimating patterns of habitat use is challenging for marine avian species because seabirds tend to aggregate in large groups and it can be difficult to locate both individuals and groups in vast marine environments. We developed an approach to estimate the statistical power of discrete survey events to identify species-specific hotspots and coldspots of long-term seabird abundance in marine environments. We illustrate our approach using historical seabird data from survey transects in the U.S. Atlantic Ocean Outer Continental Shelf (OCS), an area that has been divided into “lease blocks” for proposed offshore wind energy development. For our power analysis, we examined whether discrete lease blocks within the region could be defined as hotspots (3 × mean abundance in the OCS) or coldspots (1/3 ×) for individual species within a given season. For each of 74 species/season combinations, we determined which of eight candidate statistical distributions (ranging in their degree of skewedness) best fit the count data. We then used the selected distribution and estimates of regional prevalence to calculate and map statistical power to detect hotspots and coldspots, and estimate the p-value from Monte Carlo significance tests that specific lease blocks are in fact hotspots or coldspots relative to regional average abundance. The power to detect species-specific hotspots was higher than that of coldspots for most species because species-specific prevalence was relatively low (mean: 0.111; SD: 0.110). The number of surveys required for adequate power (> 0.6) was large for most species (tens to hundreds) using this hotspot definition. Regulators may need to accept higher proportional effect sizes, combine species into groups, and/or broaden the spatial scale by combining lease blocks in order to determine optimal placement of wind farms. Our power analysis approach provides a general framework for both retrospective analyses and future avian survey design and is applicable to a broad range of research and conservation problems.
ERIC Educational Resources Information Center
Maydeu-Olivares, Alberto; Montano, Rosa
2013-01-01
We investigate the performance of three statistics, R [subscript 1], R [subscript 2] (Glas in "Psychometrika" 53:525-546, 1988), and M [subscript 2] (Maydeu-Olivares & Joe in "J. Am. Stat. Assoc." 100:1009-1020, 2005, "Psychometrika" 71:713-732, 2006) to assess the overall fit of a one-parameter logistic model…
THE MURCHISON WIDEFIELD ARRAY 21 cm POWER SPECTRUM ANALYSIS METHODOLOGY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobs, Daniel C.; Beardsley, A. P.; Bowman, Judd D.
2016-07-10
We present the 21 cm power spectrum analysis approach of the Murchison Widefield Array Epoch of Reionization project. In this paper, we compare the outputs of multiple pipelines for the purpose of validating statistical limits cosmological hydrogen at redshifts between 6 and 12. Multiple independent data calibration and reduction pipelines are used to make power spectrum limits on a fiducial night of data. Comparing the outputs of imaging and power spectrum stages highlights differences in calibration, foreground subtraction, and power spectrum calculation. The power spectra found using these different methods span a space defined by the various tradeoffs between speed,more » accuracy, and systematic control. Lessons learned from comparing the pipelines range from the algorithmic to the prosaically mundane; all demonstrate the many pitfalls of neglecting reproducibility. We briefly discuss the way these different methods attempt to handle the question of evaluating a significant detection in the presence of foregrounds.« less
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.
Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E
2015-09-03
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Matysiak, W; Królikowska-Prasał, I; Staszyc, J; Kifer, E; Romanowska-Sarlej, J
1989-01-01
The studies were performed on 44 white female Wistar rats which were intratracheally administered the suspension of the soil dust and the electro-energetic ashes. The electro-energetic ashes were collected from 6 different local heat and power generating plants while the soil dust from several random places of our country. The statistical analysis of the body and the lung mass of the animals subjected to the single dust and ash insufflation was performed. The applied variants proved the statistically significant differences between the body and the lung mass. The observed differences are connected with the kinds of dust and ash used in the experiment.
Statistical scaling of pore-scale Lagrangian velocities in natural porous media.
Siena, M; Guadagnini, A; Riva, M; Bijeljic, B; Pereira Nunes, J P; Blunt, M J
2014-08-01
We investigate the scaling behavior of sample statistics of pore-scale Lagrangian velocities in two different rock samples, Bentheimer sandstone and Estaillades limestone. The samples are imaged using x-ray computer tomography with micron-scale resolution. The scaling analysis relies on the study of the way qth-order sample structure functions (statistical moments of order q of absolute increments) of Lagrangian velocities depend on separation distances, or lags, traveled along the mean flow direction. In the sandstone block, sample structure functions of all orders exhibit a power-law scaling within a clearly identifiable intermediate range of lags. Sample structure functions associated with the limestone block display two diverse power-law regimes, which we infer to be related to two overlapping spatially correlated structures. In both rocks and for all orders q, we observe linear relationships between logarithmic structure functions of successive orders at all lags (a phenomenon that is typically known as extended power scaling, or extended self-similarity). The scaling behavior of Lagrangian velocities is compared with the one exhibited by porosity and specific surface area, which constitute two key pore-scale geometric observables. The statistical scaling of the local velocity field reflects the behavior of these geometric observables, with the occurrence of power-law-scaling regimes within the same range of lags for sample structure functions of Lagrangian velocity, porosity, and specific surface area.
Brandmaier, Andreas M.; von Oertzen, Timo; Ghisletta, Paolo; Lindenberger, Ulman; Hertzog, Christopher
2018-01-01
Latent Growth Curve Models (LGCM) have become a standard technique to model change over time. Prediction and explanation of inter-individual differences in change are major goals in lifespan research. The major determinants of statistical power to detect individual differences in change are the magnitude of true inter-individual differences in linear change (LGCM slope variance), design precision, alpha level, and sample size. Here, we show that design precision can be expressed as the inverse of effective error. Effective error is determined by instrument reliability and the temporal arrangement of measurement occasions. However, it also depends on another central LGCM component, the variance of the latent intercept and its covariance with the latent slope. We derive a new reliability index for LGCM slope variance—effective curve reliability (ECR)—by scaling slope variance against effective error. ECR is interpretable as a standardized effect size index. We demonstrate how effective error, ECR, and statistical power for a likelihood ratio test of zero slope variance formally relate to each other and how they function as indices of statistical power. We also provide a computational approach to derive ECR for arbitrary intercept-slope covariance. With practical use cases, we argue for the complementary utility of the proposed indices of a study's sensitivity to detect slope variance when making a priori longitudinal design decisions or communicating study designs. PMID:29755377
Statistical analyses support power law distributions found in neuronal avalanches.
Klaus, Andreas; Yu, Shan; Plenz, Dietmar
2011-01-01
The size distribution of neuronal avalanches in cortical networks has been reported to follow a power law distribution with exponent close to -1.5, which is a reflection of long-range spatial correlations in spontaneous neuronal activity. However, identifying power law scaling in empirical data can be difficult and sometimes controversial. In the present study, we tested the power law hypothesis for neuronal avalanches by using more stringent statistical analyses. In particular, we performed the following steps: (i) analysis of finite-size scaling to identify scale-free dynamics in neuronal avalanches, (ii) model parameter estimation to determine the specific exponent of the power law, and (iii) comparison of the power law to alternative model distributions. Consistent with critical state dynamics, avalanche size distributions exhibited robust scaling behavior in which the maximum avalanche size was limited only by the spatial extent of sampling ("finite size" effect). This scale-free dynamics suggests the power law as a model for the distribution of avalanche sizes. Using both the Kolmogorov-Smirnov statistic and a maximum likelihood approach, we found the slope to be close to -1.5, which is in line with previous reports. Finally, the power law model for neuronal avalanches was compared to the exponential and to various heavy-tail distributions based on the Kolmogorov-Smirnov distance and by using a log-likelihood ratio test. Both the power law distribution without and with exponential cut-off provided significantly better fits to the cluster size distributions in neuronal avalanches than the exponential, the lognormal and the gamma distribution. In summary, our findings strongly support the power law scaling in neuronal avalanches, providing further evidence for critical state dynamics in superficial layers of cortex.
Nicol, Samuel; Roach, Jennifer K.; Griffith, Brad
2013-01-01
Over the past 50 years, the number and size of high-latitude lakes have decreased throughout many regions; however, individual lake trends have been variable in direction and magnitude. This spatial heterogeneity in lake change makes statistical detection of temporal trends challenging, particularly in small analysis areas where weak trends are difficult to separate from inter- and intra-annual variability. Factors affecting trend detection include inherent variability, trend magnitude, and sample size. In this paper, we investigated how the statistical power to detect average linear trends in lake size of 0.5, 1.0 and 2.0 %/year was affected by the size of the analysis area and the number of years of monitoring in National Wildlife Refuges in Alaska. We estimated power for large (930–4,560 sq km) study areas within refuges and for 2.6, 12.9, and 25.9 sq km cells nested within study areas over temporal extents of 4–50 years. We found that: (1) trends in study areas could be detected within 5–15 years, (2) trends smaller than 2.0 %/year would take >50 years to detect in cells within study areas, and (3) there was substantial spatial variation in the time required to detect change among cells. Power was particularly low in the smallest cells which typically had the fewest lakes. Because small but ecologically meaningful trends may take decades to detect, early establishment of long-term monitoring will enhance power to detect change. Our results have broad applicability and our method is useful for any study involving change detection among variable spatial and temporal extents.
Handwriting Examination: Moving from Art to Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jarman, K.H.; Hanlen, R.C.; Manzolillo, P.A.
In this document, we present a method for validating the premises and methodology of forensic handwriting examination. This method is intuitively appealing because it relies on quantitative measurements currently used qualitatively by FDE's in making comparisons, and it is scientifically rigorous because it exploits the power of multivariate statistical analysis. This approach uses measures of both central tendency and variation to construct a profile for a given individual. (Central tendency and variation are important for characterizing an individual's writing and both are currently used by FDE's in comparative analyses). Once constructed, different profiles are then compared for individuality using clustermore » analysis; they are grouped so that profiles within a group cannot be differentiated from one another based on the measured characteristics, whereas profiles between groups can. The cluster analysis procedure used here exploits the power of multivariate hypothesis testing. The result is not only a profile grouping but also an indication of statistical significance of the groups generated.« less
Proceedings: USACERL/ASCE First Joint Conference on Expert Systems, 29-30 June 1988
1989-01-01
Wong KOWLEDGE -BASED GRAPHIC DIALOGUES . o ...................... .... 80 D. L Mw 4 CONTENTS (Cont’d) ABSTRACTS ACCEPTED FOR PUBLICATION MAD, AN EXPERT...methodology of inductive shallow modeling was developed. Inductive systems may become powerful shallow modeling tools applicable to a large class of...analysis was conducted using a statistical package, Trajectories. Four different types of relationships were analyzed: linear, logarithmic, power , and
Linear and Nonlinear Analysis of Brain Dynamics in Children with Cerebral Palsy
ERIC Educational Resources Information Center
Sajedi, Firoozeh; Ahmadlou, Mehran; Vameghi, Roshanak; Gharib, Masoud; Hemmati, Sahel
2013-01-01
This study was carried out to determine linear and nonlinear changes of brain dynamics and their relationships with the motor dysfunctions in CP children. For this purpose power of EEG frequency bands (as a linear analysis) and EEG fractality (as a nonlinear analysis) were computed in eyes-closed resting state and statistically compared between 26…
ERIC Educational Resources Information Center
Aragón, Sonia; Lapresa, Daniel; Arana, Javier; Anguera, M. Teresa; Garzón, Belén
2017-01-01
Polar coordinate analysis is a powerful data reduction technique based on the Zsum statistic, which is calculated from adjusted residuals obtained by lag sequential analysis. Its use has been greatly simplified since the addition of a module in the free software program HOISAN for performing the necessary computations and producing…
The power prior: theory and applications.
Ibrahim, Joseph G; Chen, Ming-Hui; Gwon, Yeongjin; Chen, Fang
2015-12-10
The power prior has been widely used in many applications covering a large number of disciplines. The power prior is intended to be an informative prior constructed from historical data. It has been used in clinical trials, genetics, health care, psychology, environmental health, engineering, economics, and business. It has also been applied for a wide variety of models and settings, both in the experimental design and analysis contexts. In this review article, we give an A-to-Z exposition of the power prior and its applications to date. We review its theoretical properties, variations in its formulation, statistical contexts for which it has been used, applications, and its advantages over other informative priors. We review models for which it has been used, including generalized linear models, survival models, and random effects models. Statistical areas where the power prior has been used include model selection, experimental design, hierarchical modeling, and conjugate priors. Frequentist properties of power priors in posterior inference are established, and a simulation study is conducted to further examine the empirical performance of the posterior estimates with power priors. Real data analyses are given illustrating the power prior as well as the use of the power prior in the Bayesian design of clinical trials. Copyright © 2015 John Wiley & Sons, Ltd.
Solar activity and economic fundamentals: Evidence from 12 geographically disparate power grids
NASA Astrophysics Data System (ADS)
Forbes, Kevin F.; St. Cyr, O. C.
2008-10-01
This study uses local (ground-based) magnetometer data as a proxy for geomagnetically induced currents (GICs) to address whether there is a space weather/electricity market relationship in 12 geographically disparate power grids: Eirgrid, the power grid that serves the Republic of Ireland; Scottish and Southern Electricity, the power grid that served northern Scotland until April 2005; Scottish Power, the power grid that served southern Scotland until April 2005; the power grid that serves the Czech Republic; E.ON Netz, the transmission system operator in central Germany; the power grid in England and Wales; the power grid in New Zealand; the power grid that serves the vast proportion of the population in Australia; ISO New England, the power grid that serves New England; PJM, a power grid that over the sample period served all or parts of Delaware, Maryland, New Jersey, Ohio, Pennsylvania, Virginia, West Virginia, and the District of Columbia; NYISO, the power grid that serves New York State; and the power grid in the Netherlands. This study tests the hypothesis that GIC levels (proxied by the time variation of local magnetic field measurements (dH/dt)) and electricity grid conditions are related using Pearson's chi-squared statistic. The metrics of power grid conditions include measures of electricity market imbalances, energy losses, congestion costs, and actions by system operators to restore grid stability. The results of the analysis indicate that real-time market conditions in these power grids are statistically related with the GIC proxy.
Spectral Analysis of B Stars: An Application of Bayesian Statistics
NASA Astrophysics Data System (ADS)
Mugnes, J.-M.; Robert, C.
2012-12-01
To better understand the processes involved in stellar physics, it is necessary to obtain accurate stellar parameters (effective temperature, surface gravity, abundances…). Spectral analysis is a powerful tool for investigating stars, but it is also vital to reduce uncertainties at a decent computational cost. Here we present a spectral analysis method based on a combination of Bayesian statistics and grids of synthetic spectra obtained with TLUSTY. This method simultaneously constrains the stellar parameters by using all the lines accessible in observed spectra and thus greatly reduces uncertainties and improves the overall spectrum fitting. Preliminary results are shown using spectra from the Observatoire du Mont-Mégantic.
Computerized EEG analysis for studying the effect of drugs on the central nervous system.
Rosadini, G; Cavazza, B; Rodriguez, G; Sannita, W G; Siccardi, A
1977-11-01
Samples of our experience in quantitative pharmaco-EEG are reviewed to discuss and define its applicability and limits. Simple processing systems, such as the computation of Hjorth's descriptors, are useful for on-line monitoring of drug-induced EEG modifications which are evident also at the visual visual analysis. Power spectral analysis is suitable to identify and quantify EEG effects not evident at the visual inspection. It demonstrated how the EEG effects of compounds in a long-acting formulation vary according to the sampling time and the explored cerebral area. EEG modifications not detected by power spectral analysis can be defined by comparing statistically (F test) the spectral values of the EEG from a single lead at the different samples (longitudinal comparison), or the spectral values from different leads at any sample (intrahemispheric comparison). The presently available procedures of quantitative pharmaco-EEG are effective when applied to the study of mutltilead EEG recordings in a statistically significant sample of population. They do not seem reliable in the monitoring of directing of neuropyschiatric therapies in single patients, due to individual variability of drug effects.
Spectral analysis of groove spacing on Ganymede
NASA Technical Reports Server (NTRS)
Grimm, R. E.
1984-01-01
The technique used to analyze groove spacing on Ganymede is presented. Data from Voyager images are used determine the surface topography and position of the grooves. Power spectal estimates are statistically analyzed and sample data is included.
Kleikers, Pamela W M; Hooijmans, Carlijn; Göb, Eva; Langhauser, Friederike; Rewell, Sarah S J; Radermacher, Kim; Ritskes-Hoitinga, Merel; Howells, David W; Kleinschnitz, Christoph; Schmidt, Harald H H W
2015-08-27
Biomedical research suffers from a dramatically poor translational success. For example, in ischemic stroke, a condition with a high medical need, over a thousand experimental drug targets were unsuccessful. Here, we adopt methods from clinical research for a late-stage pre-clinical meta-analysis (MA) and randomized confirmatory trial (pRCT) approach. A profound body of literature suggests NOX2 to be a major therapeutic target in stroke. Systematic review and MA of all available NOX2(-/y) studies revealed a positive publication bias and lack of statistical power to detect a relevant reduction in infarct size. A fully powered multi-center pRCT rejects NOX2 as a target to improve neurofunctional outcomes or achieve a translationally relevant infarct size reduction. Thus stringent statistical thresholds, reporting negative data and a MA-pRCT approach can ensure biomedical data validity and overcome risks of bias.
Enhanced Higgs boson to τ(+)τ(-) search with deep learning.
Baldi, P; Sadowski, P; Whiteson, D
2015-03-20
The Higgs boson is thought to provide the interaction that imparts mass to the fundamental fermions, but while measurements at the Large Hadron Collider (LHC) are consistent with this hypothesis, current analysis techniques lack the statistical power to cross the traditional 5σ significance barrier without more data. Deep learning techniques have the potential to increase the statistical power of this analysis by automatically learning complex, high-level data representations. In this work, deep neural networks are used to detect the decay of the Higgs boson to a pair of tau leptons. A Bayesian optimization algorithm is used to tune the network architecture and training algorithm hyperparameters, resulting in a deep network of eight nonlinear processing layers that improves upon the performance of shallow classifiers even without the use of features specifically engineered by physicists for this application. The improvement in discovery significance is equivalent to an increase in the accumulated data set of 25%.
Dark Matter interpretation of low energy IceCube MESE excess
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chianese, M.; Miele, G.; Morisi, S., E-mail: chianese@na.infn.it, E-mail: miele@na.infn.it, E-mail: stefano.morisi@na.infn.it
2017-01-01
The 2-years MESE IceCube events show a slightly excess in the energy range 10–100 TeV with a maximum local statistical significance of 2.3σ, once a hard astrophysical power-law is assumed. A spectral index smaller than 2.2 is indeed suggested by multi-messenger studies related to p - p sources and by the recent IceCube analysis regarding 6-years up-going muon neutrinos. In the present paper, we propose a two-components scenario where the extraterrestrial neutrinos are explained in terms of an astrophysical power-law and a Dark Matter signal. We consider both decaying and annihilating Dark Matter candidates with different final states (quarks andmore » leptons) and different halo density profiles. We perform a likelihood-ratio analysis that provides a statistical significance up to 3.9σ for a Dark Matter interpretation of the IceCube low energy excess.« less
Lin, Yuan-Pin; Duann, Jeng-Ren; Chen, Jyh-Horng; Jung, Tzyy-Ping
2010-04-21
This study explores the electroencephalographic (EEG) correlates of emotional experience during music listening. Independent component analysis and analysis of variance were used to separate statistically independent spectral changes of the EEG in response to music-induced emotional processes. An independent brain process with equivalent dipole located in the fronto-central region exhibited distinct δ-band and θ-band power changes associated with self-reported emotional states. Specifically, the emotional valence was associated with δ-power decreases and θ-power increases in the frontal-central area, whereas the emotional arousal was accompanied by increases in both δ and θ powers. The resultant emotion-related component activations that were less interfered by the activities from other brain processes complement previous EEG studies of emotion perception to music.
Statistical Analysis of Zebrafish Locomotor Response.
Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure.
Statistical Analysis of Zebrafish Locomotor Response
Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling’s T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling’s T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure. PMID:26437184
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cheung, WanYin; Zhang, Jie; Florita, Anthony
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance,more » cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.« less
Prasifka, J R; Hellmich, R L; Dively, G P; Higgins, L S; Dixon, P M; Duan, J J
2008-02-01
One of the possible adverse effects of transgenic insecticidal crops is the unintended decline in the abundance of nontarget arthropods. Field trials designed to evaluate potential nontarget effects can be more complex than expected because decisions to conduct field trials and the selection of taxa to include are not always guided by the results of laboratory tests. Also, recent studies emphasize the potential for indirect effects (adverse impacts to nontarget arthropods without feeding directly on plant tissues), which are difficult to predict because of interactions among nontarget arthropods, target pests, and transgenic crops. As a consequence, field studies may attempt to monitor expansive lists of arthropod taxa, making the design of such broad studies more difficult and reducing the likelihood of detecting any negative effects that might be present. To improve the taxonomic focus and statistical rigor of future studies, existing field data and corresponding power analysis may provide useful guidance. Analysis of control data from several nontarget field trials using repeated-measures designs suggests that while detection of small effects may require considerable increases in replication, there are taxa from different ecological roles that are sampled effectively using standard methods. The use of statistical power to guide selection of taxa for nontarget trials reflects scientists' inability to predict the complex interactions among arthropod taxa, particularly when laboratory trials fail to provide guidance on which groups are more likely to be affected. However, scientists still may exercise judgment, including taxa that are not included in or supported by power analyses.
Statistical modeling to support power system planning
NASA Astrophysics Data System (ADS)
Staid, Andrea
This dissertation focuses on data-analytic approaches that improve our understanding of power system applications to promote better decision-making. It tackles issues of risk analysis, uncertainty management, resource estimation, and the impacts of climate change. Tools of data mining and statistical modeling are used to bring new insight to a variety of complex problems facing today's power system. The overarching goal of this research is to improve the understanding of the power system risk environment for improved operation, investment, and planning decisions. The first chapter introduces some challenges faced in planning for a sustainable power system. Chapter 2 analyzes the driving factors behind the disparity in wind energy investments among states with a goal of determining the impact that state-level policies have on incentivizing wind energy. Findings show that policy differences do not explain the disparities; physical and geographical factors are more important. Chapter 3 extends conventional wind forecasting to a risk-based focus of predicting maximum wind speeds, which are dangerous for offshore operations. Statistical models are presented that issue probabilistic predictions for the highest wind speed expected in a three-hour interval. These models achieve a high degree of accuracy and their use can improve safety and reliability in practice. Chapter 4 examines the challenges of wind power estimation for onshore wind farms. Several methods for wind power resource assessment are compared, and the weaknesses of the Jensen model are demonstrated. For two onshore farms, statistical models outperform other methods, even when very little information is known about the wind farm. Lastly, chapter 5 focuses on the power system more broadly in the context of the risks expected from tropical cyclones in a changing climate. Risks to U.S. power system infrastructure are simulated under different scenarios of tropical cyclone behavior that may result from climate change. The scenario-based approach allows me to address the deep uncertainty present by quantifying the range of impacts, identifying the most critical parameters, and assessing the sensitivity of local areas to a changing risk. Overall, this body of work quantifies the uncertainties present in several operational and planning decisions for power system applications.
A Guided Tour of Mathematical Methods for the Physical Sciences
NASA Astrophysics Data System (ADS)
Snieder, Roel; van Wijk, Kasper
2015-05-01
1. Introduction; 2. Dimensional analysis; 3. Power series; 4. Spherical and cylindrical coordinates; 5. Gradient; 6. Divergence of a vector field; 7. Curl of a vector field; 8. Theorem of Gauss; 9. Theorem of Stokes; 10. The Laplacian; 11. Scale analysis; 12. Linear algebra; 13. Dirac delta function; 14. Fourier analysis; 15. Analytic functions; 16. Complex integration; 17. Green's functions: principles; 18. Green's functions: examples; 19. Normal modes; 20. Potential-field theory; 21. Probability and statistics; 22. Inverse problems; 23. Perturbation theory; 24. Asymptotic evaluation of integrals; 25. Conservation laws; 26. Cartesian tensors; 27. Variational calculus; 28. Epilogue on power and knowledge.
Automated spectral and timing analysis of AGNs
NASA Astrophysics Data System (ADS)
Munz, F.; Karas, V.; Guainazzi, M.
2006-12-01
% We have developed an autonomous script that helps the user to automate the XMM-Newton data analysis for the purposes of extensive statistical investigations. We test this approach by examining X-ray spectra of bright AGNs pre-selected from the public database. The event lists extracted in this process were studied further by constructing their energy-resolved Fourier power-spectrum density. This analysis combines energy distributions, light-curves, and their power-spectra and it proves useful to assess the variability patterns present is the data. As another example, an automated search was based on the XSPEC package to reveal the emission features in 2-8 keV range.
Toward "Constructing" the Concept of Statistical Power: An Optical Analogy.
ERIC Educational Resources Information Center
Rogers, Bruce G.
This paper presents a visual analogy that may be used by instructors to teach the concept of statistical power in statistical courses. Statistical power is mathematically defined as the probability of rejecting a null hypothesis when that null is false, or, equivalently, the probability of detecting a relationship when it exists. The analogy…
Two-way DF relaying assisted D2D communication: ergodic rate and power allocation
NASA Astrophysics Data System (ADS)
Ni, Yiyang; Wang, Yuxi; Jin, Shi; Wong, Kai-Kit; Zhu, Hongbo
2017-12-01
In this paper, we investigate the ergodic rate for a device-to-device (D2D) communication system aided by a two-way decode-and-forward (DF) relay node. We first derive closed-form expressions for the ergodic rate of the D2D link under asymmetric and symmetric cases, respectively. We subsequently discuss two special scenarios including weak interference case and high signal-to-noise ratio case. Then we derive the tight approximations for each of the considered scenarios. Assuming that each transmitter only has access to its own statistical channel state information (CSI), we further derive closed-form power allocation strategy to improve the system performance according to the analytical results of the ergodic rate. Furthermore, some insights are provided for the power allocation strategy based on the analytical results. The strategies are easy to compute and require to know only the channel statistics. Numerical results show the accuracy of the analysis results under various conditions and test the availability of the power allocation strategy.
A powerful score-based test statistic for detecting gene-gene co-association.
Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun
2016-01-29
The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.
NASA Technical Reports Server (NTRS)
Djorgovski, George
1993-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
NASA Technical Reports Server (NTRS)
Djorgovski, Stanislav
1992-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
Anota, Amélie; Barbieri, Antoine; Savina, Marion; Pam, Alhousseiny; Gourgou-Bourgade, Sophie; Bonnetain, Franck; Bascoul-Mollevi, Caroline
2014-12-31
Health-Related Quality of Life (HRQoL) is an important endpoint in oncology clinical trials aiming to investigate the clinical benefit of new therapeutic strategies for the patient. However, the longitudinal analysis of HRQoL remains complex and unstandardized. There is clearly a need to propose accessible statistical methods and meaningful results for clinicians. The objective of this study was to compare three strategies for longitudinal analyses of HRQoL data in oncology clinical trials through a simulation study. The methods proposed were: the score and mixed model (SM); a survival analysis approach based on the time to HRQoL score deterioration (TTD); and the longitudinal partial credit model (LPCM). Simulations compared the methods in terms of type I error and statistical power of the test of an interaction effect between treatment arm and time. Several simulation scenarios were explored based on the EORTC HRQoL questionnaires and varying the number of patients (100, 200 or 300), items (1, 2 or 4) and response categories per item (4 or 7). Five or 10 measurement times were considered, with correlations ranging from low to high between each measure. The impact of informative missing data on these methods was also studied to reflect the reality of most clinical trials. With complete data, the type I error rate was close to the expected value (5%) for all methods, while the SM method was the most powerful method, followed by LPCM. The power of TTD is low for single-item dimensions, because only four possible values exist for the score. When the number of items increases, the power of the SM approach remained stable, those of the TTD method increases while the power of LPCM remained stable. With 10 measurement times, the LPCM was less efficient. With informative missing data, the statistical power of SM and TTD tended to decrease, while that of LPCM tended to increase. To conclude, the SM model was the most powerful model, irrespective of the scenario considered, and the presence or not of missing data. The TTD method should be avoided for single-item dimensions of the EORTC questionnaire. While the LPCM model was more adapted to this kind of data, it was less efficient than the SM model. These results warrant validation through comparisons on real data.
Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.
Mi, Gu; Di, Yanming; Schafer, Daniel W
2015-01-01
This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.
ERIC Educational Resources Information Center
Kennedy, Kate; Peters, Mary; Thomas, Mike
2012-01-01
Value-added analysis is the most robust, statistically significant method available for helping educators quantify student progress over time. This powerful tool also reveals tangible strategies for improving instruction. Built around the work of Battelle for Kids, this book provides a field-tested continuous improvement model for using…
Buu, Anne; Williams, L Keoki; Yang, James J
2018-03-01
We propose a new genome-wide association test for mixed binary and continuous phenotypes that uses an efficient numerical method to estimate the empirical distribution of the Fisher's combination statistic under the null hypothesis. Our simulation study shows that the proposed method controls the type I error rate and also maintains its power at the level of the permutation method. More importantly, the computational efficiency of the proposed method is much higher than the one of the permutation method. The simulation results also indicate that the power of the test increases when the genetic effect increases, the minor allele frequency increases, and the correlation between responses decreases. The statistical analysis on the database of the Study of Addiction: Genetics and Environment demonstrates that the proposed method combining multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests.
NASA Astrophysics Data System (ADS)
Wang, Fang; Liao, Gui-ping; Li, Jian-hui; Zou, Rui-biao; Shi, Wen
2013-03-01
A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic ρAMF -XA, which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the ρAMF -XA statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets.
Wang, Fang; Liao, Gui-ping; Li, Jian-hui; Zou, Rui-biao; Shi, Wen
2013-03-01
A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic ρAMF-XA, which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the ρAMF-XA statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets.
Strategies for mapping heterogeneous recessive traits by allele-sharing methods.
Feingold, E; Siegmund, D O
1997-01-01
We investigate strategies for detecting linkage of recessive and partially recessive traits, using sibling pairs and inbred individuals. We assume that a genomewide search is being conducted and that locus heterogeneity of the trait is likely. For sibling pairs, we evaluate the efficiency of different statistics under the assumption that one does not know the true degree of recessiveness of the trait. We recommend a sibling-pair statistic that is a linear compromise between two previously suggested statistics. We also compare the power of sibling pairs to that of more distant relatives, such as cousins. For inbred individuals, we evaluate the power of offspring of different types of matings and compare them to sibling pairs. Over a broad range of trait etiologies, sibling pairs are more powerful than inbred individuals, but for traits caused by very rare alleles, particularly in the case of heterogeneity, inbred individuals can be much more powerful. The models we develop can also be used to examine specific situations other than those we look at. We present this analysis in the idealized context of a dense set of highly polymorphic markers. In general, incorporation of real-world complexities makes inbred individuals, particularly offspring of distant relatives, look slightly less useful than our results imply. PMID:9106544
STR data for 15 autosomal STR markers from Paraná (Southern Brazil).
Alves, Hemerson B; Leite, Fábio P N; Sotomaior, Vanessa S; Rueda, Fábio F; Silva, Rosane; Moura-Neto, Rodrigo S
2014-03-01
Allelic frequencies for 15 STR autosomal loci, using AmpFℓSTR® Identifiler™, forensic, and statistical parameters were calculated. All loci reached the Hardy-Weinberg equilibrium. The combined power of discrimination and mean power of exclusion were 0.999999999999999999 and 0.9999993, respectively. The MDS plot and NJ tree analysis, generated by FST matrix, corroborated the notion of the origins of the Paraná population as mainly European-derived. The combination of these 15 STR loci represents a powerful strategy for individual identification and parentage analyses for the Paraná population.
Cysique, Lucette A; Waters, Edward K; Brew, Bruce J
2011-11-22
There is conflicting information as to whether antiretroviral drugs with better central nervous system (CNS) penetration (neuroHAART) assist in improving neurocognitive function and suppressing cerebrospinal fluid (CSF) HIV RNA. The current review aims to better synthesise existing literature by using an innovative two-phase review approach (qualitative and quantitative) to overcome methodological differences between studies. Sixteen studies, all observational, were identified using a standard citation search. They fulfilled the following inclusion criteria: conducted in the HAART era; sample size > 10; treatment effect involved more than one antiretroviral and none had a retrospective design. The qualitative phase of review of these studies consisted of (i) a blind assessment rating studies on features such as sample size, statistical methods and definitions of neuroHAART, and (ii) a non-blind assessment of the sensitivity of the neuropsychological methods to HIV-associated neurocognitive disorder (HAND). During quantitative evaluation we assessed the statistical power of studies, which achieved a high rating in the qualitative analysis. The objective of the power analysis was to determine the studies ability to assess their proposed research aims. After studies with at least three limitations were excluded in the qualitative phase, six studies remained. All six found a positive effect of neuroHAART on neurocognitive function or CSF HIV suppression. Of these six studies, only two had statistical power of at least 80%. Studies assessed as using more rigorous methods found that neuroHAART was effective in improving neurocognitive function and decreasing CSF viral load, but only two of those studies were adequately statistically powered. Because all of these studies were observational, they represent a less compelling evidence base than randomised control trials for assessing treatment effect. Therefore, large randomised trials are needed to determine the robustness of any neuroHAART effect. However, such trials must be longitudinal, include the full spectrum of HAND, ideally carefully control for co-morbidities, and be based on optimal neuropsychology methods.
NASA Astrophysics Data System (ADS)
Fan, Qingju; Wu, Yonghong
2015-08-01
In this paper, we develop a new method for the multifractal characterization of two-dimensional nonstationary signal, which is based on the detrended fluctuation analysis (DFA). By applying to two artificially generated signals of two-component ARFIMA process and binomial multifractal model, we show that the new method can reliably determine the multifractal scaling behavior of two-dimensional signal. We also illustrate the applications of this method in finance and physiology. The analyzing results exhibit that the two-dimensional signals under investigation are power-law correlations, and the electricity market consists of electricity price and trading volume is multifractal, while the two-dimensional EEG signal in sleep recorded for a single patient is weak multifractal. The new method based on the detrended fluctuation analysis may add diagnostic power to existing statistical methods.
Heidel, R Eric
2016-01-01
Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.
Uncertainty Analysis of Seebeck Coefficient and Electrical Resistivity Characterization
NASA Technical Reports Server (NTRS)
Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred
2014-01-01
In order to provide a complete description of a materials thermoelectric power factor, in addition to the measured nominal value, an uncertainty interval is required. The uncertainty may contain sources of measurement error including systematic bias error and precision error of a statistical nature. The work focuses specifically on the popular ZEM-3 (Ulvac Technologies) measurement system, but the methods apply to any measurement system. The analysis accounts for sources of systematic error including sample preparation tolerance, measurement probe placement, thermocouple cold-finger effect, and measurement parameters; in addition to including uncertainty of a statistical nature. Complete uncertainty analysis of a measurement system allows for more reliable comparison of measurement data between laboratories.
Quality of statistical reporting in developmental disability journals.
Namasivayam, Aravind K; Yan, Tina; Wong, Wing Yiu Stephanie; van Lieshout, Pascal
2015-12-01
Null hypothesis significance testing (NHST) dominates quantitative data analysis, but its use is controversial and has been heavily criticized. The American Psychological Association has advocated the reporting of effect sizes (ES), confidence intervals (CIs), and statistical power analysis to complement NHST results to provide a more comprehensive understanding of research findings. The aim of this paper is to carry out a sample survey of statistical reporting practices in two journals with the highest h5-index scores in the areas of developmental disability and rehabilitation. Using a checklist that includes critical recommendations by American Psychological Association, we examined 100 randomly selected articles out of 456 articles reporting inferential statistics in the year 2013 in the Journal of Autism and Developmental Disorders (JADD) and Research in Developmental Disabilities (RDD). The results showed that for both journals, ES were reported only half the time (JADD 59.3%; RDD 55.87%). These findings are similar to psychology journals, but are in stark contrast to ES reporting in educational journals (73%). Furthermore, a priori power and sample size determination (JADD 10%; RDD 6%), along with reporting and interpreting precision measures (CI: JADD 13.33%; RDD 16.67%), were the least reported metrics in these journals, but not dissimilar to journals in other disciplines. To advance the science in developmental disability and rehabilitation and to bridge the research-to-practice divide, reforms in statistical reporting, such as providing supplemental measures to NHST, are clearly needed.
Dzul, Maria C.; Dixon, Philip M.; Quist, Michael C.; Dinsomore, Stephen J.; Bower, Michael R.; Wilson, Kevin P.; Gaines, D. Bailey
2013-01-01
We used variance components to assess allocation of sampling effort in a hierarchically nested sampling design for ongoing monitoring of early life history stages of the federally endangered Devils Hole pupfish (DHP) (Cyprinodon diabolis). Sampling design for larval DHP included surveys (5 days each spring 2007–2009), events, and plots. Each survey was comprised of three counting events, where DHP larvae on nine plots were counted plot by plot. Statistical analysis of larval abundance included three components: (1) evaluation of power from various sample size combinations, (2) comparison of power in fixed and random plot designs, and (3) assessment of yearly differences in the power of the survey. Results indicated that increasing the sample size at the lowest level of sampling represented the most realistic option to increase the survey's power, fixed plot designs had greater power than random plot designs, and the power of the larval survey varied by year. This study provides an example of how monitoring efforts may benefit from coupling variance components estimation with power analysis to assess sampling design.
Analysis and meta-analysis of single-case designs: an introduction.
Shadish, William R
2014-04-01
The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Load- and skill-related changes in segmental contributions to a weightlifting movement.
Enoka, R M
1988-04-01
An exemplary short duration, high-power, weightlifting event was examined to determine whether the ability to lift heavier loads and whether variations in the level of skill were accompanied by quantitative changes in selected aspects of lower extremity joint power-time histories. Six experienced weightlifters, three skilled and three less skilled, performed the double-knee-bend execution of the pull in Olympic weightlifting, a movement which lasted almost 1 s. Analysis-of-variance statistics were performed on selected peak and average values of power generated by the three skilled subjects as they lifted three loads (69, 77, and 86% of their competition maximum). The results indicated that the skilled subjects lifted heavier loads by increasing the average power, but not the peak power, about the knee and ankle joints. In addition, the changes with load were more subtle than a mere quantitative scaling and also seemed to be associated with a skill element in the form of variation in the duration of the phases of power production and absorption. Similarly, statistical differences (independent t-test) due to skill did not involve changes in the magnitude of power but rather the temporal organization of the movement. Thus, the ability to successfully execute the double-knee-bend movement depends on an athlete's ability to both generate a sufficient magnitude of joint power and to organize the phases of power production and absorption into an appropriate temporal sequence.
NASA Technical Reports Server (NTRS)
Jeganathan, M.; Wilson, K. E.; Lesh, J. R.
1996-01-01
Uplink data from recent free-space optical communication experiments carried out between the Table Mountain Facility and the Japanese Engineering Test Satellite are used to study fluctuations caused by beam propagation through the atmosphere. The influence of atmospheric scintillation, beam wander and jitter, and multiple uplink beams on the statistics of power received by the satellite is analyzed and compared to experimental data. Preliminary analysis indicates the received signal obeys an approximate lognormal distribution, as predicted by the weak-turbulence model, but further characterization of other sources of fluctuations is necessary for accurate link predictions.
SPS market analysis. [small solar thermal power systems
NASA Technical Reports Server (NTRS)
Goff, H. C.
1980-01-01
A market analysis task included personal interviews by GE personnel and supplemental mail surveys to acquire statistical data and to identify and measure attitudes, reactions and intentions of prospective small solar thermal power systems (SPS) users. Over 500 firms were contacted, including three ownership classes of electric utilities, industrial firms in the top SIC codes for energy consumption, and design engineering firms. A market demand model was developed which utilizes the data base developed by personal interviews and surveys, and projected energy price and consumption data to perform sensitivity analyses and estimate potential markets for SPS.
NASA Technical Reports Server (NTRS)
Jeganathan, M.; Wilson, K. E.; Lesh, J. R.
1996-01-01
Uplink data from recent free-space optical communication experiments carried out between the Table Mountain Facility and the Japanese Engineering Test Satellite are used to study fluctuations caused by beam propagation through the atmosphere. The influence of atmospheric scintillation, beam wander and jitter, and multiple uplink beams on the statistics of power received by the satellite is analyzed and compared to experimental data. Preliminary analysis indicates the received signal obeys an approximate lognormal distribution, as predicted by the weak-turbulence model, but further characterization of other sources of fluctuations is necessary for accurate link predictions.
MAGMA: Generalized Gene-Set Analysis of GWAS Data
de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle
2015-01-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710
MAGMA: generalized gene-set analysis of GWAS data.
de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle
2015-04-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Vibration Response Models of a Stiffened Aluminum Plate Excited by a Shaker
NASA Technical Reports Server (NTRS)
Cabell, Randolph H.
2008-01-01
Numerical models of structural-acoustic interactions are of interest to aircraft designers and the space program. This paper describes a comparison between two energy finite element codes, a statistical energy analysis code, a structural finite element code, and the experimentally measured response of a stiffened aluminum plate excited by a shaker. Different methods for modeling the stiffeners and the power input from the shaker are discussed. The results show that the energy codes (energy finite element and statistical energy analysis) accurately predicted the measured mean square velocity of the plate. In addition, predictions from an energy finite element code had the best spatial correlation with measured velocities. However, predictions from a considerably simpler, single subsystem, statistical energy analysis model also correlated well with the spatial velocity distribution. The results highlight a need for further work to understand the relationship between modeling assumptions and the prediction results.
A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants
Broadaway, K. Alaine; Cutler, David J.; Duncan, Richard; Moore, Jacob L.; Ware, Erin B.; Jhun, Min A.; Bielak, Lawrence F.; Zhao, Wei; Smith, Jennifer A.; Peyser, Patricia A.; Kardia, Sharon L.R.; Ghosh, Debashis; Epstein, Michael P.
2016-01-01
Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy. PMID:26942286
Chiu, Chi-yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-ling; Xiong, Momiao; Fan, Ruzong
2017-01-01
To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data. PMID:28000696
Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong
2017-02-01
To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.
Genetic Alterations in Familial Breast Cancer: Mapping and Cloning Genes Other Than BRCAl
1997-09-01
predisposition to breast cancer in families. The gene PTEN was successfully cloned by this project, and simultaneously by others (for a different ...with germline translocations’and breast cancer for the identification of tumor suppressor genes. 14. SUBJECT TERMS Breast cancer 17. SECURITY...would limit the statistical power of linkage analysis. Therefore, we decided to integrate linkage analysis with the analysis of germline chromosomal
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-01-01
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
Angeler, David G; Viedma, Olga; Moreno, José M
2009-11-01
Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
An analysis of science versus pseudoscience
NASA Astrophysics Data System (ADS)
Hooten, James T.
2011-12-01
This quantitative study identified distinctive features in archival datasets commissioned by the National Science Foundation (NSF) for Science and Engineering Indicators reports. The dependent variables included education level, and scores for science fact knowledge, science process knowledge, and pseudoscience beliefs. The dependent variables were aggregated into nine NSF-defined geographic regions and examined for the years 2004 and 2006. The variables were also examined over all years available in the dataset. Descriptive statistics were determined and tests for normality and homogeneity of variances were performed using Statistical Package for the Social Sciences. Analysis of Variance was used to test for statistically significant differences between the nine geographic regions for each of the four dependent variables. Statistical significance of 0.05 was used. Tukey post-hoc analysis was used to compute practical significance of differences between regions. Post-hoc power analysis using G*Power was used to calculate the probability of Type II errors. Tests for correlations across all years of the dependent variables were also performed. Pearson's r was used to indicate the strength of the relationship between the dependent variables. Small to medium differences in science literacy and education level were observed between many of the nine U.S. geographic regions. The most significant differences occurred when the West South Central region was compared to the New England and the Pacific regions. Belief in pseudoscience appeared to be distributed evenly across all U.S. geographic regions. Education level was a strong indicator of science literacy regardless of a respondent's region of residence. Recommendations for further study include more in-depth investigation to uncover the nature of the relationship between education level and belief in pseudoscience.
Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.
Festing, M F
2001-01-01
In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.
The resolving power of in vitro genotoxicity assays for cigarette smoke particulate matter.
Scott, K; Saul, J; Crooks, I; Camacho, O M; Dillon, D; Meredith, C
2013-06-01
In vitro genotoxicity assays are often used to compare tobacco smoke particulate matter (PM) from different cigarettes. The quantitative aspect of the comparisons requires appropriate statistical methods and replication levels, to support the interpretation in terms of power and significance. This paper recommends a uniform statistical analysis for the Ames test, mouse lymphoma mammalian cell mutation assay (MLA) and the in vitro micronucleus test (IVMNT); involving a hierarchical decision process with respect to slope, fixed effect and single dose comparisons. With these methods, replication levels of 5 (Ames test TA98), 4 (Ames test TA100), 10 (Ames test TA1537), 6 (MLA) and 4 (IVMNT) resolved a 30% difference in PM genotoxicity. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Hu, Chongqing; Li, Aihua; Zhao, Xingyang
2011-02-01
This paper proposes a multivariate statistical analysis approach to processing the instantaneous engine speed signal for the purpose of locating multiple misfire events in internal combustion engines. The state of each cylinder is described with a characteristic vector extracted from the instantaneous engine speed signal following a three-step procedure. These characteristic vectors are considered as the values of various procedure parameters of an engine cycle. Therefore, determination of occurrence of misfire events and identification of misfiring cylinders can be accomplished by a principal component analysis (PCA) based pattern recognition methodology. The proposed algorithm can be implemented easily in practice because the threshold can be defined adaptively without the information of operating conditions. Besides, the effect of torsional vibration on the engine speed waveform is interpreted as the presence of super powerful cylinder, which is also isolated by the algorithm. The misfiring cylinder and the super powerful cylinder are often adjacent in the firing sequence, thus missing detections and false alarms can be avoided effectively by checking the relationship between the cylinders.
A Statistical Analysis Plan to Support the Joint Forward Area Air Defense Test.
1984-08-02
hy estahlishing a specific significance level prior to performing the statistical test (traditionally a levels are set at .01 or .05). What is often...undesirable increase in 8. For constant a levels , the power (I - 8) of a statistical test can he increased by Increasing the sample size of the test. fRef...ANOVA Iparison Test on MOP I=--ferences Exist AmongF "Upon MOP "A" Factor I "A" Factor I 1MOP " A " Levels ? I . I I I _ _ ________ IPerform k-Sample Com- I
NASA Astrophysics Data System (ADS)
Obozov, A. A.; Serpik, I. N.; Mihalchenko, G. S.; Fedyaeva, G. A.
2017-01-01
In the article, the problem of application of the pattern recognition (a relatively young area of engineering cybernetics) for analysis of complicated technical systems is examined. It is shown that the application of a statistical approach for hard distinguishable situations could be the most effective. The different recognition algorithms are based on Bayes approach, which estimates posteriori probabilities of a certain event and an assumed error. Application of the statistical approach to pattern recognition is possible for solving the problem of technical diagnosis complicated systems and particularly big powered marine diesel engines.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sathaye, Jayant A.
2000-04-01
Integrated assessment (IA) modeling of climate policy is increasingly global in nature, with models incorporating regional disaggregation. The existing empirical basis for IA modeling, however, largely arises from research on industrialized economies. Given the growing importance of developing countries in determining long-term global energy and carbon emissions trends, filling this gap with improved statistical information on developing countries' energy and carbon-emissions characteristics is an important priority for enhancing IA modeling. Earlier research at LBNL on this topic has focused on assembling and analyzing statistical data on productivity trends and technological change in the energy-intensive manufacturing sectors of five developing countries,more » India, Brazil, Mexico, Indonesia, and South Korea. The proposed work will extend this analysis to the agriculture and electric power sectors in India, South Korea, and two other developing countries. They will also examine the impact of alternative model specifications on estimates of productivity growth and technological change for each of the three sectors, and estimate the contribution of various capital inputs--imported vs. indigenous, rigid vs. malleable-- in contributing to productivity growth and technological change. The project has already produced a data resource on the manufacturing sector which is being shared with IA modelers. This will be extended to the agriculture and electric power sectors, which would also be made accessible to IA modeling groups seeking to enhance the empirical descriptions of developing country characteristics. The project will entail basic statistical and econometric analysis of productivity and energy trends in these developing country sectors, with parameter estimates also made available to modeling groups. The parameter estimates will be developed using alternative model specifications that could be directly utilized by the existing IAMs for the manufacturing, agriculture, and electric power sectors.« less
Austin, Peter C
2018-01-01
The use of the Cox proportional hazards regression model is widespread. A key assumption of the model is that of proportional hazards. Analysts frequently test the validity of this assumption using statistical significance testing. However, the statistical power of such assessments is frequently unknown. We used Monte Carlo simulations to estimate the statistical power of two different methods for detecting violations of this assumption. When the covariate was binary, we found that a model-based method had greater power than a method based on cumulative sums of martingale residuals. Furthermore, the parametric nature of the distribution of event times had an impact on power when the covariate was binary. Statistical power to detect a strong violation of the proportional hazards assumption was low to moderate even when the number of observed events was high. In many data sets, power to detect a violation of this assumption is likely to be low to modest.
Austin, Peter C.
2017-01-01
The use of the Cox proportional hazards regression model is widespread. A key assumption of the model is that of proportional hazards. Analysts frequently test the validity of this assumption using statistical significance testing. However, the statistical power of such assessments is frequently unknown. We used Monte Carlo simulations to estimate the statistical power of two different methods for detecting violations of this assumption. When the covariate was binary, we found that a model-based method had greater power than a method based on cumulative sums of martingale residuals. Furthermore, the parametric nature of the distribution of event times had an impact on power when the covariate was binary. Statistical power to detect a strong violation of the proportional hazards assumption was low to moderate even when the number of observed events was high. In many data sets, power to detect a violation of this assumption is likely to be low to modest. PMID:29321694
NASA Astrophysics Data System (ADS)
Hennig, R. J.; Friedrich, J.; Malaguzzi Valeri, L.; McCormick, C.; Lebling, K.; Kressig, A.
2016-12-01
The Power Watch project will offer open data on the global electricity sector starting with power plants and their impacts on climate and water systems; it will also offer visualizations and decision making tools. Power Watch will create the first comprehensive, open database of power plants globally by compiling data from national governments, public and private utilities, transmission grid operators, and other data providers to create a core dataset that has information on over 80% of global installed capacity for electrical generation. Power plant data will at a minimum include latitude and longitude, capacity, fuel type, emissions, water usage, ownership, and annual generation. By providing data that is both comprehensive, as well as making it publically available, this project will support decision making and analysis by actors across the economy and in the research community. The Power Watch research effort focuses on creating a global standard for power plant information, gathering and standardizing data from multiple sources, matching information from multiple sources on a plant level, testing cross-validation approaches (regional statistics, crowdsourcing, satellite data, and others) and developing estimation methodologies for generation, emissions, and water usage. When not available from official reports, emissions, annual generation, and water usage will be estimated. Water use estimates of power plants will be based on capacity, fuel type and satellite imagery to identify cooling types. This analysis is being piloted in several states in India and will then be scaled up to a global level. Other planned applications of of the Power Watch data include improving understanding of energy access, air pollution, emissions estimation, stranded asset analysis, life cycle analysis, tracking of proposed plants and curtailment analysis.
History and Development of the Schmidt-Hunter Meta-Analysis Methods
ERIC Educational Resources Information Center
Schmidt, Frank L.
2015-01-01
In this article, I provide answers to the questions posed by Will Shadish about the history and development of the Schmidt-Hunter methods of meta-analysis. In the 1970s, I headed a research program on personnel selection at the US Office of Personnel Management (OPM). After our research showed that validity studies have low statistical power, OPM…
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-09-14
This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.
Explorations in Statistics: Power
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2010-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fifth installment of "Explorations in Statistics" revisits power, a concept fundamental to the test of a null hypothesis. Power is the probability that we reject the null hypothesis when it is false. Four…
An Examination of Statistical Power in Multigroup Dynamic Structural Equation Models
ERIC Educational Resources Information Center
Prindle, John J.; McArdle, John J.
2012-01-01
This study used statistical simulation to calculate differential statistical power in dynamic structural equation models with groups (as in McArdle & Prindle, 2008). Patterns of between-group differences were simulated to provide insight into how model parameters influence power approximations. Chi-square and root mean square error of…
Inference from the small scales of cosmic shear with current and future Dark Energy Survey data
MacCrann, N.; Aleksić, J.; Amara, A.; ...
2016-11-05
Cosmic shear is sensitive to fluctuations in the cosmological matter density field, including on small physical scales, where matter clustering is affected by baryonic physics in galaxies and galaxy clusters, such as star formation, supernovae feedback and AGN feedback. While muddying any cosmological information that is contained in small scale cosmic shear measurements, this does mean that cosmic shear has the potential to constrain baryonic physics and galaxy formation. We perform an analysis of the Dark Energy Survey (DES) Science Verification (SV) cosmic shear measurements, now extended to smaller scales, and using the Mead et al. 2015 halo model tomore » account for baryonic feedback. While the SV data has limited statistical power, we demonstrate using a simulated likelihood analysis that the final DES data will have the statistical power to differentiate among baryonic feedback scenarios. We also explore some of the difficulties in interpreting the small scales in cosmic shear measurements, presenting estimates of the size of several other systematic effects that make inference from small scales difficult, including uncertainty in the modelling of intrinsic alignment on nonlinear scales, `lensing bias', and shape measurement selection effects. For the latter two, we make use of novel image simulations. While future cosmic shear datasets have the statistical power to constrain baryonic feedback scenarios, there are several systematic effects that require improved treatments, in order to make robust conclusions about baryonic feedback.« less
An empirical-statistical model for laser cladding of Ti-6Al-4V powder on Ti-6Al-4V substrate
NASA Astrophysics Data System (ADS)
Nabhani, Mohammad; Razavi, Reza Shoja; Barekat, Masoud
2018-03-01
In this article, Ti-6Al-4V powder alloy was directly deposited on Ti-6Al-4V substrate using laser cladding process. In this process, some key parameters such as laser power (P), laser scanning rate (V) and powder feeding rate (F) play important roles. Using linear regression analysis, this paper develops the empirical-statistical relation between these key parameters and geometrical characteristics of single clad tracks (i.e. clad height, clad width, penetration depth, wetting angle, and dilution) as a combined parameter (PαVβFγ). The results indicated that the clad width linearly depended on PV-1/3 and powder feeding rate had no effect on it. The dilution controlled by a combined parameter as VF-1/2 and laser power was a dispensable factor. However, laser power was the dominant factor for the clad height, penetration depth, and wetting angle so that they were proportional to PV-1F1/4, PVF-1/8, and P3/4V-1F-1/4, respectively. Based on the results of correlation coefficient (R > 0.9) and analysis of residuals, it was confirmed that these empirical-statistical relations were in good agreement with the measured values of single clad tracks. Finally, these relations led to the design of a processing map that can predict the geometrical characteristics of the single clad tracks based on the key parameters.
Mathur, Sunil; Sadana, Ajit
2015-12-01
We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.
Testing statistical isotropy in cosmic microwave background polarization maps
NASA Astrophysics Data System (ADS)
Rath, Pranati K.; Samal, Pramoda Kumar; Panda, Srikanta; Mishra, Debesh D.; Aluri, Pavan K.
2018-04-01
We apply our symmetry based Power tensor technique to test conformity of PLANCK Polarization maps with statistical isotropy. On a wide range of angular scales (l = 40 - 150), our preliminary analysis detects many statistically anisotropic multipoles in foreground cleaned full sky PLANCK polarization maps viz., COMMANDER and NILC. We also study the effect of residual foregrounds that may still be present in the Galactic plane using both common UPB77 polarization mask, as well as the individual component separation method specific polarization masks. However, some of the statistically anisotropic modes still persist, albeit significantly in NILC map. We further probed the data for any coherent alignments across multipoles in several bins from the chosen multipole range.
Dexter, Franklin; Shafer, Steven L
2017-03-01
Considerable attention has been drawn to poor reproducibility in the biomedical literature. One explanation is inadequate reporting of statistical methods by authors and inadequate assessment of statistical reporting and methods during peer review. In this narrative review, we examine scientific studies of several well-publicized efforts to improve statistical reporting. We also review several retrospective assessments of the impact of these efforts. These studies show that instructions to authors and statistical checklists are not sufficient; no findings suggested that either improves the quality of statistical methods and reporting. Second, even basic statistics, such as power analyses, are frequently missing or incorrectly performed. Third, statistical review is needed for all papers that involve data analysis. A consistent finding in the studies was that nonstatistical reviewers (eg, "scientific reviewers") and journal editors generally poorly assess statistical quality. We finish by discussing our experience with statistical review at Anesthesia & Analgesia from 2006 to 2016.
Sb2Te3 and Its Superlattices: Optimization by Statistical Design.
Behera, Jitendra K; Zhou, Xilin; Ranjan, Alok; Simpson, Robert E
2018-05-02
The objective of this work is to demonstrate the usefulness of fractional factorial design for optimizing the crystal quality of chalcogenide van der Waals (vdW) crystals. We statistically analyze the growth parameters of highly c axis oriented Sb 2 Te 3 crystals and Sb 2 Te 3 -GeTe phase change vdW heterostructured superlattices. The statistical significance of the growth parameters of temperature, pressure, power, buffer materials, and buffer layer thickness was found by fractional factorial design and response surface analysis. Temperature, pressure, power, and their second-order interactions are the major factors that significantly influence the quality of the crystals. Additionally, using tungsten rather than molybdenum as a buffer layer significantly enhances the crystal quality. Fractional factorial design minimizes the number of experiments that are necessary to find the optimal growth conditions, resulting in an order of magnitude improvement in the crystal quality. We highlight that statistical design of experiment methods, which is more commonly used in product design, should be considered more broadly by those designing and optimizing materials.
NASA Technical Reports Server (NTRS)
Rino, C. L.; Livingston, R. C.; Whitney, H. E.
1976-01-01
This paper presents an analysis of ionospheric scintillation data which shows that the underlying statistical structure of the signal can be accurately modeled by the additive complex Gaussian perturbation predicted by the Born approximation in conjunction with an application of the central limit theorem. By making use of this fact, it is possible to estimate the in-phase, phase quadrature, and cophased scattered power by curve fitting to measured intensity histograms. By using this procedure, it is found that typically more than 80% of the scattered power is in phase quadrature with the undeviated signal component. Thus, the signal is modeled by a Gaussian, but highly non-Rician process. From simultaneous UHF and VHF data, only a weak dependence of this statistical structure on changes in the Fresnel radius is deduced. The signal variance is found to have a nonquadratic wavelength dependence. It is hypothesized that this latter effect is a subtle manifestation of locally homogeneous irregularity structures, a mathematical model proposed by Kolmogorov (1941) in his early studies of incompressible fluid turbulence.
da Costa Lobato, Tarcísio; Hauser-Davis, Rachel Ann; de Oliveira, Terezinha Ferreira; Maciel, Marinalva Cardoso; Tavares, Maria Regina Madruga; da Silveira, Antônio Morais; Saraiva, Augusto Cesar Fonseca
2015-02-15
The Amazon area has been increasingly suffering from anthropogenic impacts, especially due to the construction of hydroelectric power plant reservoirs. The analysis and categorization of the trophic status of these reservoirs are of interest to indicate man-made changes in the environment. In this context, the present study aimed to categorize the trophic status of a hydroelectric power plant reservoir located in the Brazilian Amazon by constructing a novel Water Quality Index (WQI) and Trophic State Index (TSI) for the reservoir using major ion concentrations and physico-chemical water parameters determined in the area and taking into account the sampling locations and the local hydrological regimes. After applying statistical analyses (factor analysis and cluster analysis) and establishing a rule base of a fuzzy system to these indicators, the results obtained by the proposed method were then compared to the generally applied Carlson and a modified Lamparelli trophic state index (TSI), specific for trophic regions. The categorization of the trophic status by the proposed fuzzy method was shown to be more reliable, since it takes into account the specificities of the study area, while the Carlson and Lamparelli TSI do not, and, thus, tend to over or underestimate the trophic status of these ecosystems. The statistical techniques proposed and applied in the present study, are, therefore, relevant in cases of environmental management and policy decision-making processes, aiding in the identification of the ecological status of water bodies. With this, it is possible to identify which factors should be further investigated and/or adjusted in order to attempt the recovery of degraded water bodies. Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
The Electric Power Annual presents a summary of electric utility statistics at national, regional and State levels. The objective of the publication is to provide industry decisionmakers, government policymakers, analysts and the general public with historical data that may be used in understanding US electricity markets. The Electric Power Annual is prepared by the Survey Management Division; Office of Coal, Nuclear, Electric and Alternate Fuels; Energy Information Administration (EIA); US Department of Energy. ``The US Electric Power Industry at a Glance`` section presents a profile of the electric power industry ownership and performance, and a review of key statistics formore » the year. Subsequent sections present data on generating capability, including proposed capability additions; net generation; fossil-fuel statistics; retail sales; revenue; financial statistics; environmental statistics; electric power transactions; demand-side management; and nonutility power producers. In addition, the appendices provide supplemental data on major disturbances and unusual occurrences in US electricity power systems. Each section contains related text and tables and refers the reader to the appropriate publication that contains more detailed data on the subject matter. Monetary values in this publication are expressed in nominal terms.« less
A functional U-statistic method for association analysis of sequencing data.
Jadhav, Sneha; Tong, Xiaoran; Lu, Qing
2017-11-01
Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
Niederman, Richard
2014-09-01
The Cochrane Oral Health Group's Trials Register, the Cochrane Central Register of Controlled Trials (CENTRAL), Medline, Embase, CINAHL, National Institutes of Health Trials Register and the WHO Clinical Trials Registry Platform for ongoing trials. Reference lists of identified articles were also scanned for relevant papers. Identified manufacturers were contacted for additional information. Only randomised controlled trials comparing manual and powered toothbrushes were considered. Crossover trials were eligible for inclusion if the wash-out period length was more than two weeks. Study assessment and data extraction were carried out independently by at least two reviewers. The primary outcome measures were quantified levels of plaque or gingivitis. Risk of bias assessment was undertaken. Standard Cochrane methodological approaches were taken. Random-effects models were used provided there were four or more studies included in the meta-analysis, otherwise fixed-effect models were used. Data were classed as short term (one to three months) and long term (greater than three months). Fifty-six trials were included with 51 (4624 patients) providing data for meta-analysis. The majority (46) were at unclear risk of bias, five at high risk of bias and five at low risk. There was moderate quality evidence that powered toothbrushes provide a statistically significant benefit compared with manual toothbrushes with regard to the reduction of plaque in both the short and long-term. This corresponds to an 11% reduction in plaque for the Quigley Hein index (Turesky) in the short term and a 21% reduction in the long term. There was a high degree of heterogeneity that was not explained by the different powered toothbrush type subgroups.There was also moderate quality evidence that powered toothbrushes again provide a statistically significant reduction in gingivitis when compared with manual toothbrushes both in the short and long term. This corresponds to a 6% and 11% reduction in gingivitis for the Löe and Silness indices respectively. Again there was a high degree of heterogeneity that was not explained by the different powered toothbrush type subgroups. The greatest body of evidence was for rotation oscillation brushes which demonstrated a statistically significant reduction in plaque and gingivitis at both time points. Powered toothbrushes reduce plaque and gingivitis more than manual toothbrushing in the short and long term. The clinical importance of these findings remains unclear. Observation of methodological guidelines and greater standardisation of design would benefit both future trials and meta-analyses. Cost, reliability and side effects were inconsistently reported. Any reported side effects were localised and only temporary.
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments
Dorazio, Robert; Hunter, Margaret
2015-01-01
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Dorazio, Robert M; Hunter, Margaret E
2015-11-03
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Yu, Ningle; Zhang, Yimei; Wang, Jin; Cao, Xingjiang; Fan, Xiangyong; Xu, Xiaosan; Wang, Furu
2012-01-01
Aims: The aims of this paper were to determine the level of knowledge of and attitude to nuclear power among residents around Tianwan Nuclear power plant in Jiangsu of China. Design: A descriptive, cross-sectional design was adopted. Participants: 1,616 eligible participants who lived around the Tianwan nuclear power plant within a radius of 30km and at least 18 years old were recruited into our study and accepted epidemiological survey. Methods: Data were collected through self-administered questionnaires consisting of a socio-demographic sheet. Inferential statistics, t-test, ANOVA test and multivariate regression analysis were used to compare the differences between each subgroup and correlation analysis was conducted to understand the relationship between different factors and dependent variables. Results: Our investigation found that the level of awareness and acceptance of nuclear power was generally not high. Respondents' gender, age, marital status, residence, educational level, family income and the distance away from the nuclear power plant are important effect factors to the knowledge of and attitude to nuclear power. Conclusions: The public concerns about nuclear energy's impact are widespread. The level of awareness and acceptance of nuclear power needs to be improved urgently. PMID:22811610
Bonneville Power Administration Communication Alarm Processor expert system:
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goeltz, R.; Purucker, S.; Tonn, B.
This report describes the Communications Alarm Processor (CAP), a prototype expert system developed for the Bonneville Power Administration by Oak Ridge National Laboratory. The system is designed to receive and diagnose alarms from Bonneville's Microwave Communications System (MCS). The prototype encompasses one of seven branches of the communications network and a subset of alarm systems and alarm types from each system. The expert system employs a backward chaining approach to diagnosing alarms. Alarms are fed into the expert system directly from the communication system via RS232 ports and sophisticated alarm filtering and mailbox software. Alarm diagnoses are presented to operatorsmore » for their review and concurrence before the diagnoses are archived. Statistical software is incorporated to allow analysis of archived data for report generation and maintenance studies. The delivered system resides on a Digital Equipment Corporation VAX 3200 workstation and utilizes Nexpert Object and SAS for the expert system and statistical analysis, respectively. 11 refs., 23 figs., 7 tabs.« less
Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power
ERIC Educational Resources Information Center
Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M.; Vaughn, Sharon
2016-01-01
An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated…
Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power
Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M.; Vaughn, Sharon
2016-01-01
An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated measures. This can result in attenuated pretest-posttest correlations, reducing the variance explained by the pretest covariate. We investigated the implications of two potential range restriction scenarios: direct truncation on a selection measure and indirect range restriction on correlated measures. Empirical and simulated data indicated direct range restriction on the pretest covariate greatly reduced statistical power and necessitated sample size increases of 82%–155% (dependent on selection criteria) to achieve equivalent statistical power to parameters with unrestricted samples. However, measures demonstrating indirect range restriction required much smaller sample size increases (32%–71%) under equivalent scenarios. Additional analyses manipulated the correlations between measures and pretest-posttest correlations to guide planning experiments. Results highlight the need to differentiate between selection measures and potential covariates and to investigate range restriction as a factor impacting statistical power. PMID:28479943
Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power.
Miciak, Jeremy; Taylor, W Pat; Stuebing, Karla K; Fletcher, Jack M; Vaughn, Sharon
2016-01-01
An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated measures. This can result in attenuated pretest-posttest correlations, reducing the variance explained by the pretest covariate. We investigated the implications of two potential range restriction scenarios: direct truncation on a selection measure and indirect range restriction on correlated measures. Empirical and simulated data indicated direct range restriction on the pretest covariate greatly reduced statistical power and necessitated sample size increases of 82%-155% (dependent on selection criteria) to achieve equivalent statistical power to parameters with unrestricted samples. However, measures demonstrating indirect range restriction required much smaller sample size increases (32%-71%) under equivalent scenarios. Additional analyses manipulated the correlations between measures and pretest-posttest correlations to guide planning experiments. Results highlight the need to differentiate between selection measures and potential covariates and to investigate range restriction as a factor impacting statistical power.
Vallejo, Guillermo; Ato, Manuel; Fernández García, Paula; Livacic Rojas, Pablo E; Tuero Herrero, Ellián
2016-08-01
S. Usami (2014) describes a method to realistically determine sample size in longitudinal research using a multilevel model. The present research extends the aforementioned work to situations where it is likely that the assumption of homogeneity of the errors across groups is not met and the error term does not follow a scaled identity covariance structure. For this purpose, we followed a procedure based on transforming the variance components of the linear growth model and the parameter related to the treatment effect into specific and easily understandable indices. At the same time, we provide the appropriate statistical machinery for researchers to use when data loss is unavoidable, and changes in the expected value of the observed responses are not linear. The empirical powers based on unknown variance components were virtually the same as the theoretical powers derived from the use of statistically processed indexes. The main conclusion of the study is the accuracy of the proposed method to calculate sample size in the described situations with the stipulated power criteria.
Planck 2015 results. XVI. Isotropy and statistics of the CMB
NASA Astrophysics Data System (ADS)
Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Akrami, Y.; Aluri, P. K.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Basak, S.; Battaner, E.; Benabed, K.; Benoît, A.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Casaponsa, B.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Church, S.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Contreras, D.; Couchot, F.; Coulais, A.; Crill, B. P.; Cruz, M.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Désert, F.-X.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Fantaye, Y.; Fergusson, J.; Fernandez-Cobos, R.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frejsel, A.; Frolov, A.; Galeotta, S.; Galli, S.; Ganga, K.; Gauthier, C.; Ghosh, T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Hanson, D.; Harrison, D. L.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huang, Z.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Juvela, M.; Keihänen, E.; Keskitalo, R.; Kim, J.; Kisner, T. S.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Lesgourgues, J.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; Liu, H.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maggio, G.; Maino, D.; Mandolesi, N.; Mangilli, A.; Marinucci, D.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; McGehee, P.; Meinhold, P. R.; Melchiorri, A.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mikkelsen, K.; Mitra, S.; Miville-Deschênes, M.-A.; Molinari, D.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C. B.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Pant, N.; Paoletti, D.; Pasian, F.; Patanchon, G.; Pearson, T. J.; Perdereau, O.; Perotto, L.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Popa, L.; Pratt, G. W.; Prézeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Rotti, A.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Santos, D.; Savelainen, M.; Savini, G.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Souradeep, T.; Spencer, L. D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Trombetti, T.; Tucci, M.; Tuovinen, J.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; Yvon, D.; Zacchei, A.; Zibin, J. P.; Zonca, A.
2016-09-01
We test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect our studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The "Cold Spot" is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.
Planck 2015 results: XVI. Isotropy and statistics of the CMB
Ade, P. A. R.; Aghanim, N.; Akrami, Y.; ...
2016-09-20
In this paper, we test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect ourmore » studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The “Cold Spot” is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Finally, where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.« less
Soil carbon inventories under a bioenergy crop (switchgrass): Measurement limitations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garten, C.T. Jr.; Wullschleger, S.D.
Approximately 5 yr after planting, coarse root carbon (C) and soil organic C (SOC) inventories were compared under different types of plant cover at four switchgrass (Panicum virgatum L.) production field trials in the southeastern USA. There was significantly more coarse root C under switchgrass (Alamo variety) and forest cover than tall fescue (Festuca arundinacea Schreb.), corn (Zea mays L.), or native pastures of mixed grasses. Inventories of SOC under switchgrass were not significantly greater than SOC inventories under other plant covers. At some locations the statistical power associated with ANOVA of SOC inventories was low, which raised questions aboutmore » whether differences in SOC could be detected statistically. A minimum detectable difference (MDD) for SOC inventories was calculated. The MDD is the smallest detectable difference between treatment means once the variation, significance level, statistical power, and sample size are specified. The analysis indicated that a difference of {approx}50 mg SOC/cm{sup 2} or 5 Mg SOC/ha, which is {approx}10 to 15% of existing SOC, could be detected with reasonable sample sizes and good statistical power. The smallest difference in SOC inventories that can be detected, and only with exceedingly large sample sizes, is {approx}2 to 3%. These measurement limitations have implications for monitoring and verification of proposals to ameliorate increasing global atmospheric CO{sub 2} concentrations by sequestering C in soils.« less
Indoor Soiling Method and Outdoor Statistical Risk Analysis of Photovoltaic Power Plants
NASA Astrophysics Data System (ADS)
Rajasekar, Vidyashree
This is a two-part thesis. Part 1 presents an approach for working towards the development of a standardized artificial soiling method for laminated photovoltaic (PV) cells or mini-modules. Construction of an artificial chamber to maintain controlled environmental conditions and components/chemicals used in artificial soil formulation is briefly explained. Both poly-Si mini-modules and a single cell mono-Si coupons were soiled and characterization tests such as I-V, reflectance and quantum efficiency (QE) were carried out on both soiled, and cleaned coupons. From the results obtained, poly-Si mini-modules proved to be a good measure of soil uniformity, as any non-uniformity present would not result in a smooth curve during I-V measurements. The challenges faced while executing reflectance and QE characterization tests on poly-Si due to smaller size cells was eliminated on the mono-Si coupons with large cells to obtain highly repeatable measurements. This study indicates that the reflectance measurements between 600-700 nm wavelengths can be used as a direct measure of soil density on the modules. Part 2 determines the most dominant failure modes of field aged PV modules using experimental data obtained in the field and statistical analysis, FMECA (Failure Mode, Effect, and Criticality Analysis). The failure and degradation modes of about 744 poly-Si glass/polymer frameless modules fielded for 18 years under the cold-dry climate of New York was evaluated. Defect chart, degradation rates (both string and module levels) and safety map were generated using the field measured data. A statistical reliability tool, FMECA that uses Risk Priority Number (RPN) is used to determine the dominant failure or degradation modes in the strings and modules by means of ranking and prioritizing the modes. This study on PV power plants considers all the failure and degradation modes from both safety and performance perspectives. The indoor and outdoor soiling studies were jointly performed by two Masters Students, Sravanthi Boppana and Vidyashree Rajasekar. This thesis presents the indoor soiling study, whereas the other thesis presents the outdoor soiling study. Similarly, the statistical risk analyses of two power plants (model J and model JVA) were jointly performed by these two Masters students. Both power plants are located at the same cold-dry climate, but one power plant carries framed modules and the other carries frameless modules. This thesis presents the results obtained on the frameless modules.
Increasing power-law range in avalanche amplitude and energy distributions
NASA Astrophysics Data System (ADS)
Navas-Portella, Víctor; Serra, Isabel; Corral, Álvaro; Vives, Eduard
2018-02-01
Power-law-type probability density functions spanning several orders of magnitude are found for different avalanche properties. We propose a methodology to overcome empirical constraints that limit the range of truncated power-law distributions. By considering catalogs of events that cover different observation windows, the maximum likelihood estimation of a global power-law exponent is computed. This methodology is applied to amplitude and energy distributions of acoustic emission avalanches in failure-under-compression experiments of a nanoporous silica glass, finding in some cases global exponents in an unprecedented broad range: 4.5 decades for amplitudes and 9.5 decades for energies. In the latter case, however, strict statistical analysis suggests experimental limitations might alter the power-law behavior.
Increasing power-law range in avalanche amplitude and energy distributions.
Navas-Portella, Víctor; Serra, Isabel; Corral, Álvaro; Vives, Eduard
2018-02-01
Power-law-type probability density functions spanning several orders of magnitude are found for different avalanche properties. We propose a methodology to overcome empirical constraints that limit the range of truncated power-law distributions. By considering catalogs of events that cover different observation windows, the maximum likelihood estimation of a global power-law exponent is computed. This methodology is applied to amplitude and energy distributions of acoustic emission avalanches in failure-under-compression experiments of a nanoporous silica glass, finding in some cases global exponents in an unprecedented broad range: 4.5 decades for amplitudes and 9.5 decades for energies. In the latter case, however, strict statistical analysis suggests experimental limitations might alter the power-law behavior.
2013-01-01
Background The theoretical basis of genome-wide association studies (GWAS) is statistical inference of linkage disequilibrium (LD) between any polymorphic marker and a putative disease locus. Most methods widely implemented for such analyses are vulnerable to several key demographic factors and deliver a poor statistical power for detecting genuine associations and also a high false positive rate. Here, we present a likelihood-based statistical approach that accounts properly for non-random nature of case–control samples in regard of genotypic distribution at the loci in populations under study and confers flexibility to test for genetic association in presence of different confounding factors such as population structure, non-randomness of samples etc. Results We implemented this novel method together with several popular methods in the literature of GWAS, to re-analyze recently published Parkinson’s disease (PD) case–control samples. The real data analysis and computer simulation show that the new method confers not only significantly improved statistical power for detecting the associations but also robustness to the difficulties stemmed from non-randomly sampling and genetic structures when compared to its rivals. In particular, the new method detected 44 significant SNPs within 25 chromosomal regions of size < 1 Mb but only 6 SNPs in two of these regions were previously detected by the trend test based methods. It discovered two SNPs located 1.18 Mb and 0.18 Mb from the PD candidates, FGF20 and PARK8, without invoking false positive risk. Conclusions We developed a novel likelihood-based method which provides adequate estimation of LD and other population model parameters by using case and control samples, the ease in integration of these samples from multiple genetically divergent populations and thus confers statistically robust and powerful analyses of GWAS. On basis of simulation studies and analysis of real datasets, we demonstrated significant improvement of the new method over the non-parametric trend test, which is the most popularly implemented in the literature of GWAS. PMID:23394771
HYPOTHESIS SETTING AND ORDER STATISTIC FOR ROBUST GENOMIC META-ANALYSIS.
Song, Chi; Tseng, George C
2014-01-01
Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values ( r th ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.
Improving accuracy and power with transfer learning using a meta-analytic database.
Schwartz, Yannick; Varoquaux, Gaël; Pallier, Christophe; Pinel, Philippe; Poline, Jean-Baptiste; Thirion, Bertrand
2012-01-01
Typical cohorts in brain imaging studies are not large enough for systematic testing of all the information contained in the images. To build testable working hypotheses, investigators thus rely on analysis of previous work, sometimes formalized in a so-called meta-analysis. In brain imaging, this approach underlies the specification of regions of interest (ROIs) that are usually selected on the basis of the coordinates of previously detected effects. In this paper, we propose to use a database of images, rather than coordinates, and frame the problem as transfer learning: learning a discriminant model on a reference task to apply it to a different but related new task. To facilitate statistical analysis of small cohorts, we use a sparse discriminant model that selects predictive voxels on the reference task and thus provides a principled procedure to define ROIs. The benefits of our approach are twofold. First it uses the reference database for prediction, i.e., to provide potential biomarkers in a clinical setting. Second it increases statistical power on the new task. We demonstrate on a set of 18 pairs of functional MRI experimental conditions that our approach gives good prediction. In addition, on a specific transfer situation involving different scanners at different locations, we show that voxel selection based on transfer learning leads to higher detection power on small cohorts.
New powerful statistics for alignment-free sequence comparison under a pattern transfer model.
Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu
2011-09-07
Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.
New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model
Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu
2011-01-01
Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298
Statistical Learning Analysis in Neuroscience: Aiming for Transparency
Hanke, Michael; Halchenko, Yaroslav O.; Haxby, James V.; Pollmann, Stefan
2009-01-01
Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires “neuroscience-aware” technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270
Seo, Jong-Geun; Kang, Kyunghun; Jung, Ji-Young; Park, Sung-Pa; Lee, Maan-Gee; Lee, Ho-Won
2014-12-01
In this pilot study, we analyzed relationships between quantitative EEG measurements and clinical parameters in idiopathic normal pressure hydrocephalus patients, along with differences in these quantitative EEG markers between cerebrospinal fluid tap test responders and nonresponders. Twenty-six idiopathic normal pressure hydrocephalus patients (9 cerebrospinal fluid tap test responders and 17 cerebrospinal fluid tap test nonresponders) constituted the final group for analysis. The resting EEG was recorded and relative powers were computed for seven frequency bands. Cerebrospinal fluid tap test nonresponders, when compared with responders, showed a statistically significant increase in alpha2 band power at the right frontal and centrotemporal regions. Higher delta2 band powers in the frontal, central, parietal, and occipital regions and lower alpha1 band powers in the right temporal region significantly correlated with poorer cognitive performance. Higher theta1 band powers in the left parietal and occipital regions significantly correlated with gait dysfunction. And higher delta1 band powers in the right frontal regions significantly correlated with urinary disturbance. Our findings may encourage further research using quantitative EEG in patients with ventriculomegaly as a potential electrophysiological marker for predicting cerebrospinal fluid tap test responders. This study additionally suggests that the delta, theta, and alpha bands are statistically correlated with the severity of symptoms in idiopathic normal pressure hydrocephalus patients.
Uddameri, Venkatesh; Singaraju, Sreeram; Hernandez, E Annette
2018-02-21
Seasonal and cyclic trends in nutrient concentrations at four agricultural drainage ditches were assessed using a dataset generated from a multivariate, multiscale, multiyear water quality monitoring effort in the agriculturally dominant Lower Rio Grande Valley (LRGV) River Watershed in South Texas. An innovative bootstrap sampling-based power analysis procedure was developed to evaluate the ability of Mann-Whitney and Noether tests to discern trends and to guide future monitoring efforts. The Mann-Whitney U test was able to detect significant changes between summer and winter nutrient concentrations at sites with lower depths and unimpeded flows. Pollutant dilution, non-agricultural loadings, and in-channel flow structures (weirs) masked the effects of seasonality. The detection of cyclical trends using the Noether test was highest in the presence of vegetation mainly for total phosphorus and oxidized nitrogen (nitrite + nitrate) compared to dissolved phosphorus and reduced nitrogen (total Kjeldahl nitrogen-TKN). Prospective power analysis indicated that while increased monitoring can lead to higher statistical power, the effect size (i.e., the total number of trend sequences within a time-series) had a greater influence on the Noether test. Both Mann-Whitney and Noether tests provide complementary information on seasonal and cyclic behavior of pollutant concentrations and are affected by different processes. The results from these statistical tests when evaluated in the context of flow, vegetation, and in-channel hydraulic alterations can help guide future data collection and monitoring efforts. The study highlights the need for long-term monitoring of agricultural drainage ditches to properly discern seasonal and cyclical trends.
A novel bi-level meta-analysis approach: applied to biological pathway analysis.
Nguyen, Tin; Tagett, Rebecca; Donato, Michele; Mitrea, Cristina; Draghici, Sorin
2016-02-01
The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of multiple datasets from independent experiments. However, study heterogeneity, study bias, outliers and the lack of power of available methods present real challenge in integrating genomic data. One practical drawback of many P-value-based meta-analysis methods, including Fisher's, Stouffer's, minP and maxP, is that they are sensitive to outliers. Another drawback is that, because they perform just one statistical test for each individual experiment, they may not fully exploit the potentially large number of samples within each study. We propose a novel bi-level meta-analysis approach that employs the additive method and the Central Limit Theorem within each individual experiment and also across multiple experiments. We prove that the bi-level framework is robust against bias, less sensitive to outliers than other methods, and more sensitive to small changes in signal. For comparative analysis, we demonstrate that the intra-experiment analysis has more power than the equivalent statistical test performed on a single large experiment. For pathway analysis, we compare the proposed framework versus classical meta-analysis approaches (Fisher's, Stouffer's and the additive method) as well as against a dedicated pathway meta-analysis package (MetaPath), using 1252 samples from 21 datasets related to three human diseases, acute myeloid leukemia (9 datasets), type II diabetes (5 datasets) and Alzheimer's disease (7 datasets). Our framework outperforms its competitors to correctly identify pathways relevant to the phenotypes. The framework is sufficiently general to be applied to any type of statistical meta-analysis. The R scripts are available on demand from the authors. sorin@wayne.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Houel, Julien; Doan, Quang T; Cajgfinger, Thomas; Ledoux, Gilles; Amans, David; Aubret, Antoine; Dominjon, Agnès; Ferriol, Sylvain; Barbier, Rémi; Nasilowski, Michel; Lhuillier, Emmanuel; Dubertret, Benoît; Dujardin, Christophe; Kulzer, Florian
2015-01-27
We present an unbiased and robust analysis method for power-law blinking statistics in the photoluminescence of single nanoemitters, allowing us to extract both the bright- and dark-state power-law exponents from the emitters' intensity autocorrelation functions. As opposed to the widely used threshold method, our technique therefore does not require discriminating the emission levels of bright and dark states in the experimental intensity timetraces. We rely on the simultaneous recording of 450 emission timetraces of single CdSe/CdS core/shell quantum dots at a frame rate of 250 Hz with single photon sensitivity. Under these conditions, our approach can determine ON and OFF power-law exponents with a precision of 3% from a comparison to numerical simulations, even for shot-noise-dominated emission signals with an average intensity below 1 photon per frame and per quantum dot. These capabilities pave the way for the unbiased, threshold-free determination of blinking power-law exponents at the microsecond time scale.
The power and limits of a rule-based morpho-semantic parser.
Baud, R. H.; Rassinoux, A. M.; Ruch, P.; Lovis, C.; Scherrer, J. R.
1999-01-01
The venue of Electronic Patient Record (EPR) implies an increasing amount of medical texts readily available for processing, as soon as convenient tools are made available. The chief application is text analysis, from which one can drive other disciplines like indexing for retrieval, knowledge representation, translation and inferencing for medical intelligent systems. Prerequisites for a convenient analyzer of medical texts are: building the lexicon, developing semantic representation of the domain, having a large corpus of texts available for statistical analysis, and finally mastering robust and powerful parsing techniques in order to satisfy the constraints of the medical domain. This article aims at presenting an easy-to-use parser ready to be adapted in different settings. It describes its power together with its practical limitations as experienced by the authors. PMID:10566313
The power and limits of a rule-based morpho-semantic parser.
Baud, R H; Rassinoux, A M; Ruch, P; Lovis, C; Scherrer, J R
1999-01-01
The venue of Electronic Patient Record (EPR) implies an increasing amount of medical texts readily available for processing, as soon as convenient tools are made available. The chief application is text analysis, from which one can drive other disciplines like indexing for retrieval, knowledge representation, translation and inferencing for medical intelligent systems. Prerequisites for a convenient analyzer of medical texts are: building the lexicon, developing semantic representation of the domain, having a large corpus of texts available for statistical analysis, and finally mastering robust and powerful parsing techniques in order to satisfy the constraints of the medical domain. This article aims at presenting an easy-to-use parser ready to be adapted in different settings. It describes its power together with its practical limitations as experienced by the authors.
Liew, Bernard X W; Drovandi, Christopher C; Clifford, Samuel; Keogh, Justin W L; Morris, Susan; Netto, Kevin
2018-01-01
There is convincing evidence for the benefits of resistance training on vertical jump improvements, but little evidence to guide optimal training prescription. The inability to detect small between modality effects may partially reflect the use of ANOVA statistics. This study represents the results of a sub-study from a larger project investigating the effects of two resistance training methods on load carriage running energetics. Bayesian statistics were used to compare the effectiveness of isoinertial resistance against speed-power training to change countermovement jump (CMJ) and squat jump (SJ) height, and joint energetics. Active adults were randomly allocated to either a six-week isoinertial ( n = 16; calf raises, leg press, and lunge), or a speed-power training program ( n = 14; countermovement jumps, hopping, with hip flexor training to target pre-swing running energetics). Primary outcome variables included jump height and joint power. Bayesian mixed modelling and Functional Data Analysis were used, where significance was determined by a non-zero crossing of the 95% Bayesian Credible Interval (CrI). The gain in CMJ height after isoinertial training was 1.95 cm (95% CrI [0.85-3.04] cm) greater than the gain after speed-power training, but the gain in SJ height was similar between groups. In the CMJ, isoinertial training produced a larger increase in power absorption at the hip by a mean 0.018% (equivalent to 35 W) (95% CrI [0.007-0.03]), knee by 0.014% (equivalent to 27 W) (95% CrI [0.006-0.02]) and foot by 0.011% (equivalent to 21 W) (95% CrI [0.005-0.02]) compared to speed-power training. Short-term isoinertial training improved CMJ height more than speed-power training. The principle adaptive difference between training modalities was at the level of hip, knee and foot power absorption.
van Gelder, P.H.A.J.M.; Nijs, M.
2011-01-01
Decisions about pharmacotherapy are being taken by medical doctors and authorities based on comparative studies on the use of medications. In studies on fertility treatments in particular, the methodological quality is of utmost importance in the application of evidence-based medicine and systematic reviews. Nevertheless, flaws and omissions appear quite regularly in these types of studies. Current study aims to present an overview of some of the typical statistical flaws, illustrated by a number of example studies which have been published in peer reviewed journals. Based on an investigation of eleven studies at random selected on fertility treatments with cryopreservation, it appeared that the methodological quality of these studies often did not fulfil the required statistical criteria. The following statistical flaws were identified: flaws in study design, patient selection, and units of analysis or in the definition of the primary endpoints. Other errors could be found in p-value and power calculations or in critical p-value definitions. Proper interpretation of the results and/or use of these study results in a meta analysis should therefore be conducted with care. PMID:24753877
van Gelder, P H A J M; Nijs, M
2011-01-01
Decisions about pharmacotherapy are being taken by medical doctors and authorities based on comparative studies on the use of medications. In studies on fertility treatments in particular, the methodological quality is of utmost -importance in the application of evidence-based medicine and systematic reviews. Nevertheless, flaws and omissions appear quite regularly in these types of studies. Current study aims to present an overview of some of the typical statistical flaws, illustrated by a number of example studies which have been published in peer reviewed journals. Based on an investigation of eleven studies at random selected on fertility treatments with cryopreservation, it appeared that the methodological quality of these studies often did not fulfil the -required statistical criteria. The following statistical flaws were identified: flaws in study design, patient selection, and units of analysis or in the definition of the primary endpoints. Other errors could be found in p-value and power calculations or in critical p-value definitions. Proper -interpretation of the results and/or use of these study results in a meta analysis should therefore be conducted with care.
Replication Unreliability in Psychology: Elusive Phenomena or “Elusive” Statistical Power?
Tressoldi, Patrizio E.
2012-01-01
The focus of this paper is to analyze whether the unreliability of results related to certain controversial psychological phenomena may be a consequence of their low statistical power. Applying the Null Hypothesis Statistical Testing (NHST), still the widest used statistical approach, unreliability derives from the failure to refute the null hypothesis, in particular when exact or quasi-exact replications of experiments are carried out. Taking as example the results of meta-analyses related to four different controversial phenomena, subliminal semantic priming, incubation effect for problem solving, unconscious thought theory, and non-local perception, it was found that, except for semantic priming on categorization, the statistical power to detect the expected effect size (ES) of the typical study, is low or very low. The low power in most studies undermines the use of NHST to study phenomena with moderate or low ESs. We conclude by providing some suggestions on how to increase the statistical power or use different statistical approaches to help discriminate whether the results obtained may or may not be used to support or to refute the reality of a phenomenon with small ES. PMID:22783215
Brautigam, Chad A; Zhao, Huaying; Vargas, Carolyn; Keller, Sandro; Schuck, Peter
2016-05-01
Isothermal titration calorimetry (ITC) is a powerful and widely used method to measure the energetics of macromolecular interactions by recording a thermogram of differential heating power during a titration. However, traditional ITC analysis is limited by stochastic thermogram noise and by the limited information content of a single titration experiment. Here we present a protocol for bias-free thermogram integration based on automated shape analysis of the injection peaks, followed by combination of isotherms from different calorimetric titration experiments into a global analysis, statistical analysis of binding parameters and graphical presentation of the results. This is performed using the integrated public-domain software packages NITPIC, SEDPHAT and GUSSI. The recently developed low-noise thermogram integration approach and global analysis allow for more precise parameter estimates and more reliable quantification of multisite and multicomponent cooperative and competitive interactions. Titration experiments typically take 1-2.5 h each, and global analysis usually takes 10-20 min.
Separate enrichment analysis of pathways for up- and downregulated genes.
Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng
2014-03-06
Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.
Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew
2012-08-08
Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
NASA Astrophysics Data System (ADS)
Maolikul, S.; Kiatgamolchai, S.; Chavarnakul, T.
2012-06-01
In the context of information and communication technology (ICT) trend for worldwide individuals, social life becomes digital and portable consumer electronic devices (PCED) powered by conventional power supply from batteries have been evolving through miniaturization and various function integration. Thermoelectric generators (TEG) were hypothesized for its potential role of battery charger to serve the shining PCED market. Hence, this paper, mainly focusing at the metropolitan market in Thailand, aimed to conduct architectural innovation foresight and to develop scenarios on potential exploitation approach of PCED battery power supply with TEG charger converting power from ambient heat source adjacent to individual's daily life. After technical review and assessment for TEG potential and battery aspect, the business research was conducted to analyze PCED consumer behavior for their PCED utilization pattern, power supply lack problems, and encountering heat sources/sinks in 3 modes: daily life, work, and leisure hobbies. Based on the secondary data analysis from literature and National Statistical Office of Thailand, quantitative analysis was applied using the cluster probability sampling methodology, statistically, with the sample size of 400 at 0.05 level of significance. In addition, the qualitative analysis was conducted to emphasize the rationale of consumer's behavior using in-depth qualitative interview. Scenario planning technique was also used to generate technological and market trend foresight. Innovation field and potential scenario for matching technology with market was proposed in this paper. The ingredient for successful commercialization of battery power supply with TEG charger for PCED market consists of 5 factors as follows: (1) PCED characteristic, (2) potential ambient heat sources/sinks, (3) battery module, (4) power management module, and the final jigsaw (5) characteristic and adequate arrangement of TEG modules. The foresight outcome for the potential innovations represents a case study in the pilot commercialization of TEG technology for some interesting niche markets in metropolitan area of Thailand, and, thus, can be the clue for product development related to TEG for market-driven application in other similar requirement conditions and contexts as well.
Conroy, M.J.; Samuel, M.D.; White, Joanne C.
1995-01-01
Statistical power (and conversely, Type II error) is often ignored by biologists. Power is important to consider in the design of studies, to ensure that sufficient resources are allocated to address a hypothesis under examination. Deter- mining appropriate sample size when designing experiments or calculating power for a statistical test requires an investigator to consider the importance of making incorrect conclusions about the experimental hypothesis and the biological importance of the alternative hypothesis (or the biological effect size researchers are attempting to measure). Poorly designed studies frequently provide results that are at best equivocal, and do little to advance science or assist in decision making. Completed studies that fail to reject Ho should consider power and the related probability of a Type II error in the interpretation of results, particularly when implicit or explicit acceptance of Ho is used to support a biological hypothesis or management decision. Investigators must consider the biological question they wish to answer (Tacha et al. 1982) and assess power on the basis of biologically significant differences (Taylor and Gerrodette 1993). Power calculations are somewhat subjective, because the author must specify either f or the minimum difference that is biologically important. Biologists may have different ideas about what values are appropriate. While determining biological significance is of central importance in power analysis, it is also an issue of importance in wildlife science. Procedures, references, and computer software to compute power are accessible; therefore, authors should consider power. We welcome comments or suggestions on this subject.
ERIC Educational Resources Information Center
Warne, Russell T.
2016-01-01
Recently Kim (2016) published a meta-analysis on the effects of enrichment programs for gifted students. She found that these programs produced substantial effects for academic achievement (g = 0.96) and socioemotional outcomes (g = 0.55). However, given current theory and empirical research these estimates of the benefits of enrichment programs…
Between disorder and order: A case study of power law
NASA Astrophysics Data System (ADS)
Cao, Yong; Zhao, Youjie; Yue, Xiaoguang; Xiong, Fei; Sun, Yongke; He, Xin; Wang, Lichao
2016-08-01
Power law is an important feature of phenomena in long memory behaviors. Zipf ever found power law in the distribution of the word frequencies. In physics, the terms order and disorder are Thermodynamic or statistical physics concepts originally and a lot of research work has focused on self-organization of the disorder ingredients of simple physical systems. It is interesting what make disorder-order transition. We devise an experiment-based method about random symbolic sequences to research regular pattern between disorder and order. The experiment results reveal power law is indeed an important regularity in transition from disorder to order. About these results the preliminary study and analysis has been done to explain the reasons.
Hyatt, M.W.; Hubert, W.A.
2001-01-01
We assessed relative weight (Wr) distributions among 291 samples of stock-to-quality-length brook trout Salvelinus fontinalis, brown trout Salmo trutta, rainbow trout Oncorhynchus mykiss, and cutthroat trout O. clarki from lentic and lotic habitats. Statistics describing Wr sample distributions varied slightly among species and habitat types. The average sample was leptokurtotic and slightly skewed to the right with a standard deviation of about 10, but the shapes of Wr distributions varied widely among samples. Twenty-two percent of the samples had nonnormal distributions, suggesting the need to evaluate sample distributions before applying statistical tests to determine whether assumptions are met. In general, our findings indicate that samples of about 100 stock-to-quality-length fish are needed to obtain confidence interval widths of four Wr units around the mean. Power analysis revealed that samples of about 50 stock-to-quality-length fish are needed to detect a 2% change in mean Wr at a relatively high level of power (beta = 0.01, alpha = 0.05).
Li, Zhenghua; Cheng, Fansheng; Xia, Zhining
2011-01-01
The chemical structures of 114 polycyclic aromatic sulfur heterocycles (PASHs) have been studied by molecular electronegativity-distance vector (MEDV). The linear relationships between gas chromatographic retention index and the MEDV have been established by a multiple linear regression (MLR) model. The results of variable selection by stepwise multiple regression (SMR) and the powerful predictive abilities of the optimization model appraised by leave-one-out cross-validation showed that the optimization model with the correlation coefficient (R) of 0.994 7 and the cross-validated correlation coefficient (Rcv) of 0.994 0 possessed the best statistical quality. Furthermore, when the 114 PASHs compounds were divided into calibration and test sets in the ratio of 2:1, the statistical analysis showed our models possesses almost equal statistical quality, the very similar regression coefficients and the good robustness. The quantitative structure-retention relationship (QSRR) model established may provide a convenient and powerful method for predicting the gas chromatographic retention of PASHs.
Presotto, L; Bettinardi, V; De Bernardi, E; Belli, M L; Cattaneo, G M; Broggi, S; Fiorino, C
2018-06-01
The analysis of PET images by textural features, also known as radiomics, shows promising results in tumor characterization. However, radiomic metrics (RMs) analysis is currently not standardized and the impact of the whole processing chain still needs deep investigation. We characterized the impact on RM values of: i) two discretization methods, ii) acquisition statistics, and iii) reconstruction algorithm. The influence of tumor volume and standardized-uptake-value (SUV) on RM was also investigated. The Chang-Gung-Image-Texture-Analysis (CGITA) software was used to calculate 39 RMs using phantom data. Thirty noise realizations were acquired to measure statistical effect size indicators for each RM. The parameter η 2 (fraction of variance explained by the nuisance factor) was used to assess the effect of categorical variables, considering η 2 < 20% and 20% < η 2 < 40% as representative of a "negligible" and a "small" dependence respectively. The Cohen's d was used as discriminatory power to quantify the separation of two distributions. We found the discretization method based on fixed-bin-number (FBN) to outperform the one based on fixed-bin-size in units of SUV (FBS), as the latter shows a higher SUV dependence, with 30 RMs showing η 2 > 20%. FBN was also less influenced by the acquisition and reconstruction setup:with FBN 37 RMs had η 2 < 40%, only 20 with FBS. Most RMs showed a good discriminatory power among heterogeneous PET signals (for FBN: 29 out of 39 RMs with d > 3). For RMs analysis, FBN should be preferred. A group of 21 RMs was suggested for PET radiomics analysis. Copyright © 2018 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Sunspot activity and influenza pandemics: a statistical assessment of the purported association.
Towers, S
2017-10-01
Since 1978, a series of papers in the literature have claimed to find a significant association between sunspot activity and the timing of influenza pandemics. This paper examines these analyses, and attempts to recreate the three most recent statistical analyses by Ertel (1994), Tapping et al. (2001), and Yeung (2006), which all have purported to find a significant relationship between sunspot numbers and pandemic influenza. As will be discussed, each analysis had errors in the data. In addition, in each analysis arbitrary selections or assumptions were also made, and the authors did not assess the robustness of their analyses to changes in those arbitrary assumptions. Varying the arbitrary assumptions to other, equally valid, assumptions negates the claims of significance. Indeed, an arbitrary selection made in one of the analyses appears to have resulted in almost maximal apparent significance; changing it only slightly yields a null result. This analysis applies statistically rigorous methodology to examine the purported sunspot/pandemic link, using more statistically powerful un-binned analysis methods, rather than relying on arbitrarily binned data. The analyses are repeated using both the Wolf and Group sunspot numbers. In all cases, no statistically significant evidence of any association was found. However, while the focus in this particular analysis was on the purported relationship of influenza pandemics to sunspot activity, the faults found in the past analyses are common pitfalls; inattention to analysis reproducibility and robustness assessment are common problems in the sciences, that are unfortunately not noted often enough in review.
Meta-analysis of gene-level associations for rare variants based on single-variant statistics.
Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu
2013-08-08
Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
O'Keefe, Daniel J; Jensen, Jakob D
2007-01-01
A meta-analytic review of 93 studies (N = 21,656) finds that in disease prevention messages, gain-framed appeals, which emphasize the advantages of compliance with the communicator's recommendation, are statistically significantly more persuasive than loss-framed appeals, which emphasize the disadvantages of noncompliance. This difference is quite small (corresponding to r = .03), however, and appears attributable to a relatively large (and statistically significant) effect for messages advocating dental hygiene behaviors. Despite very good statistical power, the analysis finds no statistically significant differences in persuasiveness between gain- and loss-framed messages concerning other preventive actions such as safer-sex behaviors, skin cancer prevention behaviors, or diet and nutrition behaviors.
Ma, Li-Xin; Liu, Jian-Ping
2012-01-01
To investigate whether the power of the effect size was based on adequate sample size in randomized controlled trials (RCTs) for the treatment of patients with type 2 diabetes mellitus (T2DM) using Chinese medicine. China Knowledge Resource Integrated Database (CNKI), VIP Database for Chinese Technical Periodicals (VIP), Chinese Biomedical Database (CBM), and Wangfang Data were systematically recruited using terms like "Xiaoke" or diabetes, Chinese herbal medicine, patent medicine, traditional Chinese medicine, randomized, controlled, blinded, and placebo-controlled. Limitation was set on the intervention course > or = 3 months in order to identify the information of outcome assessement and the sample size. Data collection forms were made according to the checking lists found in the CONSORT statement. Independent double data extractions were performed on all included trials. The statistical power of the effects size for each RCT study was assessed using sample size calculation equations. (1) A total of 207 RCTs were included, including 111 superiority trials and 96 non-inferiority trials. (2) Among the 111 superiority trials, fasting plasma glucose (FPG) and glycosylated hemoglobin HbA1c (HbA1c) outcome measure were reported in 9% and 12% of the RCTs respectively with the sample size > 150 in each trial. For the outcome of HbA1c, only 10% of the RCTs had more than 80% power. For FPG, 23% of the RCTs had more than 80% power. (3) In the 96 non-inferiority trials, the outcomes FPG and HbA1c were reported as 31% and 36% respectively. These RCTs had a samples size > 150. For HbA1c only 36% of the RCTs had more than 80% power. For FPG, only 27% of the studies had more than 80% power. The sample size for statistical analysis was distressingly low and most RCTs did not achieve 80% power. In order to obtain a sufficient statistic power, it is recommended that clinical trials should establish clear research objective and hypothesis first, and choose scientific and evidence-based study design and outcome measurements. At the same time, calculate required sample size to ensure a precise research conclusion.
Estimating the vibration level of an L-shaped beam using power flow techniques
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.; Mccollum, M.; Rassineux, J. L.; Gilbert, T.
1986-01-01
The response of one component of an L-shaped beam, with point force excitation on the other component, is estimated using the power flow method. The transmitted power from the source component to the receiver component is expressed in terms of the transfer and input mobilities at the excitation point and the joint. The response is estimated both in narrow frequency bands, using the exact geometry of the beams, and as a frequency averaged response using infinite beam models. The results using this power flow technique are compared to the results obtained using finite element analysis (FEA) of the L-shaped beam for the low frequency response and to results obtained using statistical energy analysis (SEA) for the high frequencies. The agreement between the FEA results and the power flow method results at low frequencies is very good. SEA results are in terms of frequency averaged levels and these are in perfect agreement with the results obtained using the infinite beam models in the power flow method. The narrow frequency band results from the power flow method also converge to the SEA results at high frequencies. The advantage of the power flow method is that detail of the response can be retained while reducing computation time, which will allow the narrow frequency band analysis of the response to be extended to higher frequencies.
EEG and MEG data analysis in SPM8.
Litvak, Vladimir; Mattout, Jérémie; Kiebel, Stefan; Phillips, Christophe; Henson, Richard; Kilner, James; Barnes, Gareth; Oostenveld, Robert; Daunizeau, Jean; Flandin, Guillaume; Penny, Will; Friston, Karl
2011-01-01
SPM is a free and open source software written in MATLAB (The MathWorks, Inc.). In addition to standard M/EEG preprocessing, we presently offer three main analysis tools: (i) statistical analysis of scalp-maps, time-frequency images, and volumetric 3D source reconstruction images based on the general linear model, with correction for multiple comparisons using random field theory; (ii) Bayesian M/EEG source reconstruction, including support for group studies, simultaneous EEG and MEG, and fMRI priors; (iii) dynamic causal modelling (DCM), an approach combining neural modelling with data analysis for which there are several variants dealing with evoked responses, steady state responses (power spectra and cross-spectra), induced responses, and phase coupling. SPM8 is integrated with the FieldTrip toolbox , making it possible for users to combine a variety of standard analysis methods with new schemes implemented in SPM and build custom analysis tools using powerful graphical user interface (GUI) and batching tools.
EEG and MEG Data Analysis in SPM8
Litvak, Vladimir; Mattout, Jérémie; Kiebel, Stefan; Phillips, Christophe; Henson, Richard; Kilner, James; Barnes, Gareth; Oostenveld, Robert; Daunizeau, Jean; Flandin, Guillaume; Penny, Will; Friston, Karl
2011-01-01
SPM is a free and open source software written in MATLAB (The MathWorks, Inc.). In addition to standard M/EEG preprocessing, we presently offer three main analysis tools: (i) statistical analysis of scalp-maps, time-frequency images, and volumetric 3D source reconstruction images based on the general linear model, with correction for multiple comparisons using random field theory; (ii) Bayesian M/EEG source reconstruction, including support for group studies, simultaneous EEG and MEG, and fMRI priors; (iii) dynamic causal modelling (DCM), an approach combining neural modelling with data analysis for which there are several variants dealing with evoked responses, steady state responses (power spectra and cross-spectra), induced responses, and phase coupling. SPM8 is integrated with the FieldTrip toolbox , making it possible for users to combine a variety of standard analysis methods with new schemes implemented in SPM and build custom analysis tools using powerful graphical user interface (GUI) and batching tools. PMID:21437221
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dong, Lihua; Cui, Jingkun; Tang, Fengjiao
Purpose: Studies of the association between ataxia telangiectasia–mutated (ATM) gene polymorphisms and acute radiation injuries are often small in sample size, and the results are inconsistent. We conducted the first meta-analysis to provide a systematic review of published findings. Methods and Materials: Publications were identified by searching PubMed up to April 25, 2014. Primary meta-analysis was performed for all acute radiation injuries, and subgroup meta-analyses were based on clinical endpoint. The influence of sample size and radiation injury incidence on genetic effects was estimated in sensitivity analyses. Power calculations were also conducted. Results: The meta-analysis was conducted on the ATMmore » polymorphism rs1801516, including 5 studies with 1588 participants. For all studies, the cut-off for differentiating cases from controls was grade 2 acute radiation injuries. The primary meta-analysis showed a significant association with overall acute radiation injuries (allelic model: odds ratio = 1.33, 95% confidence interval: 1.04-1.71). Subgroup analyses detected an association between the rs1801516 polymorphism and a significant increase in urinary and lower gastrointestinal injuries and an increase in skin injury that was not statistically significant. There was no between-study heterogeneity in any meta-analyses. In the sensitivity analyses, small studies did not show larger effects than large studies. In addition, studies with high incidence of acute radiation injuries showed larger effects than studies with low incidence. Power calculations revealed that the statistical power of the primary meta-analysis was borderline, whereas there was adequate power for the subgroup analysis of studies with high incidence of acute radiation injuries. Conclusions: Our meta-analysis showed a consistency of the results from the overall and subgroup analyses. We also showed that the genetic effect of the rs1801516 polymorphism on acute radiation injuries was dependent on the incidence of the injury. These support the evidence of an association between the rs1801516 polymorphism and acute radiation injuries, encouraging further research of this topic.« less
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-01-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-28
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
NASA Astrophysics Data System (ADS)
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
NASA Astrophysics Data System (ADS)
Petersen, Alexander M.; Jung, Woo-Sung; Stanley, H. Eugene
2008-09-01
Statistical analysis is a major aspect of baseball, from player averages to historical benchmarks and records. Much of baseball fanfare is based around players exceeding the norm, some in a single game and others over a long career. Career statistics serve as a metric for classifying players and establishing their historical legacy. However, the concept of records and benchmarks assumes that the level of competition in baseball is stationary in time. Here we show that power law probability density functions, a hallmark of many complex systems that are driven by competition, govern career longevity in baseball. We also find similar power laws in the density functions of all major performance metrics for pitchers and batters. The use of performance-enhancing drugs has a dark history, emerging as a problem for both amateur and professional sports. We find statistical evidence consistent with performance-enhancing drugs in the analysis of home runs hit by players in the last 25 years. This is corroborated by the findings of the Mitchell Report (2007), a two-year investigation into the use of illegal steroids in Major League Baseball, which recently revealed that over 5 percent of Major League Baseball players tested positive for performance-enhancing drugs in an anonymous 2003 survey.
Television broadcast relay system, volume 2
NASA Technical Reports Server (NTRS)
Graf, E. R.
1972-01-01
A statistical analysis of propagation characteristics at 12 GHz for the design of a worldwide communication satellite system is reported. High power spaceborne TV transmitter design and the development of a low cost, low noise ground receiver for domestic reception are considered most important for implementation of a TVBS system.
Spatial variability effects on precision and power of forage yield estimation
USDA-ARS?s Scientific Manuscript database
Spatial analyses of yield trials are important, as they adjust cultivar means for spatial variation and improve the statistical precision of yield estimation. While the relative efficiency of spatial analysis has been frequently reported in several yield trials, its application on long-term forage y...
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity
ERIC Educational Resources Information Center
Beasley, T. Mark
2014-01-01
Increasing the correlation between the independent variable and the mediator ("a" coefficient) increases the effect size ("ab") for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation caused by…
The open-source movement: an introduction for forestry professionals
Patrick Proctor; Paul C. Van Deusen; Linda S. Heath; Jeffrey H. Gove
2005-01-01
In recent years, the open-source movement has yielded a generous and powerful suite of software and utilities that rivals those developed by many commercial software companies. Open-source programs are available for many scientific needs: operating systems, databases, statistical analysis, Geographic Information System applications, and object-oriented programming....
Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu
2017-12-07
Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines
NASA Astrophysics Data System (ADS)
Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.
2016-12-01
Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.
Relative risk estimates from spatial and space-time scan statistics: Are they biased?
Prates, Marcos O.; Kulldorff, Martin; Assunção, Renato M.
2014-01-01
The purely spatial and space-time scan statistics have been successfully used by many scientists to detect and evaluate geographical disease clusters. Although the scan statistic has high power in correctly identifying a cluster, no study has considered the estimates of the cluster relative risk in the detected cluster. In this paper we evaluate whether there is any bias on these estimated relative risks. Intuitively, one may expect that the estimated relative risks has upward bias, since the scan statistic cherry picks high rate areas to include in the cluster. We show that this intuition is correct for clusters with low statistical power, but with medium to high power the bias becomes negligible. The same behaviour is not observed for the prospective space-time scan statistic, where there is an increasing conservative downward bias of the relative risk as the power to detect the cluster increases. PMID:24639031
Burguete-Argueta, Nelsi; Martínez De la Cruz, Braulio; Camacho-Mejorado, Rafael; Santana, Carla; Noris, Gino; López-Bayghen, Esther; Arellano-Galindo, José; Majluf-Cruz, Abraham; Antonio Meraz-Ríos, Marco; Gómez, Rocío
2016-11-01
STRs are powerful tools intensively used in forensic and kinship studies. In order to assess the effectiveness of non-CODIS genetic markers in forensic and paternity tests, the genetic composition of six mini short tandem repeats-mini-STRs-(D1S1656, D2S441, D6S1043, D10S1248, D12S391, D22S1045) and the microsatellite SE33 in Mestizo and Amerindian populations from Mexico were studied. Using multiplex polymerase chain reactions and capillary electrophoresis, this study genotyped all loci from 870 chromosomes and evaluated the statistical genetic parameters. All mini-STRs studied were in agreement with HW and linkage equilibrium; however, an important HW departure for SE33 was found in the Mestizo population (p ≤ 0.0001). Regarding paternity and forensic statistical parameters, high values of combined power discrimination and mean power of exclusion were found using these seven markers. The principal co-ordinate analysis based on allele frequencies of three mini-STRs showed the complex genetic architecture of the Mestizo population. The results indicate that this set of loci is suitable to genetically identify individuals in the Mexican population, supporting its effectiveness in human identification casework. In addition, these findings add new statistical values and emphasise the importance of the use of non-CODIS markers in complex populations in order to avoid erroneous assumptions.
Martinez-Murcia, Francisco Jesús; Lai, Meng-Chuan; Górriz, Juan Manuel; Ramírez, Javier; Young, Adam M H; Deoni, Sean C L; Ecker, Christine; Lombardo, Michael V; Baron-Cohen, Simon; Murphy, Declan G M; Bullmore, Edward T; Suckling, John
2017-03-01
Neuroimaging studies have reported structural and physiological differences that could help understand the causes and development of Autism Spectrum Disorder (ASD). Many of them rely on multisite designs, with the recruitment of larger samples increasing statistical power. However, recent large-scale studies have put some findings into question, considering the results to be strongly dependent on the database used, and demonstrating the substantial heterogeneity within this clinically defined category. One major source of variance may be the acquisition of the data in multiple centres. In this work we analysed the differences found in the multisite, multi-modal neuroimaging database from the UK Medical Research Council Autism Imaging Multicentre Study (MRC AIMS) in terms of both diagnosis and acquisition sites. Since the dissimilarities between sites were higher than between diagnostic groups, we developed a technique called Significance Weighted Principal Component Analysis (SWPCA) to reduce the undesired intensity variance due to acquisition site and to increase the statistical power in detecting group differences. After eliminating site-related variance, statistically significant group differences were found, including Broca's area and the temporo-parietal junction. However, discriminative power was not sufficient to classify diagnostic groups, yielding accuracies results close to random. Our work supports recent claims that ASD is a highly heterogeneous condition that is difficult to globally characterize by neuroimaging, and therefore different (and more homogenous) subgroups should be defined to obtain a deeper understanding of ASD. Hum Brain Mapp 38:1208-1223, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J
2017-05-01
Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.
Afshari, Kasra; Samavati, Vahid; Shahidi, Seyed-Ahmad
2015-03-01
The effects of ultrasonic power, extraction time, extraction temperature, and the water-to-raw material ratio on extraction yield of crude polysaccharide from the leaf of Hibiscus rosa-sinensis (HRLP) were optimized by statistical analysis using response surface methodology. The response surface methodology (RSM) was used to optimize HRLP extraction yield by implementing the Box-Behnken design (BBD). The experimental data obtained were fitted to a second-order polynomial equation using multiple regression analysis and also analyzed by appropriate statistical methods (ANOVA). Analysis of the results showed that the linear and quadratic terms of these four variables had significant effects. The optimal conditions for the highest extraction yield of HRLP were: ultrasonic power, 93.59 W; extraction time, 25.71 min; extraction temperature, 93.18°C; and the water to raw material ratio, 24.3 mL/g. Under these conditions, the experimental yield was 9.66±0.18%, which is well in close agreement with the value predicted by the model 9.526%. The results demonstrated that HRLP had strong scavenging activities in vitro on DPPH and hydroxyl radicals. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Wan, Tao; Naoe, Takashi; Futakawa, Masatoshi
2016-01-01
A double-wall structure mercury target will be installed at the high-power pulsed spallation neutron source in the Japan Proton Accelerator Research Complex (J-PARC). Cavitation damage on the inner wall is an important factor governing the lifetime of the target-vessel. To monitor the structural integrity of the target vessel, displacement velocity at a point on the outer surface of the target vessel is measured using a laser Doppler vibrometer (LDV). The measured signals can be used for evaluating the damage inside the target vessel because of cyclic loading and cavitation bubble collapse caused by pulsed-beam induced pressure waves. The wavelet differential analysis (WDA) was applied to reveal the effects of the damage on vibrational cycling. To reduce the effects of noise superimposed on the vibration signals on the WDA results, analysis of variance (ANOVA) and analysis of covariance (ANCOVA), statistical methods were applied. Results from laboratory experiments, numerical simulation results with random noise added, and target vessel field data were analyzed by the WDA and the statistical methods. The analyses demonstrated that the established in-situ diagnostic technique can be used to effectively evaluate the structural response of the target vessel.
A simulator for evaluating methods for the detection of lesion-deficit associations
NASA Technical Reports Server (NTRS)
Megalooikonomou, V.; Davatzikos, C.; Herskovits, E. H.
2000-01-01
Although much has been learned about the functional organization of the human brain through lesion-deficit analysis, the variety of statistical and image-processing methods developed for this purpose precludes a closed-form analysis of the statistical power of these systems. Therefore, we developed a lesion-deficit simulator (LDS), which generates artificial subjects, each of which consists of a set of functional deficits, and a brain image with lesions; the deficits and lesions conform to predefined distributions. We used probability distributions to model the number, sizes, and spatial distribution of lesions, to model the structure-function associations, and to model registration error. We used the LDS to evaluate, as examples, the effects of the complexities and strengths of lesion-deficit associations, and of registration error, on the power of lesion-deficit analysis. We measured the numbers of recovered associations from these simulated data, as a function of the number of subjects analyzed, the strengths and number of associations in the statistical model, the number of structures associated with a particular function, and the prior probabilities of structures being abnormal. The number of subjects required to recover the simulated lesion-deficit associations was found to have an inverse relationship to the strength of associations, and to the smallest probability in the structure-function model. The number of structures associated with a particular function (i.e., the complexity of associations) had a much greater effect on the performance of the analysis method than did the total number of associations. We also found that registration error of 5 mm or less reduces the number of associations discovered by approximately 13% compared to perfect registration. The LDS provides a flexible framework for evaluating many aspects of lesion-deficit analysis.
Pereira, Tiago Veiga; Rudnicki, Martina; Pereira, Alexandre Costa; Pombo-de-Oliveira, Maria S; Franco, Rendrik França
2006-01-01
Meta-analysis has become an important statistical tool in genetic association studies, since it may provide more powerful and precise estimates. However, meta-analytic studies are prone to several potential biases not only because the preferential publication of "positive'' studies but also due to difficulties in obtaining all relevant information during the study selection process. In this letter, we point out major problems in meta-analysis that may lead to biased conclusions, illustrating an empirical example of two recent meta-analyses on the relation between MTHFR polymorphisms and risk of acute lymphoblastic leukemia that, despite the similarity in statistical methods and period of study selection, provided partially conflicting results.
Statistical issues in quality control of proteomic analyses: good experimental design and planning.
Cairns, David A
2011-03-01
Quality control is becoming increasingly important in proteomic investigations as experiments become more multivariate and quantitative. Quality control applies to all stages of an investigation and statistics can play a key role. In this review, the role of statistical ideas in the design and planning of an investigation is described. This involves the design of unbiased experiments using key concepts from statistical experimental design, the understanding of the biological and analytical variation in a system using variance components analysis and the determination of a required sample size to perform a statistically powerful investigation. These concepts are described through simple examples and an example data set from a 2-D DIGE pilot experiment. Each of these concepts can prove useful in producing better and more reproducible data. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
Comparison of Comet Enflow and VA One Acoustic-to-Structure Power Flow Predictions
NASA Technical Reports Server (NTRS)
Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.
2010-01-01
Comet Enflow is a commercially available, high frequency vibroacoustic analysis software based on the Energy Finite Element Analysis (EFEA). In this method the same finite element mesh used for structural and acoustic analysis can be employed for the high frequency solutions. Comet Enflow is being validated for a floor-equipped composite cylinder by comparing the EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) results from the commercial software program VA One from ESI Group. Early in this program a number of discrepancies became apparent in the Enflow predicted response for the power flow from an acoustic space to a structural subsystem. The power flow anomalies were studied for a simple cubic, a rectangular and a cylindrical structural model connected to an acoustic cavity. The current investigation focuses on three specific discrepancies between the Comet Enflow and the VA One predictions: the Enflow power transmission coefficient relative to the VA One coupling loss factor; the importance of the accuracy of the acoustic modal density formulation used within Enflow; and the recommended use of fast solvers in Comet Enflow. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 16 Hz to 4000 Hz.
Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock
2017-09-29
Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients.
Explanation of Two Anomalous Results in Statistical Mediation Analysis.
Fritz, Matthew S; Taylor, Aaron B; Mackinnon, David P
2012-01-01
Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M , a , increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y , b , was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a . Implications of these findings are discussed.
Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock
2017-01-01
Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients. PMID:29100405
GWAMA: software for genome-wide association meta-analysis.
Mägi, Reedik; Morris, Andrew P
2010-05-28
Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.
Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao
2009-01-01
Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Eickhoff, Simon B; Nichols, Thomas E; Laird, Angela R; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T; Bzdok, Danilo; Eickhoff, Claudia R
2016-08-15
Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. Copyright © 2016 Elsevier Inc. All rights reserved.
LADES: a software for constructing and analyzing longitudinal designs in biomedical research.
Vázquez-Alcocer, Alan; Garzón-Cortes, Daniel Ladislao; Sánchez-Casas, Rosa María
2014-01-01
One of the most important steps in biomedical longitudinal studies is choosing a good experimental design that can provide high accuracy in the analysis of results with a minimum sample size. Several methods for constructing efficient longitudinal designs have been developed based on power analysis and the statistical model used for analyzing the final results. However, development of this technology is not available to practitioners through user-friendly software. In this paper we introduce LADES (Longitudinal Analysis and Design of Experiments Software) as an alternative and easy-to-use tool for conducting longitudinal analysis and constructing efficient longitudinal designs. LADES incorporates methods for creating cost-efficient longitudinal designs, unequal longitudinal designs, and simple longitudinal designs. In addition, LADES includes different methods for analyzing longitudinal data such as linear mixed models, generalized estimating equations, among others. A study of European eels is reanalyzed in order to show LADES capabilities. Three treatments contained in three aquariums with five eels each were analyzed. Data were collected from 0 up to the 12th week post treatment for all the eels (complete design). The response under evaluation is sperm volume. A linear mixed model was fitted to the results using LADES. The complete design had a power of 88.7% using 15 eels. With LADES we propose the use of an unequal design with only 14 eels and 89.5% efficiency. LADES was developed as a powerful and simple tool to promote the use of statistical methods for analyzing and creating longitudinal experiments in biomedical research.
Reuschel, Anna; Bogatsch, Holger; Barth, Thomas; Wiedemann, Renate
2010-11-01
To compare the intraoperative and postoperative outcomes of conventional longitudinal phacoemulsification and torsional phacoemulsification. Department of Ophthalmology, University of Leipzig, Germany. Randomized single-center clinical trial. Eyes with senile cataract were randomized to have phacoemulsification using the Infiniti Vision System and the torsional mode (OZil) or conventional longitudinal mode. Primary outcomes were corrected distance visual acuity (CDVA) and central endothelial cell density (ECD), calculated according to the Conference on Harmonisation-E9 Guidelines in which missing values were substituted by the median in each group (primary analysis) and the loss was then calculated using actual data (secondary analysis). Secondary outcomes were ultrasound (US) time, cumulative dissipated energy (CDE), and percentage total equivalent power in position 3. Postoperative follow-up was at 3 months. The mean preoperative CDVA was 0.41 logMAR in the torsional group and 0.38 logMAR in the longitudinal group, improving to 0.07 logMAR postoperatively in both groups. The mean ECD loss was 7.2% ± 4.6% in the torsional group (72 patients) and 7.1% ± 4.4% in the longitudinal group (76 patients), with no statistically significant differences in the primary analysis (P = .342) or secondary analysis (P = .906). The mean US time, CDE, and percentage total equivalent power in position 3 were statistically significantly lower in the torsional group (98 patients) than in the longitudinal group (94 patients) (P<.001). The torsional mode was as safe as the longitudinal mode in phacoemulsification for age-related cataract. Copyright © 2010 ASCRS and ESCRS. Published by Elsevier Inc. All rights reserved.
Eickhoff, Simon B.; Nichols, Thomas E.; Laird, Angela R.; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T.
2016-01-01
Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. PMID:27179606
Verification of Space Station Secondary Power System Stability Using Design of Experiment
NASA Technical Reports Server (NTRS)
Karimi, Kamiar J.; Booker, Andrew J.; Mong, Alvin C.; Manners, Bruce
1998-01-01
This paper describes analytical methods used in verification of large DC power systems with applications to the International Space Station (ISS). Large DC power systems contain many switching power converters with negative resistor characteristics. The ISS power system presents numerous challenges with respect to system stability such as complex sources and undefined loads. The Space Station program has developed impedance specifications for sources and loads. The overall approach to system stability consists of specific hardware requirements coupled with extensive system analysis and testing. Testing of large complex distributed power systems is not practical due to size and complexity of the system. Computer modeling has been extensively used to develop hardware specifications as well as to identify system configurations for lab testing. The statistical method of Design of Experiments (DoE) is used as an analysis tool for verification of these large systems. DOE reduces the number of computer runs which are necessary to analyze the performance of a complex power system consisting of hundreds of DC/DC converters. DoE also provides valuable information about the effect of changes in system parameters on the performance of the system. DoE provides information about various operating scenarios and identification of the ones with potential for instability. In this paper we will describe how we have used computer modeling to analyze a large DC power system. A brief description of DoE is given. Examples using applications of DoE to analysis and verification of the ISS power system are provided.
SQC: secure quality control for meta-analysis of genome-wide association studies.
Huang, Zhicong; Lin, Huang; Fellay, Jacques; Kutalik, Zoltán; Hubaux, Jean-Pierre
2017-08-01
Due to the limited power of small-scale genome-wide association studies (GWAS), researchers tend to collaborate and establish a larger consortium in order to perform large-scale GWAS. Genome-wide association meta-analysis (GWAMA) is a statistical tool that aims to synthesize results from multiple independent studies to increase the statistical power and reduce false-positive findings of GWAS. However, it has been demonstrated that the aggregate data of individual studies are subject to inference attacks, hence privacy concerns arise when researchers share study data in GWAMA. In this article, we propose a secure quality control (SQC) protocol, which enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. SQC employs state-of-the-art cryptographic and statistical techniques for privacy protection. We implement the solution in a meta-analysis pipeline with real data to demonstrate the efficiency and scalability on commodity machines. The distributed execution of SQC on a cluster of 128 cores for one million genetic variants takes less than one hour, which is a modest cost considering the 10-month time span usually observed for the completion of the QC procedure that includes timing of logistics. SQC is implemented in Java and is publicly available at https://github.com/acs6610987/secureqc. jean-pierre.hubaux@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Large signal design - Performance and simulation of a 3 W C-band GaAs power MMIC
NASA Astrophysics Data System (ADS)
White, Paul M.; Hendrickson, Mary A.; Chang, Wayne H.; Curtice, Walter R.
1990-04-01
This paper describes a C-band GaAs power MMIC amplifier that achieved a gain of 17 dB and 1 dB compressed CW power output of 34 dBm across a 4.5-6.25-GHz frequency range, without design iteration. The first-pass design success was achieved due to the application of a harmonic balance simulator to define the optimum output load, using a large-signal FET model determined statistically on a well controlled foundry-ready process line. The measured performance was close to that predicted by a full harmonic balance circuit analysis.
Quantum fluctuation theorems and power measurements
NASA Astrophysics Data System (ADS)
Prasanna Venkatesh, B.; Watanabe, Gentaro; Talkner, Peter
2015-07-01
Work in the paradigm of the quantum fluctuation theorems of Crooks and Jarzynski is determined by projective measurements of energy at the beginning and end of the force protocol. In analogy to classical systems, we consider an alternative definition of work given by the integral of the supplied power determined by integrating up the results of repeated measurements of the instantaneous power during the force protocol. We observe that such a definition of work, in spite of taking account of the process dependence, has different possible values and statistics from the work determined by the conventional two energy measurement approach (TEMA). In the limit of many projective measurements of power, the system’s dynamics is frozen in the power measurement basis due to the quantum Zeno effect leading to statistics only trivially dependent on the force protocol. In general the Jarzynski relation is not satisfied except for the case when the instantaneous power operator commutes with the total Hamiltonian at all times. We also consider properties of the joint statistics of power-based definition of work and TEMA work in protocols where both values are determined. This allows us to quantify their correlations. Relaxing the projective measurement condition, weak continuous measurements of power are considered within the stochastic master equation formalism. Even in this scenario the power-based work statistics is in general not able to reproduce qualitative features of the TEMA work statistics.
Abbott, Eduardo F; Serrano, Valentina P; Rethlefsen, Melissa L; Pandian, T K; Naik, Nimesh D; West, Colin P; Pankratz, V Shane; Cook, David A
2018-02-01
To characterize reporting of P values, confidence intervals (CIs), and statistical power in health professions education research (HPER) through manual and computerized analysis of published research reports. The authors searched PubMed, Embase, and CINAHL in May 2016, for comparative research studies. For manual analysis of abstracts and main texts, they randomly sampled 250 HPER reports published in 1985, 1995, 2005, and 2015, and 100 biomedical research reports published in 1985 and 2015. Automated computerized analysis of abstracts included all HPER reports published 1970-2015. In the 2015 HPER sample, P values were reported in 69/100 abstracts and 94 main texts. CIs were reported in 6 abstracts and 22 main texts. Most P values (≥77%) were ≤.05. Across all years, 60/164 two-group HPER studies had ≥80% power to detect a between-group difference of 0.5 standard deviations. From 1985 to 2015, the proportion of HPER abstracts reporting a CI did not change significantly (odds ratio [OR] 2.87; 95% CI 1.04, 7.88) whereas that of main texts reporting a CI increased (OR 1.96; 95% CI 1.39, 2.78). Comparison with biomedical studies revealed similar reporting of P values, but more frequent use of CIs in biomedicine. Automated analysis of 56,440 HPER abstracts found 14,867 (26.3%) reporting a P value, 3,024 (5.4%) reporting a CI, and increased reporting of P values and CIs from 1970 to 2015. P values are ubiquitous in HPER, CIs are rarely reported, and most studies are underpowered. Most reported P values would be considered statistically significant.
Is psychology suffering from a replication crisis? What does "failure to replicate" really mean?
Maxwell, Scott E; Lau, Michael Y; Howard, George S
2015-09-01
Psychology has recently been viewed as facing a replication crisis because efforts to replicate past study findings frequently do not show the same result. Often, the first study showed a statistically significant result but the replication does not. Questions then arise about whether the first study results were false positives, and whether the replication study correctly indicates that there is truly no effect after all. This article suggests these so-called failures to replicate may not be failures at all, but rather are the result of low statistical power in single replication studies, and the result of failure to appreciate the need for multiple replications in order to have enough power to identify true effects. We provide examples of these power problems and suggest some solutions using Bayesian statistics and meta-analysis. Although the need for multiple replication studies may frustrate those who would prefer quick answers to psychology's alleged crisis, the large sample sizes typically needed to provide firm evidence will almost always require concerted efforts from multiple investigators. As a result, it remains to be seen how many of the recently claimed failures to replicate will be supported or instead may turn out to be artifacts of inadequate sample sizes and single study replications. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
The language of gene ontology: a Zipf's law analysis.
Kalankesh, Leila Ranandeh; Stevens, Robert; Brass, Andy
2012-06-07
Most major genome projects and sequence databases provide a GO annotation of their data, either automatically or through human annotators, creating a large corpus of data written in the language of GO. Texts written in natural language show a statistical power law behaviour, Zipf's law, the exponent of which can provide useful information on the nature of the language being used. We have therefore explored the hypothesis that collections of GO annotations will show similar statistical behaviours to natural language. Annotations from the Gene Ontology Annotation project were found to follow Zipf's law. Surprisingly, the measured power law exponents were consistently different between annotation captured using the three GO sub-ontologies in the corpora (function, process and component). On filtering the corpora using GO evidence codes we found that the value of the measured power law exponent responded in a predictable way as a function of the evidence codes used to support the annotation. Techniques from computational linguistics can provide new insights into the annotation process. GO annotations show similar statistical behaviours to those seen in natural language with measured exponents that provide a signal which correlates with the nature of the evidence codes used to support the annotations, suggesting that the measured exponent might provide a signal regarding the information content of the annotation.
Self-powered monitoring of repeated head impacts using time-dilation energy measurement circuit.
Feng, Tao; Aono, Kenji; Covassin, Tracey; Chakrabartty, Shantanu
2015-04-01
Due to the current epidemic levels of sport-related concussions (SRC) in the U.S., there is a pressing need for technologies that can facilitate long-term and continuous monitoring of head impacts. Existing helmet-sensor technology is inconsistent, inaccurate, and is not economically or logistically practical for large-scale human studies. In this paper, we present the design of a miniature, battery-less, self-powered sensor that can be embedded inside sport helmets and can continuously monitor and store different spatial and temporal statistics of the helmet impacts. At the core of the proposed sensor is a novel time-dilation circuit that allows measurement of a wide-range of impact energies. In this paper an array of linear piezo-floating-gate (PFG) injectors has been used for self-powered sensing and storage of linear and rotational head-impact statistics. The stored statistics are then retrieved using a plug-and-play reader and has been used for offline data analysis. We report simulation and measurement results validating the functionality of the time-dilation circuit for different levels of impact energies. Also, using prototypes of linear PFG integrated circuits fabricated in a 0.5 μm CMOS process, we demonstrate the functionality of the proposed helmet-sensors using controlled drop tests.
Statistical Power of Psychological Research: What Have We Gained in 20 Years?
ERIC Educational Resources Information Center
Rossi, Joseph S.
1990-01-01
Calculated power for 6,155 statistical tests in 221 journal articles published in 1982 volumes of "Journal of Abnormal Psychology,""Journal of Consulting and Clinical Psychology," and "Journal of Personality and Social Psychology." Power to detect small, medium, and large effects was .17, .57, and .83, respectively. Concluded that power of…
Ferrarini, Luca; Veer, Ilya M; van Lew, Baldur; Oei, Nicole Y L; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, J
2011-06-01
In recent years, graph theory has been successfully applied to study functional and anatomical connectivity networks in the human brain. Most of these networks have shown small-world topological characteristics: high efficiency in long distance communication between nodes, combined with highly interconnected local clusters of nodes. Moreover, functional studies performed at high resolutions have presented convincing evidence that resting-state functional connectivity networks exhibits (exponentially truncated) scale-free behavior. Such evidence, however, was mostly presented qualitatively, in terms of linear regressions of the degree distributions on log-log plots. Even when quantitative measures were given, these were usually limited to the r(2) correlation coefficient. However, the r(2) statistic is not an optimal estimator of explained variance, when dealing with (truncated) power-law models. Recent developments in statistics have introduced new non-parametric approaches, based on the Kolmogorov-Smirnov test, for the problem of model selection. In this work, we have built on this idea to statistically tackle the issue of model selection for the degree distribution of functional connectivity at rest. The analysis, performed at voxel level and in a subject-specific fashion, confirmed the superiority of a truncated power-law model, showing high consistency across subjects. Moreover, the most highly connected voxels were found to be consistently part of the default mode network. Our results provide statistically sound support to the evidence previously presented in literature for a truncated power-law model of resting-state functional connectivity. Copyright © 2010 Elsevier Inc. All rights reserved.
Statistical dynamics of regional populations and economies
NASA Astrophysics Data System (ADS)
Huo, Jie; Wang, Xu-Ming; Hao, Rui; Wang, Peng
Quantitative analysis of human behavior and social development is becoming a hot spot of some interdisciplinary studies. A statistical analysis on the population and GDP of 150 cities in China from 1990 to 2013 is conducted. The result indicates the cumulative probability distribution of the populations and that of the GDPs obeying the shifted power law, respectively. In order to understand these characteristics, a generalized Langevin equation describing variation of population is proposed, which is based on the correlations between population and GDP as well as the random fluctuations of the related factors. The equation is transformed into the Fokker-Plank equation to express the evolution of population distribution. The general solution demonstrates a transition of the distribution from the normal Gaussian distribution to a shifted power law, which suggests a critical point of time at which the transition takes place. The shifted power law distribution in the supercritical situation is qualitatively in accordance with the practical result. The distribution of the GDPs is derived from the well-known Cobb-Douglas production function. The result presents a change, in supercritical situation, from a shifted power law to the Gaussian distribution. This is a surprising result-the regional GDP distribution of our world will be the Gaussian distribution one day in the future. The discussions based on the changing trend of economic growth suggest it will be true. Therefore, these theoretical attempts may draw a historical picture of our society in the aspects of population and economy.
On damage detection in wind turbine gearboxes using outlier analysis
NASA Astrophysics Data System (ADS)
Antoniadou, Ifigeneia; Manson, Graeme; Dervilis, Nikolaos; Staszewski, Wieslaw J.; Worden, Keith
2012-04-01
The proportion of worldwide installed wind power in power systems increases over the years as a result of the steadily growing interest in renewable energy sources. Still, the advantages offered by the use of wind power are overshadowed by the high operational and maintenance costs, resulting in the low competitiveness of wind power in the energy market. In order to reduce the costs of corrective maintenance, the application of condition monitoring to gearboxes becomes highly important, since gearboxes are among the wind turbine components with the most frequent failure observations. While condition monitoring of gearboxes in general is common practice, with various methods having been developed over the last few decades, wind turbine gearbox condition monitoring faces a major challenge: the detection of faults under the time-varying load conditions prevailing in wind turbine systems. Classical time and frequency domain methods fail to detect faults under variable load conditions, due to the temporary effect that these faults have on vibration signals. This paper uses the statistical discipline of outlier analysis for the damage detection of gearbox tooth faults. A simplified two-degree-of-freedom gearbox model considering nonlinear backlash, time-periodic mesh stiffness and static transmission error, simulates the vibration signals to be analysed. Local stiffness reduction is used for the simulation of tooth faults and statistical processes determine the existence of intermittencies. The lowest level of fault detection, the threshold value, is considered and the Mahalanobis squared-distance is calculated for the novelty detection problem.
Bartsch, L.A.; Richardson, W.B.; Naimo, T.J.
1998-01-01
Estimation of benthic macroinvertebrate populations over large spatial scales is difficult due to the high variability in abundance and the cost of sample processing and taxonomic analysis. To determine a cost-effective, statistically powerful sample design, we conducted an exploratory study of the spatial variation of benthic macroinvertebrates in a 37 km reach of the Upper Mississippi River. We sampled benthos at 36 sites within each of two strata, contiguous backwater and channel border. Three standard ponar (525 cm(2)) grab samples were obtained at each site ('Original Design'). Analysis of variance and sampling cost of strata-wide estimates for abundance of Oligochaeta, Chironomidae, and total invertebrates showed that only one ponar sample per site ('Reduced Design') yielded essentially the same abundance estimates as the Original Design, while reducing the overall cost by 63%. A posteriori statistical power analysis (alpha = 0.05, beta = 0.20) on the Reduced Design estimated that at least 18 sites per stratum were needed to detect differences in mean abundance between contiguous backwater and channel border areas for Oligochaeta, Chironomidae, and total invertebrates. Statistical power was nearly identical for the three taxonomic groups. The abundances of several taxa of concern (e.g., Hexagenia mayflies and Musculium fingernail clams) were too spatially variable to estimate power with our method. Resampling simulations indicated that to achieve adequate sampling precision for Oligochaeta, at least 36 sample sites per stratum would be required, whereas a sampling precision of 0.2 would not be attained with any sample size for Hexagenia in channel border areas, or Chironomidae and Musculium in both strata given the variance structure of the original samples. Community-wide diversity indices (Brillouin and 1-Simpsons) increased as sample area per site increased. The backwater area had higher diversity than the channel border area. The number of sampling sites required to sample benthic macroinvertebrates during our sampling period depended on the study objective and ranged from 18 to more than 40 sites per stratum. No single sampling regime would efficiently and adequately sample all components of the macroinvertebrate community.
Kuhn-Tucker optimization based reliability analysis for probabilistic finite elements
NASA Technical Reports Server (NTRS)
Liu, W. K.; Besterfield, G.; Lawrence, M.; Belytschko, T.
1988-01-01
The fusion of probability finite element method (PFEM) and reliability analysis for fracture mechanics is considered. Reliability analysis with specific application to fracture mechanics is presented, and computational procedures are discussed. Explicit expressions for the optimization procedure with regard to fracture mechanics are given. The results show the PFEM is a very powerful tool in determining the second-moment statistics. The method can determine the probability of failure or fracture subject to randomness in load, material properties and crack length, orientation, and location.
NASA Astrophysics Data System (ADS)
Kovalevsky, Louis; Langley, Robin S.; Caro, Stephane
2016-05-01
Due to the high cost of experimental EMI measurements significant attention has been focused on numerical simulation. Classical methods such as Method of Moment or Finite Difference Time Domain are not well suited for this type of problem, as they require a fine discretisation of space and failed to take into account uncertainties. In this paper, the authors show that the Statistical Energy Analysis is well suited for this type of application. The SEA is a statistical approach employed to solve high frequency problems of electromagnetically reverberant cavities at a reduced computational cost. The key aspects of this approach are (i) to consider an ensemble of system that share the same gross parameter, and (ii) to avoid solving Maxwell's equations inside the cavity, using the power balance principle. The output is an estimate of the field magnitude distribution in each cavity. The method is applied on a typical aircraft structure.
Rowlands, G J; Musoke, A J; Morzaria, S P; Nagda, S M; Ballingall, K T; McKeever, D J
2000-04-01
A statistically derived disease reaction index based on parasitological, clinical and haematological measurements observed in 309 5 to 8-month-old Boran cattle following laboratory challenge with Theileria parva is described. Principal component analysis was applied to 13 measures including first appearance of schizonts, first appearance of piroplasms and first occurrence of pyrexia, together with the duration and severity of these symptoms, and white blood cell count. The first principal component, which was based on approximately equal contributions of the 13 variables, provided the definition for the disease reaction index, defined on a scale of 0-10. As well as providing a more objective measure of the severity of the reaction, the continuous nature of the index score enables more powerful statistical analysis of the data compared with that which has been previously possible through clinically derived categories of non-, mild, moderate and severe reactions.
Li, Yumei; Xiang, Yang; Xu, Chao; Shen, Hui; Deng, Hongwen
2018-01-15
The development of next-generation sequencing technologies has facilitated the identification of rare variants. Family-based design is commonly used to effectively control for population admixture and substructure, which is more prominent for rare variants. Case-parents studies, as typical strategies in family-based design, are widely used in rare variant-disease association analysis. Current methods in case-parents studies are based on complete case-parents data; however, parental genotypes may be missing in case-parents trios, and removing these data may lead to a loss in statistical power. The present study focuses on testing for rare variant-disease association in case-parents study by allowing for missing parental genotypes. In this report, we extended the collapsing method for rare variant association analysis in case-parents studies to allow for missing parental genotypes, and investigated the performance of two methods by using the difference of genotypes between affected offspring and their corresponding "complements" in case-parent trios and TDT framework. Using simulations, we showed that, compared with the methods just only using complete case-parents data, the proposed strategy allowing for missing parental genotypes, or even adding unrelated affected individuals, can greatly improve the statistical power and meanwhile is not affected by population stratification. We conclude that adding case-parents data with missing parental genotypes to complete case-parents data set can greatly improve the power of our strategy for rare variant-disease association.
NASA Astrophysics Data System (ADS)
Quinn, Kevin Martin
The total amount of precipitation integrated across a precipitation cluster (contiguous precipitating grid cells exceeding a minimum rain rate) is a useful measure of the aggregate size of the disturbance, expressed as the rate of water mass lost or latent heat released, i.e. the power of the disturbance. Probability distributions of cluster power are examined during boreal summer (May-September) and winter (January-March) using satellite-retrieved rain rates from the Tropical Rainfall Measuring Mission (TRMM) 3B42 and Special Sensor Microwave Imager and Sounder (SSM/I and SSMIS) programs, model output from the High Resolution Atmospheric Model (HIRAM, roughly 0.25-0.5 0 resolution), seven 1-2° resolution members of the Coupled Model Intercomparison Project Phase 5 (CMIP5) experiment, and National Center for Atmospheric Research Large Ensemble (NCAR LENS). Spatial distributions of precipitation-weighted centroids are also investigated in observations (TRMM-3B42) and climate models during winter as a metric for changes in mid-latitude storm tracks. Observed probability distributions for both seasons are scale-free from the smallest clusters up to a cutoff scale at high cluster power, after which the probability density drops rapidly. When low rain rates are excluded by choosing a minimum rain rate threshold in defining clusters, the models accurately reproduce observed cluster power statistics and winter storm tracks. Changes in behavior in the tail of the distribution, above the cutoff, are important for impacts since these quantify the frequency of the most powerful storms. End-of-century cluster power distributions and storm track locations are investigated in these models under a "business as usual" global warming scenario. The probability of high cluster power events increases by end-of-century across all models, by up to an order of magnitude for the highest-power events for which statistics can be computed. For the three models in the suite with continuous time series of high resolution output, there is substantial variability on when these probability increases for the most powerful precipitation clusters become detectable, ranging from detectable within the observational period to statistically significant trends emerging only after 2050. A similar analysis of National Centers for Environmental Prediction (NCEP) Reanalysis 2 and SSM/I-SSMIS rain rate retrievals in the recent observational record does not yield reliable evidence of trends in high-power cluster probabilities at this time. Large impacts to mid-latitude storm tracks are projected over the West Coast and eastern North America, with no less than 8 of the 9 models examined showing large increases by end-of-century in the probability density of the most powerful storms, ranging up to a factor of 6.5 in the highest range bin for which historical statistics are computed. However, within these regional domains, there is considerable variation among models in pinpointing exactly where the largest increases will occur.
An analysis of quantum coherent solar photovoltaic cells
NASA Astrophysics Data System (ADS)
Kirk, A. P.
2012-02-01
A new hypothesis (Scully et al., Proc. Natl. Acad. Sci. USA 108 (2011) 15097) suggests that it is possible to break the statistical physics-based detailed balance-limiting power conversion efficiency and increase the power output of a solar photovoltaic cell by using “noise-induced quantum coherence” to increase the current. The fundamental errors of this hypothesis are explained here. As part of this analysis, we show that the maximum photogenerated current density for a practical solar cell is a function of the incident spectrum, sunlight concentration factor, and solar cell energy bandgap and thus the presence of quantum coherence is irrelevant as it is unable to lead to increased current output from a solar cell.
A T1 and DTI fused 3D corpus callosum analysis in pre- vs. post-season contact sports players
NASA Astrophysics Data System (ADS)
Lao, Yi; Law, Meng; Shi, Jie; Gajawelli, Niharika; Haas, Lauren; Wang, Yalin; Leporé, Natasha
2015-01-01
Sports related traumatic brain injury (TBI) is a worldwide public health issue, and damage to the corpus callosum (CC) has been considered as an important indicator of TBI. However, contact sports players suffer repeated hits to the head during the course of a season even in the absence of diagnosed concussion, and less is known about their effect on callosal anatomy. In addition, T1-weighted and diffusion tensor brain magnetic resonance images (DTI) have been analyzed separately, but a joint analysis of both types of data may increase statistical power and give a more complete understanding of anatomical correlates of subclinical concussions in these athletes. Here, for the first time, we fuse T1 surface-based morphometry and a new DTI analysis on 3D surface representations of the CCs into a single statistical analysis on these subjects. Our new combined method successfully increases detection power in detecting differences between pre- vs. post-season contact sports players. Alterations are found in the ventral genu, isthmus, and splenium of CC. Our findings may inform future health assessments in contact sports players. The new method here is also the first truly multimodal diffusion and T1-weighted analysis of the CC, and may be useful to detect anatomical changes in the corpus callosum in other multimodal datasets.
Lazic, Stanley E
2008-07-21
Analysis of variance (ANOVA) is a common statistical technique in physiological research, and often one or more of the independent/predictor variables such as dose, time, or age, can be treated as a continuous, rather than a categorical variable during analysis - even if subjects were randomly assigned to treatment groups. While this is not common, there are a number of advantages of such an approach, including greater statistical power due to increased precision, a simpler and more informative interpretation of the results, greater parsimony, and transformation of the predictor variable is possible. An example is given from an experiment where rats were randomly assigned to receive either 0, 60, 180, or 240 mg/L of fluoxetine in their drinking water, with performance on the forced swim test as the outcome measure. Dose was treated as either a categorical or continuous variable during analysis, with the latter analysis leading to a more powerful test (p = 0.021 vs. p = 0.159). This will be true in general, and the reasons for this are discussed. There are many advantages to treating variables as continuous numeric variables if the data allow this, and this should be employed more often in experimental biology. Failure to use the optimal analysis runs the risk of missing significant effects or relationships.
Simulation on a car interior aerodynamic noise control based on statistical energy analysis
NASA Astrophysics Data System (ADS)
Chen, Xin; Wang, Dengfeng; Ma, Zhengdong
2012-09-01
How to simulate interior aerodynamic noise accurately is an important question of a car interior noise reduction. The unsteady aerodynamic pressure on body surfaces is proved to be the key effect factor of car interior aerodynamic noise control in high frequency on high speed. In this paper, a detail statistical energy analysis (SEA) model is built. And the vibra-acoustic power inputs are loaded on the model for the valid result of car interior noise analysis. The model is the solid foundation for further optimization on car interior noise control. After the most sensitive subsystems for the power contribution to car interior noise are pointed by SEA comprehensive analysis, the sound pressure level of car interior aerodynamic noise can be reduced by improving their sound and damping characteristics. The further vehicle testing results show that it is available to improve the interior acoustic performance by using detailed SEA model, which comprised by more than 80 subsystems, with the unsteady aerodynamic pressure calculation on body surfaces and the materials improvement of sound/damping properties. It is able to acquire more than 2 dB reduction on the central frequency in the spectrum over 800 Hz. The proposed optimization method can be looked as a reference of car interior aerodynamic noise control by the detail SEA model integrated unsteady computational fluid dynamics (CFD) and sensitivity analysis of acoustic contribution.
NASA Astrophysics Data System (ADS)
Karuppiah, R.; Faldi, A.; Laurenzi, I.; Usadi, A.; Venkatesh, A.
2014-12-01
An increasing number of studies are focused on assessing the environmental footprint of different products and processes, especially using life cycle assessment (LCA). This work shows how combining statistical methods and Geographic Information Systems (GIS) with environmental analyses can help improve the quality of results and their interpretation. Most environmental assessments in literature yield single numbers that characterize the environmental impact of a process/product - typically global or country averages, often unchanging in time. In this work, we show how statistical analysis and GIS can help address these limitations. For example, we demonstrate a method to separately quantify uncertainty and variability in the result of LCA models using a power generation case study. This is important for rigorous comparisons between the impacts of different processes. Another challenge is lack of data that can affect the rigor of LCAs. We have developed an approach to estimate environmental impacts of incompletely characterized processes using predictive statistical models. This method is applied to estimate unreported coal power plant emissions in several world regions. There is also a general lack of spatio-temporal characterization of the results in environmental analyses. For instance, studies that focus on water usage do not put in context where and when water is withdrawn. Through the use of hydrological modeling combined with GIS, we quantify water stress on a regional and seasonal basis to understand water supply and demand risks for multiple users. Another example where it is important to consider regional dependency of impacts is when characterizing how agricultural land occupation affects biodiversity in a region. We developed a data-driven methodology used in conjuction with GIS to determine if there is a statistically significant difference between the impacts of growing different crops on different species in various biomes of the world.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ade, P. A. R.; Aghanim, N.; Akrami, Y.
In this paper, we test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect ourmore » studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The “Cold Spot” is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Finally, where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.« less
Tips and Tricks for Successful Application of Statistical Methods to Biological Data.
Schlenker, Evelyn
2016-01-01
This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.
An, Ming-Wen; Lu, Xin; Sargent, Daniel J; Mandrekar, Sumithra J
2015-01-01
A phase II design with an option for direct assignment (stop randomization and assign all patients to experimental treatment based on interim analysis, IA) for a predefined subgroup was previously proposed. Here, we illustrate the modularity of the direct assignment option by applying it to the setting of two predefined subgroups and testing for separate subgroup main effects. We power the 2-subgroup direct assignment option design with 1 IA (DAD-1) to test for separate subgroup main effects, with assessment of power to detect an interaction in a post-hoc test. Simulations assessed the statistical properties of this design compared to the 2-subgroup balanced randomized design with 1 IA, BRD-1. Different response rates for treatment/control in subgroup 1 (0.4/0.2) and in subgroup 2 (0.1/0.2, 0.4/0.2) were considered. The 2-subgroup DAD-1 preserves power and type I error rate compared to the 2-subgroup BRD-1, while exhibiting reasonable power in a post-hoc test for interaction. The direct assignment option is a flexible design component that can be incorporated into broader design frameworks, while maintaining desirable statistical properties, clinical appeal, and logistical simplicity.
Four modes of optical parametric operation for squeezed state generation
NASA Astrophysics Data System (ADS)
Andersen, U. L.; Buchler, B. C.; Lam, P. K.; Wu, J. W.; Gao, J. R.; Bachor, H.-A.
2003-11-01
We report a versatile instrument, based on a monolithic optical parametric amplifier, which reliably generates four different types of squeezed light. We obtained vacuum squeezing, low power amplitude squeezing, phase squeezing and bright amplitude squeezing. We show a complete analysis of this light, including a full quantum state tomography. In addition we demonstrate the direct detection of the squeezed state statistics without the aid of a spectrum analyser. This technique makes the nonclassical properties directly visible and allows complete measurement of the statistical moments of the squeezed quadrature.
Moretti, Stefano; van Leeuwen, Danitsja; Gmuender, Hans; Bonassi, Stefano; van Delft, Joost; Kleinjans, Jos; Patrone, Fioravante; Merlo, Domenico Franco
2008-01-01
Background In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. Results In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. Conclusion CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways. PMID:18764936
Universal self-similarity of propagating populations
NASA Astrophysics Data System (ADS)
Eliazar, Iddo; Klafter, Joseph
2010-07-01
This paper explores the universal self-similarity of propagating populations. The following general propagation model is considered: particles are randomly emitted from the origin of a d -dimensional Euclidean space and propagate randomly and independently of each other in space; all particles share a statistically common—yet arbitrary—motion pattern; each particle has its own random propagation parameters—emission epoch, motion frequency, and motion amplitude. The universally self-similar statistics of the particles’ displacements and first passage times (FPTs) are analyzed: statistics which are invariant with respect to the details of the displacement and FPT measurements and with respect to the particles’ underlying motion pattern. Analysis concludes that the universally self-similar statistics are governed by Poisson processes with power-law intensities and by the Fréchet and Weibull extreme-value laws.
Universal self-similarity of propagating populations.
Eliazar, Iddo; Klafter, Joseph
2010-07-01
This paper explores the universal self-similarity of propagating populations. The following general propagation model is considered: particles are randomly emitted from the origin of a d-dimensional Euclidean space and propagate randomly and independently of each other in space; all particles share a statistically common--yet arbitrary--motion pattern; each particle has its own random propagation parameters--emission epoch, motion frequency, and motion amplitude. The universally self-similar statistics of the particles' displacements and first passage times (FPTs) are analyzed: statistics which are invariant with respect to the details of the displacement and FPT measurements and with respect to the particles' underlying motion pattern. Analysis concludes that the universally self-similar statistics are governed by Poisson processes with power-law intensities and by the Fréchet and Weibull extreme-value laws.
Statistical detection of patterns in unidimensional distributions by continuous wavelet transforms
NASA Astrophysics Data System (ADS)
Baluev, R. V.
2018-04-01
Objective detection of specific patterns in statistical distributions, like groupings or gaps or abrupt transitions between different subsets, is a task with a rich range of applications in astronomy: Milky Way stellar population analysis, investigations of the exoplanets diversity, Solar System minor bodies statistics, extragalactic studies, etc. We adapt the powerful technique of the wavelet transforms to this generalized task, making a strong emphasis on the assessment of the patterns detection significance. Among other things, our method also involves optimal minimum-noise wavelets and minimum-noise reconstruction of the distribution density function. Based on this development, we construct a self-closed algorithmic pipeline aimed to process statistical samples. It is currently applicable to single-dimensional distributions only, but it is flexible enough to undergo further generalizations and development.
NASA Astrophysics Data System (ADS)
Ryazanova, A. A.; Okladnikov, I. G.; Gordov, E. P.
2017-11-01
The frequency of occurrence and magnitude of precipitation and temperature extreme events show positive trends in several geographical regions. These events must be analyzed and studied in order to better understand their impact on the environment, predict their occurrences, and mitigate their effects. For this purpose, we augmented web-GIS called “CLIMATE” to include a dedicated statistical package developed in the R language. The web-GIS “CLIMATE” is a software platform for cloud storage processing and visualization of distributed archives of spatial datasets. It is based on a combined use of web and GIS technologies with reliable procedures for searching, extracting, processing, and visualizing the spatial data archives. The system provides a set of thematic online tools for the complex analysis of current and future climate changes and their effects on the environment. The package includes new powerful methods of time-dependent statistics of extremes, quantile regression and copula approach for the detailed analysis of various climate extreme events. Specifically, the very promising copula approach allows obtaining the structural connections between the extremes and the various environmental characteristics. The new statistical methods integrated into the web-GIS “CLIMATE” can significantly facilitate and accelerate the complex analysis of climate extremes using only a desktop PC connected to the Internet.
A data base and analysis program for shuttle main engine dynamic pressure measurements
NASA Technical Reports Server (NTRS)
Coffin, T.
1986-01-01
A dynamic pressure data base management system is described for measurements obtained from space shuttle main engine (SSME) hot firing tests. The data were provided in terms of engine power level and rms pressure time histories, and power spectra of the dynamic pressure measurements at selected times during each test. Test measurements and engine locations are defined along with a discussion of data acquisition and reduction procedures. A description of the data base management analysis system is provided and subroutines developed for obtaining selected measurement means, variances, ranges and other statistics of interest are discussed. A summary of pressure spectra obtained at SSME rated power level is provided for reference. Application of the singular value decomposition technique to spectrum interpolation is discussed and isoplots of interpolated spectra are presented to indicate measurement trends with engine power level. Program listings of the data base management and spectrum interpolation software are given. Appendices are included to document all data base measurements.
Upper limits on the 21 cm power spectrum at z = 5.9 from quasar absorption line spectroscopy
NASA Astrophysics Data System (ADS)
Pober, Jonathan C.; Greig, Bradley; Mesinger, Andrei
2016-11-01
We present upper limits on the 21 cm power spectrum at z = 5.9 calculated from the model-independent limit on the neutral fraction of the intergalactic medium of x_{H I} < 0.06 + 0.05 (1σ ) derived from dark pixel statistics of quasar absorption spectra. Using 21CMMC, a Markov chain Monte Carlo Epoch of Reionization analysis code, we explore the probability distribution of 21 cm power spectra consistent with this constraint on the neutral fraction. We present 99 per cent confidence upper limits of Δ2(k) < 10-20 mK2 over a range of k from 0.5 to 2.0 h Mpc-1, with the exact limit dependent on the sampled k mode. This limit can be used as a null test for 21 cm experiments: a detection of power at z = 5.9 in excess of this value is highly suggestive of residual foreground contamination or other systematic errors affecting the analysis.
Sex Differences in the Response of Children with ADHD to Once-Daily Formulations of Methylphenidate
ERIC Educational Resources Information Center
Sonuga-Barke, J. S.; Coghill, David; Markowitz, John S.; Swanson, James M.; Vandenberghe, Mieke; Hatch, Simon J.
2007-01-01
Objectives: Studies of sex differences in methylphenidate response by children with attention-deficit/hyperactivity disorder have lacked methodological rigor and statistical power. This paper reports an examination of sex differences based on further analysis of data from a comparison of two once-daily methylphenidate formulations (the COMACS…
Managing Student Loan Default Risk: Evidence from a Privately Guaranteed Portfolio.
ERIC Educational Resources Information Center
Monteverde, Kirk
2000-01-01
Application of the statistical techniques of survival analysis and credit scoring to private education loans extended to law students found a pronounced seasoning effect for such loans and the robust predictive power of credit bureau scoring of borrowers. Other predictors of default included school-of-attendance, school's geographic location, and…
Take It Back: A Manual for Fighting Slurs on Campus.
ERIC Educational Resources Information Center
2003
This manual provides information on fighting verbal harassment on the school campus. Chapter 1, "Background," discusses the history and power analysis of vebal abuse and offers statistics about slurs and harassment (the widespread use of slurs, the damaging effect of slurs on people who are targeted by them, and how infrequently efforts…
Time Delay Embedding Increases Estimation Precision of Models of Intraindividual Variability
ERIC Educational Resources Information Center
von Oertzen, Timo; Boker, Steven M.
2010-01-01
This paper investigates the precision of parameters estimated from local samples of time dependent functions. We find that "time delay embedding," i.e., structuring data prior to analysis by constructing a data matrix of overlapping samples, increases the precision of parameter estimates and in turn statistical power compared to standard…
NASA Astrophysics Data System (ADS)
Kim, Hyun-Sil; Kim, Jae-Seung; Lee, Seong-Hyun; Seo, Yun-Ho
2014-12-01
Insertion loss prediction of large acoustical enclosures using Statistical Energy Analysis (SEA) method is presented. The SEA model consists of three elements: sound field inside the enclosure, vibration energy of the enclosure panel, and sound field outside the enclosure. It is assumed that the space surrounding the enclosure is sufficiently large so that there is no energy flow from the outside to the wall panel or to air cavity inside the enclosure. The comparison of the predicted insertion loss to the measured data for typical large acoustical enclosures shows good agreements. It is found that if the critical frequency of the wall panel falls above the frequency region of interest, insertion loss is dominated by the sound transmission loss of the wall panel and averaged sound absorption coefficient inside the enclosure. However, if the critical frequency of the wall panel falls into the frequency region of interest, acoustic power from the sound radiation by the wall panel must be added to the acoustic power from transmission through the panel.
Statistical analysis of CSP plants by simulating extensive meteorological series
NASA Astrophysics Data System (ADS)
Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana
2017-06-01
The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.
NASA Astrophysics Data System (ADS)
He, Honghui; Dong, Yang; Zhou, Jialing; Ma, Hui
2017-03-01
As one of the salient features of light, polarization contains abundant structural and optical information of media. Recently, as a comprehensive description of polarization property, the Mueller matrix polarimetry has been applied to various biomedical studies such as cancerous tissues detections. In previous works, it has been found that the structural information encoded in the 2D Mueller matrix images can be presented by other transformed parameters with more explicit relationship to certain microstructural features. In this paper, we present a statistical analyzing method to transform the 2D Mueller matrix images into frequency distribution histograms (FDHs) and their central moments to reveal the dominant structural features of samples quantitatively. The experimental results of porcine heart, intestine, stomach, and liver tissues demonstrate that the transformation parameters and central moments based on the statistical analysis of Mueller matrix elements have simple relationships to the dominant microstructural properties of biomedical samples, including the density and orientation of fibrous structures, the depolarization power, diattenuation and absorption abilities. It is shown in this paper that the statistical analysis of 2D images of Mueller matrix elements may provide quantitative or semi-quantitative criteria for biomedical diagnosis.
NASA Technical Reports Server (NTRS)
Edwards, B. F.; Waligora, J. M.; Horrigan, D. J., Jr.
1985-01-01
This analysis was done to determine whether various decompression response groups could be characterized by the pooled nitrogen (N2) washout profiles of the group members, pooling individual washout profiles provided a smooth time dependent function of means representative of the decompression response group. No statistically significant differences were detected. The statistical comparisons of the profiles were performed by means of univariate weighted t-test at each 5 minute profile point, and with levels of significance of 5 and 10 percent. The estimated powers of the tests (i.e., probabilities) to detect the observed differences in the pooled profiles were of the order of 8 to 30 percent.
Monitoring Statistics Which Have Increased Power over a Reduced Time Range.
ERIC Educational Resources Information Center
Tang, S. M.; MacNeill, I. B.
1992-01-01
The problem of monitoring trends for changes at unknown times is considered. Statistics that permit one to focus high power on a segment of the monitored period are studied. Numerical procedures are developed to compute the null distribution of these statistics. (Author)
ViPAR: a software platform for the Virtual Pooling and Analysis of Research Data.
Carter, Kim W; Francis, Richard W; Carter, K W; Francis, R W; Bresnahan, M; Gissler, M; Grønborg, T K; Gross, R; Gunnes, N; Hammond, G; Hornig, M; Hultman, C M; Huttunen, J; Langridge, A; Leonard, H; Newman, S; Parner, E T; Petersson, G; Reichenberg, A; Sandin, S; Schendel, D E; Schalkwyk, L; Sourander, A; Steadman, C; Stoltenberg, C; Suominen, A; Surén, P; Susser, E; Sylvester Vethanayagam, A; Yusof, Z
2016-04-01
Research studies exploring the determinants of disease require sufficient statistical power to detect meaningful effects. Sample size is often increased through centralized pooling of disparately located datasets, though ethical, privacy and data ownership issues can often hamper this process. Methods that facilitate the sharing of research data that are sympathetic with these issues and which allow flexible and detailed statistical analyses are therefore in critical need. We have created a software platform for the Virtual Pooling and Analysis of Research data (ViPAR), which employs free and open source methods to provide researchers with a web-based platform to analyse datasets housed in disparate locations. Database federation permits controlled access to remotely located datasets from a central location. The Secure Shell protocol allows data to be securely exchanged between devices over an insecure network. ViPAR combines these free technologies into a solution that facilitates 'virtual pooling' where data can be temporarily pooled into computer memory and made available for analysis without the need for permanent central storage. Within the ViPAR infrastructure, remote sites manage their own harmonized research dataset in a database hosted at their site, while a central server hosts the data federation component and a secure analysis portal. When an analysis is initiated, requested data are retrieved from each remote site and virtually pooled at the central site. The data are then analysed by statistical software and, on completion, results of the analysis are returned to the user and the virtually pooled data are removed from memory. ViPAR is a secure, flexible and powerful analysis platform built on open source technology that is currently in use by large international consortia, and is made publicly available at [http://bioinformatics.childhealthresearch.org.au/software/vipar/]. © The Author 2015. Published by Oxford University Press on behalf of the International Epidemiological Association.
POWER ANALYSIS FOR COMPLEX MEDIATIONAL DESIGNS USING MONTE CARLO METHODS
Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.
2013-01-01
Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex mediational models. The approach is based on the well known technique of generating a large number of samples in a Monte Carlo study, and estimating power as the percentage of cases in which an estimate of interest is significantly different from zero. Examples of power calculation for commonly used mediational models are provided. Power analyses for the single mediator, multiple mediators, three-path mediation, mediation with latent variables, moderated mediation, and mediation in longitudinal designs are described. Annotated sample syntax for Mplus is appended and tabled values of required sample sizes are shown for some models. PMID:23935262
Publication bias was not a good reason to discourage trials with low power.
Borm, George F; den Heijer, Martin; Zielhuis, Gerhard A
2009-01-01
The objective was to investigate whether it is justified to discourage trials with less than 80% power. Trials with low power are unlikely to produce conclusive results, but their findings can be used by pooling then in a meta-analysis. However, such an analysis may be biased, because trials with low power are likely to have a nonsignificant result and are less likely to be published than trials with a statistically significant outcome. We simulated several series of studies with varying degrees of publication bias and then calculated the "real" one-sided type I error and the bias of meta-analyses with a "nominal" error rate (significance level) of 2.5%. In single trials, in which heterogeneity was set at zero, low, and high, the error rates were 2.3%, 4.7%, and 16.5%, respectively. In multiple trials with 80%-90% power and a publication rate of 90% when the results were nonsignificant, the error rates could be as high as 5.1%. When the power was 50% and the publication rate of non-significant results was 60%, the error rates did not exceed 5.3%, whereas the bias was at most 15% of the difference used in the power calculation. The impact of publication bias does not warrant the exclusion of trials with 50% power.
Kling, Daniel; Egeland, Thore; Piñero, Mariana Herrera; Vigeland, Magnus Dehli
2017-11-01
Methods and implementations of DNA-based identification are well established in several forensic contexts. However, assessing the statistical power of these methods has been largely overlooked, except in the simplest cases. In this paper we outline general methods for such power evaluation, and apply them to a large set of family reunification cases, where the objective is to decide whether a person of interest (POI) is identical to the missing person (MP) in a family, based on the DNA profile of the POI and available family members. As such, this application closely resembles database searching and disaster victim identification (DVI). If parents or children of the MP are available, they will typically provide sufficient statistical evidence to settle the case. However, if one must resort to more distant relatives, it is not a priori obvious that a reliable conclusion is likely to be reached. In these cases power evaluation can be highly valuable, for instance in the recruitment of additional family members. To assess the power in an identification case, we advocate the combined use of two statistics: the Probability of Exclusion, and the Probability of Exceedance. The former is the probability that the genotypes of a random, unrelated person are incompatible with the available family data. If this is close to 1, it is likely that a conclusion will be achieved regarding general relatedness, but not necessarily the specific relationship. To evaluate the ability to recognize a true match, we use simulations to estimate exceedance probabilities, i.e. the probability that the likelihood ratio will exceed a given threshold, assuming that the POI is indeed the MP. All simulations are done conditionally on available family data. Such conditional simulations have a long history in medical linkage analysis, but to our knowledge this is the first systematic forensic genetics application. Also, for forensic markers mutations cannot be ignored and therefore current models and implementations must be extended. All the tools are freely available in Familias (http://www.familias.no) empowered by the R library paramlink. The above approach is applied to a large and important data set: 'The missing grandchildren of Argentina'. We evaluate the power of 196 families from the DNA reference databank (Banco Nacional de Datos Genéticos, http://www.bndg.gob.ar. As a result we show that 58 of the families have poor statistical power and require additional genetic data to enable a positive identification. Copyright © 2017 Elsevier B.V. All rights reserved.
HARMONIC SPACE ANALYSIS OF PULSAR TIMING ARRAY REDSHIFT MAPS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roebber, Elinore; Holder, Gilbert, E-mail: roebbere@physics.mcgill.ca
2017-01-20
In this paper, we propose a new framework for treating the angular information in the pulsar timing array (PTA) response to a gravitational wave (GW) background based on standard cosmic microwave background techniques. We calculate the angular power spectrum of the all-sky gravitational redshift pattern induced at the Earth for both a single bright source of gravitational radiation and a statistically isotropic, unpolarized Gaussian random GW background. The angular power spectrum is the harmonic transform of the Hellings and Downs curve. We use the power spectrum to examine the expected variance in the Hellings and Downs curve in both cases.more » Finally, we discuss the extent to which PTAs are sensitive to the angular power spectrum and find that the power spectrum sensitivity is dominated by the quadrupole anisotropy of the gravitational redshift map.« less
NASA Astrophysics Data System (ADS)
Tomarov, G. V.; Povarov, V. P.; Shipkov, A. A.; Gromov, A. F.; Kiselev, A. N.; Shepelev, S. V.; Galanin, A. V.
2015-02-01
Specific features relating to development of the information-analytical system on the problem of flow-accelerated corrosion of pipeline elements in the secondary coolant circuit of the VVER-440-based power units at the Novovoronezh nuclear power plant are considered. The results from a statistical analysis of data on the quantity, location, and operating conditions of the elements and preinserted segments of pipelines used in the condensate-feedwater and wet steam paths are presented. The principles of preparing and using the information-analytical system for determining the lifetime to reaching inadmissible wall thinning in elements of pipelines used in the secondary coolant circuit of the VVER-440-based power units at the Novovoronezh NPP are considered.
Steeves, Darren; Campagna, Phil
2018-02-14
This project investigated whether there was a relationship between maximal aerobic power and the recovery or performance in elite ice hockey players during a simulated hockey game. An on-ice protocol was used to simulate a game of ice hockey. Recovery values were determined by the differences in lactate and heart rate measures. Total distance traveled was also recorded as a performance measure. On two other days, subjects returned and completed a maximal aerobic power test on a treadmill and a maximal lactate test on ice. Statistical analysis showed no relationship between maximal aerobic power or maximal lactate values and recovery (heart rate, lactate) or the performance measure of distance traveled. It was concluded there was no relationship between maximal aerobic power and recovery during a simulated game in elite hockey players.
NASA Astrophysics Data System (ADS)
Jain, A.
2017-08-01
Computer based method can help in discovery of leads and can potentially eliminate chemical synthesis and screening of many irrelevant compounds, and in this way, it save time as well as cost. Molecular modeling systems are powerful tools for building, visualizing, analyzing and storing models of complex molecular structure that can help to interpretate structure activity relationship. The use of various techniques of molecular mechanics and dynamics and software in Computer aided drug design along with statistics analysis is powerful tool for the medicinal chemistry to synthesis therapeutic and effective drugs with minimum side effect.
Gait analysis in patients operated on for sacrococcygeal teratoma.
Zaccara, Antonio; Iacobelli, Barbara D; Adorisio, Ottavio; Petrarca, Maurizio; Di Rosa, Giuseppe; Pierro, Marcello M; Bagolan, Pietro
2004-06-01
Long-term follow-up of sacrococcygeal teratoma (SCT) is well established; however, little is known about the effects of extensive surgery in the pelvic and perineal region, which involves disruption of muscles providing maximal support in normal walking. Thirteen patients operated on at birth for SCT with extensive muscle dissection underwent gait studies with a Vicon 3-D motion analysis system with 6 cameras. Results were compared with 15 age-matched controls. Statistical analysis was performed with Mann-Whitney test; correlations were sought with Spearman's correlation coefficient. All subjects were independent ambulators, and no statistically significant differences were seen in walking velocity and stride length. However, in all patients, toe-off occurred earlier (at 58% +/- 1.82% of stride length) than controls (at 65.5% +/- 0.52%; P <.05). On kinetics, all patients exhibited, on both limbs, a significant reduction of hip extensory moment (-0.11 +/- 0.11 left; -0.16 +/- 0.15 right v 1.19 +/- 0.08 Newtonmeter/kg; P <.05) and of ankle dorsi/plantar moment (-0.07 +/- 0.09 right; -0.08 +/- 0.16 v -0.15 +/- 0.05 Nm/kg, p < 0.05). Knee power was also significantly reduced (0.44 +/- 0.55 right, 0.63 +/- 0.45 left v 0.04 +/- 0.05 W/kg), whereas ankle power was increased (3 +/- 1.5 right; 2.8 +/- 0.9 left v 1.97 +/- 0.2 W/kg; P <.05). No statistically significant correlation was found between tumor size and either muscle power generation or flexory/extensory moments. Patients operated on for SCT exhibit nearly normal gait patterns. However, this normal pattern is accompanied by abnormal kinetics of some ambulatory muscles, and the extent of these abnormalities appears to be independent of tumor size. A careful follow-up is warranted to verify if such modifications are stable or progress over the years, thereby impairing ambulatory potential or leading to early arthrosis.
The power laws of violence against women: rescaling research and policies.
Kappler, Karolin E; Kaltenbrunner, Andreas
2012-01-01
Violence against Women -despite its perpetuation over centuries and its omnipresence at all social levels- entered into social consciousness and the general agenda of Social Sciences only recently, mainly thanks to feminist research, campaigns, and general social awareness. The present article analyzes in a secondary analysis of German prevalence data on Violence against Women, whether the frequency and severity of Violence against Women can be described with power laws. Although the investigated distributions all resemble power-law distributions, a rigorous statistical analysis accepts this hypothesis at a significance level of 0.1 only for 1 of 5 cases of the tested frequency distributions and with some restrictions for the severity of physical violence. Lowering the significance level to 0.01 leads to the acceptance of the power-law hypothesis in 2 of the 5 tested frequency distributions and as well for the severity of domestic violence. The rejections might be mainly due to the noise in the data, with biases caused by self-reporting, errors through rounding, desirability response bias, and selection bias. Future victimological surveys should be designed explicitly to avoid these deficiencies in the data to be able to clearly answer the question whether Violence against Women follows a power-law pattern. This finding would not only have statistical implications for the processing and presentation of the data, but also groundbreaking consequences on the general understanding of Violence against Women and policy modeling, as the skewed nature of the underlying distributions makes evident that Violence against Women is a highly disparate and unequal social problem. This opens new questions for interdisciplinary research, regarding the interplay between environmental, experimental, and social factors on victimization.
The Power Laws of Violence against Women: Rescaling Research and Policies
Kappler, Karolin E.; Kaltenbrunner, Andreas
2012-01-01
Background Violence against Women –despite its perpetuation over centuries and its omnipresence at all social levels– entered into social consciousness and the general agenda of Social Sciences only recently, mainly thanks to feminist research, campaigns, and general social awareness. The present article analyzes in a secondary analysis of German prevalence data on Violence against Women, whether the frequency and severity of Violence against Women can be described with power laws. Principal Findings Although the investigated distributions all resemble power-law distributions, a rigorous statistical analysis accepts this hypothesis at a significance level of 0.1 only for 1 of 5 cases of the tested frequency distributions and with some restrictions for the severity of physical violence. Lowering the significance level to 0.01 leads to the acceptance of the power-law hypothesis in 2 of the 5 tested frequency distributions and as well for the severity of domestic violence. The rejections might be mainly due to the noise in the data, with biases caused by self-reporting, errors through rounding, desirability response bias, and selection bias. Conclusion Future victimological surveys should be designed explicitly to avoid these deficiencies in the data to be able to clearly answer the question whether Violence against Women follows a power-law pattern. This finding would not only have statistical implications for the processing and presentation of the data, but also groundbreaking consequences on the general understanding of Violence against Women and policy modeling, as the skewed nature of the underlying distributions makes evident that Violence against Women is a highly disparate and unequal social problem. This opens new questions for interdisciplinary research, regarding the interplay between environmental, experimental, and social factors on victimization. PMID:22768348
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bihn T. Pham; Jeffrey J. Einerson
2010-06-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automatedmore » processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.« less
Mining protein-protein interaction networks: denoising effects
NASA Astrophysics Data System (ADS)
Marras, Elisabetta; Capobianco, Enrico
2009-01-01
A typical instrument to pursue analysis in complex network studies is the analysis of the statistical distributions. They are usually computed for measures which characterize network topology, and are aimed at capturing both structural and dynamics aspects. Protein-protein interaction networks (PPIN) have also been studied through several measures. It is in general observed that a power law is expected to characterize scale-free networks. However, mixing the original noise cover with outlying information and other system-dependent fluctuations makes the empirical detection of the power law a difficult task. As a result the uncertainty level increases when looking at the observed sample; in particular, one may wonder whether the computed features may be sufficient to explain the interactome. We then address noise problems by implementing both decomposition and denoising techniques that reduce the impact of factors known to affect the accuracy of power law detection.
Avalanche statistics from data with low time resolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
LeBlanc, Michael; Nawano, Aya; Wright, Wendelin J.
Extracting avalanche distributions from experimental microplasticity data can be hampered by limited time resolution. We compute the effects of low time resolution on avalanche size distributions and give quantitative criteria for diagnosing and circumventing problems associated with low time resolution. We show that traditional analysis of data obtained at low acquisition rates can lead to avalanche size distributions with incorrect power-law exponents or no power-law scaling at all. Furthermore, we demonstrate that it can lead to apparent data collapses with incorrect power-law and cutoff exponents. We propose new methods to analyze low-resolution stress-time series that can recover the size distributionmore » of the underlying avalanches even when the resolution is so low that naive analysis methods give incorrect results. We test these methods on both downsampled simulation data from a simple model and downsampled bulk metallic glass compression data and find that the methods recover the correct critical exponents.« less
Estimating statistical power for open-enrollment group treatment trials.
Morgan-Lopez, Antonio A; Saavedra, Lissette M; Hien, Denise A; Fals-Stewart, William
2011-01-01
Modeling turnover in group membership has been identified as a key barrier contributing to a disconnect between the manner in which behavioral treatment is conducted (open-enrollment groups) and the designs of substance abuse treatment trials (closed-enrollment groups, individual therapy). Latent class pattern mixture models (LCPMMs) are emerging tools for modeling data from open-enrollment groups with membership turnover in recently proposed treatment trials. The current article illustrates an approach to conducting power analyses for open-enrollment designs based on the Monte Carlo simulation of LCPMM models using parameters derived from published data from a randomized controlled trial comparing Seeking Safety to a Community Care condition for women presenting with comorbid posttraumatic stress disorder and substance use disorders. The example addresses discrepancies between the analysis framework assumed in power analyses of many recently proposed open-enrollment trials and the proposed use of LCPMM for data analysis. Copyright © 2011 Elsevier Inc. All rights reserved.
Avalanche statistics from data with low time resolution
LeBlanc, Michael; Nawano, Aya; Wright, Wendelin J.; ...
2016-11-22
Extracting avalanche distributions from experimental microplasticity data can be hampered by limited time resolution. We compute the effects of low time resolution on avalanche size distributions and give quantitative criteria for diagnosing and circumventing problems associated with low time resolution. We show that traditional analysis of data obtained at low acquisition rates can lead to avalanche size distributions with incorrect power-law exponents or no power-law scaling at all. Furthermore, we demonstrate that it can lead to apparent data collapses with incorrect power-law and cutoff exponents. We propose new methods to analyze low-resolution stress-time series that can recover the size distributionmore » of the underlying avalanches even when the resolution is so low that naive analysis methods give incorrect results. We test these methods on both downsampled simulation data from a simple model and downsampled bulk metallic glass compression data and find that the methods recover the correct critical exponents.« less
Power calculation for overall hypothesis testing with high-dimensional commensurate outcomes.
Chi, Yueh-Yun; Gribbin, Matthew J; Johnson, Jacqueline L; Muller, Keith E
2014-02-28
The complexity of system biology means that any metabolic, genetic, or proteomic pathway typically includes so many components (e.g., molecules) that statistical methods specialized for overall testing of high-dimensional and commensurate outcomes are required. While many overall tests have been proposed, very few have power and sample size methods. We develop accurate power and sample size methods and software to facilitate study planning for high-dimensional pathway analysis. With an account of any complex correlation structure between high-dimensional outcomes, the new methods allow power calculation even when the sample size is less than the number of variables. We derive the exact (finite-sample) and approximate non-null distributions of the 'univariate' approach to repeated measures test statistic, as well as power-equivalent scenarios useful to generalize our numerical evaluations. Extensive simulations of group comparisons support the accuracy of the approximations even when the ratio of number of variables to sample size is large. We derive a minimum set of constants and parameters sufficient and practical for power calculation. Using the new methods and specifying the minimum set to determine power for a study of metabolic consequences of vitamin B6 deficiency helps illustrate the practical value of the new results. Free software implementing the power and sample size methods applies to a wide range of designs, including one group pre-intervention and post-intervention comparisons, multiple parallel group comparisons with one-way or factorial designs, and the adjustment and evaluation of covariate effects. Copyright © 2013 John Wiley & Sons, Ltd.
Statistical power calculations for mixed pharmacokinetic study designs using a population approach.
Kloprogge, Frank; Simpson, Julie A; Day, Nicholas P J; White, Nicholas J; Tarning, Joel
2014-09-01
Simultaneous modelling of dense and sparse pharmacokinetic data is possible with a population approach. To determine the number of individuals required to detect the effect of a covariate, simulation-based power calculation methodologies can be employed. The Monte Carlo Mapped Power method (a simulation-based power calculation methodology using the likelihood ratio test) was extended in the current study to perform sample size calculations for mixed pharmacokinetic studies (i.e. both sparse and dense data collection). A workflow guiding an easy and straightforward pharmacokinetic study design, considering also the cost-effectiveness of alternative study designs, was used in this analysis. Initially, data were simulated for a hypothetical drug and then for the anti-malarial drug, dihydroartemisinin. Two datasets (sampling design A: dense; sampling design B: sparse) were simulated using a pharmacokinetic model that included a binary covariate effect and subsequently re-estimated using (1) the same model and (2) a model not including the covariate effect in NONMEM 7.2. Power calculations were performed for varying numbers of patients with sampling designs A and B. Study designs with statistical power >80% were selected and further evaluated for cost-effectiveness. The simulation studies of the hypothetical drug and the anti-malarial drug dihydroartemisinin demonstrated that the simulation-based power calculation methodology, based on the Monte Carlo Mapped Power method, can be utilised to evaluate and determine the sample size of mixed (part sparsely and part densely sampled) study designs. The developed method can contribute to the design of robust and efficient pharmacokinetic studies.