Statistical Significance Testing.
ERIC Educational Resources Information Center
McLean, James E., Ed.; Kaufman, Alan S., Ed.
1998-01-01
The controversy about the use or misuse of statistical significance testing has become the major methodological issue in educational research. This special issue contains three articles that explore the controversy, three commentaries on these articles, an overall response, and three rejoinders by the first three authors. They are: (1)…
Lack of Statistical Significance
ERIC Educational Resources Information Center
Kehle, Thomas J.; Bray, Melissa A.; Chafouleas, Sandra M.; Kawano, Takuji
2007-01-01
Criticism has been leveled against the use of statistical significance testing (SST) in many disciplines. However, the field of school psychology has been largely devoid of critiques of SST. Inspection of the primary journals in school psychology indicated numerous examples of SST with nonrandom samples and/or samples of convenience. In this…
Statistical or biological significance?
Saxon, Emma
2015-01-01
Oat plants grown at an agricultural research facility produce higher yields in Field 1 than in Field 2, under well fertilised conditions and with similar weather exposure; all oat plants in both fields are healthy and show no sign of disease. In this study, the authors hypothesised that the soil microbial community might be different in each field, and these differences might explain the difference in oat plant growth. They carried out a metagenomic analysis of the 16 s ribosomal 'signature' sequences from bacteria in 50 randomly located soil samples in each field to determine the composition of the bacterial community. The study identified >1000 species, most of which were present in both fields. The authors identified two plant growth-promoting species that were significantly reduced in soil from Field 2 (Student's t-test P < 0.05), and concluded that these species might have contributed to reduced yield. PMID:26541972
Statistically significant relational data mining :
Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.
2014-02-01
This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.
Significant results: statistical or clinical?
2016-01-01
The null hypothesis significance test method is popular in biological and medical research. Many researchers have used this method for their research without exact knowledge, though it has both merits and shortcomings. Readers will know its shortcomings, as well as several complementary or alternative methods, as such the estimated effect size and the confidence interval. PMID:27066201
Statistical significance of the gallium anomaly
Giunti, Carlo; Laveder, Marco
2011-06-15
We calculate the statistical significance of the anomalous deficit of electron neutrinos measured in the radioactive source experiments of the GALLEX and SAGE solar neutrino detectors, taking into account the uncertainty of the detection cross section. We found that the statistical significance of the anomaly is {approx}3.0{sigma}. A fit of the data in terms of neutrino oscillations favors at {approx}2.7{sigma} short-baseline electron neutrino disappearance with respect to the null hypothesis of no oscillations.
Statistical Significance vs. Practical Significance: An Exploration through Health Education
ERIC Educational Resources Information Center
Rosen, Brittany L.; DeMaria, Andrea L.
2012-01-01
The purpose of this paper is to examine the differences between statistical and practical significance, including strengths and criticisms of both methods, as well as provide information surrounding the application of various effect sizes and confidence intervals within health education research. Provided are recommendations, explanations and…
Comments on the Statistical Significance Testing Articles.
ERIC Educational Resources Information Center
Knapp, Thomas R.
1998-01-01
Expresses a "middle-of-the-road" position on statistical significance testing, suggesting that it has its place but that confidence intervals are generally more useful. Identifies 10 errors of omission or commission in the papers reviewed that weaken the positions taken in their discussions. (SLD)
Statistical significance of normalized global alignment.
Peris, Guillermo; Marzal, Andrés
2014-03-01
The comparison of homologous proteins from different species is a first step toward a function assignment and a reconstruction of the species evolution. Though local alignment is mostly used for this purpose, global alignment is important for constructing multiple alignments or phylogenetic trees. However, statistical significance of global alignments is not completely clear, lacking a specific statistical model to describe alignments or depending on computationally expensive methods like Z-score. Recently we presented a normalized global alignment, defined as the best compromise between global alignment cost and length, and showed that this new technique led to better classification results than Z-score at a much lower computational cost. However, it is necessary to analyze the statistical significance of the normalized global alignment in order to be considered a completely functional algorithm for protein alignment. Experiments with unrelated proteins extracted from the SCOP ASTRAL database showed that normalized global alignment scores can be fitted to a log-normal distribution. This fact, obtained without any theoretical support, can be used to derive statistical significance of normalized global alignments. Results are summarized in a table with fitted parameters for different scoring schemes. PMID:24400820
Assessing the statistical significance of periodogram peaks
NASA Astrophysics Data System (ADS)
Baluev, R. V.
2008-04-01
The least-squares (or Lomb-Scargle) periodogram is a powerful tool that is routinely used in many branches of astronomy to search for periodicities in observational data. The problem of assessing the statistical significance of candidate periodicities for a number of periodograms is considered. Based on results in extreme value theory, improved analytic estimations of false alarm probabilities are given. These include an upper limit to the false alarm probability (or a lower limit to the significance). The estimations are tested numerically in order to establish regions of their practical applicability.
Social significance of community structure: Statistical view
NASA Astrophysics Data System (ADS)
Li, Hui-Jia; Daniels, Jasmine J.
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p -value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc.
Social significance of community structure: statistical view.
Li, Hui-Jia; Daniels, Jasmine J
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p-value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc. PMID:25679651
Statistical Significance of Trends in Exoplanetary Atmospheres
NASA Astrophysics Data System (ADS)
Harrington, Joseph; Bowman, M.; Blumenthal, S. D.; Loredo, T. J.; UCF Exoplanets Group
2013-10-01
Cowan and Agol (2011) and we (Harrington et al. 2007, 2010, 2011, 2012, 2013) have noted that at higher equilibrium temperatures, observed exoplanet fluxes are substantially higher than even the elevated equilibrium temperature predicts. With a substantial increase in the number of atmospheric flux measurements, we can now test the statistical significance of this trend. We can also cast the data on a variety of axes to search further for the physics behind both the jump in flux above about 2000 K and the wide scatter in fluxes at all temperatures. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G.
On the statistical significance of climate trends
NASA Astrophysics Data System (ADS)
Franzke, Christian
2010-05-01
One of the major problems in climate science is the prediction of future climate change due to anthropogenic green-house gas emissions. The earth's climate is not changing in a uniform way because it is a complex nonlinear system of many interacting components. The overall warming trend can be interrupted by cooling periods due to natural variability. Thus, in order to statistically distinguish between internal climate variability and genuine trends one has to assume a certain null model of the climate variability. Traditionally a short-range, and not a long-range, dependent null model is chosen. Here I show evidence for the first time that temperature data at 8 stations across Antarctica are long-range dependent and that the choice of a long-range, rather than a short-range, dependent null model negates the statistical significance of temperature trends at 2 out of 3 stations. These results show the short comings of traditional trend analysis and imply that more attention should be given to the correlation structure of climate data, in particular if they are long-range dependent. In this study I use the Empirical Mode Decomposition (EMD) to decompose the univariate temperature time series into a finite number of Intrinsic Mode Functions (IMF) and an instantaneous mean. While there is no unambiguous definition of a trend, in this study we interpret the instantaneous mean as a trend which is possibly nonlinear. The EMD method has been shown to be a powerful method for extracting trends from noisy and nonlinear time series. I will show that this way of identifying trends is superior to the traditional linear least-square fits.
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
ERIC Educational Resources Information Center
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Reviewer Bias for Statistically Significant Results: A Reexamination.
ERIC Educational Resources Information Center
Fagley, N. S.; McKinney, I. Jean
1983-01-01
Reexamines the article by Atkinson, Furlong, and Wampold (1982) and questions their conclusion that reviewers were biased toward statistically significant results. A statistical power analysis shows the power of their bogus study was low. Low power in a study reporting nonsignificant findings is a valid reason for recommending not to publish.…
Advances in Testing the Statistical Significance of Mediation Effects
ERIC Educational Resources Information Center
Mallinckrodt, Brent; Abraham, W. Todd; Wei, Meifen; Russell, Daniel W.
2006-01-01
P. A. Frazier, A. P. Tix, and K. E. Barron (2004) highlighted a normal theory method popularized by R. M. Baron and D. A. Kenny (1986) for testing the statistical significance of indirect effects (i.e., mediator variables) in multiple regression contexts. However, simulation studies suggest that this method lacks statistical power relative to some…
Statistical significance test for transition matrices of atmospheric Markov chains
NASA Technical Reports Server (NTRS)
Vautard, Robert; Mo, Kingtse C.; Ghil, Michael
1990-01-01
Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.
Decadal power in land air temperatures: Is it statistically significant?
NASA Astrophysics Data System (ADS)
Thejll, Peter A.
2001-12-01
The geographical distribution and properties of the well-known 10-11 year signal in terrestrial temperature records is investigated. By analyzing the Global Historical Climate Network data for surface air temperatures we verify that the signal is strongest in North America and is similar in nature to that reported earlier by R. G. Currie. The decadal signal is statistically significant for individual stations, but it is not possible to show that the signal is statistically significant globally, using strict tests. In North America, during the twentieth century, the decadal variability in the solar activity cycle is associated with the decadal part of the North Atlantic Oscillation index series in such a way that both of these signals correspond to the same spatial pattern of cooling and warming. A method for testing statistical results with Monte Carlo trials on data fields with specified temporal structure and specific spatial correlation retained is presented.
Your Chi-Square Test Is Statistically Significant: Now What?
ERIC Educational Resources Information Center
Sharpe, Donald
2015-01-01
Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…
A Comparison of Statistical Significance Tests for Selecting Equating Functions
ERIC Educational Resources Information Center
Moses, Tim
2009-01-01
This study compared the accuracies of nine previously proposed statistical significance tests for selecting identity, linear, and equipercentile equating functions in an equivalent groups equating design. The strategies included likelihood ratio tests for the loglinear models of tests' frequency distributions, regression tests, Kolmogorov-Smirnov…
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Assigning statistical significance to proteotypic peptides via database searches.
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2011-02-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId's knowledge database to include proteotypic information, utilized RAId's statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId's programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those that occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Statistical significance of climate sensitivity predictors obtained by data mining
NASA Astrophysics Data System (ADS)
Caldwell, Peter M.; Bretherton, Christopher S.; Zelinka, Mark D.; Klein, Stephen A.; Santer, Benjamin D.; Sanderson, Benjamin M.
2014-03-01
Several recent efforts to estimate Earth's equilibrium climate sensitivity (ECS) focus on identifying quantities in the current climate which are skillful predictors of ECS yet can be constrained by observations. This study automates the search for observable predictors using data from phase 5 of the Coupled Model Intercomparison Project. The primary focus of this paper is assessing statistical significance of the resulting predictive relationships. Failure to account for dependence between models, variables, locations, and seasons is shown to yield misleading results. A new technique for testing the field significance of data-mined correlations which avoids these problems is presented. Using this new approach, all 41,741 relationships we tested were found to be explainable by chance. This leads us to conclude that data mining is best used to identify potential relationships which are then validated or discarded using physically based hypothesis testing.
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance
Kramer, Karen L.; Veile, Amanda; Otárola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children’s growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children’s monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children’s growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children’s growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children’s growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance. PMID:26938742
Statistical significance across multiple optimization models for community partition
NASA Astrophysics Data System (ADS)
Li, Ju; Li, Hui-Jia; Mao, He-Jin; Chen, Junhua
2016-05-01
The study of community structure is an important problem in a wide range of applications, which can help us understand the real network system deeply. However, due to the existence of random factors and error edges in real networks, how to measure the significance of community structure efficiently is a crucial question. In this paper, we present a novel statistical framework computing the significance of community structure across multiple optimization methods. Different from the universal approaches, we calculate the similarity between a given node and its leader and employ the distribution of link tightness to derive the significance score, instead of a direct comparison to a randomized model. Based on the distribution of community tightness, a new “p-value” form significance measure is proposed for community structure analysis. Specially, the well-known approaches and their corresponding quality functions are unified to a novel general formulation, which facilitates in providing a detailed comparison across them. To determine the position of leaders and their corresponding followers, an efficient algorithm is proposed based on the spectral theory. Finally, we apply the significance analysis to some famous benchmark networks and the good performance verified the effectiveness and efficiency of our framework.
ERIC Educational Resources Information Center
Gordon, Howard R. D.
A random sample of 113 members of the American Vocational Education Research Association (AVERA) was surveyed to obtain baseline information regarding AVERA members' perceptions of statistical significance tests. The Psychometrics Group Instrument was used to collect data from participants. Of those surveyed, 67% were male, 93% had earned a…
Statistical controversies in clinical research: statistical significance-too much of a good thing ….
Buyse, M; Hurvitz, S A; Andre, F; Jiang, Z; Burris, H A; Toi, M; Eiermann, W; Lindsay, M-A; Slamon, D
2016-05-01
The use and interpretation of P values is a matter of debate in applied research. We argue that P values are useful as a pragmatic guide to interpret the results of a clinical trial, not as a strict binary boundary that separates real treatment effects from lack thereof. We illustrate our point using the result of BOLERO-1, a randomized, double-blind trial evaluating the efficacy and safety of adding everolimus to trastuzumab and paclitaxel as first-line therapy for HER2+ advanced breast cancer. In this trial, the benefit of everolimus was seen only in the predefined subset of patients with hormone receptor-negative breast cancer at baseline (progression-free survival hazard ratio = 0.66, P = 0.0049). A strict interpretation of this finding, based on complex 'alpha splitting' rules to assess statistical significance, led to the conclusion that the benefit of everolimus was not statistically significant either overall or in the subset. We contend that this interpretation does not do justice to the data, and we argue that the benefit of everolimus in hormone receptor-negative breast cancer is both statistically compelling and clinically relevant. PMID:26861602
Statistical downscaling rainfall using artificial neural network: significantly wetter Bangkok?
NASA Astrophysics Data System (ADS)
Vu, Minh Tue; Aribarg, Thannob; Supratid, Siriporn; Raghavan, Srivatsan V.; Liong, Shie-Yui
2015-08-01
Artificial neural network (ANN) is an established technique with a flexible mathematical structure that is capable of identifying complex nonlinear relationships between input and output data. The present study utilizes ANN as a method of statistically downscaling global climate models (GCMs) during the rainy season at meteorological site locations in Bangkok, Thailand. The study illustrates the applications of the feed forward back propagation using large-scale predictor variables derived from both the ERA-Interim reanalyses data and present day/future GCM data. The predictors are first selected over different grid boxes surrounding Bangkok region and then screened by using principal component analysis (PCA) to filter the best correlated predictors for ANN training. The reanalyses downscaled results of the present day climate show good agreement against station precipitation with a correlation coefficient of 0.8 and a Nash-Sutcliffe efficiency of 0.65. The final downscaled results for four GCMs show an increasing trend of precipitation for rainy season over Bangkok by the end of the twenty-first century. The extreme values of precipitation determined using statistical indices show strong increases of wetness. These findings will be useful for policy makers in pondering adaptation measures due to flooding such as whether the current drainage network system is sufficient to meet the changing climate and to plan for a range of related adaptation/mitigation measures.
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
ERIC Educational Resources Information Center
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
Smith, Ariana L; Wein, Alan J
2011-05-01
To evaluate the statistical and clinical efficacy of the pharmacological treatments of nocturia using non-antidiuretic agents. A literature review of treatments of nocturia specifically addressing the impact of alpha blockers, 5-alpha reductase inhibitors (5ARI) and antimuscarinics on reduction in nocturnal voids. Despite commonly reported statistically significant results, nocturia has shown a poor clinical response to traditional therapies for benign prostatic hyperplasia including alpha blockers and 5ARI. Similarly, nocturia has shown a poor clinical response to traditional therapies for overactive bladder including antimuscarinics. Statistical success has been achieved in some groups with a variety of alpha blockers and antimuscarinic agents, but the clinical significance of these changes is doubtful. It is likely that other types of therapy will need to be employed in order to achieve a clinically significant reduction in nocturia. PMID:21518417
ERIC Educational Resources Information Center
Monterde-i-Bort, Hector; Frias-Navarro, Dolores; Pascual-Llobell, Juan
2010-01-01
The empirical study we present here deals with a pedagogical issue that has not been thoroughly explored up until now in our field. Previous empirical studies in other sectors have identified the opinions of researchers about this topic, showing that completely unacceptable interpretations have been made of significance tests and other statistical…
ERIC Educational Resources Information Center
Simpson, Robert G.
1981-01-01
Occasionally, differences in test scores seem to indicate that a student performs much better in one reading area than in another when, in reality, the differences may not be statistically significant. The author presents a table in which statistically significant differences between Woodcock test standard scores are identified. (Author)
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
ERIC Educational Resources Information Center
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
ERIC Educational Resources Information Center
Norris, John M.
2015-01-01
Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…
The Importance of Invariance Procedures as against Tests of Statistical Significance.
ERIC Educational Resources Information Center
Fish, Larry
A growing controversy surrounds the strict interpretation of statistical significance tests in social research. Statistical significance tests fail in particular to provide estimates for the stability of research results. Methods that do provide such estimates are known as invariance or cross-validation procedures. Invariance analysis is largely…
A Review of Post-1994 Literature on Whether Statistical Significance Tests Should Be Banned.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
This paper summarizes the literature regarding statistical significance testing with an emphasis on: (1) the post-1994 literature in various disciplines; (2) alternatives to statistical significance testing; and (3) literature exploring why researchers have demonstrably failed to be influenced by the 1994 American Psychological Association…
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
2001-01-01
Summarizes the post-1994 literature in psychology and education regarding statistical significance testing, emphasizing limitations and defenses of statistical testing and alternatives or supplements to statistical significance testing. (SLD)
Chládek, J; Brázdil, M; Halámek, J; Plešinger, F; Jurák, P
2013-01-01
We present an off-line analysis procedure for exploring brain activity recorded from intra-cerebral electroencephalographic data (SEEG). The objective is to determine the statistical differences between different types of stimulations in the time-frequency domain. The procedure is based on computing relative signal power change and subsequent statistical analysis. An example of characteristic statistically significant event-related de/synchronization (ERD/ERS) detected across different frequency bands following different oddball stimuli is presented. The method is used for off-line functional classification of different brain areas. PMID:24109865
Armijo-Olivo, Susan; Warren, Sharon; Fuentes, Jorge; Magee, David J
2011-12-01
Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results. PMID:21658987
Fidler, Fiona; Burgman, Mark A; Cumming, Geoff; Buttrose, Robert; Thomason, Neil
2006-10-01
Over the last decade, criticisms of null-hypothesis significance testing have grown dramatically, and several alternative practices, such as confidence intervals, information theoretic, and Bayesian methods, have been advocated. Have these calls for change had an impact on the statistical reporting practices in conservation biology? In 2000 and 2001, 92% of sampled articles in Conservation Biology and Biological Conservation reported results of null-hypothesis tests. In 2005 this figure dropped to 78%. There were corresponding increases in the use of confidence intervals, information theoretic, and Bayesian techniques. Of those articles reporting null-hypothesis testing--which still easily constitute the majority--very few report statistical power (8%) and many misinterpret statistical nonsignificance as evidence for no effect (63%). Overall, results of our survey show some improvements in statistical practice, but further efforts are clearly required to move the discipline toward improved practices. PMID:17002771
Effect size, confidence interval and statistical significance: a practical guide for biologists.
Nakagawa, Shinichi; Cuthill, Innes C
2007-11-01
Null hypothesis significance testing (NHST) is the dominant statistical approach in biology, although it has many, frequently unappreciated, problems. Most importantly, NHST does not provide us with two crucial pieces of information: (1) the magnitude of an effect of interest, and (2) the precision of the estimate of the magnitude of that effect. All biologists should be ultimately interested in biological importance, which may be assessed using the magnitude of an effect, but not its statistical significance. Therefore, we advocate presentation of measures of the magnitude of effects (i.e. effect size statistics) and their confidence intervals (CIs) in all biological journals. Combined use of an effect size and its CIs enables one to assess the relationships within data more effectively than the use of p values, regardless of statistical significance. In addition, routine presentation of effect sizes will encourage researchers to view their results in the context of previous research and facilitate the incorporation of results into future meta-analysis, which has been increasingly used as the standard method of quantitative review in biology. In this article, we extensively discuss two dimensionless (and thus standardised) classes of effect size statistics: d statistics (standardised mean difference) and r statistics (correlation coefficient), because these can be calculated from almost all study designs and also because their calculations are essential for meta-analysis. However, our focus on these standardised effect size statistics does not mean unstandardised effect size statistics (e.g. mean difference and regression coefficient) are less important. We provide potential solutions for four main technical problems researchers may encounter when calculating effect size and CIs: (1) when covariates exist, (2) when bias in estimating effect size is possible, (3) when data have non-normal error structure and/or variances, and (4) when data are non
NASA Astrophysics Data System (ADS)
Wilks, Daniel S.
1996-04-01
A simple approach to long-range forecasting of monthly or seasonal quantities is as the average of observations over some number of the most recent years. Finding this `optimal climate normal' (OCN) involves examining the relationships between the observed variable and averages of its values over the previous one to 30 years and selecting the averaging period yielding the best results. This procedure involves a multiplicity of comparisons, which will lead to misleadingly positive results for developments data. The statistical significance of these OCNs are assessed here using a resampling procedure, in which time series of U.S. Climate Division data are repeatedly shuffled to produce statistical distributions of forecast performance measures, under the null hypothesis that the OCNs exhibit no predictive skill. Substantial areas in the United States are found for which forecast performance appears to be significantly better than would occur by chance.Another complication in the assessment of the statistical significance of the OCNs derives from the spatial correlation exhibited by the data. Because of this correlation, instances of Type I errors (false rejections of local null hypotheses) will tend to occur with spatial coherency and accordingly have the potential to be confused with regions for which there may be real predictability. The `field significance' of the collections of local tests is also assessed here by simultaneously and coherently shuffling the time series for the Climate Divisions. Areas exhibiting significant local tests are large enough to conclude that seasonal OCN temperature forecasts exhibit significant skill over parts of the United States for all seasons except SON, OND, and NDJ, and that seasonal OCN precipitation forecasts are significantly skillful only in the fall. Statistical significance is weaker for monthly than for seasonal OCN temperature forecasts, and the monthly OCN precipitation forecasts do not exhibit significant predictive
Alphas and Asterisks: The Development of Statistical Significance Testing Standards in Sociology
ERIC Educational Resources Information Center
Leahey, Erin
2005-01-01
In this paper, I trace the development of statistical significance testing standards in sociology by analyzing data from articles published in two prestigious sociology journals between 1935 and 2000. I focus on the role of two key elements in the diffusion literature, contagion and rationality, as well as the role of institutional factors. I…
ERIC Educational Resources Information Center
Snyder, Patricia; Lawson, Stephen
Magnitude of effect measures (MEMs), when adequately understood and correctly used, are important aids for researchers who do not want to rely solely on tests of statistical significance in substantive result interpretation. The MEM tells how much of the dependent variable can be controlled, predicted, or explained by the independent variables.…
ERIC Educational Resources Information Center
Spinella, Sarah
2011-01-01
As result replicability is essential to science and difficult to achieve through external replicability, the present paper notes the insufficiency of null hypothesis statistical significance testing (NHSST) and explains the bootstrap as a plausible alternative, with a heuristic example to illustrate the bootstrap method. The bootstrap relies on…
Statistical Significance, Effect Size, and Replication: What Do the Journals Say?
ERIC Educational Resources Information Center
DeVaney, Thomas A.
2001-01-01
Studied the attitudes of representatives of journals in education, sociology, and psychology through an electronic survey completed by 194 journal representatives. Results suggest that the majority of journals do not have written policies concerning the reporting of results from statistical significance testing, and most indicated that statistical…
Statistical Significance of the Trends in Monthly Heavy Precipitation Over the US
Mahajan, Salil; North, Dr. Gerald R.; Saravanan, Dr. R.; Genton, Dr. Marc G.
2012-01-01
Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall's {tau} test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong.
ERIC Educational Resources Information Center
Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.
2011-01-01
In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…
Weighing the costs of different errors when determining statistical significant during monitoring
Technology Transfer Automated Retrieval System (TEKTRAN)
Selecting appropriate significance levels when constructing confidence intervals and performing statistical analyses with rangeland monitoring data is not a straightforward process. This process is burdened by the conventional selection of “95% confidence” (i.e., Type I error rate, a =0.05) as the d...
Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.
ERIC Educational Resources Information Center
Deegear, James
This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
ERIC Educational Resources Information Center
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate unless "corrected" effect…
Liu, Yang; Vijver, Martina G; Qiu, Hao; Baas, Jan; Peijnenburg, Willie J G M
2015-12-01
There is increasing attention from scientists and policy makers to the joint effects of multiple metals on organisms when present in a mixture. Using root elongation of lettuce (Lactuca sativa L.) as a toxicity endpoint, the combined effects of binary mixtures of Cu, Cd, and Ni were studied. The statistical MixTox model was used to search deviations from the reference models i.e. concentration addition (CA) and independent action (IA). The deviations were subsequently interpreted as 'interactions'. A comprehensive experiment was designed to test the reproducibility of the 'interactions'. The results showed that the toxicity of binary metal mixtures was equally well predicted by both reference models. We found statistically significant 'interactions' in four of the five total datasets. However, the patterns of 'interactions' were found to be inconsistent or even contradictory across the different independent experiments. It is recommended that a statistically significant 'interaction', must be treated with care and is not necessarily biologically relevant. Searching a statistically significant interaction can be the starting point for further measurements and modeling to advance the understanding of underlying mechanisms and non-additive interactions occurring inside the organisms. PMID:26188643
The Effects of Electrode Impedance on Data Quality and Statistical Significance in ERP Recordings
Kappenman, Emily S.; Luck, Steven J.
2010-01-01
To determine whether data quality is meaningfully reduced by high electrode impedance, EEG was recorded simultaneously from low- and high-impedance electrode sites during an oddball task. Low-frequency noise was found to be increased at high-impedance sites relative to low-impedance sites, especially when the recording environment was warm and humid. The increased noise at the high-impedance sites caused an increase in the number of trials needed to obtain statistical significance in analyses of P3 amplitude, but this could be partially mitigated by high-pass filtering and artifact rejection. High electrode impedance did not reduce statistical power for the N1 wave unless the recording environment was warm and humid. Thus, high electrode impedance may increase noise and decrease statistical power under some conditions, but these effects can be reduced by using a cool and dry recording environment and appropriate signal processing methods. PMID:20374541
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Suffredini, Anthony F; Sacks, David B; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple 'fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ. PMID:26510657
NASA Astrophysics Data System (ADS)
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Mass spectrometry-based protein identification with accurate statistical significance assignment
Alves, Gelio; Yu, Yi-Kuo
2015-01-01
Motivation: Assigning statistical significance accurately has become increasingly important as metadata of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of metadata at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry-based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. Results: We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database P-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level E-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Sorić formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. Availability and implementation: The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit. Contact: yyu@ncbi.nlm.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25362092
Rudd, James; Moore, Jason H.; Urbanowicz, Ryan J.
2013-01-01
Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear. PMID:24358057
NASA Astrophysics Data System (ADS)
Ritter, Axel; Muñoz-Carpena, Rafael
2013-02-01
SummarySuccess in the use of computer models for simulating environmental variables and processes requires objective model calibration and verification procedures. Several methods for quantifying the goodness-of-fit of observations against model-calculated values have been proposed but none of them is free of limitations and are often ambiguous. When a single indicator is used it may lead to incorrect verification of the model. Instead, a combination of graphical results, absolute value error statistics (i.e. root mean square error), and normalized goodness-of-fit statistics (i.e. Nash-Sutcliffe Efficiency coefficient, NSE) is currently recommended. Interpretation of NSE values is often subjective, and may be biased by the magnitude and number of data points, data outliers and repeated data. The statistical significance of the performance statistics is an aspect generally ignored that helps in reducing subjectivity in the proper interpretation of the model performance. In this work, approximated probability distributions for two common indicators (NSE and root mean square error) are derived with bootstrapping (block bootstrapping when dealing with time series), followed by bias corrected and accelerated calculation of confidence intervals. Hypothesis testing of the indicators exceeding threshold values is proposed in a unified framework for statistically accepting or rejecting the model performance. It is illustrated how model performance is not linearly related with NSE, which is critical for its proper interpretation. Additionally, the sensitivity of the indicators to model bias, outliers and repeated data is evaluated. The potential of the difference between root mean square error and mean absolute error for detecting outliers is explored, showing that this may be considered a necessary but not a sufficient condition of outlier presence. The usefulness of the approach for the evaluation of model performance is illustrated with case studies including those with
Massage induces an immediate, albeit short-term, reduction in muscle stiffness.
Eriksson Crommert, M; Lacourpaille, L; Heales, L J; Tucker, K; Hug, F
2015-10-01
Using ultrasound shear wave elastography, the aims of this study were: (a) to evaluate the effect of massage on stiffness of the medial gastrocnemius (MG) muscle and (b) to determine whether this effect (if any) persists over a short period of rest. A 7-min massage protocol was performed unilaterally on MG in 18 healthy volunteers. Measurements of muscle shear elastic modulus (stiffness) were performed bilaterally (control and massaged leg) in a moderately stretched position at three time points: before massage (baseline), directly after massage (follow-up 1), and following 3 min of rest (follow-up 2). Directly after massage, participants rated pain experienced during the massage. MG shear elastic modulus of the massaged leg decreased significantly at follow-up 1 (-5.2 ± 8.8%, P = 0.019, d = -0.66). There was no difference between follow-up 2 and baseline for the massaged leg (P = 0.83) indicating that muscle stiffness returned to baseline values. Shear elastic modulus was not different between time points in the control leg. There was no association between perceived pain during the massage and stiffness reduction (r = 0.035; P = 0.89). This is the first study to provide evidence that massage reduces muscle stiffness. However, this effect is short lived and returns to baseline values quickly after cessation of the massage. PMID:25487283
On the statistical significance of the bulk flow measured by the Planck satellite
NASA Astrophysics Data System (ADS)
Atrio-Barandela, F.
2013-09-01
A recent analysis of data collected by the Planck satellite detected a net dipole at the location of X-ray selected galaxy clusters, corresponding to a large-scale bulk flow extending at least to z ~ 0.18, the median redshift of the cluster sample. The amplitude of this flow, as measured with Planck, is consistent with earlier findings based on data from the Wilkinson Microwave Anisotropy Probe (WMAP). However, the uncertainty assigned to the dipole by the Planck team is much larger than that found in the WMAP studies, leading the authors of the Planck study to conclude that the observed bulk flow is not statistically significant. Here, we show that two of the three implementations of random sampling used in the error analysis of the Planck study lead to systematic overestimates in the uncertainty of the measured dipole. Random simulations of the sky do not take into account that the actual realization of the sky leads to filtered data that have a 12% lower root-mean-square dispersion than the average simulation. Using rotations around the Galactic pole (the Z axis), increases the uncertainty of the X and Y components of the dipole and artificially reduces the significance of the dipole detection from 98-99% to less than 90% confidence. When either effect is taken into account, the corrected errors agree with those obtained using random distributions of clusters on Planck data, and the resulting statistical significance of the dipole measured by Planck is consistent with that of the WMAP results.
NASA Astrophysics Data System (ADS)
Santer, B. D.; Wigley, T. M. L.; Boyle, J. S.; Gaffen, D. J.; Hnilo, J. J.; Nychka, D.; Parker, D. E.; Taylor, K. E.
2000-03-01
This paper examines trend uncertainties in layer-average free atmosphere temperatures arising from the use of different trend estimation methods. It also considers statistical issues that arise in assessing the significance of individual trends and of trend differences between data sets. Possible causes of these trends are not addressed. We use data from satellite and radiosonde measurements and from two reanalysis projects. To facilitate intercomparison, we compute from reanalyses and radiosonde data temperatures equivalent to those from the satellite-based Microwave Sounding Unit (MSU). We compare linear trends based on minimization of absolute deviations (LA) and minimization of squared deviations (LS). Differences are generally less than 0.05°C/decade over 1959-1996. Over 1979-1993, they exceed 0.10°C/decade for lower tropospheric time series and 0.15°C/decade for the lower stratosphere. Trend fitting by the LA method can degrade the lower-tropospheric trend agreement of 0.03°C/decade (over 1979-1996) previously reported for the MSU and radiosonde data. In assessing trend significance we employ two methods to account for temporal autocorrelation effects. With our preferred method, virtually none of the individual 1979-1993 trends in deep-layer temperatures are significantly different from zero. To examine trend differences between data sets we compute 95% confidence intervals for individual trends and show that these overlap for almost all data sets considered. Confidence intervals for lower-tropospheric trends encompass both zero and the model-projected trends due to anthropogenic effects. We also test the significance of a trend in d(t), the time series of differences between a pair of data sets. Use of d(t) removes variability common to both time series and facilitates identification of small trend differences. This more discerning test reveals that roughly 30% of the data set comparisons have significant differences in lower-tropospheric trends
Crow, C.J.
1985-01-01
Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.
A Network-Based Method to Assess the Statistical Significance of Mild Co-Regulation Effects
Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna
2013-01-01
Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis. PMID:24039936
NASA Astrophysics Data System (ADS)
Kellerer-Pirklbauer, Andreas
2016-04-01
Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter
Statistics, Probability, Significance, Likelihood: Words Mean What We Define Them to Mean
ERIC Educational Resources Information Center
Drummond, Gordon B.; Tom, Brian D. M.
2011-01-01
Statisticians use words deliberately and specifically, but not necessarily in the way they are used colloquially. For example, in general parlance "statistics" can mean numerical information, usually data. In contrast, one large statistics textbook defines the term "statistic" to denote "a characteristic of a "sample", such as the average score",…
Carr, J.R.; Roberts, K.P.
1989-02-01
Universal kriging is compared with ordinary kriging for estimation of earthquake ground motion. Ordinary kriging is based on a stationary random function model; universal kriging is based on a nonstationary random function model representing first-order drift. Accuracy of universal kriging is compared with that for ordinary kriging; cross-validation is used as the basis for comparison. Hypothesis testing on these results shows that accuracy obtained using universal kriging is not significantly different from accuracy obtained using ordinary kriging. Test based on normal distribution assumptions are applied to errors measured in the cross-validation procedure; t and F tests reveal no evidence to suggest universal and ordinary kriging are different for estimation of earthquake ground motion. Nonparametric hypothesis tests applied to these errors and jackknife statistics yield the same conclusion: universal and ordinary kriging are not significantly different for this application as determined by a cross-validation procedure. These results are based on application to four independent data sets (four different seismic events).
There's more than one way to conduct a replication study: Beyond statistical significance.
Anderson, Samantha F; Maxwell, Scott E
2016-03-01
As the field of psychology struggles to trust published findings, replication research has begun to become more of a priority to both scientists and journals. With this increasing emphasis placed on reproducibility, it is essential that replication studies be capable of advancing the field. However, we argue that many researchers have been only narrowly interpreting the meaning of replication, with studies being designed with a simple statistically significant or nonsignificant results framework in mind. Although this interpretation may be desirable in some cases, we develop a variety of additional "replication goals" that researchers could consider when planning studies. Even if researchers are aware of these goals, we show that they are rarely used in practice-as results are typically analyzed in a manner only appropriate to a simple significance test. We discuss each goal conceptually, explain appropriate analysis procedures, and provide 1 or more examples to illustrate these analyses in practice. We hope that these various goals will allow researchers to develop a more nuanced understanding of replication that can be flexible enough to answer the various questions that researchers might seek to understand. PMID:26214497
Gehrmann, Thies; Reinders, Marcel J.T.
2015-01-01
Background: With more and more genomes being sequenced, detecting synteny between genomes becomes more and more important. However, for microorganisms the genomic divergence quickly becomes large, resulting in different codon usage and shuffling of gene order and gene elements such as exons. Results: We present Proteny, a methodology to detect synteny between diverged genomes. It operates on the amino acid sequence level to be insensitive to codon usage adaptations and clusters groups of exons disregarding order to handle diversity in genomic ordering between genomes. Furthermore, Proteny assigns significance levels to the syntenic clusters such that they can be selected on statistical grounds. Finally, Proteny provides novel ways to visualize results at different scales, facilitating the exploration and interpretation of syntenic regions. We test the performance of Proteny on a standard ground truth dataset, and we illustrate the use of Proteny on two closely related genomes (two different strains of Aspergillus niger) and on two distant genomes (two species of Basidiomycota). In comparison to other tools, we find that Proteny finds clusters with more true homologies in fewer clusters that contain more genes, i.e. Proteny is able to identify a more consistent synteny. Further, we show how genome rearrangements, assembly errors, gene duplications and the conservation of specific genes can be easily studied with Proteny. Availability and implementation: Proteny is freely available at the Delft Bioinformatics Lab website http://bioinformatics.tudelft.nl/dbl/software. Contact: t.gehrmann@tudelft.nl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26116928
Alexandrov, N. N.; Go, N.
1994-01-01
We have completed an exhaustive search for the common spatial arrangements of backbone fragments (SARFs) in nonhomologous proteins. This type of local structural similarity, incorporating short fragments of backbone atoms, arranged not necessarily in the same order along the polypeptide chain, appears to be important for protein function and stability. To estimate the statistical significance of the similarities, we have introduced a similarity score. We present several locally similar structures, with a large similarity score, which have not yet been reported. On the basis of the results of pairwise comparison, we have performed hierarchical cluster analysis of protein structures. Our analysis is not limited by comparison of single chains but also includes complex molecules consisting of several subunits. The SARFs with backbone fragments from different polypeptide chains provide a stable interaction between subunits in protein molecules. In many cases the active site of enzyme is located at the same position relative to the common SARFs, implying a function of the certain SARFs as a universal interface of the protein-substrate interaction. PMID:8069217
2013-01-01
Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9). PMID:23721463
Badr, Lina Kurdahi
2009-01-01
By adopting more appropriate statistical methods to appraise data from a previously published randomized controlled trial (RCT), we evaluated the statistical and clinical significance of an intervention on the 18 month neurodevelopmental outcome of infants with suspected brain injury. The intervention group (n =32) received extensive, individualized cognitive/sensorimotor stimulation by public health nurses (PHNs) while the control group (n = 30) received standard follow-up care. At 18 months 43 infants remained in the study (22 = intervention, 21 = control). The results indicate that there was a significant statistical change within groups and a clinical significance whereby more infants in the intervention group improved in mental, motor and neurological functioning at 18 months compared to the control group. The benefits of looking at clinical significance from a meaningful aspect for practitioners are emphasized. PMID:19276403
NASA Astrophysics Data System (ADS)
Alves, Gelio
After the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two concrete examples: global sequence alignment statistics and peptide sequencing/identification using mass spectrometry. High-performance liquid chromatography coupled to a mass spectrometer (HPLC/MS/MS), enabling peptide identifications and thus protein identifications, has become the tool of choice in large-scale proteomics experiments. Peptide identification is usually done by database searches methods. The lack of robust statistical significance assignment among current methods motivated the development of a novel de novo algorithm, RAId, whose score statistics then provide statistical significance for high scoring peptides found in our custom, enzyme-digested peptide library. The ease of incorporating post-translation modifications is another important feature of RAId. To organize the massive protein/DNA data accumulated, biologists often cluster proteins according to their similarity via tools such as sequence alignment. Homologous proteins share similar domains. To assess the similarity of two domains usually requires alignment from head to toe, ie. a global alignment. A good alignment score statistics with an appropriate null model enable us to distinguish the biologically meaningful similarity from chance similarity. There has been much progress in local alignment statistics, which characterize score statistics when alignments tend to appear as a short segment of the whole sequence. For global alignment, which is useful in domain alignment, there is still much room for
NASA Astrophysics Data System (ADS)
Williams, Arnold C.; Pachowicz, Peter W.
2004-09-01
Current mine detection research indicates that no single sensor or single look from a sensor will detect mines/minefields in a real-time manner at a performance level suitable for a forward maneuver unit. Hence, the integrated development of detectors and fusion algorithms are of primary importance. A problem in this development process has been the evaluation of these algorithms with relatively small data sets, leading to anecdotal and frequently over trained results. These anecdotal results are often unreliable and conflicting among various sensors and algorithms. Consequently, the physical phenomena that ought to be exploited and the performance benefits of this exploitation are often ambiguous. The Army RDECOM CERDEC Night Vision Laboratory and Electron Sensors Directorate has collected large amounts of multisensor data such that statistically significant evaluations of detection and fusion algorithms can be obtained. Even with these large data sets care must be taken in algorithm design and data processing to achieve statistically significant performance results for combined detectors and fusion algorithms. This paper discusses statistically significant detection and combined multilook fusion results for the Ellipse Detector (ED) and the Piecewise Level Fusion Algorithm (PLFA). These statistically significant performance results are characterized by ROC curves that have been obtained through processing this multilook data for the high resolution SAR data of the Veridian X-Band radar. We discuss the implications of these results on mine detection and the importance of statistical significance, sample size, ground truth, and algorithm design in performance evaluation.
Determination of significant variables in compound wear using a statistical model
Pumwa, J.; Griffin, R.B.; Smith, C.M.
1997-07-01
This paper will report on a study of dry compound wear of normalized 1018 steel on A2 tool steel. Compound wear is a combination of sliding and impact wear. The compound wear machine consisted of an A2 tool steel wear plate that could be rotated, and an indentor head that held the 1018 carbon steel wear pins. The variables in the system were the rpm of the wear plate, the force with which the indentor strikes the wear plate, and the frequency with which the indentor strikes the wear plate. A statistically designed experiment was used to analyze the effects of the different variables on the compound wear process. The model developed showed that wear could be reasonably well predicted using a defined variable that was called the workrate. The paper will discuss the results of the modeling and the metallurgical changes that occurred at the indentor interface, with the wear plate, during the wear process.
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated. PMID:25388545
ERIC Educational Resources Information Center
Jacobson, Neil S.; Truax, Paula
1991-01-01
Describes ways of operationalizing clinically significant change, defined as extent to which therapy moves someone outside range of dysfunctional population or within range of functional population. Uses examples to show how clients can be categorized on basis of this definition. Proposes reliable change index (RC) to determine whether magnitude…
ERIC Educational Resources Information Center
Hojat, Mohammadreza; Xu, Gang
2004-01-01
Effect Sizes (ES) are an increasingly important index used to quantify the degree of practical significance of study results. This paper gives an introduction to the computation and interpretation of effect sizes from the perspective of the consumer of the research literature. The key points made are: (1) "ES" is a useful indicator of the…
ERIC Educational Resources Information Center
Onwuegbuzie, Anthony J.; Roberts, J. Kyle; Daniel, Larry G.
2005-01-01
In this article, the authors (a) illustrate how displaying disattenuated correlation coefficients alongside their unadjusted counterparts will allow researchers to assess the impact of unreliability on bivariate relationships and (b) demonstrate how a proposed new "what if reliability" analysis can complement null hypothesis significance tests of…
NASA Technical Reports Server (NTRS)
Staubert, R.
1985-01-01
Methods for calculating the statistical significance of excess events and the interpretation of the formally derived values are discussed. It is argued that a simple formula for a conservative estimate should generally be used in order to provide a common understanding of quoted values.
Key statistics related to CO/sub 2/ emissions: Significant contributing countries
Kellogg, M.A.; Edmonds, J.A.; Scott, M.J.; Pomykala, J.S.
1987-07-01
This country selection task report describes and applies a methodology for identifying a set of countries responsible for significant present and anticipated future emissions of CO/sub 2/ and other radiatively important gases (RIGs). The identification of countries responsible for CO/sub 2/ and other RIGs emissions will help determine to what extent a select number of countries might be capable of influencing future emissions. Once identified, those countries could potentially exercise cooperative collective control of global emissions and thus mitigate the associated adverse affects of those emissions. The methodology developed consists of two approaches: the resource approach and the emissions approach. While conceptually very different, both approaches yield the same fundamental conclusion. The core of any international initiative to control global emissions must include three key countries: the US, USSR, and the People's Republic of China. It was also determined that broader control can be achieved through the inclusion of sixteen additional countries with significant contributions to worldwide emissions.
NASA Technical Reports Server (NTRS)
Massey, J. L.
1976-01-01
The very low error probability obtained with long error-correcting codes results in a very small number of observed errors in simulation studies of practical size and renders the usual confidence interval techniques inapplicable to the observed error probability. A natural extension of the notion of a 'confidence interval' is made and applied to such determinations of error probability by simulation. An example is included to show the surprisingly great significance of as few as two decoding errors in a very large number of decoding trials.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Potts, T.T.; Hylko, J.M.; Almond, D.
2007-07-01
A company's overall safety program becomes an important consideration to continue performing work and for procuring future contract awards. When injuries or accidents occur, the employer ultimately loses on two counts - increased medical costs and employee absences. This paper summarizes the human and organizational components that contributed to successful safety programs implemented by WESKEM, LLC's Environmental, Safety, and Health Departments located in Paducah, Kentucky, and Oak Ridge, Tennessee. The philosophy of 'safety, compliance, and then production' and programmatic components implemented at the start of the contracts were qualitatively identified as contributing factors resulting in a significant accumulation of safe work hours and an Experience Modification Rate (EMR) of <1.0. Furthermore, a study by the Associated General Contractors of America quantitatively validated components, already found in the WESKEM, LLC programs, as contributing factors to prevent employee accidents and injuries. Therefore, an investment in the human and organizational components now can pay dividends later by reducing the EMR, which is the key to reducing Workers' Compensation premiums. Also, knowing your employees' demographics and taking an active approach to evaluate and prevent fatigue may help employees balance work and non-work responsibilities. In turn, this approach can assist employers in maintaining a healthy and productive workforce. For these reasons, it is essential that safety needs be considered as the starting point when performing work. (authors)
NASA Astrophysics Data System (ADS)
Baluev, Roman V.
2013-11-01
We consider the `multifrequency' periodogram, in which the putative signal is modelled as a sum of two or more sinusoidal harmonics with independent frequencies. It is useful in cases when the data may contain several periodic components, especially when their interaction with each other and with the data sampling patterns might produce misleading results. Although the multifrequency statistic itself was constructed earlier, for example by G. Foster in his CLEANest algorithm, its probabilistic properties (the detection significance levels) are still poorly known and much of what is deemed known is not rigorous. These detection levels are nonetheless important for data analysis. We argue that to prove the simultaneous existence of all n components revealed in a multiperiodic variation, it is mandatory to apply at least 2n - 1 significance tests, among which most involve various multifrequency statistics, and only n tests are single-frequency ones. The main result of this paper is an analytic estimation of the statistical significance of the frequency tuples that the multifrequency periodogram can reveal. Using the theory of extreme values of random fields (the generalized Rice method), we find a useful approximation to the relevant false alarm probability. For the double-frequency periodogram, this approximation is given by the elementary formula (π/16)W2e- zz2, where W denotes the normalized width of the settled frequency range, and z is the observed periodogram maximum. We carried out intensive Monte Carlo simulations to show that the practical quality of this approximation is satisfactory. A similar analytic expression for the general multifrequency periodogram is also given, although with less numerical verification.
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
NASA Astrophysics Data System (ADS)
Casati, Michele
2014-05-01
The assertion that solar activity may play a significant role in the trigger of large volcanic eruptions is, and has been discussed by many geophysicists. Numerous scientific papers have established a possible correlation between these events and the electromagnetic coupling between the Earth and the Sun, but none of them has been able to highlight a possible statistically significant relationship between large volcanic eruptions and any of the series, such as geomagnetic activity, solar wind, sunspots number. In our research, we compare the 148 volcanic eruptions with index VEI4, the major 37 historical volcanic eruptions equal to or greater than index VEI5, recorded from 1610 to 2012 , with its sunspots number. Staring, as the threshold value, a monthly sunspot number of 46 (recorded during the great eruption of Krakatoa VEI6 historical index, August 1883), we note some possible relationships and conduct a statistical test. • Of the historical 31 large volcanic eruptions with index VEI5+, recorded between 1610 and 1955, 29 of these were recorded when the SSN<46. The remaining 2 eruptions were not recorded when the SSN<46, but rather during solar maxima of the solar cycle of the year 1739 and in the solar cycle No. 14 (Shikotsu eruption of 1739 and Ksudach 1907). • Of the historical 8 large volcanic eruptions with index VEI6+, recorded from 1610 to the present, 7 of these were recorded with SSN<46 and more specifically, within the three large solar minima known : Maunder (1645-1710), Dalton (1790-1830) and during the solar minimums occurred between 1880 and 1920. As the only exception, we note the eruption of Pinatubo of June 1991, recorded in the solar maximum of cycle 22. • Of the historical 6 major volcanic eruptions with index VEI5+, recorded after 1955, 5 of these were not recorded during periods of low solar activity, but rather during solar maxima, of the cycles 19,21 and 22. The significant tests, conducted with the chi-square χ ² = 7,782, detect a
Webb-Robertson, Bobbie-Jo M.; McCue, Lee Ann; Waters, Katrina M.; Matzke, Melissa M.; Jacobs, Jon M.; Metz, Thomas O.; Varnum, Susan M.; Pounds, Joel G.
2010-11-01
Liquid chromatography-mass spectrometry-based (LC-MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in peptide intensities and observations (presence/absence) of peptides makes data analysis quite challenging. The missing abundance values in LC-MS proteomics data are difficult to address with traditional imputation-based approaches because the mechanisms by which data are missing are unknown a priori. Data can be missing due to random mechanisms such as experimental error, or non-random mechanisms such as a true biological effect. We present a statistical approach that uses a test of independence known as a G-test to test the null hypothesis of independence between the number of missing values and the experimental groups. We pair the G-test results evaluating independence of missing data (IMD) with a standard analysis of variance (ANOVA) that uses only means and variances computed from the observed data. Each peptide is therefore represented by two statistical confidence metrics, one for qualitative differential observation and one for quantitative differential intensity. We use two simulated and two real LC-MS datasets to demonstrate the robustness and sensitivity of the ANOVA-IMD approach for assigning confidence to peptides with significant differential abundance among experimental groups.
NASA Astrophysics Data System (ADS)
Wang, H. J.; Shi, W. L.; Chen, X. H.
2006-05-01
The West Development Policy being implemented in China is causing significant land use and land cover (LULC) changes in West China. With the up-to-date satellite database of the Global Land Cover Characteristics Database (GLCCD) that characterizes the lower boundary conditions, the regional climate model RIEMS-TEA is used to simulate possible impacts of the significant LULC variation. The model was run for five continuous three-month periods from 1 June to 1 September of 1993, 1994, 1995, 1996, and 1997, and the results of the five groups are examined by means of a student t-test to identify the statistical significance of regional climate variation. The main results are: (1) The regional climate is affected by the LULC variation because the equilibrium of water and heat transfer in the air-vegetation interface is changed. (2) The integrated impact of the LULC variation on regional climate is not only limited to West China where the LULC varies, but also to some areas in the model domain where the LULC does not vary at all. (3) The East Asian monsoon system and its vertical structure are adjusted by the large scale LULC variation in western China, where the consequences axe the enhancement of the westward water vapor transfer from the east east and the relevant increase of wet-hydrostatic energy in the middle-upper atmospheric layers. (4) The ecological engineering in West China affects significantly the regional climate in Northwest China, North China and the middle-lower reaches of the Yangtze River; there are obvious effects in South, Northeast, and Southwest China, but minor effects in Tibet.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
Denton, Debra L; Diamond, Jerry; Zheng, Lei
2011-05-01
The U.S. Environmental Protection Agency (U.S. EPA) and state agencies implement the Clean Water Act, in part, by evaluating the toxicity of effluent and surface water samples. A common goal for both regulatory authorities and permittees is confidence in an individual test result (e.g., no-observed-effect concentration [NOEC], pass/fail, 25% effective concentration [EC25]), which is used to make regulatory decisions, such as reasonable potential determinations, permit compliance, and watershed assessments. This paper discusses an additional statistical approach (test of significant toxicity [TST]), based on bioequivalence hypothesis testing, or, more appropriately, test of noninferiority, which examines whether there is a nontoxic effect at a single concentration of concern compared with a control. Unlike the traditional hypothesis testing approach in whole effluent toxicity (WET) testing, TST is designed to incorporate explicitly both α and β error rates at levels of toxicity that are unacceptable and acceptable, given routine laboratory test performance for a given test method. Regulatory management decisions are used to identify unacceptable toxicity levels for acute and chronic tests, and the null hypothesis is constructed such that test power is associated with the ability to declare correctly a truly nontoxic sample as acceptable. This approach provides a positive incentive to generate high-quality WET data to make informed decisions regarding regulatory decisions. This paper illustrates how α and β error rates were established for specific test method designs and tests the TST approach using both simulation analyses and actual WET data. In general, those WET test endpoints having higher routine (e.g., 50th percentile) within-test control variation, on average, have higher method-specific α values (type I error rate), to maintain a desired type II error rate. This paper delineates the technical underpinnings of this approach and demonstrates the benefits
Fisher, Aaron; Anderson, G Brooke; Peng, Roger; Leek, Jeff
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%-49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%-76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Fisher, Aaron; Anderson, G. Brooke; Peng, Roger
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%–49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%–76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
NASA Technical Reports Server (NTRS)
Friedlander, Alan L.; Harry, David P., III
1960-01-01
An exploratory analysis of vehicle guidance during the approach to a target planet is presented. The objective of the guidance maneuver is to guide the vehicle to a specific perigee distance with a high degree of accuracy and minimum corrective velocity expenditure. The guidance maneuver is simulated by considering the random sampling of real measurements with significant error and reducing this information to prescribe appropriate corrective action. The instrumentation system assumed includes optical and/or infrared devices to indicate range and a reference angle in the trajectory plane. Statistical results are obtained by Monte-Carlo techniques and are shown as the expectation of guidance accuracy and velocity-increment requirements. Results are nondimensional and applicable to any planet within limits of two-body assumptions. The problem of determining how many corrections to make and when to make them is a consequence of the conflicting requirement of accurate trajectory determination and propulsion. Optimum values were found for a vehicle approaching a planet along a parabolic trajectory with an initial perigee distance of 5 radii and a target perigee of 1.02 radii. In this example measurement errors were less than i minute of arc. Results indicate that four corrections applied in the vicinity of 50, 16, 15, and 1.5 radii, respectively, yield minimum velocity-increment requirements. Thrust devices capable of producing a large variation of velocity-increment size are required. For a vehicle approaching the earth, miss distances within 32 miles are obtained with 90-percent probability. Total velocity increments used in guidance are less than 3300 feet per second with 90-percent probability. It is noted that the above representative results are valid only for the particular guidance scheme hypothesized in this analysis. A parametric study is presented which indicates the effects of measurement error size, initial perigee, and initial energy on the guidance
Greenhalgh, T.
1997-01-01
It is possible to be seriously misled by taking the statistical competence (and/or the intellectual honesty) of authors for granted. Some common errors committed (deliberately or inadvertently) by the authors of papers are given in the final box. PMID:9277611
Kurtz, S.E.; Fields, D.E.
1983-10-01
This report describes a version of the TERPED/P computer code that is very useful for small data sets. A new algorithm for determining the Kolmogorov-Smirnov (KS) statistics is used to extend program applicability. The TERPED/P code facilitates the analysis of experimental data and assists the user in determining its probability distribution function. Graphical and numerical tests are performed interactively in accordance with the user's assumption of normally or log-normally distributed data. Statistical analysis options include computation of the chi-square statistic and the KS one-sample test statistic and the corresponding significance levels. Cumulative probability plots of the user's data are generated either via a local graphics terminal, a local line printer or character-oriented terminal, or a remote high-resolution graphics device such as the FR80 film plotter or the Calcomp paper plotter. Several useful computer methodologies suffer from limitations of their implementations of the KS nonparametric test. This test is one of the more powerful analysis tools for examining the validity of an assumption about the probability distribution of a set of data. KS algorithms are found in other analysis codes, including the Statistical Analysis Subroutine (SAS) package and earlier versions of TERPED. The inability of these algorithms to generate significance levels for sample sizes less than 50 has limited their usefulness. The release of the TERPED code described herein contains algorithms to allow computation of the KS statistic and significance level for data sets of, if the user wishes, as few as three points. Values computed for the KS statistic are within 3% of the correct value for all data set sizes.
Lindmark, Anita; van Rompaye, Bart; Goetghebeur, Els; Glader, Eva-Lotta; Eriksson, Marie
2016-01-01
Background When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke) to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance. Methods The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008–2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method. Results Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252) and high specificity (0.991). There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence. Conclusions The study emphasizes the importance of combining clinical relevance
Gaonkar, Bilwaj; Davatzikos, Christos
2013-01-01
Multivariate pattern analysis (MVPA) methods such as support vector machines (SVMs) have been increasingly applied to fMRI and sMRI analyses, enabling the detection of distinctive imaging patterns. However, identifying brain regions that significantly contribute to the classification/group separation requires computationally expensive permutation testing. In this paper we show that the results of SVM-permutation testing can be analytically approximated. This approximation leads to more than a thousand fold speed up of the permutation testing procedure, thereby rendering it feasible to perform such tests on standard computers. The speed up achieved makes SVM based group difference analysis competitive with standard univariate group difference analysis methods. PMID:23583748
Boareto, Marcelo; Caticha, Nestor
2014-01-01
Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.
NASA Technical Reports Server (NTRS)
Druzhinin, I. P.; Khamyanova, N. V.; Yagodinskiy, V. N.
1974-01-01
Statistical evaluations of the significance of the relationship of abrupt changes in solar activity and discontinuities in the multi-year pattern of an epidemic process are reported. They reliably (with probability of more than 99.9%) show the real nature of this relationship and its great specific weight (about half) in the formation of discontinuities in the multi-year pattern of the processes in question.
NASA Astrophysics Data System (ADS)
Kalimeris, A.; Potirakis, S. M.; Eftaxias, K.; Antonopoulos, G.; Kopanas, J.; Nomikos, C.
2016-05-01
A multi-spectral analysis of the kHz electromagnetic time series associated with Athens' earthquake (M = 5.9, 7 September 1999) is presented here, that results to the reliable discrimination of the fracto-electromagnetic emissions from the natural geo-electromagnetic field background. Five spectral analysis methods are utilized in order to resolve the statistically significant variability modes of the studied dynamical system out of a red noise background (the revised Multi-Taper Method, the Singular Spectrum Analysis, and the Wavelet Analysis among them). The performed analysis reveals the existence of three distinct epochs in the time series for the period before the earthquake, a "quiet", a "transitional" and an "active" epoch. Towards the end of the active epoch, during a sub-period which is approximately starting two days before the earthquake, the dynamical system passes into a high activity state, where electromagnetic signal emissions become powerful and statistically significant almost in all time-scales. The temporal behavior of the studied system in each one of these epochs is further searched through mathematical reconstruction in the time domain of those spectral features that were found to be statistically significant. The transition of the system from the quiet to the active state proved to be detectable first in the long time-scales and afterwards in the short scales. Finally, a Hurst exponent analysis revealed persistent characteristics embedded in the two strong EM bursts observed during the "active" epoch.
Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Aliakbari, Massumeh; Lindlöf, Angelica; Ebrahimi, Mahdi; Ebrahimie, Esmaeil
2015-01-01
Cis regulatory elements (CREs), located within promoter regions, play a significant role in the blueprint for transcriptional regulation of genes. There is a growing interest to study the combinatorial nature of CREs including presence or absence of CREs, the number of occurrences of each CRE, as well as of their order and location relative to their target genes. Comparative promoter analysis has been shown to be a reliable strategy to test the significance of each component of promoter architecture. However, it remains unclear what level of difference in the number of occurrences of each CRE is of statistical significance in order to explain different expression patterns of two genes. In this study, we present a novel statistical approach for pairwise comparison of promoters of Arabidopsis genes in the context of number of occurrences of each CRE within the promoters. First, using the sample of 1000 Arabidopsis promoters, the results of the goodness of fit test and non-parametric analysis revealed that the number of occurrences of CREs in a promoter sequence is Poisson distributed. As a promoter sequence contained functional and non-functional CREs, we addressed the issue of the statistical distribution of functional CREs by analyzing the ChIP-seq datasets. The results showed that the number of occurrences of functional CREs over the genomic regions was determined as being Poisson distributed. In accordance with the obtained distribution of CREs occurrences, we suggested the Audic and Claverie (AC) test to compare two promoters based on the number of occurrences for the CREs. Superiority of the AC test over Chi-square (2×2) and Fisher's exact tests was also shown, as the AC test was able to detect a higher number of significant CREs. The two case studies on the Arabidopsis genes were performed in order to biologically verify the pairwise test for promoter comparison. Consequently, a number of CREs with significantly different occurrences was identified between
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the 'IMGT clonotype (AA)' (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
Best, R; Harrell, A; Geesey, C; Libby, B; Wijesooriya, K
2014-06-15
Purpose: The purpose of this study is to inter-compare and find statistically significant differences between flattened field fixed-beam (FB) IMRT with flattening-filter free (FFF) volumetric modulated arc therapy (VMAT) for stereotactic body radiation therapy SBRT. Methods: SBRT plans using FB IMRT and FFF VMAT were generated for fifteen SBRT lung patients using 6 MV beams. For each patient, both IMRT and VMAT plans were created for comparison. Plans were generated utilizing RTOG 0915 (peripheral, 10 patients) and RTOG 0813 (medial, 5 patients) lung protocols. Target dose, critical structure dose, and treatment time were compared and tested for statistical significance. Parameters of interest included prescription isodose surface coverage, target dose heterogeneity, high dose spillage (location and volume), low dose spillage (location and volume), lung dose spillage, and critical structure maximum- and volumetric-dose limits. Results: For all criteria, we found equivalent or higher conformality with VMAT plans as well as reduced critical structure doses. Several differences passed a Student's t-test of significance: VMAT reduced the high dose spillage, evaluated with conformality index (CI), by an average of 9.4%±15.1% (p=0.030) compared to IMRT. VMAT plans reduced the lung volume receiving 20 Gy by 16.2%±15.0% (p=0.016) compared with IMRT. For the RTOG 0915 peripheral lesions, the volumes of lung receiving 12.4 Gy and 11.6 Gy were reduced by 27.0%±13.8% and 27.5%±12.6% (for both, p<0.001) in VMAT plans. Of the 26 protocol pass/fail criteria, VMAT plans were able to achieve an average of 0.2±0.7 (p=0.026) more constraints than the IMRT plans. Conclusions: FFF VMAT has dosimetric advantages over fixed beam IMRT for lung SBRT. Significant advantages included increased dose conformity, and reduced organs-at-risk doses. The overall improvements in terms of protocol pass/fail criteria were more modest and will require more patient data to establish difference
Wang, Hong-Qiang; Tsai, Chung-Jui
2013-01-01
With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. Software
NASA Astrophysics Data System (ADS)
Draper, David S.; van Westrenen, Wim
2007-12-01
As a complement to our efforts to update and revise the thermodynamic basis for predicting garnet-melt trace element partitioning using lattice-strain theory (van Westrenen and Draper in Contrib Mineral Petrol, this issue), we have performed detailed statistical evaluations of possible correlations between intensive and extensive variables and experimentally determined garnet-melt partitioning values for trivalent cations (rare earth elements, Y, and Sc) entering the dodecahedral garnet X-site. We applied these evaluations to a database containing over 300 partition coefficient determinations, compiled both from literature values and from our own work designed in part to expand that database. Available data include partitioning measurements in ultramafic to basaltic to intermediate bulk compositions, and recent studies in Fe-rich systems relevant to extraterrestrial petrogenesis, at pressures sufficiently high such that a significant component of majorite, the high-pressure form of garnet, is present. Through the application of lattice-strain theory, we obtained best-fit values for the ideal ionic radius of the dodecahedral garnet X-site, r 0(3+), its apparent Young’s modulus E(3+), and the strain-free partition coefficient D 0(3+) for a fictive REE element J of ionic radius r 0(3+). Resulting values of E, D 0, and r 0 were used in multiple linear regressions involving sixteen variables that reflect the possible influence of garnet composition and stoichiometry, melt composition and structure, major-element partitioning, pressure, and temperature. We find no statistically significant correlations between fitted r 0 and E values and any combination of variables. However, a highly robust correlation between fitted D 0 and garnet-melt Fe Mg exchange and D Mg is identified. The identification of more explicit melt-compositional influence is a first for this type of predictive modeling. We combine this statistically-derived expression for predicting D 0 with the new
ERIC Educational Resources Information Center
Meyer, Donald L.
Bayesian statistical methodology and its possible uses in the behavioral sciences are discussed in relation to the solution of problems in both the use and teaching of fundamental statistical methods, including confidence intervals, significance tests, and sampling. The Bayesian model explains these statistical methods and offers a consistent…
Minor changes in the indicator used to measure fine PM, which cause only modest changes in Mass concentrations, can lead to dramatic changes in the statistical relationship of fine PM mass with cardiovascular mortality. An epidemiologic study in Phoenix (Mar et al., 2000), augme...
ERIC Educational Resources Information Center
Cicchetti, Domenic V.; Koenig, Kathy; Klin, Ami; Volkmar, Fred R.; Paul, Rhea; Sparrow, Sara
2011-01-01
The objectives of this report are: (a) to trace the theoretical roots of the concept clinical significance that derives from Bayesian thinking, Marginal Utility/Diminishing Returns in Economics, and the "just noticeable difference", in Psychophysics. These concepts then translated into: Effect Size (ES), strength of agreement, clinical…
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2001-01-01
Since 1750, the number of cataclysmic volcanic eruptions (volcanic explosivity index (VEI)>=4) per decade spans 2-11, with 96 percent located in the tropics and extra-tropical Northern Hemisphere. A two-point moving average of the volcanic time series has higher values since the 1860's than before, being 8.00 in the 1910's (the highest value) and 6.50 in the 1980's, the highest since the 1910's peak. Because of the usual behavior of the first difference of the two-point moving averages, one infers that its value for the 1990's will measure approximately 6.50 +/- 1, implying that approximately 7 +/- 4 cataclysmic volcanic eruptions should be expected during the present decade (2000-2009). Because cataclysmic volcanic eruptions (especially those having VEI>=5) nearly always have been associated with short-term episodes of global cooling, the occurrence of even one might confuse our ability to assess the effects of global warming. Poisson probability distributions reveal that the probability of one or more events with a VEI>=4 within the next ten years is >99 percent. It is approximately 49 percent for an event with a VEI>=5, and 18 percent for an event with a VEI>=6. Hence, the likelihood that a climatically significant volcanic eruption will occur within the next ten years appears reasonably high.
Williams, Scott G. Buyyounouski, Mark K.; Pickles, Tom; Kestin, Larry; Martinez, Alvaro; Hanlon, Alexandra L.; Duchesne, Gillian M.
2008-03-15
Purpose: To define and incorporate the impact of the percentage of positive biopsy cores (PPC) into a predictive model of prostate cancer radiotherapy biochemical outcome. Methods and Materials: The data of 3264 men with clinically localized prostate cancer treated with external beam radiotherapy at four institutions were retrospectively analyzed. Standard prognostic and treatment factors plus the number of biopsy cores collected and the number positive for malignancy by transrectal ultrasound-guided biopsy were available. The primary endpoint was biochemical failure (bF, Phoenix definition). Multivariate proportional hazards analyses were performed and expressed as a nomogram and the model's predictive ability assessed using the concordance index (c-index). Results: The cohort consisted of 21% low-, 51% intermediate-, and 28% high-risk cancer patients, and 30% had androgen deprivation with radiotherapy. The median PPC was 50% (interquartile range [IQR] 29-67%), and median follow-up was 51 months (IQR 29-71 months). Percentage of positive biopsy cores displayed an independent association with the risk of bF (p = 0.01), as did age, prostate-specific antigen value, Gleason score, clinical stage, androgen deprivation duration, and radiotherapy dose (p < 0.001 for all). Including PPC increased the c-index from 0.72 to 0.73 in the overall model. The influence of PPC varied significantly with radiotherapy dose and clinical stage (p = 0.02 for both interactions), with doses <66 Gy and palpable tumors showing the strongest relationship between PPC and bF. Intermediate-risk patients were poorly discriminated regardless of PPC inclusion (c-index 0.65 for both models). Conclusions: Outcome models incorporating PPC show only minor additional ability to predict biochemical failure beyond those containing standard prognostic factors.
Grossling, Bernardo F.
1975-01-01
Exploratory drilling is still in incipient or youthful stages in those areas of the world where the bulk of the potential petroleum resources is yet to be discovered. Methods of assessing resources from projections based on historical production and reserve data are limited to mature areas. For most of the world's petroleum-prospective areas, a more speculative situation calls for a critical review of resource-assessment methodology. The language of mathematical statistics is required to define more rigorously the appraisal of petroleum resources. Basically, two approaches have been used to appraise the amounts of undiscovered mineral resources in a geologic province: (1) projection models, which use statistical data on the past outcome of exploration and development in the province; and (2) estimation models of the overall resources of the province, which use certain known parameters of the province together with the outcome of exploration and development in analogous provinces. These two approaches often lead to widely different estimates. Some of the controversy that arises results from a confusion of the probabilistic significance of the quantities yielded by each of the two approaches. Also, inherent limitations of analytic projection models-such as those using the logistic and Gomperts functions --have often been ignored. The resource-assessment problem should be recast in terms that provide for consideration of the probability of existence of the resource and of the probability of discovery of a deposit. Then the two above-mentioned models occupy the two ends of the probability range. The new approach accounts for (1) what can be expected with reasonably high certainty by mere projections of what has been accomplished in the past; (2) the inherent biases of decision-makers and resource estimators; (3) upper bounds that can be set up as goals for exploration; and (4) the uncertainties in geologic conditions in a search for minerals. Actual outcomes can then
NASA Astrophysics Data System (ADS)
Temme, F. P.
1992-12-01
Realisation of the invariance properties of the p ⩽ 2 number partitional inventory components of the 20-fold spin algebra associated with [A] 20 nuclear spin clusters under SU2 × L20 allows the mappings {[λ] → Γ} to be derived. In addition, recent general inner tensor product expressions under Ln, for n even (odd), also facilitates the evaluation of many higher [λ] ( L20; p = 3) correlative mappings onto SU3↓SO(3) × L↓20T A 5 subduced symmetry from SU2 duality, thus providing results that determine the nature of adapted NMR bases for both dodecahedrane and its d 20 analogue. The significance of this work lies in the pertinence of nuclear spin statistics to both selective MQ-NMR and to other spectroscopic aspects of cage clusters, e.g., [ 13C] n, n = 20, 60, fullerenes. Mappings onto Ln irreps sets of specific p ⩽ 3 number partitions arise in combinatorial treatment of {M iti} Rota fields, defining scalar invariants in the context of Cayley algebra. Inclusion of the Ln group in the specific Racah chain for NMR symmetry gives rise to significant further physical insight.
Cosmic statistics of statistics
NASA Astrophysics Data System (ADS)
Szapudi, István; Colombi, Stéphane; Bernardeau, Francis
1999-12-01
The errors on statistics measured in finite galaxy catalogues are exhaustively investigated. The theory of errors on factorial moments by Szapudi & Colombi is applied to cumulants via a series expansion method. All results are subsequently extended to the weakly non-linear regime. Together with previous investigations this yields an analytic theory of the errors for moments and connected moments of counts in cells from highly non-linear to weakly non-linear scales. For non-linear functions of unbiased estimators, such as the cumulants, the phenomenon of cosmic bias is identified and computed. Since it is subdued by the cosmic errors in the range of applicability of the theory, correction for it is inconsequential. In addition, the method of Colombi, Szapudi & Szalay concerning sampling effects is generalized, adapting the theory for inhomogeneous galaxy catalogues. While previous work focused on the variance only, the present article calculates the cross-correlations between moments and connected moments as well for a statistically complete description. The final analytic formulae representing the full theory are explicit but somewhat complicated. Therefore we have made available a fortran program capable of calculating the described quantities numerically (for further details e-mail SC at colombi@iap.fr). An important special case is the evaluation of the errors on the two-point correlation function, for which this should be more accurate than any method put forward previously. This tool will be immensely useful in the future for assessing the precision of measurements from existing catalogues, as well as aiding the design of new galaxy surveys. To illustrate the applicability of the results and to explore the numerical aspects of the theory qualitatively and quantitatively, the errors and cross-correlations are predicted under a wide range of assumptions for the future Sloan Digital Sky Survey. The principal results concerning the cumulants ξ, Q3 and Q4 is that
Shi, Runhua; McLarty, Jerry W
2009-10-01
In this article, we introduced basic concepts of statistics, type of distributions, and descriptive statistics. A few examples were also provided. The basic concepts presented herein are only a fraction of the concepts related to descriptive statistics. Also, there are many commonly used distributions not presented herein, such as Poisson distributions for rare events and exponential distributions, F distributions, and logistic distributions. More information can be found in many statistics books and publications. PMID:19891281
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2008-01-01
As a branch of knowledge, Statistics is ubiquitous and its applications can be found in (almost) every field of human endeavour. In this article, the authors track down the possible source of the link between the "Siren song" and applications of Statistics. Answers to their previous five questions and five new questions on Statistics are presented.
ERIC Educational Resources Information Center
Callamaras, Peter
1983-01-01
This buyer's guide to seven major types of statistics software packages for microcomputers reviews Edu-Ware Statistics 3.0; Financial Planning; Speed Stat; Statistics with DAISY; Human Systems Dynamics package of Stats Plus, ANOVA II, and REGRESS II; Maxistat; and Moore-Barnes' MBC Test Construction and MBC Correlation. (MBR)
Significant lexical relationships
Pedersen, T.; Kayaalp, M.; Bruce, R.
1996-12-31
Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.
ERIC Educational Resources Information Center
Andrews, Ian A.
1999-01-01
Provides a crossword puzzle with an answer key corresponding to the book entitled "Significant Treasures/Tresors Parlants" that is filled with color and black-and-white prints of paintings and artifacts from 131 museums and art galleries as a sampling of the 2,200 such Canadian institutions. (CMK)
Kogalovskii, M.R.
1995-03-01
This paper presents a review of problems related to statistical database systems, which are wide-spread in various fields of activity. Statistical databases (SDB) are referred to as databases that consist of data and are used for statistical analysis. Topics under consideration are: SDB peculiarities, properties of data models adequate for SDB requirements, metadata functions, null-value problems, SDB compromise protection problems, stored data compression techniques, and statistical data representation means. Also examined is whether the present Database Management Systems (DBMS) satisfy the SDB requirements. Some actual research directions in SDB systems are considered.
Smith, Alwyn
1969-01-01
This paper is based on an analysis of questionnaires sent to the health ministries of Member States of WHO asking for information about the extent, nature, and scope of morbidity statistical information. It is clear that most countries collect some statistics of morbidity and many countries collect extensive data. However, few countries relate their collection to the needs of health administrators for information, and many countries collect statistics principally for publication in annual volumes which may appear anything up to 3 years after the year to which they refer. The desiderata of morbidity statistics may be summarized as reliability, representativeness, and relevance to current health problems. PMID:5306722
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2008-01-01
In this article, the authors focus on hypothesis testing--that peculiarly statistical way of deciding things. Statistical methods for testing hypotheses were developed in the 1920s and 1930s by some of the most famous statisticians, in particular Ronald Fisher, Jerzy Neyman and Egon Pearson, who laid the foundations of almost all modern methods of…
ERIC Educational Resources Information Center
Huberty, Carl J.
An approach to statistical testing, which combines Neyman-Pearson hypothesis testing and Fisher significance testing, is recommended. The use of P-values in this approach is discussed in some detail. The author also discusses some problems which are often found in introductory statistics textbooks. The problems involve the definitions of…
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute works to provide information on cancer statistics in an effort to reduce the burden of cancer among the U.S. population.
... cancer statistics across the world. U.S. Cancer Mortality Trends The best indicator of progress against cancer is ... the number of cancer survivors has increased. These trends show that progress is being made against the ...
NASA Astrophysics Data System (ADS)
Hermann, Claudine
Statistical Physics bridges the properties of a macroscopic system and the microscopic behavior of its constituting particles, otherwise impossible due to the giant magnitude of Avogadro's number. Numerous systems of today's key technologies - such as semiconductors or lasers - are macroscopic quantum objects; only statistical physics allows for understanding their fundamentals. Therefore, this graduate text also focuses on particular applications such as the properties of electrons in solids with applications, and radiation thermodynamics and the greenhouse effect.
Bishop, Joseph E.; Strack, O. E.
2011-03-22
A novel method is presented for assessing the convergence of a sequence of statistical distributions generated by direct Monte Carlo sampling. The primary application is to assess the mesh or grid convergence, and possibly divergence, of stochastic outputs from non-linear continuum systems. Example systems include those from fluid or solid mechanics, particularly those with instabilities and sensitive dependence on initial conditions or system parameters. The convergence assessment is based on demonstrating empirically that a sequence of cumulative distribution functions converges in the Linfty norm. The effect of finite sample sizes is quantified using confidence levels from the Kolmogorov–Smirnov statistic. The statistical method is independent of the underlying distributions. The statistical method is demonstrated using two examples: (1) the logistic map in the chaotic regime, and (2) a fragmenting ductile ring modeled with an explicit-dynamics finite element code. In the fragmenting ring example the convergence of the distribution describing neck spacing is investigated. The initial yield strength is treated as a random field. Two different random fields are considered, one with spatial correlation and the other without. Both cases converged, albeit to different distributions. The case with spatial correlation exhibited a significantly higher convergence rate compared with the one without spatial correlation.
Han, Yali; Liu, Jie; Sun, Meili; Zhang, Zongpu; Liu, Chuanyong; Sun, Yuping
2016-01-01
Background. There is no definitive conclusion so far on the predictive values of ERCC1 polymorphisms for clinical outcomes of platinum-based chemotherapy in non-small cell lung cancer (NSCLC). We updated this meta-analysis with an expectation to obtain some statistical advancement on this issue. Methods. Relevant studies were identified by searching MEDLINE, EMBASE databases from inception to April 2015. Primary outcomes included objective response rate (ORR), progression-free survival (PFS), and overall survival (OS). All analyses were performed using the Review Manager version 5.3 and the Stata version 12.0. Results. A total of 33 studies including 5373 patients were identified. ERCC1 C118T and C8092A could predict both ORR and OS for platinum-based chemotherapy in Asian NSCLC patients (CT + TT versus CC, ORR: OR = 0.80, 95% CI = 0.67–0.94; OS: HR = 1.24, 95% CI = 1.01–1.53) (CA + AA versus CC, ORR: OR = 0.76, 95% CI = 0.60–0.96; OS: HR = 1.37, 95% CI = 1.06–1.75). Conclusions. Current evidence strongly indicated the prospect of ERCC1 C118T and C8092A as predictive biomarkers for platinum-based chemotherapy in Asian NSCLC patients. However, the results should be interpreted with caution and large prospective studies are still required to further investigate these findings. PMID:27057082
NASA Astrophysics Data System (ADS)
Goodman, J. W.
This book is based on the thesis that some training in the area of statistical optics should be included as a standard part of any advanced optics curriculum. Random variables are discussed, taking into account definitions of probability and random variables, distribution functions and density functions, an extension to two or more random variables, statistical averages, transformations of random variables, sums of real random variables, Gaussian random variables, complex-valued random variables, and random phasor sums. Other subjects examined are related to random processes, some first-order properties of light waves, the coherence of optical waves, some problems involving high-order coherence, effects of partial coherence on imaging systems, imaging in the presence of randomly inhomogeneous media, and fundamental limits in photoelectric detection of light. Attention is given to deterministic versus statistical phenomena and models, the Fourier transform, and the fourth-order moment of the spectrum of a detected speckle image.
ERIC Educational Resources Information Center
Chicot, Katie; Holmes, Hilary
2012-01-01
The use, and misuse, of statistics is commonplace, yet in the printed format data representations can be either over simplified, supposedly for impact, or so complex as to lead to boredom, supposedly for completeness and accuracy. In this article the link to the video clip shows how dynamic visual representations can enliven and enhance the…
ERIC Educational Resources Information Center
Catley, Alan
2007-01-01
Following the announcement last year that there will be no more math coursework assessment at General Certificate of Secondary Education (GCSE), teachers will in the future be able to devote more time to preparing learners for formal examinations. One of the key things that the author has learned when teaching statistics is that it makes for far…
Januszyk, Michael; Gurtner, Geoffrey C
2011-01-01
The scope of biomedical research has expanded rapidly during the past several decades, and statistical analysis has become increasingly necessary to understand the meaning of large and diverse quantities of raw data. As such, a familiarity with this lexicon is essential for critical appraisal of medical literature. This article attempts to provide a practical overview of medical statistics, with an emphasis on the selection, application, and interpretation of specific tests. This includes a brief review of statistical theory and its nomenclature, particularly with regard to the classification of variables. A discussion of descriptive methods for data presentation is then provided, followed by an overview of statistical inference and significance analysis, and detailed treatment of specific statistical tests and guidelines for their interpretation. PMID:21200241
ERIC Educational Resources Information Center
Peterson, Lisa S.
2008-01-01
Clinical significance is an important concept in research, particularly in education and the social sciences. The present article first compares clinical significance to other measures of "significance" in statistics. The major methods used to determine clinical significance are explained and the strengths and weaknesses of clinical significance…
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. PMID:26466186
NASA Astrophysics Data System (ADS)
Goodman, Joseph W.
2000-07-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Robert G. Bartle The Elements of Integration and Lebesgue Measure George E. P. Box & Norman R. Draper Evolutionary Operation: A Statistical Method for Process Improvement George E. P. Box & George C. Tiao Bayesian Inference in Statistical Analysis R. W. Carter Finite Groups of Lie Type: Conjugacy Classes and Complex Characters R. W. Carter Simple Groups of Lie Type William G. Cochran & Gertrude M. Cox Experimental Designs, Second Edition Richard Courant Differential and Integral Calculus, Volume I RIchard Courant Differential and Integral Calculus, Volume II Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume I Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume II D. R. Cox Planning of Experiments Harold S. M. Coxeter Introduction to Geometry, Second Edition Charles W. Curtis & Irving Reiner Representation Theory of Finite Groups and Associative Algebras Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume I Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume II Cuthbert Daniel Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition Bruno de Finetti Theory of Probability, Volume I Bruno de Finetti Theory of Probability, Volume 2 W. Edwards Deming Sample Design in Business Research
A study on the use of Gumbel approximation with the Bernoulli spatial scan statistic.
Read, S; Bath, P A; Willett, P; Maheswaran, R
2013-08-30
The Bernoulli version of the spatial scan statistic is a well established method of detecting localised spatial clusters in binary labelled point data, a typical application being the epidemiological case-control study. A recent study suggests the inferential accuracy of several versions of the spatial scan statistic (principally the Poisson version) can be improved, at little computational cost, by using the Gumbel distribution, a method now available in SaTScan(TM) (www.satscan.org). We study in detail the effect of this technique when applied to the Bernoulli version and demonstrate that it is highly effective, albeit with some increase in false alarm rates at certain significance thresholds. We explain how this increase is due to the discrete nature of the Bernoulli spatial scan statistic and demonstrate that it can affect even small p-values. Despite this, we argue that the Gumbel method is actually preferable for very small p-values. Furthermore, we extend previous research by running benchmark trials on 12 000 synthetic datasets, thus demonstrating that the overall detection capability of the Bernoulli version (i.e. ratio of power to false alarm rate) is not noticeably affected by the use of the Gumbel method. We also provide an example application of the Gumbel method using data on hospital admissions for chronic obstructive pulmonary disease. PMID:23348825
1986-01-01
Official population data for the USSR are presented for 1985 and 1986. Part 1 (pp. 65-72) contains data on capitals of union republics and cities with over one million inhabitants, including population estimates for 1986 and vital statistics for 1985. Part 2 (p. 72) presents population estimates by sex and union republic, 1986. Part 3 (pp. 73-6) presents data on population growth, including birth, death, and natural increase rates, 1984-1985; seasonal distribution of births and deaths; birth order; age-specific birth rates in urban and rural areas and by union republic; marriages; age at marriage; and divorces. PMID:12178831
Intervention for Maltreating Fathers: Statistically and Clinically Significant Change
ERIC Educational Resources Information Center
Scott, Katreena L.; Lishak, Vicky
2012-01-01
Objective: Fathers are seldom the focus of efforts to address child maltreatment and little is currently known about the effectiveness of intervention for this population. To address this gap, we examined the efficacy of a community-based group treatment program for fathers who had abused or neglected their children or exposed their children to…
Worry, Intolerance of Uncertainty, and Statistics Anxiety
ERIC Educational Resources Information Center
Williams, Amanda S.
2013-01-01
Statistics anxiety is a problem for most graduate students. This study investigates the relationship between intolerance of uncertainty, worry, and statistics anxiety. Intolerance of uncertainty was significantly related to worry, and worry was significantly related to three types of statistics anxiety. Six types of statistics anxiety were…
Suite versus composite statistics
Balsillie, J.H.; Tanner, W.F.
1999-01-01
Suite and composite methodologies, two statistically valid approaches for producing statistical descriptive measures, are investigated for sample groups representing a probability distribution where, in addition, each sample is probability distribution. Suite and composite means (first moment measures) are always equivalent. Composite standard deviations (second moment measures) are always larger than suite standard deviations. Suite and composite values for higher moment measures have more complex relationships. Very seldom, however, are they equivalent, and they normally yield statistically significant but different results. Multiple samples are preferable to single samples (including composites) because they permit the investigator to examine sample-to-sample variability. These and other relationships for suite and composite probability distribution analyses are investigated and reported using granulometric data.
Candidate Assembly Statistical Evaluation
Energy Science and Technology Software Center (ESTSC)
1998-07-15
The Savannah River Site (SRS) receives aluminum clad spent Material Test Reactor (MTR) fuel from all over the world for storage and eventual reprocessing. There are hundreds of different kinds of MTR fuels and these fuels will continue to be received at SRS for approximately ten more years. SRS''s current criticality evaluation methodology requires the modeling of all MTR fuels utilizing Monte Carlo codes, which is extremely time consuming and resource intensive. Now that amore » significant number of MTR calculations have been conducted it is feasible to consider building statistical models that will provide reasonable estimations of MTR behavior. These statistical models can be incorporated into a standardized model homogenization spreadsheet package to provide analysts with a means of performing routine MTR fuel analyses with a minimal commitment of time and resources. This became the purpose for development of the Candidate Assembly Statistical Evaluation (CASE) program at SRS.« less
Smith, Kristy Breuhl; Smith, Michael Seth
2016-03-01
Obesity is a chronic disease that is strongly associated with an increase in mortality and morbidity including, certain types of cancer, cardiovascular disease, disability, diabetes mellitus, hypertension, osteoarthritis, and stroke. In adults, overweight is defined as a body mass index (BMI) of 25 kg/m(2) to 29 kg/m(2) and obesity as a BMI of greater than 30 kg/m(2). If current trends continue, it is estimated that, by the year 2030, 38% of the world's adult population will be overweight and another 20% obese. Significant global health strategies must reduce the morbidity and mortality associated with the obesity epidemic. PMID:26896205
Cosmetic Plastic Surgery Statistics
2014 Cosmetic Plastic Surgery Statistics Cosmetic Procedure Trends 2014 Plastic Surgery Statistics Report Please credit the AMERICAN SOCIETY OF PLASTIC SURGEONS when citing statistical data or using ...
Statistics Anxiety among Postgraduate Students
ERIC Educational Resources Information Center
Koh, Denise; Zawi, Mohd Khairi
2014-01-01
Most postgraduate programmes, that have research components, require students to take at least one course of research statistics. Not all postgraduate programmes are science based, there are a significant number of postgraduate students who are from the social sciences that will be taking statistics courses, as they try to complete their…
Thermodynamics of cellular statistical inference
NASA Astrophysics Data System (ADS)
Lang, Alex; Fisher, Charles; Mehta, Pankaj
2014-03-01
Successful organisms must be capable of accurately sensing the surrounding environment in order to locate nutrients and evade toxins or predators. However, single cell organisms face a multitude of limitations on their accuracy of sensing. Berg and Purcell first examined the canonical example of statistical limitations to cellular learning of a diffusing chemical and established a fundamental limit to statistical accuracy. Recent work has shown that the Berg and Purcell learning limit can be exceeded using Maximum Likelihood Estimation. Here, we recast the cellular sensing problem as a statistical inference problem and discuss the relationship between the efficiency of an estimator and its thermodynamic properties. We explicitly model a single non-equilibrium receptor and examine the constraints on statistical inference imposed by noisy biochemical networks. Our work shows that cells must balance sample number, specificity, and energy consumption when performing statistical inference. These tradeoffs place significant constraints on the practical implementation of statistical estimators in a cell.
Predict! Teaching Statistics Using Informational Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie
2013-01-01
Statistics is one of the most widely used topics for everyday life in the school mathematics curriculum. Unfortunately, the statistics taught in schools focuses on calculations and procedures before students have a chance to see it as a useful and powerful tool. Researchers have found that a dominant view of statistics is as an assortment of tools…
Statistics Poker: Reinforcing Basic Statistical Concepts
ERIC Educational Resources Information Center
Leech, Nancy L.
2008-01-01
Learning basic statistical concepts does not need to be tedious or dry; it can be fun and interesting through cooperative learning in the small-group activity of Statistics Poker. This article describes a teaching approach for reinforcing basic statistical concepts that can help students who have high anxiety and makes learning and reinforcing…
Statistics of atmospheric correlations.
Santhanam, M S; Patra, P K
2001-07-01
For a large class of quantum systems, the statistical properties of their spectrum show remarkable agreement with random matrix predictions. Recent advances show that the scope of random matrix theory is much wider. In this work, we show that the random matrix approach can be beneficially applied to a completely different classical domain, namely, to the empirical correlation matrices obtained from the analysis of the basic atmospheric parameters that characterize the state of atmosphere. We show that the spectrum of atmospheric correlation matrices satisfy the random matrix prescription. In particular, the eigenmodes of the atmospheric empirical correlation matrices that have physical significance are marked by deviations from the eigenvector distribution. PMID:11461326
Neuroendocrine Tumor: Statistics
... Tumor > Neuroendocrine Tumor - Statistics Request Permissions Neuroendocrine Tumor - Statistics Approved by the Cancer.Net Editorial Board , 04/ ... the body. It is important to remember that statistics on how many people survive this type of ...
Antecedents of students' achievement in statistics
NASA Astrophysics Data System (ADS)
Awaludin, Izyan Syazana; Razak, Ruzanna Ab; Harris, Hezlin; Selamat, Zarehan
2015-02-01
The applications of statistics in most fields have been vast. Many degree programmes at local universities require students to enroll in at least one statistics course. The standard of these courses varies across different degree programmes. This is because of students' diverse academic backgrounds in which some comes far from the field of statistics. The high failure rate in statistics courses for non-science stream students had been concerning every year. The purpose of this research is to investigate the antecedents of students' achievement in statistics. A total of 272 students participated in the survey. Multiple linear regression was applied to examine the relationship between the factors and achievement. We found that statistics anxiety was a significant predictor of students' achievement. We also found that students' age has significant effect to achievement. Older students are more likely to achieve lowers scores in statistics. Student's level of study also has a significant impact on their achievement in statistics.
[Comment on] Statistical discrimination
NASA Astrophysics Data System (ADS)
Chinn, Douglas
In the December 8, 1981, issue of Eos, a news item reported the conclusion of a National Research Council study that sexual discrimination against women with Ph.D.'s exists in the field of geophysics. Basically, the item reported that even when allowances are made for motherhood the percentage of female Ph.D.'s holding high university and corporate positions is significantly lower than the percentage of male Ph.D.'s holding the same types of positions. The sexual discrimination conclusion, based only on these statistics, assumes that there are no basic psychological differences between men and women that might cause different populations in the employment group studied. Therefore, the reasoning goes, after taking into account possible effects from differences related to anatomy, such as women stopping their careers in order to bear and raise children, the statistical distributions of positions held by male and female Ph.D.'s ought to be very similar to one another. Any significant differences between the distributions must be caused primarily by sexual discrimination.
Statistical Reference Datasets
National Institute of Standards and Technology Data Gateway
Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
Chiou, Chei-Chang; Wang, Yu-Min; Lee, Li-Tze
2014-08-01
Statistical knowledge is widely used in academia; however, statistics teachers struggle with the issue of how to reduce students' statistics anxiety and enhance students' statistics learning. This study assesses the effectiveness of a "one-minute paper strategy" in reducing students' statistics-related anxiety and in improving students' statistics-related achievement. Participants were 77 undergraduates from two classes enrolled in applied statistics courses. An experiment was implemented according to a pretest/posttest comparison group design. The quasi-experimental design showed that the one-minute paper strategy significantly reduced students' statistics anxiety and improved students' statistics learning achievement. The strategy was a better instructional tool than the textbook exercise for reducing students' statistics anxiety and improving students' statistics achievement. PMID:25153964
[Big data in official statistics].
Zwick, Markus
2015-08-01
The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany. PMID:26077871
Ranald Macdonald and statistical inference.
Smith, Philip T
2009-05-01
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing. PMID:19351454
... Research AMIGAS Fighting Cervical Cancer Worldwide Stay Informed Statistics for Other Kinds of Cancer Breast Cervical Colorectal ( ... Skin Vaginal and Vulvar Cancer Home Uterine Cancer Statistics Language: English Español (Spanish) Recommend on Facebook Tweet ...
Mathematical and statistical analysis
NASA Technical Reports Server (NTRS)
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Statistical Modeling of Occupational Exposure to Polycyclic Aromatic Hydrocarbons Using OSHA Data.
Lee, Derrick G; Lavoué, Jérôme; Spinelli, John J; Burstyn, Igor
2015-01-01
Polycyclic aromatic hydrocarbons (PAHs) are a group of pollutants with multiple variants classified as carcinogenic. The Occupational Safety and Health Administration (OSHA) provided access to two PAH exposure databanks of United States workplace compliance testing data collected between 1979 and 2010. Mixed-effects logistic models were used to predict the exceedance fraction (EF), i.e., the probability of exceeding OSHA's Permissible Exposure Limit (PEL = 0.2 mg/m3) for PAHs based on industry and occupation. Measurements of coal tar pitch volatiles were used as a surrogate for PAHs. Time, databank, occupation, and industry were included as fixed-effects while an identifier for the compliance inspection number was included as a random effect. Analyses involved 2,509 full-shift personal measurements. Results showed that the majority of industries had an estimated EF < 0.5, although several industries, including Standardized Industry Classification codes 1623 (Water, Sewer, Pipeline, and Communication and Powerline Construction), 1711 (Plumbing, Heating, and Air-Conditioning), 2824 (Manmade Organic Fibres), 3496 (Misc. Fabricated Wire products), and 5812 (Eating Places), and Major group's 13 (Oil and Gas Extraction) and 30 (Rubber and Miscellaneous Plastic Products), were estimated to have more than an 80% likelihood of exceeding the PEL. There was an inverse temporal trend of exceeding the PEL, with lower risk in most recent years, albeit not statistically significant. Similar results were shown when incorporating occupation, but varied depending on the occupation as the majority of industries predicted at the administrative level, e.g., managers, had an estimated EF < 0.5 while at the minimally skilled/laborer level there was a substantial increase in the estimated EF. These statistical models allow the prediction of PAH exposure risk through individual occupational histories and will be used to create a job-exposure matrix for use in a population-based case
Minnesota Health Statistics 1988.
ERIC Educational Resources Information Center
Minnesota State Dept. of Health, St. Paul.
This document comprises the 1988 annual statistical report of the Minnesota Center for Health Statistics. After introductory technical notes on changes in format, sources of data, and geographic allocation of vital events, an overview is provided of vital health statistics in all areas. Thereafter, separate sections of the report provide tables…
ERIC Educational Resources Information Center
Lenard, Christopher; McCarthy, Sally; Mills, Terence
2014-01-01
There are many different aspects of statistics. Statistics involves mathematics, computing, and applications to almost every field of endeavour. Each aspect provides an opportunity to spark someone's interest in the subject. In this paper we discuss some ethical aspects of statistics, and describe how an introduction to ethics has been…
ERIC Educational Resources Information Center
Strasser, Nora
2007-01-01
Avoiding statistical mistakes is important for educators at all levels. Basic concepts will help you to avoid making mistakes using statistics and to look at data with a critical eye. Statistical data is used at educational institutions for many purposes. It can be used to support budget requests, changes in educational philosophy, changes to…
Statistical quality management
NASA Astrophysics Data System (ADS)
Vanderlaan, Paul
1992-10-01
Some aspects of statistical quality management are discussed. Quality has to be defined as a concrete, measurable quantity. The concepts of Total Quality Management (TQM), Statistical Process Control (SPC), and inspection are explained. In most cases SPC is better than inspection. It can be concluded that statistics has great possibilities in the field of TQM.
Explorations in statistics: statistical facets of reproducibility.
Curran-Everett, Douglas
2016-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eleventh installment of Explorations in Statistics explores statistical facets of reproducibility. If we obtain an experimental result that is scientifically meaningful and statistically unusual, we would like to know that our result reflects a general biological phenomenon that another researcher could reproduce if (s)he repeated our experiment. But more often than not, we may learn this researcher cannot replicate our result. The National Institutes of Health and the Federation of American Societies for Experimental Biology have created training modules and outlined strategies to help improve the reproducibility of research. These particular approaches are necessary, but they are not sufficient. The principles of hypothesis testing and estimation are inherent to the notion of reproducibility in science. If we want to improve the reproducibility of our research, then we need to rethink how we apply fundamental concepts of statistics to our science. PMID:27231259
Statistical prediction of cyclostationary processes
Kim, K.Y.
2000-03-15
Considered in this study is a cyclostationary generalization of an EOF-based prediction method. While linear statistical prediction methods are typically optimal in the sense that prediction error variance is minimal within the assumption of stationarity, there is some room for improved performance since many physical processes are not stationary. For instance, El Nino is known to be strongly phase locked with the seasonal cycle, which suggests nonstationarity of the El Nino statistics. Many geophysical and climatological processes may be termed cyclostationary since their statistics show strong cyclicity instead of stationarity. Therefore, developed in this study is a cyclostationary prediction method. Test results demonstrate that performance of prediction methods can be improved significantly by accounting for the cyclostationarity of underlying processes. The improvement comes from an accurate rendition of covariance structure both in space and time.
Thermodynamic Limit in Statistical Physics
NASA Astrophysics Data System (ADS)
Kuzemsky, A. L.
2014-03-01
The thermodynamic limit in statistical thermodynamics of many-particle systems is an important but often overlooked issue in the various applied studies of condensed matter physics. To settle this issue, we review tersely the past and present disposition of thermodynamic limiting procedure in the structure of the contemporary statistical mechanics and our current understanding of this problem. We pick out the ingenious approach by Bogoliubov, who developed a general formalism for establishing the limiting distribution functions in the form of formal series in powers of the density. In that study, he outlined the method of justification of the thermodynamic limit when he derived the generalized Boltzmann equations. To enrich and to weave our discussion, we take this opportunity to give a brief survey of the closely related problems, such as the equipartition of energy and the equivalence and nonequivalence of statistical ensembles. The validity of the equipartition of energy permits one to decide what are the boundaries of applicability of statistical mechanics. The major aim of this work is to provide a better qualitative understanding of the physical significance of the thermodynamic limit in modern statistical physics of the infinite and "small" many-particle systems.
NASA Astrophysics Data System (ADS)
Schieve, William C.; Horwitz, Lawrence P.
2009-04-01
1. Foundations of quantum statistical mechanics; 2. Elementary examples; 3. Quantum statistical master equation; 4. Quantum kinetic equations; 5. Quantum irreversibility; 6. Entropy and dissipation: the microscopic theory; 7. Global equilibrium: thermostatics and the microcanonical ensemble; 8. Bose-Einstein ideal gas condensation; 9. Scaling, renormalization and the Ising model; 10. Relativistic covariant statistical mechanics of many particles; 11. Quantum optics and damping; 12. Entanglements; 13. Quantum measurement and irreversibility; 14. Quantum Langevin equation: quantum Brownian motion; 15. Linear response: fluctuation and dissipation theorems; 16. Time dependent quantum Green's functions; 17. Decay scattering; 18. Quantum statistical mechanics, extended; 19. Quantum transport with tunneling and reservoir ballistic transport; 20. Black hole thermodynamics; Appendix; Index.
Statistical distribution sampling
NASA Technical Reports Server (NTRS)
Johnson, E. S.
1975-01-01
Determining the distribution of statistics by sampling was investigated. Characteristic functions, the quadratic regression problem, and the differential equations for the characteristic functions are analyzed.
Statistical comparison of dissolution profiles.
Wang, Yifan; Snee, Ronald D; Keyvan, Golshid; Muzzio, Fernando J
2016-05-01
Statistical methods to assess similarity of dissolution profiles are introduced. Sixteen groups of dissolution profiles from a full factorial design were used to demonstrate implementation details. Variables in the design include drug strength, tablet stability time, and dissolution testing condition. The 16 groups were considered similar when compared using the similarity factor f2 (f2 > 50). However, multivariate ANOVA (MANOVA) repeated measures suggested statistical differences. A modified principal component analysis (PCA) was used to describe the dissolution curves in terms of level and shape. The advantage of the modified PCA approach is that the calculated shape principal components will not be confounded by level effect. Effect size test using omega-squared was also used for dissolution comparisons. Effects indicated by omega-squared are independent of sample size and are a necessary supplement to p value reported from the MANOVA table. Methods to compare multiple groups show that product strength and dissolution testing condition had significant effects on both level and shape. For pairwise analysis, a post-hoc analysis using Tukey's method categorized three similar groups, and was consistent with level-shape analysis. All these methods provide valuable information that is missed using f2 method alone to compare average profiles. The improved statistical analysis approach introduced here enables one to better ascertain both statistical significance and clinical relevance, supporting more objective regulatory decisions. PMID:26294289
Statistical Mechanics of Zooplankton
Hinow, Peter; Nihongi, Ai; Strickler, J. Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar “microscopic” quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the “ecological temperature” of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean’s swimming behavior. PMID:26270537
Statistical Mechanics of Zooplankton.
Hinow, Peter; Nihongi, Ai; Strickler, J Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar "microscopic" quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the "ecological temperature" of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean's swimming behavior. PMID:26270537
Queer (v.) Queer (v.): Biology as Curriculum, Pedagogy, and Being albeit Queer (v.)
ERIC Educational Resources Information Center
Broadway, Francis S.
2011-01-01
In order to advance the purpose of education as creating a sustainable world yet to be imagined, educationally, queer (v.) queer (v.) expounds curriculum, pedagogy and being, which has roots in sexuality--the public face of the private confluence of sexuality, gender, race and class, are a necessary framework for queer. If queer is a complicated…
Nitrofurantoin-induced interstitial pneumonitis: albeit rare, should not be missed.
Syed, Haamid; Bachuwa, Ghassan; Upadhaya, Sunil; Abed, Firas
2016-01-01
Interstitial lung disease (ILD) is a rare adverse effect of nitrofurantoin and can range from benign infiltrates to a fatal condition. Nitrofurantoin acts via inhibiting the protein synthesis in bacteria by helping reactive intermediates and is known to produce primary lung parenchymal injury through an oxidant mechanism. Stopping the drug leads to complete recovery of symptoms. In this report, we present a case of nitrofurantoin-induced ILD with the recovery of symptoms and disease process after stopping the drug. PMID:26912767
Queer (v.) queer (v.): biology as curriculum, pedagogy, and being albeit queer (v.)
NASA Astrophysics Data System (ADS)
Broadway, Francis S.
2011-06-01
In order to advance the purpose of education as creating a sustainable world yet to be imagined, educationally, queer (v.) queer (v.) expounds curriculum, pedagogy and being, which has roots in sexuality—the public face of the private confluence of sexuality, gender, race and class, are a necessary framework for queer. If queer is a complicated conversation of strangers' eros, then queer facilitates the creation of space, revolution and transformation. In other words, queer, for science education, is more than increasing and privileging the heteronormative and non-heteronormative science content that extends capitalism's hegemony, but rather science as the dignity, identity, and loving and caring of and by one's self and fellow human beings as strangers.
ERIC Educational Resources Information Center
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U.
2015-01-01
This article uses definitions provided by Cronbach in his seminal paper for coefficient a to show the concepts of reliability, dimensionality, and internal consistency are distinct but interrelated. The article begins with a critique of the definition of reliability and then explores mathematical properties of Cronbach's a. Internal consistency…
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…
Multidimensional Visual Statistical Learning
ERIC Educational Resources Information Center
Turk-Browne, Nicholas B.; Isola, Phillip J.; Scholl, Brian J.; Treat, Teresa A.
2008-01-01
Recent studies of visual statistical learning (VSL) have demonstrated that statistical regularities in sequences of visual stimuli can be automatically extracted, even without intent or awareness. Despite much work on this topic, however, several fundamental questions remain about the nature of VSL. In particular, previous experiments have not…
Deconstructing Statistical Analysis
ERIC Educational Resources Information Center
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Explorations in Statistics: Power
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2010-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fifth installment of "Explorations in Statistics" revisits power, a concept fundamental to the test of a null hypothesis. Power is the probability that we reject the null hypothesis when it is false. Four things affect…
ERIC Educational Resources Information Center
Huizingh, Eelko K. R. E.
2007-01-01
Accessibly written and easy to use, "Applied Statistics Using SPSS" is an all-in-one self-study guide to SPSS and do-it-yourself guide to statistics. What is unique about Eelko Huizingh's approach is that this book is based around the needs of undergraduate students embarking on their own research project, and its self-help style is designed to…
Vijayaraj, Veeraraghavan; Cheriyadat, Anil M; Bhaduri, Budhendra L; Vatsavai, Raju; Bright, Eddie A
2008-01-01
Statistical properties of high-resolution overhead images representing different land use categories are analyzed using various local and global statistical image properties based on the shape of the power spectrum, image gradient distributions, edge co-occurrence, and inter-scale wavelet coefficient distributions. The analysis was performed on a database of high-resolution (1 meter) overhead images representing a multitude of different downtown, suburban, commercial, agricultural and wooded exemplars. Various statistical properties relating to these image categories and their relationship are discussed. The categorical variations in power spectrum contour shapes, the unique gradient distribution characteristics of wooded categories, the similarity in edge co-occurrence statistics for overhead and natural images, and the unique edge co-occurrence statistics of downtown categories are presented in this work. Though previous work on natural image statistics has showed some of the unique characteristics for different categories, the relationships for overhead images are not well understood. The statistical properties of natural images were used in previous studies to develop prior image models, to predict and index objects in a scene and to improve computer vision models. The results from our research findings can be used to augment and adapt computer vision algorithms that rely on prior image statistics to process overhead images, calibrate the performance of overhead image analysis algorithms, and derive features for better discrimination of overhead image categories.
Understanding Undergraduate Statistical Anxiety
ERIC Educational Resources Information Center
McKim, Courtney
2014-01-01
The purpose of this study was to understand undergraduate students' views of statistics. Results reveal that students with less anxiety have a higher interest in statistics and also believe in their ability to perform well in the course. Also students who have a more positive attitude about the class tend to have a higher belief in their…
Croarkin, M. Carroll
2001-01-01
For more than 50 years, the Statistical Engineering Division (SED) has been instrumental in the success of a broad spectrum of metrology projects at NBS/NIST. This paper highlights fundamental contributions of NBS/NIST statisticians to statistics and to measurement science and technology. Published methods developed by SED staff, especially during the early years, endure as cornerstones of statistics not only in metrology and standards applications, but as data-analytic resources used across all disciplines. The history of statistics at NBS/NIST began with the formation of what is now the SED. Examples from the first five decades of the SED illustrate the critical role of the division in the successful resolution of a few of the highly visible, and sometimes controversial, statistical studies of national importance. A review of the history of major early publications of the division on statistical methods, design of experiments, and error analysis and uncertainty is followed by a survey of several thematic areas. The accompanying examples illustrate the importance of SED in the history of statistics, measurements and standards: calibration and measurement assurance, interlaboratory tests, development of measurement methods, Standard Reference Materials, statistical computing, and dissemination of measurement technology. A brief look forward sketches the expanding opportunity and demand for SED statisticians created by current trends in research and development at NIST.
ERIC Educational Resources Information Center
Hodgson, Ted; Andersen, Lyle; Robison-Cox, Jim; Jones, Clain
2004-01-01
Water quality experiments, especially the use of macroinvertebrates as indicators of water quality, offer an ideal context for connecting statistics and science. In the STAR program for secondary students and teachers, water quality experiments were also used as a context for teaching statistics. In this article, we trace one activity that uses…
ERIC Educational Resources Information Center
Council of Ontario Universities, Toronto.
Summary statistics on application and registration patterns of applicants wishing to pursue full-time study in first-year places in Ontario universities (for the fall of 1987) are given. Data on registrations were received indirectly from the universities as part of their annual submission of USIS/UAR enrollment data to Statistics Canada and MCU.…
Introduction to Statistical Physics
NASA Astrophysics Data System (ADS)
Casquilho, João Paulo; Ivo Cortez Teixeira, Paulo
2014-12-01
Preface; 1. Random walks; 2. Review of thermodynamics; 3. The postulates of statistical physics. Thermodynamic equilibrium; 4. Statistical thermodynamics – developments and applications; 5. The classical ideal gas; 6. The quantum ideal gas; 7. Magnetism; 8. The Ising model; 9. Liquid crystals; 10. Phase transitions and critical phenomena; 11. Irreversible processes; Appendixes; Index.
Reform in Statistical Education
ERIC Educational Resources Information Center
Huck, Schuyler W.
2007-01-01
Two questions are considered in this article: (a) What should professionals in school psychology do in an effort to stay current with developments in applied statistics? (b) What should they do with their existing knowledge to move from surface understanding of statistics to deep understanding? Written for school psychologists who have completed…
Statistical Mapping by Computer.
ERIC Educational Resources Information Center
Utano, Jack J.
The function of a statistical map is to provide readers with a visual impression of the data so that they may be able to identify any geographic characteristics of the displayed phenomena. The increasingly important role played by the computer in the production of statistical maps is manifested by the varied examples of computer maps in recent…
The purpose of the Disability Statistics Center is to produce and disseminate statistical information on disability and the status of people with disabilities in American society and to establish and monitor indicators of how conditions are changing over time to meet their health...
Lessons from Inferentialism for Statistics Education
ERIC Educational Resources Information Center
Bakker, Arthur; Derry, Jan
2011-01-01
This theoretical paper relates recent interest in informal statistical inference (ISI) to the semantic theory termed inferentialism, a significant development in contemporary philosophy, which places inference at the heart of human knowing. This theory assists epistemological reflection on challenges in statistics education encountered when…
NASA Astrophysics Data System (ADS)
Dunbar, P. K.; Furtney, M.; McLean, S. J.; Sweeney, A. D.
2014-12-01
Tsunamis have inflicted death and destruction on the coastlines of the world throughout history. The occurrence of tsunamis and the resulting effects have been collected and studied as far back as the second millennium B.C. The knowledge gained from cataloging and examining these events has led to significant changes in our understanding of tsunamis, tsunami sources, and methods to mitigate the effects of tsunamis. The most significant, not surprisingly, are often the most devastating, such as the 2011 Tohoku, Japan earthquake and tsunami. The goal of this poster is to give a brief overview of the occurrence of tsunamis and then focus specifically on several significant tsunamis. There are various criteria to determine the most significant tsunamis: the number of deaths, amount of damage, maximum runup height, had a major impact on tsunami science or policy, etc. As a result, descriptions will include some of the most costly (2011 Tohoku, Japan), the most deadly (2004 Sumatra, 1883 Krakatau), and the highest runup ever observed (1958 Lituya Bay, Alaska). The discovery of the Cascadia subduction zone as the source of the 1700 Japanese "Orphan" tsunami and a future tsunami threat to the U.S. northwest coast, contributed to the decision to form the U.S. National Tsunami Hazard Mitigation Program. The great Lisbon earthquake of 1755 marked the beginning of the modern era of seismology. Knowledge gained from the 1964 Alaska earthquake and tsunami helped confirm the theory of plate tectonics. The 1946 Alaska, 1952 Kuril Islands, 1960 Chile, 1964 Alaska, and the 2004 Banda Aceh, tsunamis all resulted in warning centers or systems being established.The data descriptions on this poster were extracted from NOAA's National Geophysical Data Center (NGDC) global historical tsunami database. Additional information about these tsunamis, as well as water level data can be found by accessing the NGDC website www.ngdc.noaa.gov/hazard/
Ector, Hugo
2010-12-01
I still remember my first book on statistics: "Elementary statistics with applications in medicine and the biological sciences" by Frederick E. Croxton. For me, it has been the start of pursuing understanding statistics in daily life and in medical practice. It was the first volume in a long row of books. In his introduction, Croxton pretends that"nearly everyone involved in any aspect of medicine needs to have some knowledge of statistics". The reality is that for many clinicians, statistics are limited to a "P < 0.05 = ok". I do not blame my colleagues who omit the paragraph on statistical methods. They have never had the opportunity to learn concise and clear descriptions of the key features. I have experienced how some authors can describe difficult methods in a well understandable language. Others fail completely. As a teacher, I tell my students that life is impossible without a basic knowledge of statistics. This feeling has resulted in an annual seminar of 90 minutes. This tutorial is the summary of this seminar. It is a summary and a transcription of the best pages I have detected. PMID:21302664
NASA Astrophysics Data System (ADS)
Cook, Samuel A.; Fukawa-Connelly, Timothy
2016-02-01
Studies have shown that at the end of an introductory statistics course, students struggle with building block concepts, such as mean and standard deviation, and rely on procedural understandings of the concepts. This study aims to investigate the understandings entering freshman of a department of mathematics and statistics (including mathematics education), students who are presumably better prepared in terms of mathematics and statistics than the average university student, have of introductory statistics. This case study found that these students enter college with common statistical misunderstandings, lack of knowledge, and idiosyncratic collections of correct statistical knowledge. Moreover, they also have a wide range of beliefs about their knowledge with some of the students who believe that they have the strongest knowledge also having significant misconceptions. More attention to these statistical building blocks may be required in a university introduction statistics course.
Predicting Success in Psychological Statistics Courses.
Lester, David
2016-06-01
Many students perform poorly in courses on psychological statistics, and it is useful to be able to predict which students will have difficulties. In a study of 93 undergraduates enrolled in Statistical Methods (18 men, 75 women; M age = 22.0 years, SD = 5.1), performance was significantly associated with sex (female students performed better) and proficiency in algebra in a linear regression analysis. Anxiety about statistics was not associated with course performance, indicating that basic mathematical skills are the best correlate for performance in statistics courses and can usefully be used to stream students into classes by ability. PMID:27273557
Winters, Ryan; Winters, Andrew; Amedee, Ronald G.
2010-01-01
The Accreditation Council for Graduate Medical Education sets forth a number of required educational topics that must be addressed in residency and fellowship programs. We sought to provide a primer on some of the important basic statistical concepts to consider when examining the medical literature. It is not essential to understand the exact workings and methodology of every statistical test encountered, but it is necessary to understand selected concepts such as parametric and nonparametric tests, correlation, and numerical versus categorical data. This working knowledge will allow you to spot obvious irregularities in statistical analyses that you encounter. PMID:21603381
Statistics of football dynamics
NASA Astrophysics Data System (ADS)
Mendes, R. S.; Malacarne, L. C.; Anteneodo, C.
2007-06-01
We investigate the dynamics of football matches. Our goal is to characterize statistically the temporal sequence of ball movements in this collective sport game, searching for traits of complex behavior. Data were collected over a variety of matches in South American, European and World championships throughout 2005 and 2006. We show that the statistics of ball touches presents power-law tails and can be described by q-gamma distributions. To explain such behavior we propose a model that provides information on the characteristics of football dynamics. Furthermore, we discuss the statistics of duration of out-of-play intervals, not directly related to the previous scenario.
Hockey sticks, principal components, and spurious significance
NASA Astrophysics Data System (ADS)
McIntyre, Stephen; McKitrick, Ross
2005-02-01
The ``hockey stick'' shaped temperature reconstruction of Mann et al. (1998, 1999) has been widely applied. However it has not been previously noted in print that, prior to their principal components (PCs) analysis on tree ring networks, they carried out an unusual data transformation which strongly affects the resulting PCs. Their method, when tested on persistent red noise, nearly always produces a hockey stick shaped first principal component (PC1) and overstates the first eigenvalue. In the controversial 15th century period, the MBH98 method effectively selects only one species (bristlecone pine) into the critical North American PC1, making it implausible to describe it as the ``dominant pattern of variance''. Through Monte Carlo analysis, we show that MBH98 benchmarks for significance of the Reduction of Error (RE) statistic are substantially under-stated and, using a range of cross-validation statistics, we show that the MBH98 15th century reconstruction lacks statistical significance.
Petroleum statistics in France
De Saint Germain, H.; Lamiraux, C.
1995-08-01
33 oil companies, including Elf, Exxon, Agip, Conoco as well as Coparex, Enron, Hadson, Midland, Hunt, Canyon and Union Texas are present in oil and gas exploration and production in France. The production of oil and gas in France amounts to some 60,000 bopd of oil and 350 MMcfpd of marketed natural gas each year, which still accounts for 3.5% and 10% for French domestic needs, respectively. To date, 166 fields have been discovered, representing a total reserve of 3 billion bbl of crude oil and 13 trillion cf of raw gas. These fields are concentrated in two major onshore sedimentary basins of Mesozoic age, which are the Aquitaine basin and the Paris basin. The Aquitaine basin should be subdivided into two distinct domains: The Parentis basin where the largest field Parentis was discovered in 1954 with still production of about 3700 bopd of oil and where Les Arbouslers field, discovered at the end of 1991, is currently producing about 10,000 bopd of oil. The northern Pyrenees and their foreland, where the Lacq field, discovered in 1951, has produced about 7.7 tcf of gas since 1957, and is still producing 138 MMcfpd. In the Paris basin, the two large oil fields are Villeperclue discovered in 1982 by Triton and Total, and Chaunoy, discovered in 1983 by Essorep, which are still producing about 10,000 and 15,000 bopd, respectively. The last significantly sized discovery occurred in 1990 with Itteville by Elf Aquitaine which is currently producing 4,200 bopd. The poster shows statistical data related to the past 20 years of oil and gas exploration and production in France.
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Playing at Statistical Mechanics
ERIC Educational Resources Information Center
Clark, Paul M.; And Others
1974-01-01
Discussed are the applications of counting techniques of a sorting game to distributions and concepts in statistical mechanics. Included are the following distributions: Fermi-Dirac, Bose-Einstein, and most probable. (RH)
Cooperative Learning in Statistics.
ERIC Educational Resources Information Center
Keeler, Carolyn M.; And Others
1994-01-01
Formal use of cooperative learning techniques proved effective in improving student performance and retention in a freshman level statistics course. Lectures interspersed with group activities proved effective in increasing conceptual understanding and overall class performance. (11 references) (Author)
Understanding Solar Flare Statistics
NASA Astrophysics Data System (ADS)
Wheatland, M. S.
2005-12-01
A review is presented of work aimed at understanding solar flare statistics, with emphasis on the well known flare power-law size distribution. Although avalanche models are perhaps the favoured model to describe flare statistics, their physical basis is unclear, and they are divorced from developing ideas in large-scale reconnection theory. An alternative model, aimed at reconciling large-scale reconnection models with solar flare statistics, is revisited. The solar flare waiting-time distribution has also attracted recent attention. Observed waiting-time distributions are described, together with what they might tell us about the flare phenomenon. Finally, a practical application of flare statistics to flare prediction is described in detail, including the results of a year of automated (web-based) predictions from the method.
Titanic: A Statistical Exploration.
ERIC Educational Resources Information Center
Takis, Sandra L.
1999-01-01
Uses the available data about the Titanic's passengers to interest students in exploring categorical data and the chi-square distribution. Describes activities incorporated into a statistics class and gives additional resources for collecting information about the Titanic. (ASK)
... and Statistics Recommend on Facebook Tweet Share Compartir Plague in the United States Plague was first introduced ... per year in the United States: 1900-2012. Plague Worldwide Plague epidemics have occurred in Africa, Asia, ...
NASA Astrophysics Data System (ADS)
Grégoire, G.
2016-05-01
This chapter is devoted to two objectives. The first one is to answer the request expressed by attendees of the first Astrostatistics School (Annecy, October 2013) to be provided with an elementary vademecum of statistics that would facilitate understanding of the given courses. In this spirit we recall very basic notions, that is definitions and properties that we think sufficient to benefit from courses given in the Astrostatistical School. Thus we give briefly definitions and elementary properties on random variables and vectors, distributions, estimation and tests, maximum likelihood methodology. We intend to present basic ideas in a hopefully comprehensible way. We do not try to give a rigorous presentation, and due to the place devoted to this chapter, can cover only a rather limited field of statistics. The second aim is to focus on some statistical tools that are useful in classification: basic introduction to Bayesian statistics, maximum likelihood methodology, Gaussian vectors and Gaussian mixture models.
Tuberculosis Data and Statistics
... Organization Chart Advisory Groups Federal TB Task Force Data and Statistics Language: English Español (Spanish) Recommend on ... United States publication. PDF [6 MB] Interactive TB Data Tool Online Tuberculosis Information System (OTIS) OTIS is ...
NASA Astrophysics Data System (ADS)
Richfield, Jon; bookfeller
2016-07-01
In reply to Ralph Kenna and Pádraig Mac Carron's feature article “Maths meets myths” in which they describe how they are using techniques from statistical physics to characterize the societies depicted in ancient Icelandic sagas.
... facts and statistics here include brain and central nervous system tumors (including spinal cord, pituitary and pineal gland ... U.S. living with a primary brain and central nervous system tumor. This year, nearly 17,000 people will ...
Purposeful Statistical Investigations
ERIC Educational Resources Information Center
Day, Lorraine
2014-01-01
Lorraine Day provides us with a great range of statistical investigations using various resources such as maths300 and TinkerPlots. Each of the investigations link mathematics to students' lives and provide engaging and meaningful contexts for mathematical inquiry.
Oakland, J.S.
1986-01-01
Addressing the increasing importance for firms to have a thorough knowledge of statistically based quality control procedures, this book presents the fundamentals of statistical process control (SPC) in a non-mathematical, practical way. It provides real-life examples and data drawn from a wide variety of industries. The foundations of good quality management and process control, and control of conformance and consistency during production are given. Offers clear guidance to those who wish to understand and implement modern SPC techniques.
Statistical Physics of Particles
NASA Astrophysics Data System (ADS)
Kardar, Mehran
2006-06-01
Statistical physics has its origins in attempts to describe the thermal properties of matter in terms of its constituent particles, and has played a fundamental role in the development of quantum mechanics. Based on lectures for a course in statistical mechanics taught by Professor Kardar at Massachusetts Institute of Technology, this textbook introduces the central concepts and tools of statistical physics. It contains a chapter on probability and related issues such as the central limit theorem and information theory, and covers interacting particles, with an extensive description of the van der Waals equation and its derivation by mean field approximation. It also contains an integrated set of problems, with solutions to selected problems at the end of the book. It will be invaluable for graduate and advanced undergraduate courses in statistical physics. A complete set of solutions is available to lecturers on a password protected website at www.cambridge.org/9780521873420. Based on lecture notes from a course on Statistical Mechanics taught by the author at MIT Contains 89 exercises, with solutions to selected problems Contains chapters on probability and interacting particles Ideal for graduate courses in Statistical Mechanics
NASA Astrophysics Data System (ADS)
Kardar, Mehran
2006-06-01
While many scientists are familiar with fractals, fewer are familiar with the concepts of scale-invariance and universality which underly the ubiquity of their shapes. These properties may emerge from the collective behaviour of simple fundamental constituents, and are studied using statistical field theories. Based on lectures for a course in statistical mechanics taught by Professor Kardar at Massachusetts Institute of Technology, this textbook demonstrates how such theories are formulated and studied. Perturbation theory, exact solutions, renormalization groups, and other tools are employed to demonstrate the emergence of scale invariance and universality, and the non-equilibrium dynamics of interfaces and directed paths in random media are discussed. Ideal for advanced graduate courses in statistical physics, it contains an integrated set of problems, with solutions to selected problems at the end of the book. A complete set of solutions is available to lecturers on a password protected website at www.cambridge.org/9780521873413. Based on lecture notes from a course on Statistical Mechanics taught by the author at MIT Contains 65 exercises, with solutions to selected problems Features a thorough introduction to the methods of Statistical Field theory Ideal for graduate courses in Statistical Physics
Anthropological significance of phenylketonuria.
Saugstad, L F
1975-01-01
The highest incidence rates of phenylketonuria (PKU) have been observed in Ireland and Scotlant. Parents heterozygous for PKU in Norway differ significantly from the general population in the Rhesus, Kell and PGM systems. The parents investigated showed an excess of Rh negative, Kell plus and PGM type 1 individuals, which makes them similar to the present populations in Ireland and Scotlant. It is postulated that the heterozygotes for PKU in Norway are descended from a completely assimilated sub-population of Celtic origin, who came or were brought here, 1ooo years ago. Bronze objects of Western European (Scottish, Irish) origin, found in Viking graves widely distributed in Norway, have been taken as evidence of Vikings returning with loot (including a number of Celts) from Western Viking settlements. The continuity of residence since the Viking age in most habitable parts of Norway, and what seems to be a nearly complete regional relationship between the sites where Viking graves contain western imported objects and the birthplaces of grandparents of PKUs identified in Norway, lend further support to the hypothesis that the heterozygotes for PKU in Norway are descended from a completely assimilated subpopulation. The remarkable resemblance between Iceland and Ireland, in respect of several genetic markers (including the Rhesus, PGM and Kell systems), is considered to be an expression of a similar proportion of people of Celtic origin in each of the two countries. Their identical, high incidence rates of PKU are regarded as further evidence of this. The significant decline in the incidence of PKU when one passes from Ireland, Scotland and Iceland, to Denmark and on to Norway and Sweden, is therefore explained as being related to a reduction in the proportion of inhabitants of Celtic extraction in the respective populations. PMID:803884
Josse, Florent; Lefebvre, Yannick; Todeschini, Patrick; Turato, Silvia; Meister, Eric
2006-07-01
Assessing the structural integrity of a nuclear Reactor Pressure Vessel (RPV) subjected to pressurized-thermal-shock (PTS) transients is extremely important to safety. In addition to conventional deterministic calculations to confirm RPV integrity, Electricite de France (EDF) carries out probabilistic analyses. Probabilistic analyses are interesting because some key variables, albeit conventionally taken at conservative values, can be modeled more accurately through statistical variability. One variable which significantly affects RPV structural integrity assessment is cleavage fracture initiation toughness. The reference fracture toughness method currently in use at EDF is the RCCM and ASME Code lower-bound K{sub IC} based on the indexing parameter RT{sub NDT}. However, in order to quantify the toughness scatter for probabilistic analyses, the master curve method is being analyzed at present. Furthermore, the master curve method is a direct means of evaluating fracture toughness based on K{sub JC} data. In the framework of the master curve investigation undertaken by EDF, this article deals with the following two statistical items: building a master curve from an extract of a fracture toughness dataset (from the European project 'Unified Reference Fracture Toughness Design curves for RPV Steels') and controlling statistical uncertainty for both mono-temperature and multi-temperature tests. Concerning the first point, master curve temperature dependence is empirical in nature. To determine the 'original' master curve, Wallin postulated that a unified description of fracture toughness temperature dependence for ferritic steels is possible, and used a large number of data corresponding to nuclear-grade pressure vessel steels and welds. Our working hypothesis is that some ferritic steels may behave in slightly different ways. Therefore we focused exclusively on the basic french reactor vessel metal of types A508 Class 3 and A 533 grade B Class 1, taking the sampling
SANABRIA, FEDERICO; KILLEEN, PETER R.
2008-01-01
Despite being under challenge for the past 50 years, null hypothesis significance testing (NHST) remains dominant in the scientific field for want of viable alternatives. NHST, along with its significance level p, is inadequate for most of the uses to which it is put, a flaw that is of particular interest to educational practitioners who too often must use it to sanctify their research. In this article, we review the failure of NHST and propose prep, the probability of replicating an effect, as a more useful statistic for evaluating research and aiding practical decision making. PMID:19122766
Statistical learning and selective inference
Taylor, Jonathan; Tibshirani, Robert J.
2015-01-01
We describe the problem of “selective inference.” This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have “cherry-picked”—searched for the strongest associations—means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis. PMID:26100887
Significant Radionuclides Determination
Jo A. Ziegler
2001-07-31
The purpose of this calculation is to identify radionuclides that are significant to offsite doses from potential preclosure events for spent nuclear fuel (SNF) and high-level radioactive waste expected to be received at the potential Monitored Geologic Repository (MGR). In this calculation, high-level radioactive waste is included in references to DOE SNF. A previous document, ''DOE SNF DBE Offsite Dose Calculations'' (CRWMS M&O 1999b), calculated the source terms and offsite doses for Department of Energy (DOE) and Naval SNF for use in design basis event analyses. This calculation reproduces only DOE SNF work (i.e., no naval SNF work is included in this calculation) created in ''DOE SNF DBE Offsite Dose Calculations'' and expands the calculation to include DOE SNF expected to produce a high dose consequence (even though the quantity of the SNF is expected to be small) and SNF owned by commercial nuclear power producers. The calculation does not address any specific off-normal/DBE event scenarios for receiving, handling, or packaging of SNF. The results of this calculation are developed for comparative analysis to establish the important radionuclides and do not represent the final source terms to be used for license application. This calculation will be used as input to preclosure safety analyses and is performed in accordance with procedure AP-3.12Q, ''Calculations'', and is subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (DOE 2000) as determined by the activity evaluation contained in ''Technical Work Plan for: Preclosure Safety Analysis, TWP-MGR-SE-000010'' (CRWMS M&O 2000b) in accordance with procedure AP-2.21Q, ''Quality Determinations and Planning for Scientific, Engineering, and Regulatory Compliance Activities''.
Fungi producing significant mycotoxins.
2012-01-01
Mycotoxins are secondary metabolites of microfungi that are known to cause sickness or death in humans or animals. Although many such toxic metabolites are known, it is generally agreed that only a few are significant in causing disease: aflatoxins, fumonisins, ochratoxin A, deoxynivalenol, zearalenone, and ergot alkaloids. These toxins are produced by just a few species from the common genera Aspergillus, Penicillium, Fusarium, and Claviceps. All Aspergillus and Penicillium species either are commensals, growing in crops without obvious signs of pathogenicity, or invade crops after harvest and produce toxins during drying and storage. In contrast, the important Fusarium and Claviceps species infect crops before harvest. The most important Aspergillus species, occurring in warmer climates, are A. flavus and A. parasiticus, which produce aflatoxins in maize, groundnuts, tree nuts, and, less frequently, other commodities. The main ochratoxin A producers, A. ochraceus and A. carbonarius, commonly occur in grapes, dried vine fruits, wine, and coffee. Penicillium verrucosum also produces ochratoxin A but occurs only in cool temperate climates, where it infects small grains. F. verticillioides is ubiquitous in maize, with an endophytic nature, and produces fumonisins, which are generally more prevalent when crops are under drought stress or suffer excessive insect damage. It has recently been shown that Aspergillus niger also produces fumonisins, and several commodities may be affected. F. graminearum, which is the major producer of deoxynivalenol and zearalenone, is pathogenic on maize, wheat, and barley and produces these toxins whenever it infects these grains before harvest. Also included is a short section on Claviceps purpurea, which produces sclerotia among the seeds in grasses, including wheat, barley, and triticale. The main thrust of the chapter contains information on the identification of these fungi and their morphological characteristics, as well as factors
Statistical Physics of Fracture
Alava, Mikko; Nukala, Phani K; Zapperi, Stefano
2006-05-01
Disorder and long-range interactions are two of the key components that make material failure an interesting playfield for the application of statistical mechanics. The cornerstone in this respect has been lattice models of the fracture in which a network of elastic beams, bonds, or electrical fuses with random failure thresholds are subject to an increasing external load. These models describe on a qualitative level the failure processes of real, brittle, or quasi-brittle materials. This has been particularly important in solving the classical engineering problems of material strength: the size dependence of maximum stress and its sample-to-sample statistical fluctuations. At the same time, lattice models pose many new fundamental questions in statistical physics, such as the relation between fracture and phase transitions. Experimental results point out to the existence of an intriguing crackling noise in the acoustic emission and of self-affine fractals in the crack surface morphology. Recent advances in computer power have enabled considerable progress in the understanding of such models. Among these partly still controversial issues, are the scaling and size-effects in material strength and accumulated damage, the statistics of avalanches or bursts of microfailures, and the morphology of the crack surface. Here we present an overview of the results obtained with lattice models for fracture, highlighting the relations with statistical physics theories and more conventional fracture mechanics approaches.
Statistical Downscaling: Lessons Learned
NASA Astrophysics Data System (ADS)
Walton, D.; Hall, A. D.; Sun, F.
2013-12-01
In this study, we examine ways to improve statistical downscaling of general circulation model (GCM) output. Why do we downscale GCM output? GCMs have low resolution, so they cannot represent local dynamics and topographic effects that cause spatial heterogeneity in the regional climate change signal. Statistical downscaling recovers fine-scale information by utilizing relationships between the large-scale and fine-scale signals to bridge this gap. In theory, the downscaled climate change signal is more credible and accurate than its GCM counterpart, but in practice, there may be little improvement. Here, we tackle the practical problems that arise in statistical downscaling, using temperature change over the Los Angeles region as a test case. This region is an ideal place to apply downscaling since its complex topography and shoreline are poorly simulated by GCMs. By comparing two popular statistical downscaling methods and one dynamical downscaling method, we identify issues with statistically downscaled climate change signals and develop ways to fix them. We focus on scale mismatch, domain of influence, and other problems - many of which users may be unaware of - and discuss practical solutions.
Statistical properties of Fourier-based time-lag estimates
NASA Astrophysics Data System (ADS)
Epitropakis, A.; Papadakis, I. E.
2016-06-01
observed time series; b) smoothing of the cross-periodogram should be avoided, as this may introduce significant bias to the time-lag estimates, which can be taken into account by assuming a model cross-spectrum (and not just a model time-lag spectrum); c) time-lags should be estimated by dividing observed time series into a number, say m, of shorter data segments and averaging the resulting cross-periodograms; d) if the data segments have a duration ≳ 20 ks, the time-lag bias is ≲15% of its intrinsic value for the model cross-spectra and power-spectra considered in this work. This bias should be estimated in practise (by considering possible intrinsic cross-spectra that may be applicable to the time-lag spectra at hand) to assess the reliability of any time-lag analysis; e) the effects of experimental noise can be minimised by only estimating time-lags in the frequency range where the sample coherence is larger than 1.2/(1 + 0.2m). In this range, the amplitude of noise variations caused by measurement errors is smaller than the amplitude of the signal's intrinsic variations. As long as m ≳ 20, time-lags estimated by averaging over individual data segments have analytical error estimates that are within 95% of the true scatter around their mean, and their distribution is similar, albeit not identical, to a Gaussian.
Innovative trend significance test and applications
NASA Astrophysics Data System (ADS)
Şen, Zekai
2015-11-01
Hydro-climatological time series might embed characteristics of past changes concerning climate variability in terms of shifts, cyclic fluctuations, and more significantly in the form of trends. Identification of such features from the available records is one of the prime tasks of hydrologists, climatologists, applied statisticians, or experts in related topics. Although there are different trend identification and significance tests in the literature, they require restrictive assumptions, which may not be existent in the structure of hydro-climatological time series. In this paper, a method is suggested with statistical significance test for trend identification in an innovative manner. This method has non-parametric basis without any restrictive assumption, and its application is rather simple with the concept of sub-series comparisons that are extracted from the main time series. The method provides privilege for selection of sub-temporal half periods for the comparison and, finally, generates trend on objective and quantitative manners. The necessary statistical equations are derived for innovative trend identification and statistical significance test application. The application of the proposed methodology is suggested for three time series from different parts of the world including Southern New Jersey annual temperature, Danube River annual discharge, and Tigris River Diyarbakir meteorology station annual total rainfall records. Each record has significant trend with increasing type in the New Jersey case, whereas in other two cases, decreasing trends exist.
Perception in statistical graphics
NASA Astrophysics Data System (ADS)
VanderPlas, Susan Ruth
There has been quite a bit of research on statistical graphics and visualization, generally focused on new types of graphics, new software to create graphics, interactivity, and usability studies. Our ability to interpret and use statistical graphics hinges on the interface between the graph itself and the brain that perceives and interprets it, and there is substantially less research on the interplay between graph, eye, brain, and mind than is sufficient to understand the nature of these relationships. The goal of the work presented here is to further explore the interplay between a static graph, the translation of that graph from paper to mental representation (the journey from eye to brain), and the mental processes that operate on that graph once it is transferred into memory (mind). Understanding the perception of statistical graphics should allow researchers to create more effective graphs which produce fewer distortions and viewer errors while reducing the cognitive load necessary to understand the information presented in the graph. Taken together, these experiments should lay a foundation for exploring the perception of statistical graphics. There has been considerable research into the accuracy of numerical judgments viewers make from graphs, and these studies are useful, but it is more effective to understand how errors in these judgments occur so that the root cause of the error can be addressed directly. Understanding how visual reasoning relates to the ability to make judgments from graphs allows us to tailor graphics to particular target audiences. In addition, understanding the hierarchy of salient features in statistical graphics allows us to clearly communicate the important message from data or statistical models by constructing graphics which are designed specifically for the perceptual system.
The functional significance of stereopsis.
O'Connor, Anna R; Birch, Eileen E; Anderson, Susan; Draper, Hayley
2010-04-01
Purpose. Development or restoration of binocular vision is one of the key goals of strabismus management; however, the functional impact of stereoacuity has largely been neglected. Methods. Subjects aged 10 to 30 years with normal, reduced, or nil stereoacuity performed three tasks: Purdue pegboard (measured how many pegs placed in 30 seconds), bead threading (with two sizes of bead, to increase the difficulty; measured time taken to thread a number of beads), and water pouring (measured both accuracy and time). All tests were undertaken both with and without occlusion of one eye. Results. One hundred forty-three subjects were recruited, 32.9% (n = 47) with a manifest deviation. Performances on the pegboard and bead tasks were significantly worse in the nil stereoacuity group when compared with that of the normal stereoacuity group. On the large and small bead tasks, those with reduced stereoacuity were better than those with nil stereoacuity (when the Preschool Randot Stereoacuity Test [Stereo Optical Co, Inc., Chicago, IL] results were used to determine stereoacuity levels). Comparison of the short-term monocular conditions (those with normal stereoacuity but occluded) with nil stereoacuity showed that, on all measures, the performance was best in the nil stereoacuity group and was statistically significant for the large and small beads task, irrespective of which test result was used to define the stereoacuity levels. Conclusions. Performance on motor skills tasks was related to stereoacuity, with subjects with normal stereoacuity performing best on all tests. This quantifiable degradation in performance on some motor skill tasks supports the need to implement management strategies to maximize development of high-grade stereoacuity. PMID:19933184
Statistical aspects of solar flares
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
1987-01-01
A survey of the statistical properties of 850 H alpha solar flares during 1975 is presented. Comparison of the results found here with those reported elsewhere for different epochs is accomplished. Distributions of rise time, decay time, and duration are given, as are the mean, mode, median, and 90th percentile values. Proportions by selected groupings are also determined. For flares in general, mean values for rise time, decay time, and duration are 5.2 + or - 0.4 min, and 18.1 + or 1.1 min, respectively. Subflares, accounting for nearly 90 percent of the flares, had mean values lower than those found for flares of H alpha importance greater than 1, and the differences are statistically significant. Likewise, flares of bright and normal relative brightness have mean values of decay time and duration that are significantly longer than those computed for faint flares, and mass-motion related flares are significantly longer than non-mass-motion related flares. Seventy-three percent of the mass-motion related flares are categorized as being a two-ribbon flare and/or being accompanied by a high-speed dark filament. Slow rise time flares (rise time greater than 5 min) have a mean value for duration that is significantly longer than that computed for fast rise time flares, and long-lived duration flares (duration greater than 18 min) have a mean value for rise time that is significantly longer than that computed for short-lived duration flares, suggesting a positive linear relationship between rise time and duration for flares. Monthly occurrence rates for flares in general and by group are found to be linearly related in a positive sense to monthly sunspot number. Statistical testing reveals the association between sunspot number and numbers of flares to be significant at the 95 percent level of confidence, and the t statistic for slope is significant at greater than 99 percent level of confidence. Dependent upon the specific fit, between 58 percent and 94 percent of
Mixed-effects statistical model for comparative LC-MS proteomics studies.
Daly, D S; Anderson, K K; Panisko, E A; Purvine, S O; Fang, R; Monroe, M E; Baker, S E
2008-03-01
Comparing a protein's concentrations across two or more treatments is the focus of many proteomics studies. A frequent source of measurements for these comparisons is a mass spectrometry (MS) analysis of a protein's peptide ions separated by liquid chromatography (LC) following its enzymatic digestion. Alas, LC-MS identification and quantification of equimolar peptides can vary significantly due to their unequal digestion, separation, and ionization. This unequal measurability of peptides, the largest source of LC-MS nuisance variation, stymies confident comparison of a protein's concentration across treatments. Our objective is to introduce a mixed-effects statistical model for comparative LC-MS proteomics studies. We describe LC-MS peptide abundance with a linear model featuring pivotal terms that account for unequal peptide LC-MS measurability. We advance fitting this model to an often incomplete LC-MS data set with REstricted Maximum Likelihood (REML) estimation, producing estimates of model goodness-of-fit, treatment effects, standard errors, confidence intervals, and protein relative concentrations. We illustrate the model with an experiment featuring a known dilution series of a filamentous ascomycete fungus Trichoderma reesei protein mixture. For 781 of the 1546 T. reesei proteins with sufficient data coverage, the fitted mixed-effects models capably described the LC-MS measurements. The LC-MS measurability terms effectively accounted for this major source of uncertainty. Ninety percent of the relative concentration estimates were within 0.5-fold of the true relative concentrations. Akin to the common ratio method, this model also produced biased estimates, albeit less biased. Bias decreased significantly, both absolutely and relative to the ratio method, as the number of observed peptides per protein increased. Mixed-effects statistical modeling offers a flexible, well-established methodology for comparative proteomics studies integrating common
A mixed-effects Statistical Model for Comparative LC-MS Proteomics Studies
Daly, Don S.; Anderson, Kevin K.; Panisko, Ellen A.; Purvine, Samuel O.; Fang, Ruihua; Monroe, Matthew E.; Baker, Scott E.
2008-03-01
Comparing a protein’s concentrations across two or more treatments is the focus of many proteomics studies. A frequent source of measurements for these comparisons is a mass spectrometry (MS) analysis of a protein’s peptide ions separated by liquid chromatography (LC) following its enzymatic digestion. Alas, LC-MS identification and quantification of equimolar peptides can vary significantly due to their unequal digestion, separation and ionization. This unequal measurability of peptides, the largest source of LC-MS nuisance variation, stymies confident comparison of a protein’s concentration across treatments. Our objective is to introduce a mixed-effects statistical model for comparative LC-MS proteomics studies. We describe LC-MS peptide abundance with a linear model featuring pivotal terms that account for unequal peptide LC-MS measurability. We advance fitting this model to an often incomplete LC-MS dataset with REstricted Maximum Likelihood (REML) estimation, producing estimates of model goodness-offit, treatment effects, standard errors, confidence intervals, and protein relative concentrations. We illustrate the model with an experiment featuring a known dilution series of a filamentous ascomycete fungus Trichoderma reesei protein mixture. For the 781 of 1546 T.reesei proteins with sufficient data coverage, the fitted mixed-effects models capably described the LC-MS measurements. The LC-MS measurability terms effectively accounted for this major source of uncertainty. Ninety percent of the relative concentration estimates were within 1/2 fold of the true relative concentrations. Akin to the common ratio method, this model also produced biased estimates, albeit less biased. Bias decreased significantly, both absolutely and relative to the ratio method, as the number of observed peptides per protein increased. Mixed-effects statistical modeling offers a flexible, well-established methodology for comparative proteomics studies integrating common
Analogies for Understanding Statistics
ERIC Educational Resources Information Center
Hocquette, Jean-Francois
2004-01-01
This article describes a simple way to explain the limitations of statistics to scientists and students to avoid the publication of misleading conclusions. Biologists examine their results extremely critically and carefully choose the appropriate analytic methods depending on their scientific objectives. However, no such close attention is usually…
Statistical methods in microbiology.
Ilstrup, D M
1990-01-01
Statistical methodology is viewed by the average laboratory scientist, or physician, sometimes with fear and trepidation, occasionally with loathing, and seldom with fondness. Statistics may never be loved by the medical community, but it does not have to be hated by them. It is true that statistical science is sometimes highly mathematical, always philosophical, and occasionally obtuse, but for the majority of medical studies it can be made palatable. The goal of this article has been to outline a finite set of methods of analysis that investigators should choose based on the nature of the variable being studied and the design of the experiment. The reader is encouraged to seek the advice of a professional statistician when there is any doubt about the appropriate method of analysis. A statistician can also help the investigator with problems that have nothing to do with statistical tests, such as quality control, choice of response variable and comparison groups, randomization, and blinding of assessment of response variables. PMID:2200604
Statistical Energy Analysis Program
NASA Technical Reports Server (NTRS)
Ferebee, R. C.; Trudell, R. W.; Yano, L. I.; Nygaard, S. I.
1985-01-01
Statistical Energy Analysis (SEA) is powerful tool for estimating highfrequency vibration spectra of complex structural systems and incorporated into computer program. Basic SEA analysis procedure divided into three steps: Idealization, parameter generation, and problem solution. SEA computer program written in FORTRAN V for batch execution.
Education Statistics Quarterly, 2003.
ERIC Educational Resources Information Center
Marenus, Barbara; Burns, Shelley; Fowler, William; Greene, Wilma; Knepper, Paula; Kolstad, Andrew; McMillen Seastrom, Marilyn; Scott, Leslie
2003-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released in a 3-month period. Each issue also contains a message from the NCES on a timely…
Spitball Scatterplots in Statistics
ERIC Educational Resources Information Center
Wagaman, John C.
2012-01-01
This paper describes an active learning idea that I have used in my applied statistics class as a first lesson in correlation and regression. Students propel spitballs from various standing distances from the target and use the recorded data to determine if the spitball accuracy is associated with standing distance and review the algebra of lines…
Juvenile Court Statistics - 1972.
ERIC Educational Resources Information Center
Office of Youth Development (DHEW), Washington, DC.
This report is a statistical study of juvenile court cases in 1972. The data demonstrates how the court is frequently utilized in dealing with juvenile delinquency by the police as well as by other community agencies and parents. Excluded from this report are the ordinary traffic cases handled by juvenile court. The data indicate that: (1) in…
Library Research and Statistics.
ERIC Educational Resources Information Center
Lynch, Mary Jo; St. Lifer, Evan; Halstead, Kent; Fox, Bette-Lee; Miller, Marilyn L.; Shontz, Marilyn L.
2001-01-01
These nine articles discuss research and statistics on libraries and librarianship, including libraries in the United States, Canada, and Mexico; acquisition expenditures in public, academic, special, and government libraries; price indexes; state rankings of public library data; library buildings; expenditures in school library media centers; and…
Foundations of Statistical Seismology
NASA Astrophysics Data System (ADS)
Vere-Jones, David
2010-06-01
A brief account is given of the principles of stochastic modelling in seismology, with special regard to the role and development of stochastic models for seismicity. Stochastic models are seen as arising in a hierarchy of roles in seismology, as in other scientific disciplines. At their simplest, they provide a convenient descriptive tool for summarizing data patterns; in engineering and other applications, they provide a practical way of bridging the gap between the detailed modelling of a complex system, and the need to fit models to limited data; at the most fundamental level they arise as a basic component in the modelling of earthquake phenomena, analogous to that of stochastic models in statistical mechanics or turbulence theory. As an emerging subdiscipline, statistical seismology includes elements of all of these. The scope for the development of stochastic models depends crucially on the quantity and quality of the available data. The availability of extensive, high-quality catalogues and other relevant data lies behind the recent explosion of interest in statistical seismology. At just such a stage, it seems important to review the underlying principles on which statistical modelling is based, and that is the main purpose of the present paper.
Graduate Statistics: Student Attitudes
ERIC Educational Resources Information Center
Kennedy, Robert L.; Broadston, Pamela M.
2004-01-01
This study investigated the attitudes toward statistics of graduate students who used a computer program as part of the instruction, which allowed for an individualized, self-paced, student-centered, activity-based course. The twelve sections involved in this study were offered in the spring and fall 2001, spring and fall 2002, spring and fall…
Geopositional Statistical Methods
NASA Technical Reports Server (NTRS)
Ross, Kenton
2006-01-01
RMSE based methods distort circular error estimates (up to 50% overestimation). The empirical approach is the only statistically unbiased estimator offered. Ager modification to Shultz approach is nearly unbiased, but cumbersome. All methods hover around 20% uncertainty (@ 95% confidence) for low geopositional bias error estimates. This requires careful consideration in assessment of higher accuracy products.
Statistical Reasoning over Lunch
ERIC Educational Resources Information Center
Selmer, Sarah J.; Bolyard, Johnna J.; Rye, James A.
2011-01-01
Students in the 21st century are exposed daily to a staggering amount of numerically infused media. In this era of abundant numeric data, students must be able to engage in sound statistical reasoning when making life decisions after exposure to varied information. The context of nutrition can be used to engage upper elementary and middle school…
Fractional statistics and confinement
NASA Astrophysics Data System (ADS)
Gaete, P.; Wotzasek, C.
2005-02-01
It is shown that a pointlike composite having charge and magnetic moment displays a confining potential for the static interaction while simultaneously obeying fractional statistics in a pure gauge theory in three dimensions, without a Chern-Simons term. This result is distinct from the Maxwell-Chern-Simons theory that shows a screening nature for the potential.
Statistics for Learning Genetics
ERIC Educational Resources Information Center
Charles, Abigail Sheena
2012-01-01
This study investigated the knowledge and skills that biology students may need to help them understand statistics/mathematics as it applies to genetics. The data are based on analyses of current representative genetics texts, practicing genetics professors' perspectives, and more directly, students' perceptions of, and performance in,…
ERIC Educational Resources Information Center
Akram, Muhammad; Siddiqui, Asim Jamal; Yasmeen, Farah
2004-01-01
In order to learn the concept of statistical techniques one needs to run real experiments that generate reliable data. In practice, the data from some well-defined process or system is very costly and time consuming. It is difficult to run real experiments during the teaching period in the university. To overcome these difficulties, statisticians…
Knot theory and statistical mechanics
Jones, V.F.R. )
1990-11-01
Certain algebraic relations used to solve models in statistical mechanics were key to describing a mathematical property of knots known as a polynomial invariant. This connection, tenuous at first, has since developed into a significant flow of ideas. The appearance of such common ground is not atypical of recent developments in mathematics and physics--ideas from different fields interact and produce unexpected results. Indeed, the discovery of the connection between knots and statistical mechanics passed through a theory intimately related to the mathematical structure of quantum physics. This theory, called von Neumann algebras, is distinguished by the idea of continuous dimensionality. Spaces typically have dimensions that are natural numbers, such as 2, 3 or 11, but in von Neumann algebras dimensions such as 2 or {pi} are equally possible. This possibility for continuous dimension played a key role in joining knot theory and statistical mechanics. In another direction, the knot invariants were soon found to occur in quantum field theory. Indeed, Edward Witten of the Institute for Advanced Study in Princeton, N.J., has shown that topological quantum field theory provides a natural way of expressing the new ideas about knots. This advance, in turn, has allowed a beautiful generalization about the invariants of knots in more complicated three-dimensional spaces known as three-manifolds, in which space itself may contain holes and loops.
Statistics, Uncertainty, and Transmitted Variation
Wendelberger, Joanne Roth
2014-11-05
The field of Statistics provides methods for modeling and understanding data and making decisions in the presence of uncertainty. When examining response functions, variation present in the input variables will be transmitted via the response function to the output variables. This phenomenon can potentially have significant impacts on the uncertainty associated with results from subsequent analysis. This presentation will examine the concept of transmitted variation, its impact on designed experiments, and a method for identifying and estimating sources of transmitted variation in certain settings.
Statistics for Learning Genetics
NASA Astrophysics Data System (ADS)
Charles, Abigail Sheena
This study investigated the knowledge and skills that biology students may need to help them understand statistics/mathematics as it applies to genetics. The data are based on analyses of current representative genetics texts, practicing genetics professors' perspectives, and more directly, students' perceptions of, and performance in, doing statistically-based genetics problems. This issue is at the emerging edge of modern college-level genetics instruction, and this study attempts to identify key theoretical components for creating a specialized biological statistics curriculum. The goal of this curriculum will be to prepare biology students with the skills for assimilating quantitatively-based genetic processes, increasingly at the forefront of modern genetics. To fulfill this, two college level classes at two universities were surveyed. One university was located in the northeastern US and the other in the West Indies. There was a sample size of 42 students and a supplementary interview was administered to a select 9 students. Interviews were also administered to professors in the field in order to gain insight into the teaching of statistics in genetics. Key findings indicated that students had very little to no background in statistics (55%). Although students did perform well on exams with 60% of the population receiving an A or B grade, 77% of them did not offer good explanations on a probability question associated with the normal distribution provided in the survey. The scope and presentation of the applicable statistics/mathematics in some of the most used textbooks in genetics teaching, as well as genetics syllabi used by instructors do not help the issue. It was found that the text books, often times, either did not give effective explanations for students, or completely left out certain topics. The omission of certain statistical/mathematical oriented topics was seen to be also true with the genetics syllabi reviewed for this study. Nonetheless
ERIC Educational Resources Information Center
Chan, Shiau Wei; Ismail, Zaleha
2014-01-01
The focus of assessment in statistics has gradually shifted from traditional assessment towards alternative assessment where more attention has been paid to the core statistical concepts such as center, variability, and distribution. In spite of this, there are comparatively few assessments that combine the significant three types of statistical…
The Statistical Drake Equation
NASA Astrophysics Data System (ADS)
Maccone, Claudio
2010-12-01
We provide the statistical generalization of the Drake equation. From a simple product of seven positive numbers, the Drake equation is now turned into the product of seven positive random variables. We call this "the Statistical Drake Equation". The mathematical consequences of this transformation are then derived. The proof of our results is based on the Central Limit Theorem (CLT) of Statistics. In loose terms, the CLT states that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable. This is called the Lyapunov Form of the CLT, or the Lindeberg Form of the CLT, depending on the mathematical constraints assumed on the third moments of the various probability distributions. In conclusion, we show that: The new random variable N, yielding the number of communicating civilizations in the Galaxy, follows the LOGNORMAL distribution. Then, as a consequence, the mean value of this lognormal distribution is the ordinary N in the Drake equation. The standard deviation, mode, and all the moments of this lognormal N are also found. The seven factors in the ordinary Drake equation now become seven positive random variables. The probability distribution of each random variable may be ARBITRARY. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into our statistical Drake equation by allowing an arbitrary probability distribution for each factor. This is both physically realistic and practically very useful, of course. An application of our statistical Drake equation then follows. The (average) DISTANCE between any two neighboring and communicating civilizations in the Galaxy may be shown to be inversely proportional to the cubic root of N. Then, in our approach, this distance becomes a new random variable. We derive the relevant probability density
NASA Astrophysics Data System (ADS)
Maccone, C.
In this paper is provided the statistical generalization of the Fermi paradox. The statistics of habitable planets may be based on a set of ten (and possibly more) astrobiological requirements first pointed out by Stephen H. Dole in his book Habitable planets for man (1964). The statistical generalization of the original and by now too simplistic Dole equation is provided by replacing a product of ten positive numbers by the product of ten positive random variables. This is denoted the SEH, an acronym standing for “Statistical Equation for Habitables”. The proof in this paper is based on the Central Limit Theorem (CLT) of Statistics, stating that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable (Lyapunov form of the CLT). It is then shown that: 1. The new random variable NHab, yielding the number of habitables (i.e. habitable planets) in the Galaxy, follows the log- normal distribution. By construction, the mean value of this log-normal distribution is the total number of habitable planets as given by the statistical Dole equation. 2. The ten (or more) astrobiological factors are now positive random variables. The probability distribution of each random variable may be arbitrary. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into the SEH by allowing an arbitrary probability distribution for each factor. This is both astrobiologically realistic and useful for any further investigations. 3. By applying the SEH it is shown that the (average) distance between any two nearby habitable planets in the Galaxy may be shown to be inversely proportional to the cubic root of NHab. This distance is denoted by new random variable D. The relevant probability density function is derived, which was named the "Maccone distribution" by Paul Davies in