Statistical Significance Testing.
ERIC Educational Resources Information Center
McLean, James E., Ed.; Kaufman, Alan S., Ed.
1998-01-01
The controversy about the use or misuse of statistical significance testing has become the major methodological issue in educational research. This special issue contains three articles that explore the controversy, three commentaries on these articles, an overall response, and three rejoinders by the first three authors. They are: (1)…
Statistically significant relational data mining :
Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.
2014-02-01
This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.
Significant results: statistical or clinical?
2016-01-01
The null hypothesis significance test method is popular in biological and medical research. Many researchers have used this method for their research without exact knowledge, though it has both merits and shortcomings. Readers will know its shortcomings, as well as several complementary or alternative methods, as such the estimated effect size and the confidence interval. PMID:27066201
Statistical Significance of Threading Scores
Fayyaz Movaghar, Afshin; Launay, Guillaume; Schbath, Sophie; Gibrat, Jean-François
2012-01-01
Abstract We present a general method for assessing threading score significance. The threading score of a protein sequence, thread onto a given structure, should be compared with the threading score distribution of a random amino-acid sequence, of the same length, thread on the same structure; small p-values point significantly high scores. We claim that, due to general protein contact map properties, this reference distribution is a Weibull extreme value distribution whose parameters depend on the threading method, the structure, the length of the query and the random sequence simulation model used. These parameters can be estimated off-line with simulated sequence samples, for different sequence lengths. They can further be interpolated at the exact length of a query, enabling the quick computation of the p-value. PMID:22149633
Statistical significance of the gallium anomaly
Giunti, Carlo; Laveder, Marco
2011-06-15
We calculate the statistical significance of the anomalous deficit of electron neutrinos measured in the radioactive source experiments of the GALLEX and SAGE solar neutrino detectors, taking into account the uncertainty of the detection cross section. We found that the statistical significance of the anomaly is {approx}3.0{sigma}. A fit of the data in terms of neutrino oscillations favors at {approx}2.7{sigma} short-baseline electron neutrino disappearance with respect to the null hypothesis of no oscillations.
The insignificance of statistical significance testing
Johnson, Douglas H.
1999-01-01
Despite their use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.
Statistical Significance vs. Practical Significance: An Exploration through Health Education
ERIC Educational Resources Information Center
Rosen, Brittany L.; DeMaria, Andrea L.
2012-01-01
The purpose of this paper is to examine the differences between statistical and practical significance, including strengths and criticisms of both methods, as well as provide information surrounding the application of various effect sizes and confidence intervals within health education research. Provided are recommendations, explanations and…
Determining the Statistical Significance of Relative Weights
ERIC Educational Resources Information Center
Tonidandel, Scott; LeBreton, James M.; Johnson, Jeff W.
2009-01-01
Relative weight analysis is a procedure for estimating the relative importance of correlated predictors in a regression equation. Because the sampling distribution of relative weights is unknown, researchers using relative weight analysis are unable to make judgments regarding the statistical significance of the relative weights. J. W. Johnson…
Statistical significance testing and clinical trials.
Krause, Merton S
2011-09-01
The efficacy of treatments is better expressed for clinical purposes in terms of these treatments' outcome distributions and their overlapping rather than in terms of the statistical significance of these distributions' mean differences, because clinical practice is primarily concerned with the outcome of each individual client rather than with the mean of the variety of outcomes in any group of clients. Reports of the obtained outcome distributions for the comparison groups of all competently designed and executed randomized clinical trials should be publicly available no matter what the statistical significance of the mean differences among these groups, because all of these studies' outcome distributions provide clinically useful information about the efficacy of the treatments compared.
Systematic identification of statistically significant network measures
NASA Astrophysics Data System (ADS)
Ziv, Etay; Koytcheff, Robin; Middendorf, Manuel; Wiggins, Chris
2005-01-01
We present a graph embedding space (i.e., a set of measures on graphs) for performing statistical analyses of networks. Key improvements over existing approaches include discovery of “motif hubs” (multiple overlapping significant subgraphs), computational efficiency relative to subgraph census, and flexibility (the method is easily generalizable to weighted and signed graphs). The embedding space is based on scalars, functionals of the adjacency matrix representing the network. Scalars are global, involving all nodes; although they can be related to subgraph enumeration, there is not a one-to-one mapping between scalars and subgraphs. Improvements in network randomization and significance testing—we learn the distribution rather than assuming Gaussianity—are also presented. The resulting algorithm establishes a systematic approach to the identification of the most significant scalars and suggests machine-learning techniques for network classification.
Finding Statistically Significant Communities in Networks
Lancichinetti, Andrea; Radicchi, Filippo; Ramasco, José J.; Fortunato, Santo
2011-01-01
Community structure is one of the main structural features of networks, revealing both their internal organization and the similarity of their elementary units. Despite the large variety of methods proposed to detect communities in graphs, there is a big need for multi-purpose techniques, able to handle different types of datasets and the subtleties of community structure. In this paper we present OSLOM (Order Statistics Local Optimization Method), the first method capable to detect clusters in networks accounting for edge directions, edge weights, overlapping communities, hierarchies and community dynamics. It is based on the local optimization of a fitness function expressing the statistical significance of clusters with respect to random fluctuations, which is estimated with tools of Extreme and Order Statistics. OSLOM can be used alone or as a refinement procedure of partitions/covers delivered by other techniques. We have also implemented sequential algorithms combining OSLOM with other fast techniques, so that the community structure of very large networks can be uncovered. Our method has a comparable performance as the best existing algorithms on artificial benchmark graphs. Several applications on real networks are shown as well. OSLOM is implemented in a freely available software (http://www.oslom.org), and we believe it will be a valuable tool in the analysis of networks. PMID:21559480
Social significance of community structure: Statistical view
NASA Astrophysics Data System (ADS)
Li, Hui-Jia; Daniels, Jasmine J.
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p -value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc.
Social significance of community structure: statistical view.
Li, Hui-Jia; Daniels, Jasmine J
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p-value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc.
Statistical Significance of Clustering using Soft Thresholding
Huang, Hanwen; Liu, Yufeng; Yuan, Ming; Marron, J. S.
2015-01-01
Clustering methods have led to a number of important discoveries in bioinformatics and beyond. A major challenge in their use is determining which clusters represent important underlying structure, as opposed to spurious sampling artifacts. This challenge is especially serious, and very few methods are available, when the data are very high in dimension. Statistical Significance of Clustering (SigClust) is a recently developed cluster evaluation tool for high dimensional low sample size data. An important component of the SigClust approach is the very definition of a single cluster as a subset of data sampled from a multivariate Gaussian distribution. The implementation of SigClust requires the estimation of the eigenvalues of the covariance matrix for the null multivariate Gaussian distribution. We show that the original eigenvalue estimation can lead to a test that suffers from severe inflation of type-I error, in the important case where there are a few very large eigenvalues. This paper addresses this critical challenge using a novel likelihood based soft thresholding approach to estimate these eigenvalues, which leads to a much improved SigClust. Major improvements in SigClust performance are shown by both mathematical analysis, based on the new notion of Theoretical Cluster Index, and extensive simulation studies. Applications to some cancer genomic data further demonstrate the usefulness of these improvements. PMID:26755893
[Significance of medical statistics in insurance medicine].
Becher, J
2001-03-01
Knowledge of medical statistics is of great benefit to every insurance medical officer as they facilitate communication with actuaries, allow officers to make their own calculations and are the basis for correctly interpreting medical journals. Only about 20% of original work in medicine today is published without statistics or only with descriptive statistics--and this trend is falling. The reader of medical publications should be in a position to make a critical analysis of the methodology and content, since one cannot always rely on the conclusions drawn by the authors: statistical errors appear very frequently in medical publications. Due to the specific methodological features involved, the assessment of meta-analyses demands special attention. The number of published meta-analyses has risen 40-fold over the last ten years. Important examples for the practical use of statistical methods in insurance medicine include estimating extramortality from published survival analyses and evaluating diagnostic test results. The purpose of this article is to highlight statistical problems and issues of relevance to insurance medicine and to establish the bases for understanding them.
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
ERIC Educational Resources Information Center
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
The Use of Meta-Analytic Statistical Significance Testing
ERIC Educational Resources Information Center
Polanin, Joshua R.; Pigott, Terri D.
2015-01-01
Meta-analysis multiplicity, the concept of conducting multiple tests of statistical significance within one review, is an underdeveloped literature. We address this issue by considering how Type I errors can impact meta-analytic results, suggest how statistical power may be affected through the use of multiplicity corrections, and propose how…
Reviewer Bias for Statistically Significant Results: A Reexamination.
ERIC Educational Resources Information Center
Fagley, N. S.; McKinney, I. Jean
1983-01-01
Reexamines the article by Atkinson, Furlong, and Wampold (1982) and questions their conclusion that reviewers were biased toward statistically significant results. A statistical power analysis shows the power of their bogus study was low. Low power in a study reporting nonsignificant findings is a valid reason for recommending not to publish.…
The questioned p value: clinical, practical and statistical significance.
Jiménez-Paneque, Rosa
2016-09-09
The use of p-value and statistical significance have been questioned since the early 80s in the last century until today. Much has been discussed about it in the field of statistics and its applications, especially in Epidemiology and Public Health. As a matter of fact, the p-value and its equivalent, statistical significance, are difficult concepts to grasp for the many health professionals some way involved in research applied to their work areas. However, its meaning should be clear in intuitive terms although it is based on theoretical concepts of the field of Statistics. This paper attempts to present the p-value as a concept that applies to everyday life and therefore intuitively simple but whose proper use cannot be separated from theoretical and methodological elements of inherent complexity. The reasons behind the criticism received by the p-value and its isolated use are intuitively explained, mainly the need to demarcate statistical significance from clinical significance and some of the recommended remedies for these problems are approached as well. It finally refers to the current trend to vindicate the p-value appealing to the convenience of its use in certain situations and the recent statement of the American Statistical Association in this regard.
Statistical significance test for transition matrices of atmospheric Markov chains
NASA Technical Reports Server (NTRS)
Vautard, Robert; Mo, Kingtse C.; Ghil, Michael
1990-01-01
Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.
Statistical Significance and Effect Size: Two Sides of a Coin.
ERIC Educational Resources Information Center
Fan, Xitao
This paper suggests that statistical significance testing and effect size are two sides of the same coin; they complement each other, but do not substitute for one another. Good research practice requires that both should be taken into consideration to make sound quantitative decisions. A Monte Carlo simulation experiment was conducted, and a…
Interpretation of Statistical Significance Testing: A Matter of Perspective.
ERIC Educational Resources Information Center
McClure, John; Suen, Hoi K.
1994-01-01
This article compares three models that have been the foundation for approaches to the analysis of statistical significance in early childhood research--the Fisherian and the Neyman-Pearson models (both considered "classical" approaches), and the Bayesian model. The article concludes that all three models have a place in the analysis of research…
Your Chi-Square Test Is Statistically Significant: Now What?
ERIC Educational Resources Information Center
Sharpe, Donald
2015-01-01
Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…
Estimation of the geochemical threshold and its statistical significance
Miesch, A.T.
1981-01-01
A statistic is proposed for estimating the geochemical threshold and its statistical significance, or it may be used to identify a group of extreme values that can be tested for significance by other means. The statistic is the maximum gap between adjacent values in an ordered array after each gap has been adjusted for the expected frequency. The values in the ordered array are geochemical values transformed by either ln(?? - ??) or ln(?? - ??) and then standardized so that the mean is zero and the variance is unity. The expected frequency is taken from a fitted normal curve with unit area. The midpoint of an adjusted gap that exceeds the corresponding critical value may be taken as an estimate of the geochemical threshold, and the associated probability indicates the likelihood that the threshold separates two geochemical populations. The adjusted gap test may fail to identify threshold values if the variation tends to be continuous from background values to the higher values that reflect mineralized ground. However, the test will serve to identify other anomalies that may be too subtle to have been noted by other means. ?? 1981.
Beyond Statistical Significance: Implications of Network Structure on Neuronal Activity
Vlachos, Ioannis; Aertsen, Ad; Kumar, Arvind
2012-01-01
It is a common and good practice in experimental sciences to assess the statistical significance of measured outcomes. For this, the probability of obtaining the actual results is estimated under the assumption of an appropriately chosen null-hypothesis. If this probability is smaller than some threshold, the results are deemed statistically significant and the researchers are content in having revealed, within their own experimental domain, a “surprising” anomaly, possibly indicative of a hitherto hidden fragment of the underlying “ground-truth”. What is often neglected, though, is the actual importance of these experimental outcomes for understanding the system under investigation. We illustrate this point by giving practical and intuitive examples from the field of systems neuroscience. Specifically, we use the notion of embeddedness to quantify the impact of a neuron's activity on its downstream neurons in the network. We show that the network response strongly depends on the embeddedness of stimulated neurons and that embeddedness is a key determinant of the importance of neuronal activity on local and downstream processing. We extrapolate these results to other fields in which networks are used as a theoretical framework. PMID:22291581
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance
Kramer, Karen L.; Veile, Amanda; Otárola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children’s growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children’s monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children’s growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children’s growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children’s growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance. PMID:26938742
Fostering Students' Statistical Literacy through Significant Learning Experience
ERIC Educational Resources Information Center
Krishnan, Saras
2015-01-01
A major objective of statistics education is to develop students' statistical literacy that enables them to be educated users of data in context. Teaching statistics in today's educational settings is not an easy feat because teachers have a huge task in keeping up with the demands of the new generation of learners. The present day students have…
A Tutorial on Hunting Statistical Significance by Chasing N.
Szucs, Denes
2016-01-01
There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Hidden data dredging (also called p-hacking) is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. In order to build better intuition to avoid, detect and criticize some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings. I illustrate several forms of two special cases of data dredging. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. Second, researchers may group participants post hoc along potential but unplanned independent grouping variables. The first approach 'hacks' the number of participants in studies, the second approach 'hacks' the number of variables in the analysis. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. Similar, usually undocumented data dredging steps can easily lead to having 20-50%, or more false positives.
A Tutorial on Hunting Statistical Significance by Chasing N
Szucs, Denes
2016-01-01
There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Hidden data dredging (also called p-hacking) is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. In order to build better intuition to avoid, detect and criticize some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings. I illustrate several forms of two special cases of data dredging. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. Second, researchers may group participants post hoc along potential but unplanned independent grouping variables. The first approach ‘hacks’ the number of participants in studies, the second approach ‘hacks’ the number of variables in the analysis. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. Similar, usually undocumented data dredging steps can easily lead to having 20–50%, or more false positives. PMID:27713723
Shukla, R.; Yu Daohai; Fulk, F.
1995-12-31
Short-term toxicity tests with aquatic organisms are a valuable measurement tool in the assessment of the toxicity of effluents, environmental samples and single chemicals. Currently toxicity tests are utilized in a wide range of US EPA regulatory activities including effluent discharge compliance. In the current approach for determining the No Observed Effect Concentration, an effluent concentration is presumed safe if there is no statistically significant difference in toxicant response versus control response. The conclusion of a safe concentration may be due to the fact that it truly is safe, or alternatively, that the ability of the statistical test to detect an effect, given its existence, is inadequate. Results of research of a new statistical approach, the basis of which is to move away from a demonstration of no difference to a demonstration of equivalence, will be discussed. The concept of observed confidence distributions, first suggested by Cox, is proposed as a measure of the strength of evidence for practically equivalent responses between a given effluent concentration and the control. The research included determination of intervals of practically equivalent responses as a function of the variability of control response. The approach is illustrated using reproductive data from tests with Ceriodaphnia dubia and survival and growth data from tests with fathead minnow. The data are from the US EPA`s National Reference Toxicant Database.
Tipping points in the arctic: eyeballing or statistical significance?
Carstensen, Jacob; Weydmann, Agata
2012-02-01
Arctic ecosystems have experienced and are projected to experience continued large increases in temperature and declines in sea ice cover. It has been hypothesized that small changes in ecosystem drivers can fundamentally alter ecosystem functioning, and that this might be particularly pronounced for Arctic ecosystems. We present a suite of simple statistical analyses to identify changes in the statistical properties of data, emphasizing that changes in the standard error should be considered in addition to changes in mean properties. The methods are exemplified using sea ice extent, and suggest that the loss rate of sea ice accelerated by factor of ~5 in 1996, as reported in other studies, but increases in random fluctuations, as an early warning signal, were observed already in 1990. We recommend to employ the proposed methods more systematically for analyzing tipping points to document effects of climate change in the Arctic.
Statistical downscaling rainfall using artificial neural network: significantly wetter Bangkok?
NASA Astrophysics Data System (ADS)
Vu, Minh Tue; Aribarg, Thannob; Supratid, Siriporn; Raghavan, Srivatsan V.; Liong, Shie-Yui
2016-11-01
Artificial neural network (ANN) is an established technique with a flexible mathematical structure that is capable of identifying complex nonlinear relationships between input and output data. The present study utilizes ANN as a method of statistically downscaling global climate models (GCMs) during the rainy season at meteorological site locations in Bangkok, Thailand. The study illustrates the applications of the feed forward back propagation using large-scale predictor variables derived from both the ERA-Interim reanalyses data and present day/future GCM data. The predictors are first selected over different grid boxes surrounding Bangkok region and then screened by using principal component analysis (PCA) to filter the best correlated predictors for ANN training. The reanalyses downscaled results of the present day climate show good agreement against station precipitation with a correlation coefficient of 0.8 and a Nash-Sutcliffe efficiency of 0.65. The final downscaled results for four GCMs show an increasing trend of precipitation for rainy season over Bangkok by the end of the twenty-first century. The extreme values of precipitation determined using statistical indices show strong increases of wetness. These findings will be useful for policy makers in pondering adaptation measures due to flooding such as whether the current drainage network system is sufficient to meet the changing climate and to plan for a range of related adaptation/mitigation measures.
Wilkinson, Michael
2014-03-01
Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Evaluating clinical significance: incorporating robust statistics with normative comparison tests.
van Wieringen, Katrina; Cribbie, Robert A
2014-05-01
The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non-normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann-Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann-Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann-Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann-Welch tests, and the power of the Schuirmann-Yuen was substantially greater than that of the Schuirmann or Schuirmann-Welch tests when distributions were skewed or outliers were present. The Schuirmann-Yuen test is recommended for assessing clinical significance with normative comparisons.
Lies, damned lies and statistics: Clinical importance versus statistical significance in research.
Mellis, Craig
2017-02-28
Correctly performed and interpreted statistics play a crucial role for both those who 'produce' clinical research, and for those who 'consume' this research. Unfortunately, however, there are many misunderstandings and misinterpretations of statistics by both groups. In particular, there is a widespread lack of appreciation for the severe limitations with p values. This is a particular problem with small sample sizes and low event rates - common features of many published clinical trials. These issues have resulted in increasing numbers of false positive clinical trials (false 'discoveries'), and the well-publicised inability to replicate many of the findings. While chance clearly plays a role in these errors, many more are due to either poorly performed or badly misinterpreted statistics. Consequently, it is essential that whenever p values appear, these need be accompanied by both 95% confidence limits and effect sizes. These will enable readers to immediately assess the plausible range of results, and whether or not the effect is clinically meaningful.
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
ERIC Educational Resources Information Center
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
ERIC Educational Resources Information Center
Monterde-i-Bort, Hector; Frias-Navarro, Dolores; Pascual-Llobell, Juan
2010-01-01
The empirical study we present here deals with a pedagogical issue that has not been thoroughly explored up until now in our field. Previous empirical studies in other sectors have identified the opinions of researchers about this topic, showing that completely unacceptable interpretations have been made of significance tests and other statistical…
ERIC Educational Resources Information Center
Norris, John M.
2015-01-01
Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
ERIC Educational Resources Information Center
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
A Review of Post-1994 Literature on Whether Statistical Significance Tests Should Be Banned.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
This paper summarizes the literature regarding statistical significance testing with an emphasis on: (1) the post-1994 literature in various disciplines; (2) alternatives to statistical significance testing; and (3) literature exploring why researchers have demonstrably failed to be influenced by the 1994 American Psychological Association…
The Historical Growth of Statistical Significance Testing in Psychology--and Its Future Prospects.
ERIC Educational Resources Information Center
Hubbard, Raymond; Ryan, Patricia A.
2000-01-01
Examined the historical growth in the popularity of statistical significance testing using a random sample of data from 12 American Psychological Association journals. Results replicate and extend findings from a study that used only one such journal. Discusses the role of statistical significance testing and the use of replication and…
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
NASA Astrophysics Data System (ADS)
Vermeesch, Pieter
2011-02-01
In my Eos Forum of 24 November 2009 (90(47), 443), I used the chi-square test to reject the null hypothesis that earthquakes occur independent of the weekday to make the point that statistical significance should not be confused with geological significance. Of the five comments on my article, only the one by Sornette and Pisarenko [2011] disputes this conclusion, while the remaining comments take issue with certain aspects of the geophysical case study. In this reply I will address all of these points, after providing some necessary further background about statistical tests. Two types of error can result from a hypothesis test. A Type I error occurs when a true null hypothesis is erroneously rejected by chance. A Type II error occurs when a false null hypothesis is erroneously accepted by chance. By definition, the p value is the probability, under the null hypothesis, of obtaining a test statistic at least as extreme as the one observed. In other words, the smaller the p value, the lower the probability that a Type I error has been made. In light of the exceedingly small p value of the earthquake data set, Tseng and Chen's [2011] assertion that a Type I error has been committed is clearly wrong. How about Type II errors?
Chládek, J; Brázdil, M; Halámek, J; Plešinger, F; Jurák, P
2013-01-01
We present an off-line analysis procedure for exploring brain activity recorded from intra-cerebral electroencephalographic data (SEEG). The objective is to determine the statistical differences between different types of stimulations in the time-frequency domain. The procedure is based on computing relative signal power change and subsequent statistical analysis. An example of characteristic statistically significant event-related de/synchronization (ERD/ERS) detected across different frequency bands following different oddball stimuli is presented. The method is used for off-line functional classification of different brain areas.
Coulson, Melissa; Healey, Michelle; Fidler, Fiona; Cumming, Geoff
2010-01-01
A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST), or confidence intervals (CIs). Authors of articles published in psychology, behavioral neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST.
Coulson, Melissa; Healey, Michelle; Fidler, Fiona; Cumming, Geoff
2010-01-01
A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST), or confidence intervals (CIs). Authors of articles published in psychology, behavioral neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST. PMID:21607077
ERIC Educational Resources Information Center
Thompson, Bruce
This paper evaluates the logic underlying various criticisms of statistical significance testing and makes specific recommendations for scientific and editorial practice that might better increase the knowledge base. Reliance on the traditional hypothesis testing model has led to a major bias against nonsignificant results and to misinterpretation…
ERIC Educational Resources Information Center
Snyder, Patricia; Lawson, Stephen
Magnitude of effect measures (MEMs), when adequately understood and correctly used, are important aids for researchers who do not want to rely solely on tests of statistical significance in substantive result interpretation. The MEM tells how much of the dependent variable can be controlled, predicted, or explained by the independent variables.…
Alphas and Asterisks: The Development of Statistical Significance Testing Standards in Sociology
ERIC Educational Resources Information Center
Leahey, Erin
2005-01-01
In this paper, I trace the development of statistical significance testing standards in sociology by analyzing data from articles published in two prestigious sociology journals between 1935 and 2000. I focus on the role of two key elements in the diffusion literature, contagion and rationality, as well as the role of institutional factors. I…
Statistical Significance of the Trends in Monthly Heavy Precipitation Over the US
Mahajan, Salil; North, Dr. Gerald R.; Saravanan, Dr. R.; Genton, Dr. Marc G.
2012-01-01
Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall's {tau} test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong.
Weighing the costs of different errors when determining statistical significant during monitoring
Technology Transfer Automated Retrieval System (TEKTRAN)
Selecting appropriate significance levels when constructing confidence intervals and performing statistical analyses with rangeland monitoring data is not a straightforward process. This process is burdened by the conventional selection of “95% confidence” (i.e., Type I error rate, a =0.05) as the d...
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
ERIC Educational Resources Information Center
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate…
Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.
ERIC Educational Resources Information Center
Deegear, James
This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…
ERIC Educational Resources Information Center
Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.
2011-01-01
In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…
ERIC Educational Resources Information Center
Spinella, Sarah
2011-01-01
As result replicability is essential to science and difficult to achieve through external replicability, the present paper notes the insufficiency of null hypothesis statistical significance testing (NHSST) and explains the bootstrap as a plausible alternative, with a heuristic example to illustrate the bootstrap method. The bootstrap relies on…
2010-01-01
Background The null hypothesis significance test (NHST) is the most frequently used statistical method, although its inferential validity has been widely criticized since its introduction. In 1988, the International Committee of Medical Journal Editors (ICMJE) warned against sole reliance on NHST to substantiate study conclusions and suggested supplementary use of confidence intervals (CI). Our objective was to evaluate the extent and quality in the use of NHST and CI, both in English and Spanish language biomedical publications between 1995 and 2006, taking into account the International Committee of Medical Journal Editors recommendations, with particular focus on the accuracy of the interpretation of statistical significance and the validity of conclusions. Methods Original articles published in three English and three Spanish biomedical journals in three fields (General Medicine, Clinical Specialties and Epidemiology - Public Health) were considered for this study. Papers published in 1995-1996, 2000-2001, and 2005-2006 were selected through a systematic sampling method. After excluding the purely descriptive and theoretical articles, analytic studies were evaluated for their use of NHST with P-values and/or CI for interpretation of statistical "significance" and "relevance" in study conclusions. Results Among 1,043 original papers, 874 were selected for detailed review. The exclusive use of P-values was less frequent in English language publications as well as in Public Health journals; overall such use decreased from 41% in 1995-1996 to 21% in 2005-2006. While the use of CI increased over time, the "significance fallacy" (to equate statistical and substantive significance) appeared very often, mainly in journals devoted to clinical specialties (81%). In papers originally written in English and Spanish, 15% and 10%, respectively, mentioned statistical significance in their conclusions. Conclusions Overall, results of our review show some improvements in
Discrete Fourier Transform: statistical effect size and significance of Fourier components.
NASA Astrophysics Data System (ADS)
Crockett, Robin
2016-04-01
A key analytical technique in the context of investigating cyclic/periodic features in time-series (and other sequential data) is the Discrete (Fast) Fourier Transform (DFT/FFT). However, assessment of the statistical effect-size and significance of the Fourier components in the DFT/FFT spectrum can be subjective and variable. This presentation will outline an approach and method for the statistical evaluation of the effect-size and significance of individual Fourier components from their DFT/FFT coefficients. The effect size is determined in terms of the proportions of the variance in the time-series that individual components account for. The statistical significance is determined using an hypothesis-test / p-value approach with respect to a null hypothesis that the time-series has no linear dependence on a given frequency (of a Fourier component). This approach also allows spectrograms to be presented in terms of these statistical parameters. The presentation will use sunspot cycles as an illustrative example.
Evidence for t{bar t} production at the Tevatron: Statistical significance and cross section
Koningsberg, J.; CDF Collaboration
1994-09-01
We summarize here the results of the ``counting experiments`` by the CDF Collaboration in the search of t{bar t} production in p{bar p} collisions at {radical}s = 1800 TeV at the Tevatron. We analyze their statistical significance by calculating the probability that the observed excess is a fluctuation of the expected backgrounds, and assuming the excess is from top events, extract a measurement of the t{bar t} production cross-section.
Massage induces an immediate, albeit short-term, reduction in muscle stiffness.
Eriksson Crommert, M; Lacourpaille, L; Heales, L J; Tucker, K; Hug, F
2015-10-01
Using ultrasound shear wave elastography, the aims of this study were: (a) to evaluate the effect of massage on stiffness of the medial gastrocnemius (MG) muscle and (b) to determine whether this effect (if any) persists over a short period of rest. A 7-min massage protocol was performed unilaterally on MG in 18 healthy volunteers. Measurements of muscle shear elastic modulus (stiffness) were performed bilaterally (control and massaged leg) in a moderately stretched position at three time points: before massage (baseline), directly after massage (follow-up 1), and following 3 min of rest (follow-up 2). Directly after massage, participants rated pain experienced during the massage. MG shear elastic modulus of the massaged leg decreased significantly at follow-up 1 (-5.2 ± 8.8%, P = 0.019, d = -0.66). There was no difference between follow-up 2 and baseline for the massaged leg (P = 0.83) indicating that muscle stiffness returned to baseline values. Shear elastic modulus was not different between time points in the control leg. There was no association between perceived pain during the massage and stiffness reduction (r = 0.035; P = 0.89). This is the first study to provide evidence that massage reduces muscle stiffness. However, this effect is short lived and returns to baseline values quickly after cessation of the massage.
NASA Astrophysics Data System (ADS)
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Statistical significance of the rich-club phenomenon in complex networks
NASA Astrophysics Data System (ADS)
Jiang, Zhi-Qiang; Zhou, Wei-Xing
2008-04-01
We propose that the rich-club phenomenon in complex networks should be defined in the spirit of bootstrapping, in which a null model is adopted to assess the statistical significance of the rich-club detected. Our method can serve as a definition of the rich-club phenomenon and is applied to analyze three real networks and three model networks. The results show significant improvement compared with previously reported results. We report a dilemma with an exceptional example, showing that there does not exist an omnipotent definition for the rich-club phenomenon.
NASA Astrophysics Data System (ADS)
Eggert, Silke; Walter, Thomas R.
2009-06-01
The study of volcanic triggering and interaction with the tectonic surroundings has received special attention in recent years, using both direct field observations and historical descriptions of eruptions and earthquake activity. Repeated reports of clustered eruptions and earthquakes may imply that interaction is important in some subregions. However, the subregions likely to suffer such clusters have not been systematically identified, and the processes responsible for the observed interaction remain unclear. We first review previous works about the clustered occurrence of eruptions and earthquakes, and describe selected events. We further elaborate available databases and confirm a statistically significant relationship between volcanic eruptions and earthquakes on the global scale. Moreover, our study implies that closed volcanic systems in particular tend to be activated in association with a tectonic earthquake trigger. We then perform a statistical study at the subregional level, showing that certain subregions are especially predisposed to concurrent eruption-earthquake sequences, whereas such clustering is statistically less significant in other subregions. Based on this study, we argue that individual and selected observations may bias the perceptible weight of coupling. The activity at volcanoes located in the predisposed subregions (e.g., Japan, Indonesia, Melanesia), however, often unexpectedly changes in association with either an imminent or a past earthquake.
Zou, Fei; Fine, Jason P.; Hu, Jianhua; Lin, D. Y.
2004-01-01
Assessing genome-wide statistical significance is an important and difficult problem in multipoint linkage analysis. Due to multiple tests on the same genome, the usual pointwise significance level based on the chi-square approximation is inappropriate. Permutation is widely used to determine genome-wide significance. Theoretical approximations are available for simple experimental crosses. In this article, we propose a resampling procedure to assess the significance of genome-wide QTL mapping for experimental crosses. The proposed method is computationally much less intensive than the permutation procedure (in the order of 102 or higher) and is applicable to complex breeding designs and sophisticated genetic models that cannot be handled by the permutation and theoretical methods. The usefulness of the proposed method is demonstrated through simulation studies and an application to a Drosophila backcross. PMID:15611194
On the statistical significance of surface air temperature trends in the Eurasian Arctic region
NASA Astrophysics Data System (ADS)
Franzke, C.
2012-12-01
This study investigates the statistical significance of the trends of station temperature time series from the European Climate Assessment & Data archive poleward of 60°N. The trends are identified by different methods and their significance is assessed by three different null models of climate noise. All stations show a warming trend but only 17 out of the 109 considered stations have trends which cannot be explained as arising from intrinsic climate fluctuations when tested against any of the three null models. Out of those 17, only one station exhibits a warming trend which is significant against all three null models. The stations with significant warming trends are located mainly in Scandinavia and Iceland.
Significance probability mapping: the final touch in t-statistic mapping.
Hassainia, F; Petit, D; Montplaisir, J
1994-01-01
Significance Probability Mapping (SPM), based on Student's t-statistic, is widely used for comparing mean brain topography maps of two groups. The map resulting from this process represents the distribution of t-values over the entire scalp. However, t-values by themselves cannot reveal whether or not group differences are significant. Significance levels associated with a few t-values are therefore commonly indicated on map legends to give the reader an idea of the significance levels of t-values. Nevertheless, a precise significance level topography cannot be achieved with these few significance values. We introduce a new kind of map which directly displays significance level topography in order to relieve the reader from converting multiple t-values to their corresponding significance probabilities, and to obtain a good quantification and a better localization of regions with significant differences between groups. As an illustration of this type of map, we present a comparison of EEG activity in Alzheimer's patients and age-matched control subjects for both wakefulness and REM sleep.
How to get statistically significant effects in any ERP experiment (and why you shouldn't).
Luck, Steven J; Gaspelin, Nicholas
2017-01-01
ERP experiments generate massive datasets, often containing thousands of values for each participant, even after averaging. The richness of these datasets can be very useful in testing sophisticated hypotheses, but this richness also creates many opportunities to obtain effects that are statistically significant but do not reflect true differences among groups or conditions (bogus effects). The purpose of this paper is to demonstrate how common and seemingly innocuous methods for quantifying and analyzing ERP effects can lead to very high rates of significant but bogus effects, with the likelihood of obtaining at least one such bogus effect exceeding 50% in many experiments. We focus on two specific problems: using the grand-averaged data to select the time windows and electrode sites for quantifying component amplitudes and latencies, and using one or more multifactor statistical analyses. Reanalyses of prior data and simulations of typical experimental designs are used to show how these problems can greatly increase the likelihood of significant but bogus results. Several strategies are described for avoiding these problems and for increasing the likelihood that significant effects actually reflect true differences among groups or conditions.
Statistical significance estimation of a signal within the GooFit framework on GPUs
NASA Astrophysics Data System (ADS)
Cristella, Leonardo; Di Florio, Adriano; Pompili, Alexis
2017-03-01
In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B+ → J/ψϕK+. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.
Tables of Significance Points for the Variance-Weighted Kolmogorov-Smirnov Statistics.
1981-02-19
NIEDERHAUSEN NO001-76- C - 075 UNCLASSIFIED TR-298 NL Lmmi TABLES OF SIGNIFICANCE POINTS FOR THE VARIANCE-WEIGHTED KOLMOGOROV-SMIRNOV STATISTICS BY Heinrich...Niederhausen TECHNICAL REPORT NO. 298 FEBRUARY 19, 1981 Prepared Under Contract N00014-76- C -0475 (NR-042-267) For the Office of Naval Research Herbert...satisfying 0 < V0 < 10 and v, < Vi-i ¥i IN,* The following functions define a 4-Sheffer sequence (see (A.12)) for the derivative operator D: d i if x ɘ f(, c
NASA Astrophysics Data System (ADS)
Eggert, S.; Walter, T. R.
2009-04-01
The study of volcanic triggering and coupling to the tectonic surroundings has received special attention in recent years, using both direct field observations and historical descriptions of eruptions and earthquake activity. Repeated reports of volcano-earthquake interactions in, e.g., Europe and Japan, may imply that clustered occurrence is important in some regions. However, the regions likely to suffer clustered eruption-earthquake activity have not been systematically identified, and the processes responsible for the observed interaction are debated. We first review previous works about the correlation of volcanic eruptions and earthquakes, and describe selected local clustered events. Following an overview of previous statistical studies, we further elaborate the databases of correlated eruptions and earthquakes from a global perspective. Since we can confirm a relationship between volcanic eruptions and earthquakes on the global scale, we then perform a statistical study on the regional level, showing that time and distance between events follow a linear relationship. In the time before an earthquake, a period of volcanic silence often occurs, whereas in the time after, an increase in volcanic activity is evident. Our statistical tests imply that certain regions are especially predisposed to concurrent eruption-earthquake pairs, e.g., Japan, whereas such pairing is statistically less significant in other regions, such as Europe. Based on this study, we argue that individual and selected observations may bias the perceptible weight of coupling. Volcanoes located in the predisposed regions (e.g., Japan, Indonesia, Melanesia), however, indeed often have unexpectedly changed in association with either an imminent or a past earthquake.
RT-PSM, a real-time program for peptide-spectrum matching with statistical significance.
Wu, Fang-Xiang; Gagné, Pierre; Droit, Arnaud; Poirier, Guy G
2006-01-01
The analysis of complex biological peptide mixtures by tandem mass spectrometry (MS/MS) produces a huge body of collision-induced dissociation (CID) MS/MS spectra. Several methods have been developed for identifying peptide-spectrum matches (PSMs) by assigning MS/MS spectra to peptides in a database. However, most of these methods either do not give the statistical significance of PSMs (e.g., SEQUEST) or employ time-consuming computational methods to estimate the statistical significance (e.g., PeptideProphet). In this paper, we describe a new algorithm, RT-PSM, which can be used to identify PSMs and estimate their accuracy statistically in real time. RT-PSM first computes PSM scores between an MS/MS spectrum and a set of candidate peptides whose masses are within a preset tolerance of the MS/MS precursor ion mass. Then the computed PSM scores of all candidate peptides are employed to fit the expectation value distribution of the scores into a second-degree polynomial function in PSM score. The statistical significance of the best PSM is estimated by extrapolating the fitting polynomial function to the best PSM score. RT-PSM was tested on two pairs of MS/MS spectrum datasets and protein databases to investigate its performance. The MS/MS spectra were acquired using an ion trap mass spectrometer equipped with a nano-electrospray ionization source. The results show that RT-PSM has good sensitivity and specificity. Using a 55,577-entry protein database and running on a standard Pentium-4, 2.8-GHz CPU personal computer, RT-PSM can process peptide spectra on a sequential, one-by-one basis in 0.047 s on average, compared to more than 7 s per spectrum on average for Sequest and X!Tandem, in their current batch-mode processing implementations. RT-PSM is clearly shown to be fast enough for real-time PSM assignment of MS/MS spectra generated every 3 s or so by a 3D ion trap or by a QqTOF instrument.
Jefferson, L; Cooper, E; Hewitt, C; Torgerson, T; Cook, L; Tharmanathan, P; Cockayne, S; Torgerson, D
2016-01-01
Objective Time-lag from study completion to publication is a potential source of publication bias in randomised controlled trials. This study sought to update the evidence base by identifying the effect of the statistical significance of research findings on time to publication of trial results. Design Literature searches were carried out in four general medical journals from June 2013 to June 2014 inclusive (BMJ, JAMA, the Lancet and the New England Journal of Medicine). Setting Methodological review of four general medical journals. Participants Original research articles presenting the primary analyses from phase 2, 3 and 4 parallel-group randomised controlled trials were included. Main outcome measures Time from trial completion to publication. Results The median time from trial completion to publication was 431 days (n = 208, interquartile range 278–618). A multivariable adjusted Cox model found no statistically significant difference in time to publication for trials reporting positive or negative results (hazard ratio: 0.86, 95% CI 0.64 to 1.16, p = 0.32). Conclusion In contrast to previous studies, this review did not demonstrate the presence of time-lag bias in time to publication. This may be a result of these articles being published in four high-impact general medical journals that may be more inclined to publish rapidly, whatever the findings. Further research is needed to explore the presence of time-lag bias in lower quality studies and lower impact journals. PMID:27757242
NASA Astrophysics Data System (ADS)
Hu, Rui; Wang, Bin
2001-02-01
Finding out statistically significant words in DNA and protein sequences forms the basis for many genetic studies. By applying the maximal entropy principle, we give one systematic way to study the nonrandom occurrence of words in DNA or protein sequences. Through comparison with experimental results, it was shown that patterns of regulatory binding sites in Saccharomyces cerevisiae ( yeast) genomes tend to occur significantly in the promoter regions. We studied two correlated gene families of yeast. The method successfully extracts the binding sites verified by experiments in each family. Many putative regulatory sites in the upstream regions are proposed. The study also suggested that some regulatory sites are active in both directions, while others show directional preference.
2014-01-01
Background Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are “real” (i.e., statistically significant). Results The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations. Conclusions We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models. PMID:24694189
Crow, C.J.
1985-01-01
Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.
A network-based method to assess the statistical significance of mild co-regulation effects.
Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna
2013-01-01
Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis.
Statistics, Probability, Significance, Likelihood: Words Mean What We Define Them to Mean
ERIC Educational Resources Information Center
Drummond, Gordon B.; Tom, Brian D. M.
2011-01-01
Statisticians use words deliberately and specifically, but not necessarily in the way they are used colloquially. For example, in general parlance "statistics" can mean numerical information, usually data. In contrast, one large statistics textbook defines the term "statistic" to denote "a characteristic of a…
NASA Astrophysics Data System (ADS)
Kellerer-Pirklbauer, Andreas
2016-04-01
Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter
Sassenhagen, Jona; Alday, Phillip M
2016-11-01
Experimental research on behavior and cognition frequently rests on stimulus or subject selection where not all characteristics can be fully controlled, even when attempting strict matching. For example, when contrasting patients to controls, variables such as intelligence or socioeconomic status are often correlated with patient status. Similarly, when presenting word stimuli, variables such as word frequency are often correlated with primary variables of interest. One procedure very commonly employed to control for such nuisance effects is conducting inferential tests on confounding stimulus or subject characteristics. For example, if word length is not significantly different for two stimulus sets, they are considered as matched for word length. Such a test has high error rates and is conceptually misguided. It reflects a common misunderstanding of statistical tests: interpreting significance not to refer to inference about a particular population parameter, but about 1. the sample in question, 2. the practical relevance of a sample difference (so that a nonsignificant test is taken to indicate evidence for the absence of relevant differences). We show inferential testing for assessing nuisance effects to be inappropriate both pragmatically and philosophically, present a survey showing its high prevalence, and briefly discuss an alternative in the form of regression including nuisance variables.
Cheng, Chia-Ying; Huang, Chung-Yuan; Sun, Chuen-Tsai
2008-02-01
A major task for postgenomic systems biology researchers is to systematically catalogue molecules and their interactions within living cells. Advancements in complex-network theory are being made toward uncovering organizing principles that govern cell formation and evolution, but we lack understanding of how molecules and their interactions determine how complex systems function. Molecular bridge motifs include isolated motifs that neither interact nor overlap with others, whereas brick motifs act as network foundations that play a central role in defining global topological organization. To emphasize their structural organizing and evolutionary characteristics, we define bridge motifs as consisting of weak links only and brick motifs as consisting of strong links only, then propose a method for performing two tasks simultaneously, which are as follows: 1) detecting global statistical features and local connection structures in biological networks and 2) locating functionally and statistically significant network motifs. To further understand the role of biological networks in system contexts, we examine functional and topological differences between bridge and brick motifs for predicting biological network behaviors and functions. After observing brick motif similarities between E. coli and S. cerevisiae, we note that bridge motifs differentiate C. elegans from Drosophila and sea urchin in three types of networks. Similarities (differences) in bridge and brick motifs imply similar (different) key circuit elements in the three organisms. We suggest that motif-content analyses can provide researchers with global and local data for real biological networks and assist in the search for either isolated or functionally and topologically overlapping motifs when investigating and comparing biological system functions and behaviors.
Gehrmann, Thies; Reinders, Marcel J.T.
2015-01-01
Background: With more and more genomes being sequenced, detecting synteny between genomes becomes more and more important. However, for microorganisms the genomic divergence quickly becomes large, resulting in different codon usage and shuffling of gene order and gene elements such as exons. Results: We present Proteny, a methodology to detect synteny between diverged genomes. It operates on the amino acid sequence level to be insensitive to codon usage adaptations and clusters groups of exons disregarding order to handle diversity in genomic ordering between genomes. Furthermore, Proteny assigns significance levels to the syntenic clusters such that they can be selected on statistical grounds. Finally, Proteny provides novel ways to visualize results at different scales, facilitating the exploration and interpretation of syntenic regions. We test the performance of Proteny on a standard ground truth dataset, and we illustrate the use of Proteny on two closely related genomes (two different strains of Aspergillus niger) and on two distant genomes (two species of Basidiomycota). In comparison to other tools, we find that Proteny finds clusters with more true homologies in fewer clusters that contain more genes, i.e. Proteny is able to identify a more consistent synteny. Further, we show how genome rearrangements, assembly errors, gene duplications and the conservation of specific genes can be easily studied with Proteny. Availability and implementation: Proteny is freely available at the Delft Bioinformatics Lab website http://bioinformatics.tudelft.nl/dbl/software. Contact: t.gehrmann@tudelft.nl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26116928
Wang, Bo; Shi, Zhanquan; Weber, Georg F; Kennedy, Michael A
2013-10-01
Nuclear magnetic resonance (NMR) spectroscopy-based metabonomics is of growing importance for discovery of human disease biomarkers. Identification and validation of disease biomarkers using statistical significance analysis (SSA) is critical for translation to clinical practice. SSA is performed by assessing a null hypothesis test using a derivative of the Student's t test, e.g., a Welch's t test. Choosing how to correct the significance level for rejecting null hypotheses in the case of multiple testing to maintain a constant family-wise type I error rate is a common problem in such tests. The multiple testing problem arises because the likelihood of falsely rejecting the null hypothesis, i.e., a false positive, grows as the number of tests applied to the same data set increases. Several methods have been introduced to address this problem. Bonferroni correction (BC) assumes all variables are independent and therefore sacrifices sensitivity for detecting true positives in partially dependent data sets. False discovery rate (FDR) methods are more sensitive than BC but uniformly ascribe highest stringency to lowest p value variables. Here, we introduce standard deviation step down (SDSD), which is more sensitive and appropriate than BC for partially dependent data sets. Sensitivity and type I error rate of SDSD can be adjusted based on the degree of variable dependency. SDSD generates fundamentally different profiles of critical p values compared with FDR methods potentially leading to reduced type II error rates. SDSD is increasingly sensitive for more concentrated metabolites. SDSD is demonstrated using NMR-based metabonomics data collected on three different breast cancer cell line extracts.
Elçi, Alper; Polat, Rahime
2011-01-01
The main objective of this study was to statistically evaluate the significance of seasonal groundwater quality change and to provide an assessment on the spatial distribution of specific groundwater quality parameters. The studied area was the Mount Nif karstic aquifer system located in the southeast of the city of Izmir. Groundwater samples were collected at 57 sampling points in the rainy winter and dry summer seasons. Groundwater quality indicators of interest were electrical conductivity (EC), nitrate, chloride, sulfate, sodium, some heavy metals, and arsenic. Maps showing the spatial distributions and temporal changes of these parameters were created to further interpret spatial patterns and seasonal changes in groundwater quality. Furthermore, statistical tests were conducted to confirm whether the seasonal changes for each quality parameter were statistically significant. It was evident from the statistical tests that the seasonal changes in most groundwater quality parameters were statistically not significant. However, the increase in EC values and aluminum concentrations from winter to summer was found to be significant. Furthermore, a negative correlation between sampling elevation and groundwater quality was found. It was shown that with simple statistical testing, important conclusions can be drawn from limited monitoring data. It was concluded that less groundwater recharge in the dry period of the year does not always imply higher concentrations for all groundwater quality parameters because water circulation times, lithology, quality and extent of recharge, and land use patterns also play an important role on the alteration of groundwater quality.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
ERIC Educational Resources Information Center
Rogosa, David
1981-01-01
The form of the Johnson-Neyman region of significance is shown to be determined by the statistic for testing the null hypothesis that the population within-group regressions are parallel. Results are obtained for both simultaneous and nonsimultaneous regions of significance. (Author)
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated.
ERIC Educational Resources Information Center
Thompson, Bruce; Snyder, Patricia A.
1998-01-01
Investigates two aspects of research analyses in quantitative research studies reported in the 1996 issues of "Journal of Counseling & Development" (JCD). Acceptable methodological practice regarding significance testing and evaluation of score reliability has evolved considerably. Contemporary thinking on these issues is described; practice as…
Key statistics related to CO/sub 2/ emissions: Significant contributing countries
Kellogg, M.A.; Edmonds, J.A.; Scott, M.J.; Pomykala, J.S.
1987-07-01
This country selection task report describes and applies a methodology for identifying a set of countries responsible for significant present and anticipated future emissions of CO/sub 2/ and other radiatively important gases (RIGs). The identification of countries responsible for CO/sub 2/ and other RIGs emissions will help determine to what extent a select number of countries might be capable of influencing future emissions. Once identified, those countries could potentially exercise cooperative collective control of global emissions and thus mitigate the associated adverse affects of those emissions. The methodology developed consists of two approaches: the resource approach and the emissions approach. While conceptually very different, both approaches yield the same fundamental conclusion. The core of any international initiative to control global emissions must include three key countries: the US, USSR, and the People's Republic of China. It was also determined that broader control can be achieved through the inclusion of sixteen additional countries with significant contributions to worldwide emissions.
Petykhov, A B; Maev, I V; Deriabin, V E
2012-01-01
Anthropometry--a technique, allowing to obtain the necessary features for the characteristic of human body's changes in norm and at pathology. Statistical analysis of anthropometric parameters, such as--body mass, length, waist line, hip, shoulder and wrist circumferences, skin rolls of fat thickness: on triceps, under a bladebone, on a breast, on a venter and on a biceps, with calculation of indexes and an assessment of possible age influence was carried out for the first time in domestic medicine. Complexes of showing interrelations anthropometric characteristics were detected. Correlation coefficients (r) were counted and the factorial (on a method main a component with the subsequent rotation--a varimax method), covariance and discriminative analyses (with application of the Kaiser and Wilks criterions and F-test) is applied. Study of intergroup variability of body composition was carried out on separate characteristics in healthy individuals groups (135 surveyed aged 45,6 +/- 1,2 years, 56,3% men and 43,7% women) and at internal pathology: patients after a gastrectomy--121 (57,7 +/- 1,2 years, 52% men and 48% women); after Billroth operation--214 (56,1 +/- 1,0 years, 53% men and 47% women); after enterectomy--103 (44,5 +/- 1,8 years, 53% men and 47% women); after mixed genesis protein-energy wasting--206 (29,04 +/- 1,6 years, 79% men and 21% women). The group of interlocking characteristics which includes anthropometric parameters of hypodermic lipopexia (rolls of fat thickness on triceps, a biceps, under a bladebone, on a venter) and fatty body mass was defined by results of the analysis. These characteristics are interconnected with age and growth and have more expressed dependence at women, that reflects development of a fatty component of a body, at assessment of body mass index at women (unlike men). The waist-hip circumference index differs irrespective of body composition indicators that doesn't allow to characterize it with the terms of truncal or
ERIC Educational Resources Information Center
Oshima, T. C.; Raju, Nambury S.; Nanda, Alice O.
2006-01-01
A new item parameter replication method is proposed for assessing the statistical significance of the noncompensatory differential item functioning (NCDIF) index associated with the differential functioning of items and tests framework. In this new method, a cutoff score for each item is determined by obtaining a (1-alpha ) percentile rank score…
Adams, James; Kruger, Uwe; Geis, Elizabeth; Gehn, Eva; Fimbres, Valeria; Pollard, Elena; Mitchell, Jessica; Ingram, Julie; Hellmers, Robert; Quig, David; Hahn, Juergen
2017-01-01
Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate
Potts, T.T.; Hylko, J.M.; Almond, D.
2007-07-01
A company's overall safety program becomes an important consideration to continue performing work and for procuring future contract awards. When injuries or accidents occur, the employer ultimately loses on two counts - increased medical costs and employee absences. This paper summarizes the human and organizational components that contributed to successful safety programs implemented by WESKEM, LLC's Environmental, Safety, and Health Departments located in Paducah, Kentucky, and Oak Ridge, Tennessee. The philosophy of 'safety, compliance, and then production' and programmatic components implemented at the start of the contracts were qualitatively identified as contributing factors resulting in a significant accumulation of safe work hours and an Experience Modification Rate (EMR) of <1.0. Furthermore, a study by the Associated General Contractors of America quantitatively validated components, already found in the WESKEM, LLC programs, as contributing factors to prevent employee accidents and injuries. Therefore, an investment in the human and organizational components now can pay dividends later by reducing the EMR, which is the key to reducing Workers' Compensation premiums. Also, knowing your employees' demographics and taking an active approach to evaluate and prevent fatigue may help employees balance work and non-work responsibilities. In turn, this approach can assist employers in maintaining a healthy and productive workforce. For these reasons, it is essential that safety needs be considered as the starting point when performing work. (authors)
Meta-analysis using effect size distributions of only statistically significant studies.
van Assen, Marcel A L M; van Aert, Robbie C M; Wicherts, Jelte M
2015-09-01
Publication bias threatens the validity of meta-analytic results and leads to overestimation of the effect size in traditional meta-analysis. This particularly applies to meta-analyses that feature small studies, which are ubiquitous in psychology. Here we develop a new method for meta-analysis that deals with publication bias. This method, p-uniform, enables (a) testing of publication bias, (b) effect size estimation, and (c) testing of the null-hypothesis of no effect. No current method for meta-analysis possesses all 3 qualities. Application of p-uniform is straightforward because no additional data on missing studies are needed and no sophisticated assumptions or choices need to be made before applying it. Simulations show that p-uniform generally outperforms the trim-and-fill method and the test of excess significance (TES; Ioannidis & Trikalinos, 2007b) if publication bias exists and population effect size is homogenous or heterogeneity is slight. For illustration, p-uniform and other publication bias analyses are applied to the meta-analysis of McCall and Carriger (1993) examining the association between infants' habituation to a stimulus and their later cognitive ability (IQ). We conclude that p-uniform is a valuable technique for examining publication bias and estimating population effects in fixed-effect meta-analyses, and as sensitivity analysis to draw inferences about publication bias.
Saha, Ranajit; Pan, Sudip; Chattaraj, Pratim K
2016-11-05
The validity of the maximum hardness principle (MHP) is tested in the cases of 50 chemical reactions, most of which are organic in nature and exhibit anomeric effect. To explore the effect of the level of theory on the validity of MHP in an exothermic reaction, B3LYP/6-311++G(2df,3pd) and LC-BLYP/6-311++G(2df,3pd) (def2-QZVP for iodine and mercury) levels are employed. Different approximations like the geometric mean of hardness and combined hardness are considered in case there are multiple reactants and/or products. It is observed that, based on the geometric mean of hardness, while 82% of the studied reactions obey the MHP at the B3LYP level, 84% of the reactions follow this rule at the LC-BLYP level. Most of the reactions possess the hardest species on the product side. A 50% null hypothesis is rejected at a 1% level of significance.
Wang, Q.; Denton, D.L.; Shukla, R.
2000-01-01
As a follow up to the recommendations of the September 1995 SETAC Pellston Workshop on Whole Effluent Toxicity (WET) on test methods and appropriate endpoints, this paper will discuss the applications and statistical properties of using a statistical criterion of minimum significant difference (MSD). The authors examined the upper limits of acceptable MSDs as acceptance criterion in the case of normally distributed data. The implications of this approach are examined in terms of false negative rate as well as false positive rate. Results indicated that the proposed approach has reasonable statistical properties. Reproductive data from short-term chronic WET test with Ceriodaphnia dubia tests were used to demonstrate the applications of the proposed approach. The data were collected by the North Carolina Department of Environment, Health, and Natural Resources (Raleigh, NC, USA) as part of their National Pollutant Discharge Elimination System program.
Escoto Ponce de León, M C; Mancilla Díaz, J M; Camacho Ruiz, E J
2008-09-01
The current study used clinical and statistical significance tests to investigate the effects of two forms (didactic or interactive) of a universal prevention program on attitudes about shape and weight, eating behaviors, the influence of body aesthetic models, and self-esteem. Three schools were randomly assigned to one, interactive, didactic, or a control condition. Children (61 girls and 59 boys, age 9-11 years) were evaluated at pre-intervention, post-intervention, and at 6-month follow-up. Programs comprised eight, 90-min sessions. Statistical and clinical significance tests showed more changes in boys and girls with the interactive program versus the didactic intervention and control groups. The findings support the use of interactive programs that highlight identified risk factors and construction of identity based on positive traits distinct to physical appearance.
Iacucci, Ernesto; Zingg, Hans H; Perkins, Theodore J
2012-01-01
High-throughput molecular biology studies, such as microarray assays of gene expression, two-hybrid experiments for detecting protein interactions, or ChIP-Seq experiments for transcription factor binding, often result in an "interesting" set of genes - say, genes that are co-expressed or bound by the same factor. One way of understanding the biological meaning of such a set is to consider what processes or functions, as defined in an ontology, are over-represented (enriched) or under-represented (depleted) among genes in the set. Usually, the significance of enrichment or depletion scores is based on simple statistical models and on the membership of genes in different classifications. We consider the more general problem of computing p-values for arbitrary integer additive statistics, or weighted membership functions. Such membership functions can be used to represent, for example, prior knowledge on the role of certain genes or classifications, differential importance of different classifications or genes to the experimenter, hierarchical relationships between classifications, or different degrees of interestingness or evidence for specific genes. We describe a generic dynamic programming algorithm that can compute exact p-values for arbitrary integer additive statistics. We also describe several optimizations for important special cases, which can provide orders-of-magnitude speed up in the computations. We apply our methods to datasets describing oxidative phosphorylation and parturition and compare p-values based on computations of several different statistics for measuring enrichment. We find major differences between p-values resulting from these statistics, and that some statistics recover "gold standard" annotations of the data better than others. Our work establishes a theoretical and algorithmic basis for far richer notions of enrichment or depletion of gene sets with respect to gene ontologies than has previously been available.
NASA Astrophysics Data System (ADS)
Baluev, Roman V.
2013-11-01
We consider the `multifrequency' periodogram, in which the putative signal is modelled as a sum of two or more sinusoidal harmonics with independent frequencies. It is useful in cases when the data may contain several periodic components, especially when their interaction with each other and with the data sampling patterns might produce misleading results. Although the multifrequency statistic itself was constructed earlier, for example by G. Foster in his CLEANest algorithm, its probabilistic properties (the detection significance levels) are still poorly known and much of what is deemed known is not rigorous. These detection levels are nonetheless important for data analysis. We argue that to prove the simultaneous existence of all n components revealed in a multiperiodic variation, it is mandatory to apply at least 2n - 1 significance tests, among which most involve various multifrequency statistics, and only n tests are single-frequency ones. The main result of this paper is an analytic estimation of the statistical significance of the frequency tuples that the multifrequency periodogram can reveal. Using the theory of extreme values of random fields (the generalized Rice method), we find a useful approximation to the relevant false alarm probability. For the double-frequency periodogram, this approximation is given by the elementary formula (π/16)W2e- zz2, where W denotes the normalized width of the settled frequency range, and z is the observed periodogram maximum. We carried out intensive Monte Carlo simulations to show that the practical quality of this approximation is satisfactory. A similar analytic expression for the general multifrequency periodogram is also given, although with less numerical verification.
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
NASA Astrophysics Data System (ADS)
Ahmed, Sheehan H.; Brooks, Alyson M.; Christensen, Charlotte R.
2017-04-01
We investigate whether the inclusion of baryonic physics influences the formation of thin, coherently rotating planes of satellites such as those seen around the Milky Way and Andromeda. For four Milky Way-mass simulations, each run both as dark matter-only and with baryons included, we are able to identify a planar configuration that significantly maximizes the number of plane satellite members. The maximum plane member satellites are consistently different between the dark matter-only and baryonic versions of the same run due to the fact that satellites are both more likely to be destroyed and to infall later in the baryonic runs. Hence, studying satellite planes in dark matter-only simulations is misleading, because they will be composed of different satellite members than those that would exist if baryons were included. Additionally, the destruction of satellites in the baryonic runs leads to less radially concentrated satellite distributions, a result that is critical to making planes that are statistically significant compared to a random distribution. Since all planes pass through the centre of the galaxy, it is much harder to create a plane of a given height from a random distribution if the satellites have a low radial concentration. We identify Andromeda's low radial satellite concentration as a key reason why the plane in Andromeda is highly significant. Despite this, when corotation is considered, none of the satellite planes identified for the simulated galaxies are as statistically significant as the observed planes around the Milky Way and Andromeda, even in the baryonic runs.
NASA Astrophysics Data System (ADS)
Casati, Michele
2014-05-01
The assertion that solar activity may play a significant role in the trigger of large volcanic eruptions is, and has been discussed by many geophysicists. Numerous scientific papers have established a possible correlation between these events and the electromagnetic coupling between the Earth and the Sun, but none of them has been able to highlight a possible statistically significant relationship between large volcanic eruptions and any of the series, such as geomagnetic activity, solar wind, sunspots number. In our research, we compare the 148 volcanic eruptions with index VEI4, the major 37 historical volcanic eruptions equal to or greater than index VEI5, recorded from 1610 to 2012 , with its sunspots number. Staring, as the threshold value, a monthly sunspot number of 46 (recorded during the great eruption of Krakatoa VEI6 historical index, August 1883), we note some possible relationships and conduct a statistical test. • Of the historical 31 large volcanic eruptions with index VEI5+, recorded between 1610 and 1955, 29 of these were recorded when the SSN<46. The remaining 2 eruptions were not recorded when the SSN<46, but rather during solar maxima of the solar cycle of the year 1739 and in the solar cycle No. 14 (Shikotsu eruption of 1739 and Ksudach 1907). • Of the historical 8 large volcanic eruptions with index VEI6+, recorded from 1610 to the present, 7 of these were recorded with SSN<46 and more specifically, within the three large solar minima known : Maunder (1645-1710), Dalton (1790-1830) and during the solar minimums occurred between 1880 and 1920. As the only exception, we note the eruption of Pinatubo of June 1991, recorded in the solar maximum of cycle 22. • Of the historical 6 major volcanic eruptions with index VEI5+, recorded after 1955, 5 of these were not recorded during periods of low solar activity, but rather during solar maxima, of the cycles 19,21 and 22. The significant tests, conducted with the chi-square χ ² = 7,782, detect a
Linden, Ariel
2008-04-01
Prior to implementing a disease management (DM) strategy, a needs assessment should be conducted to determine whether sufficient opportunity exists for an intervention to be successful in the given population. A central component of this assessment is a sample size analysis to determine whether the population is of sufficient size to allow the expected program effect to achieve statistical significance. This paper discusses the parameters that comprise the generic sample size formula for independent samples and their interrelationships, followed by modifications for the DM setting. In addition, a table is provided with sample size estimates for various effect sizes. Examples are described in detail along with strategies for overcoming common barriers. Ultimately, conducting these calculations up front will help set appropriate expectations about the ability to demonstrate the success of the intervention.
NASA Astrophysics Data System (ADS)
Wang, H. J.; Shi, W. L.; Chen, X. H.
2006-05-01
The West Development Policy being implemented in China is causing significant land use and land cover (LULC) changes in West China. With the up-to-date satellite database of the Global Land Cover Characteristics Database (GLCCD) that characterizes the lower boundary conditions, the regional climate model RIEMS-TEA is used to simulate possible impacts of the significant LULC variation. The model was run for five continuous three-month periods from 1 June to 1 September of 1993, 1994, 1995, 1996, and 1997, and the results of the five groups are examined by means of a student t-test to identify the statistical significance of regional climate variation. The main results are: (1) The regional climate is affected by the LULC variation because the equilibrium of water and heat transfer in the air-vegetation interface is changed. (2) The integrated impact of the LULC variation on regional climate is not only limited to West China where the LULC varies, but also to some areas in the model domain where the LULC does not vary at all. (3) The East Asian monsoon system and its vertical structure are adjusted by the large scale LULC variation in western China, where the consequences axe the enhancement of the westward water vapor transfer from the east east and the relevant increase of wet-hydrostatic energy in the middle-upper atmospheric layers. (4) The ecological engineering in West China affects significantly the regional climate in Northwest China, North China and the middle-lower reaches of the Yangtze River; there are obvious effects in South, Northeast, and Southwest China, but minor effects in Tibet.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
NASA Astrophysics Data System (ADS)
Govindan, R. B.; Al-Shargabi, Tareq; Andescavage, Nickie N.; Metzler, Marina; Lenin, R. B.; Plessis, Adré du
2017-01-01
Phase differences of two signals in perfect synchrony exhibit a narrow band distribution, whereas the phase differences of two asynchronous signals exhibit uniform distribution. We assess the statistical significance of the phase synchronization between two signals by using a signed rank test to compare the distribution of their phase differences to the theoretically expected uniform distribution for two asynchronous signals. Using numerical simulation of a second order autoregressive (AR2) process, we show that the proposed approach correctly identifies the coupling between the AR2 process and the driving white noise. We also identify the optimal p-value that distinguishes coupled scenarios from uncoupled ones. To identify the limiting cases, we study the phase synchronization between two independent white noises as a function of bandwidth of the filter in a different second simulation. We identify the frequency bandwidth below which the proposed approach fails and suggest using a data-driven approach for those scenarios. Finally, we demonstrate the application of this approach to study the coupling between beat-to-beat cardiac intervals and continuous blood pressure obtained from critically-ill infants to characterize the baroreflex function.
NASA Technical Reports Server (NTRS)
Friedlander, Alan L.; Harry, David P., III
1960-01-01
An exploratory analysis of vehicle guidance during the approach to a target planet is presented. The objective of the guidance maneuver is to guide the vehicle to a specific perigee distance with a high degree of accuracy and minimum corrective velocity expenditure. The guidance maneuver is simulated by considering the random sampling of real measurements with significant error and reducing this information to prescribe appropriate corrective action. The instrumentation system assumed includes optical and/or infrared devices to indicate range and a reference angle in the trajectory plane. Statistical results are obtained by Monte-Carlo techniques and are shown as the expectation of guidance accuracy and velocity-increment requirements. Results are nondimensional and applicable to any planet within limits of two-body assumptions. The problem of determining how many corrections to make and when to make them is a consequence of the conflicting requirement of accurate trajectory determination and propulsion. Optimum values were found for a vehicle approaching a planet along a parabolic trajectory with an initial perigee distance of 5 radii and a target perigee of 1.02 radii. In this example measurement errors were less than i minute of arc. Results indicate that four corrections applied in the vicinity of 50, 16, 15, and 1.5 radii, respectively, yield minimum velocity-increment requirements. Thrust devices capable of producing a large variation of velocity-increment size are required. For a vehicle approaching the earth, miss distances within 32 miles are obtained with 90-percent probability. Total velocity increments used in guidance are less than 3300 feet per second with 90-percent probability. It is noted that the above representative results are valid only for the particular guidance scheme hypothesized in this analysis. A parametric study is presented which indicates the effects of measurement error size, initial perigee, and initial energy on the guidance
Fisher, Aaron; Anderson, G Brooke; Peng, Roger; Leek, Jeff
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%-49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%-76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/.
Fisher, Aaron; Anderson, G. Brooke; Peng, Roger
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%–49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%–76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Greenhalgh, T.
1997-01-01
It is possible to be seriously misled by taking the statistical competence (and/or the intellectual honesty) of authors for granted. Some common errors committed (deliberately or inadvertently) by the authors of papers are given in the final box. PMID:9277611
Yokoyama, Shozo; Takenaka, Naomi
2005-04-01
Red-green color vision is strongly suspected to enhance the survival of its possessors. Despite being red-green color blind, however, many species have successfully competed in nature, which brings into question the evolutionary advantage of achieving red-green color vision. Here, we propose a new method of identifying positive selection at individual amino acid sites with the premise that if positive Darwinian selection has driven the evolution of the protein under consideration, then it should be found mostly at the branches in the phylogenetic tree where its function had changed. The statistical and molecular methods have been applied to 29 visual pigments with the wavelengths of maximal absorption at approximately 510-540 nm (green- or middle wavelength-sensitive [MWS] pigments) and at approximately 560 nm (red- or long wavelength-sensitive [LWS] pigments), which are sampled from a diverse range of vertebrate species. The results show that the MWS pigments are positively selected through amino acid replacements S180A, Y277F, and T285A and that the LWS pigments have been subjected to strong evolutionary conservation. The fact that these positively selected M/LWS pigments are found not only in animals with red-green color vision but also in those with red-green color blindness strongly suggests that both red-green color vision and color blindness have undergone adaptive evolution independently in different species.
Kurtz, S.E.; Fields, D.E.
1983-10-01
This report describes a version of the TERPED/P computer code that is very useful for small data sets. A new algorithm for determining the Kolmogorov-Smirnov (KS) statistics is used to extend program applicability. The TERPED/P code facilitates the analysis of experimental data and assists the user in determining its probability distribution function. Graphical and numerical tests are performed interactively in accordance with the user's assumption of normally or log-normally distributed data. Statistical analysis options include computation of the chi-square statistic and the KS one-sample test statistic and the corresponding significance levels. Cumulative probability plots of the user's data are generated either via a local graphics terminal, a local line printer or character-oriented terminal, or a remote high-resolution graphics device such as the FR80 film plotter or the Calcomp paper plotter. Several useful computer methodologies suffer from limitations of their implementations of the KS nonparametric test. This test is one of the more powerful analysis tools for examining the validity of an assumption about the probability distribution of a set of data. KS algorithms are found in other analysis codes, including the Statistical Analysis Subroutine (SAS) package and earlier versions of TERPED. The inability of these algorithms to generate significance levels for sample sizes less than 50 has limited their usefulness. The release of the TERPED code described herein contains algorithms to allow computation of the KS statistic and significance level for data sets of, if the user wishes, as few as three points. Values computed for the KS statistic are within 3% of the correct value for all data set sizes.
Liu, Wei; Ding, Jinhui
2016-05-25
The application of the principle of the intention-to-treat (ITT) to the analysis of clinical trials is challenged in the presence of missing outcome data. The consequences of stopping an assigned treatment in a withdrawn subject are unknown. It is difficult to make a single assumption about missing mechanisms for all clinical trials because there are complicated reactions in the human body to drugs due to the presence of complex biological networks, leading to data missing randomly or non-randomly. Currently there is no statistical method that can tell whether a difference between two treatments in the ITT population of a randomized clinical trial with missing data is significant at a pre-specified level. Making no assumptions about the missing mechanisms, we propose a generalized complete-case (GCC) analysis based on the data of completers. An evaluation of the impact of missing data on the ITT analysis reveals that a statistically significant GCC result implies a significant treatment effect in the ITT population at a pre-specified significance level unless, relative to the comparator, the test drug is poisonous to the non-completers as documented in their medical records. Applications of the GCC analysis are illustrated using literature data, and its properties and limits are discussed.
Best, R; Harrell, A; Geesey, C; Libby, B; Wijesooriya, K
2014-06-15
Purpose: The purpose of this study is to inter-compare and find statistically significant differences between flattened field fixed-beam (FB) IMRT with flattening-filter free (FFF) volumetric modulated arc therapy (VMAT) for stereotactic body radiation therapy SBRT. Methods: SBRT plans using FB IMRT and FFF VMAT were generated for fifteen SBRT lung patients using 6 MV beams. For each patient, both IMRT and VMAT plans were created for comparison. Plans were generated utilizing RTOG 0915 (peripheral, 10 patients) and RTOG 0813 (medial, 5 patients) lung protocols. Target dose, critical structure dose, and treatment time were compared and tested for statistical significance. Parameters of interest included prescription isodose surface coverage, target dose heterogeneity, high dose spillage (location and volume), low dose spillage (location and volume), lung dose spillage, and critical structure maximum- and volumetric-dose limits. Results: For all criteria, we found equivalent or higher conformality with VMAT plans as well as reduced critical structure doses. Several differences passed a Student's t-test of significance: VMAT reduced the high dose spillage, evaluated with conformality index (CI), by an average of 9.4%±15.1% (p=0.030) compared to IMRT. VMAT plans reduced the lung volume receiving 20 Gy by 16.2%±15.0% (p=0.016) compared with IMRT. For the RTOG 0915 peripheral lesions, the volumes of lung receiving 12.4 Gy and 11.6 Gy were reduced by 27.0%±13.8% and 27.5%±12.6% (for both, p<0.001) in VMAT plans. Of the 26 protocol pass/fail criteria, VMAT plans were able to achieve an average of 0.2±0.7 (p=0.026) more constraints than the IMRT plans. Conclusions: FFF VMAT has dosimetric advantages over fixed beam IMRT for lung SBRT. Significant advantages included increased dose conformity, and reduced organs-at-risk doses. The overall improvements in terms of protocol pass/fail criteria were more modest and will require more patient data to establish difference
Nhu, Nguyen Van; Singh, Mahendra; Leonhard, Kai
2008-05-08
We have computed molecular descriptors for sizes, shapes, charge distributions, and dispersion interactions for 67 compounds using quantum chemical ab initio and density functional theory methods. For the same compounds, we have fitted the three perturbed-chain polar statistical associating fluid theory (PCP-SAFT) equation of state (EOS) parameters to experimental data and have performed a statistical analysis for relations between the descriptors and the EOS parameters. On this basis, an analysis of the physical significance of the parameters, the limits of the present descriptors, and the PCP-SAFT EOS has been performed. The result is a method that can be used to estimate the vapor pressure curve including the normal boiling point, the liquid volume, the enthalpy of vaporization, the critical data, mixture properties, and so on. When only two of the three parameters are predicted and one is adjusted to experimental normal boiling point data, excellent predictions of all investigated pure compound and mixture properties are obtained. We are convinced that the methodology presented in this work will lead to new EOS applications as well as improved EOS models whose predictive performance is likely to surpass that of most present quantum chemically based, quantitative structure-property relationship, and group contribution methods for a broad range of chemical substances.
Maric, Marija; de Haan, Else; Hogendoorn, Sanne M; Wolters, Lidewij H; Huizenga, Hilde M
2015-03-01
Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a data-analytic method to analyze univariate (i.e., one symptom) single-case data using the common package SPSS. This method can help the clinical researcher to investigate whether an intervention works as compared with a baseline period or another intervention type, and to determine whether symptom improvement is clinically significant. First, we describe the statistical method in a conceptual way and show how it can be implemented in SPSS. Simulation studies were performed to determine the number of observation points required per intervention phase. Second, to illustrate this method and its implications, we present a case study of an adolescent with anxiety disorders treated with cognitive-behavioral therapy techniques in an outpatient psychotherapy clinic, whose symptoms were regularly assessed before each session. We provide a description of the data analyses and results of this case study. Finally, we discuss the advantages and shortcomings of the proposed method.
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the ‘IMGT clonotype (AA)’ (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
NASA Astrophysics Data System (ADS)
Løvsletten, Ola; Rypdal, Martin; Rypdal, Kristoffer; Fredriksen, Hege-Beate
2015-04-01
We explore the statistics of instrumental surface temperature records on 5°× 5°, 2°× 2°, and equal-area grids. In particular, we compute the significance of determinstic trends against two parsimonious null models; auto-regressive processes of order 1, AR(1), and fractional Gaussian noises (fGn's). Both of these two null models contain a memory parameter which quantifies the temporal climate variability, with white noise nested in both classes of models. Estimates of the persistence parameters show significant positive serial correlation for most grid cells, with higher persistence over occeans compared to land areas. This shows that, in a trend detection framework, we need to take into account larger spurious trends than what follows from the frequently used white noise assumption. Tested against the fGn null hypothesis, we find that ~ 68% (~ 47%) of the time series have significant trends at the 5% (1%) significance level. If we assume an AR(1) null hypothesis instead, then the result is that ~ 94% (~ 88%) of the time series have significant trends at the 5% (1%) significance level. For both null models, the locations where we do not find significant trends are mostly the ENSO regions and the North-Atlantic. We try to discriminate between the two null models by means of likelihood-ratios. If we at each grid point choose the null model preferred by the model selection test, we find that ~ 82% (~ 73%) of the time series have significant trends at the 5% (1%). We conclude that there is emerging evidence of significant warming trends also at regional scales, although with a much lower signal-to-noise ratio compared to global mean temperatures. Another finding is that many temperature records are consistent with error models for internal variability that exhibit long-range dependence, whereas the temperature fluctuations of the tropical oceans are strongly influenced by the ENSO, and therefore seemingly more consistent with random processes with short
Ioannidis, John P. A.
2017-01-01
A typical rule that has been used for the endorsement of new medications by the Food and Drug Administration is to have two trials, each convincing on its own, demonstrating effectiveness. “Convincing” may be subjectively interpreted, but the use of p-values and the focus on statistical significance (in particular with p < .05 being coined significant) is pervasive in clinical research. Therefore, in this paper, we calculate with simulations what it means to have exactly two trials, each with p < .05, in terms of the actual strength of evidence quantified by Bayes factors. Our results show that different cases where two trials have a p-value below .05 have wildly differing Bayes factors. Bayes factors of at least 20 in favor of the alternative hypothesis are not necessarily achieved and they fail to be reached in a large proportion of cases, in particular when the true effect size is small (0.2 standard deviations) or zero. In a non-trivial number of cases, evidence actually points to the null hypothesis, in particular when the true effect size is zero, when the number of trials is large, and when the number of participants in both groups is low. We recommend use of Bayes factors as a routine tool to assess endorsement of new medications, because Bayes factors consistently quantify strength of evidence. Use of p-values may lead to paradoxical and spurious decision-making regarding the use of new medications. PMID:28273140
Minor changes in the indicator used to measure fine PM, which cause only modest changes in Mass concentrations, can lead to dramatic changes in the statistical relationship of fine PM mass with cardiovascular mortality. An epidemiologic study in Phoenix (Mar et al., 2000), augme...
Abe, T; Tsuiki, T; Murai, K; Sasamori, S
1990-12-01
A statistical study of 41 cases with denture foreign bodies in the air and upper food passages which were treated in our department during the past 21 years was done. (1) Males were more frequently affected. The ratio of male to female was about 2 to 1. (2) Of 41 dentures, 2, 2 and 37 were lodged in the air passages, hypopharynx and esophagus respectively. (3) There were 5 complete mandibular dentures in 41 cases. (4) The causes of the denture foreign bodies were originated to the problem of denture itself in 29 cases, that of the patient himself in 2 cases and both in 10 cases. (5) Of 39 problematic dentures, 16 showed the breakage such as plate fracture and clasp deformity, but the other 23 showed no breakage. In this latter group, poor holding of the denture was ascribed to miss-making or miss-planning. (6) Of 12 patients with problems in their physical function, 5 had suffered from cerebrovascular disease and 3 from geriatric dementia. (7) The denture foreign body in aged patients with physical hypofunction tends to increase in recent years. (8) Of 39 dentures tried to remove by esophagoscopy, 18 were done with difficulty and they were detachable partial dentures with one artificial tooth and 2-arm-clasps lodged at the first and/or second isthmus of the esophagus. Though we have a denture removed successfully at the third trial, we have no case needed external esophagotomy. (9) Duplicated denture models were made in 20 cases prior to the procedure, and we certify that these models play an important role for the safer removal of denture foreign bodies.
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2001-01-01
Since 1750, the number of cataclysmic volcanic eruptions (volcanic explosivity index (VEI)>=4) per decade spans 2-11, with 96 percent located in the tropics and extra-tropical Northern Hemisphere. A two-point moving average of the volcanic time series has higher values since the 1860's than before, being 8.00 in the 1910's (the highest value) and 6.50 in the 1980's, the highest since the 1910's peak. Because of the usual behavior of the first difference of the two-point moving averages, one infers that its value for the 1990's will measure approximately 6.50 +/- 1, implying that approximately 7 +/- 4 cataclysmic volcanic eruptions should be expected during the present decade (2000-2009). Because cataclysmic volcanic eruptions (especially those having VEI>=5) nearly always have been associated with short-term episodes of global cooling, the occurrence of even one might confuse our ability to assess the effects of global warming. Poisson probability distributions reveal that the probability of one or more events with a VEI>=4 within the next ten years is >99 percent. It is approximately 49 percent for an event with a VEI>=5, and 18 percent for an event with a VEI>=6. Hence, the likelihood that a climatically significant volcanic eruption will occur within the next ten years appears reasonably high.
Williams, Scott G. Buyyounouski, Mark K.; Pickles, Tom; Kestin, Larry; Martinez, Alvaro; Hanlon, Alexandra L.; Duchesne, Gillian M.
2008-03-15
Purpose: To define and incorporate the impact of the percentage of positive biopsy cores (PPC) into a predictive model of prostate cancer radiotherapy biochemical outcome. Methods and Materials: The data of 3264 men with clinically localized prostate cancer treated with external beam radiotherapy at four institutions were retrospectively analyzed. Standard prognostic and treatment factors plus the number of biopsy cores collected and the number positive for malignancy by transrectal ultrasound-guided biopsy were available. The primary endpoint was biochemical failure (bF, Phoenix definition). Multivariate proportional hazards analyses were performed and expressed as a nomogram and the model's predictive ability assessed using the concordance index (c-index). Results: The cohort consisted of 21% low-, 51% intermediate-, and 28% high-risk cancer patients, and 30% had androgen deprivation with radiotherapy. The median PPC was 50% (interquartile range [IQR] 29-67%), and median follow-up was 51 months (IQR 29-71 months). Percentage of positive biopsy cores displayed an independent association with the risk of bF (p = 0.01), as did age, prostate-specific antigen value, Gleason score, clinical stage, androgen deprivation duration, and radiotherapy dose (p < 0.001 for all). Including PPC increased the c-index from 0.72 to 0.73 in the overall model. The influence of PPC varied significantly with radiotherapy dose and clinical stage (p = 0.02 for both interactions), with doses <66 Gy and palpable tumors showing the strongest relationship between PPC and bF. Intermediate-risk patients were poorly discriminated regardless of PPC inclusion (c-index 0.65 for both models). Conclusions: Outcome models incorporating PPC show only minor additional ability to predict biochemical failure beyond those containing standard prognostic factors.
NASA Astrophysics Data System (ADS)
Khan, Shahjahan
Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden "jewels" in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model
NASA Astrophysics Data System (ADS)
Khan, Shahjahan
Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden “jewels” in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model
Grossling, Bernardo F.
1975-01-01
Exploratory drilling is still in incipient or youthful stages in those areas of the world where the bulk of the potential petroleum resources is yet to be discovered. Methods of assessing resources from projections based on historical production and reserve data are limited to mature areas. For most of the world's petroleum-prospective areas, a more speculative situation calls for a critical review of resource-assessment methodology. The language of mathematical statistics is required to define more rigorously the appraisal of petroleum resources. Basically, two approaches have been used to appraise the amounts of undiscovered mineral resources in a geologic province: (1) projection models, which use statistical data on the past outcome of exploration and development in the province; and (2) estimation models of the overall resources of the province, which use certain known parameters of the province together with the outcome of exploration and development in analogous provinces. These two approaches often lead to widely different estimates. Some of the controversy that arises results from a confusion of the probabilistic significance of the quantities yielded by each of the two approaches. Also, inherent limitations of analytic projection models-such as those using the logistic and Gomperts functions --have often been ignored. The resource-assessment problem should be recast in terms that provide for consideration of the probability of existence of the resource and of the probability of discovery of a deposit. Then the two above-mentioned models occupy the two ends of the probability range. The new approach accounts for (1) what can be expected with reasonably high certainty by mere projections of what has been accomplished in the past; (2) the inherent biases of decision-makers and resource estimators; (3) upper bounds that can be set up as goals for exploration; and (4) the uncertainties in geologic conditions in a search for minerals. Actual outcomes can then
Statistics of superior records
NASA Astrophysics Data System (ADS)
Ben-Naim, E.; Krapivsky, P. L.
2013-08-01
We study statistics of records in a sequence of random variables. These identical and independently distributed variables are drawn from the parent distribution ρ. The running record equals the maximum of all elements in the sequence up to a given point. We define a superior sequence as one where all running records are above the average record expected for the parent distribution ρ. We find that the fraction of superior sequences SN decays algebraically with sequence length N, SN˜N-β in the limit N→∞. Interestingly, the decay exponent β is nontrivial, being the root of an integral equation. For example, when ρ is a uniform distribution with compact support, we find β=0.450265. In general, the tail of the parent distribution governs the exponent β. We also consider the dual problem of inferior sequences, where all records are below average, and find that the fraction of inferior sequences IN decays algebraically, albeit with a different decay exponent, IN˜N-α. We use the above statistical measures to analyze earthquake data.
Hunt, N C; Ghosh, K M; Blain, A P; Rushton, S P; Longstaff, L M; Deehan, D J
2015-05-01
The aim of this study was to compare the maximum laxity conferred by the cruciate-retaining (CR) and posterior-stabilised (PS) Triathlon single-radius total knee arthroplasty (TKA) for anterior drawer, varus-valgus opening and rotation in eight cadaver knees through a defined arc of flexion (0º to 110º). The null hypothesis was that the limits of laxity of CR- and PS-TKAs are not significantly different. The investigation was undertaken in eight loaded cadaver knees undergoing subjective stress testing using a measurement rig. Firstly the native knee was tested prior to preparation for CR-TKA and subsequently for PS-TKA implantation. Surgical navigation was used to track maximal displacements/rotations at 0º, 30º, 60º, 90º and 110° of flexion. Mixed-effects modelling was used to define the behaviour of the TKAs. The laxity measured for the CR- and PS-TKAs revealed no statistically significant differences over the studied flexion arc for the two versions of TKA. Compared with the native knee both TKAs exhibited slightly increased anterior drawer and decreased varus-valgus and internal-external roational laxities. We believe further study is required to define the clinical states for which the additional constraint offered by a PS-TKA implant may be beneficial.
NASA Astrophysics Data System (ADS)
Temme, F. P.
1992-12-01
Realisation of the invariance properties of the p ⩽ 2 number partitional inventory components of the 20-fold spin algebra associated with [A] 20 nuclear spin clusters under SU2 × L20 allows the mappings {[λ] → Γ} to be derived. In addition, recent general inner tensor product expressions under Ln, for n even (odd), also facilitates the evaluation of many higher [λ] ( L20; p = 3) correlative mappings onto SU3↓SO(3) × L↓20T A 5 subduced symmetry from SU2 duality, thus providing results that determine the nature of adapted NMR bases for both dodecahedrane and its d 20 analogue. The significance of this work lies in the pertinence of nuclear spin statistics to both selective MQ-NMR and to other spectroscopic aspects of cage clusters, e.g., [ 13C] n, n = 20, 60, fullerenes. Mappings onto Ln irreps sets of specific p ⩽ 3 number partitions arise in combinatorial treatment of {M iti} Rota fields, defining scalar invariants in the context of Cayley algebra. Inclusion of the Ln group in the specific Racah chain for NMR symmetry gives rise to significant further physical insight.
Shi, Runhua; McLarty, Jerry W
2009-10-01
In this article, we introduced basic concepts of statistics, type of distributions, and descriptive statistics. A few examples were also provided. The basic concepts presented herein are only a fraction of the concepts related to descriptive statistics. Also, there are many commonly used distributions not presented herein, such as Poisson distributions for rare events and exponential distributions, F distributions, and logistic distributions. More information can be found in many statistics books and publications.
Miyauchi, T; Hagimoto, H; Saito, T; Endo, K; Ishii, M; Yamaguchi, T; Kajiwara, A; Matsushita, M
1989-01-01
EEG power amplitude and power ratio data obtained from 15 (3 men and 12 women) patients with Alzheimer's disease (AD) and 8 (2 men and 6 women) with senile dementia of Alzheimer type (SDAT) were compared with similar data from 40 age- and sex-matched normal controls. Compared with the healthy controls, both patient groups demonstrated increased EEG background slowing, and it indicated more slower in AD than in SDAT. Moreover, both groups showed characteristic findings respectively on EEG topography and t-statistic significance probability mapping (SPM). The differences between AD and their controls indicated high slowing with reductions in alpha 2, beta 1 and beta 2 activity. The SPMs of power ratio in theta and alpha 2 bands showed most prominent significance in the right posterior-temporal region and delta and beta bands did in the frontal region. Severe AD indicated only frontal delta slowing compared to mild AD. The differences between SDAT and their controls indicated only mild slowing in delta and theta bands. The SPM of power amplitude showed occipital slowing, whereas the SPM of power ratio showed the slowing in the frontal region. Judging from both topographic findings, these were considered to denote diffuse slow tendency. In summary, these results presumed that in AD, cortical damages followed by EEG slowing with reductions of alpha 2 and beta bands originated rapidly and thereafter developed subcortical (non-specific area in thalamus) changes with frontal delta activity on SPM. On the other hand, in SDAT, diffuse cortico-subcortical damages with diffuse slowing on EEG topography were caused gradually.
ERIC Educational Resources Information Center
Callamaras, Peter
1983-01-01
This buyer's guide to seven major types of statistics software packages for microcomputers reviews Edu-Ware Statistics 3.0; Financial Planning; Speed Stat; Statistics with DAISY; Human Systems Dynamics package of Stats Plus, ANOVA II, and REGRESS II; Maxistat; and Moore-Barnes' MBC Test Construction and MBC Correlation. (MBR)
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2008-01-01
As a branch of knowledge, Statistics is ubiquitous and its applications can be found in (almost) every field of human endeavour. In this article, the authors track down the possible source of the link between the "Siren song" and applications of Statistics. Answers to their previous five questions and five new questions on Statistics are presented.
Statistics-based research--a pig in a poke?
Penston, James
2011-10-01
Much of medical research involves large-scale randomized controlled trials designed to detect small differences in outcome between the study groups. This approach is believed to produce reliable evidence on which the management of patients is based. But can we be sure that the demonstration of a small, albeit statistically significant, difference is sufficient to infer the presence of a causal relationship between the drug and the outcome? A study is claimed to have internal validity when other explanations for the observed difference - namely, inequalities between the groups, bias in the assessment of the outcome and chance - have been excluded. Despite the various processes that are put into place - including, for example, randomization, allocation concealment, double-blinding and intention-to-treat analysis - it remains doubtful whether the groups are equal in terms of all factors relevant to the outcome and whether bias has been excluded. As for the exclusion of chance, not only may inappropriate statistical tests be used, but also frequentist statistics has been subjected to serious criticisms in recent years that further bring internal validity into question. But the problems do not end with the flaws in internal validity. The philosophical basis of large-scale randomized controlled trials and epidemiological studies is unsound. When examined closely, many obstacles emerge that threaten the inference from a small, statistically significant difference to the presence of a causal relationship between the drug and the outcome. Given the influence of statistics-based research on the practice of medicine, it is of the utmost importance that the flaws in this methodology are brought to the fore.
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2008-01-01
In this article, the authors focus on hypothesis testing--that peculiarly statistical way of deciding things. Statistical methods for testing hypotheses were developed in the 1920s and 1930s by some of the most famous statisticians, in particular Ronald Fisher, Jerzy Neyman and Egon Pearson, who laid the foundations of almost all modern methods of…
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Tellinghuisen, Joel
2008-01-01
The method of least squares is probably the most powerful data analysis tool available to scientists. Toward a fuller appreciation of that power, this work begins with an elementary review of statistics fundamentals, and then progressively increases in sophistication as the coverage is extended to the theory and practice of linear and nonlinear least squares. The results are illustrated in application to data analysis problems important in the life sciences. The review of fundamentals includes the role of sampling and its connection to probability distributions, the Central Limit Theorem, and the importance of finite variance. Linear least squares are presented using matrix notation, and the significance of the key probability distributions-Gaussian, chi-square, and t-is illustrated with Monte Carlo calculations. The meaning of correlation is discussed, including its role in the propagation of error. When the data themselves are correlated, special methods are needed for the fitting, as they are also when fitting with constraints. Nonlinear fitting gives rise to nonnormal parameter distributions, but the 10% Rule of Thumb suggests that such problems will be insignificant when the parameter is sufficiently well determined. Illustrations include calibration with linear and nonlinear response functions, the dangers inherent in fitting inverted data (e.g., Lineweaver-Burk equation), an analysis of the reliability of the van't Hoff analysis, the problem of correlated data in the Guggenheim method, and the optimization of isothermal titration calorimetry procedures using the variance-covariance matrix for experiment design. The work concludes with illustrations on assessing and presenting results.
... population, or about 25 million Americans, has experienced tinnitus lasting at least five minutes in the past ... by NIDCD Epidemiology and Statistics Program staff: (1) tinnitus prevalence was obtained from the 2008 National Health ...
ERIC Educational Resources Information Center
Huberty, Carl J.
An approach to statistical testing, which combines Neyman-Pearson hypothesis testing and Fisher significance testing, is recommended. The use of P-values in this approach is discussed in some detail. The author also discusses some problems which are often found in introductory statistics textbooks. The problems involve the definitions of…
ERIC Educational Resources Information Center
Chicot, Katie; Holmes, Hilary
2012-01-01
The use, and misuse, of statistics is commonplace, yet in the printed format data representations can be either over simplified, supposedly for impact, or so complex as to lead to boredom, supposedly for completeness and accuracy. In this article the link to the video clip shows how dynamic visual representations can enliven and enhance the…
Rendón-Macías, Mario Enrique; Villasís-Keever, Miguel Ángel; Miranda-Novales, María Guadalupe
2016-01-01
Descriptive statistics is the branch of statistics that gives recommendations on how to summarize clearly and simply research data in tables, figures, charts, or graphs. Before performing a descriptive analysis it is paramount to summarize its goal or goals, and to identify the measurement scales of the different variables recorded in the study. Tables or charts aim to provide timely information on the results of an investigation. The graphs show trends and can be histograms, pie charts, "box and whiskers" plots, line graphs, or scatter plots. Images serve as examples to reinforce concepts or facts. The choice of a chart, graph, or image must be based on the study objectives. Usually it is not recommended to use more than seven in an article, also depending on its length.
Order Statistics and Nonparametric Statistics.
2014-09-26
Topics investigated include the following: Probability that a fuze will fire; moving order statistics; distribution theory and properties of the...problem posed by an Army Scientist: A fuze will fire when at least n-i (or n-2) of n detonators function within time span t. What is the probability of
NASA Astrophysics Data System (ADS)
Goodman, Joseph W.
2000-07-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Robert G. Bartle The Elements of Integration and Lebesgue Measure George E. P. Box & Norman R. Draper Evolutionary Operation: A Statistical Method for Process Improvement George E. P. Box & George C. Tiao Bayesian Inference in Statistical Analysis R. W. Carter Finite Groups of Lie Type: Conjugacy Classes and Complex Characters R. W. Carter Simple Groups of Lie Type William G. Cochran & Gertrude M. Cox Experimental Designs, Second Edition Richard Courant Differential and Integral Calculus, Volume I RIchard Courant Differential and Integral Calculus, Volume II Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume I Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume II D. R. Cox Planning of Experiments Harold S. M. Coxeter Introduction to Geometry, Second Edition Charles W. Curtis & Irving Reiner Representation Theory of Finite Groups and Associative Algebras Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume I Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume II Cuthbert Daniel Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition Bruno de Finetti Theory of Probability, Volume I Bruno de Finetti Theory of Probability, Volume 2 W. Edwards Deming Sample Design in Business Research
Pestana, Dinis
2013-01-01
Statistics is a privileged tool in building knowledge from information, since the purpose is to extract from a sample limited information conclusions to the whole population. The pervasive use of statistical software (that always provides an answer, the question being adequate or not), and the absence of statistics to confer a scientific flavour to so much bad science, has had a pernicious effect on some disbelief on statistical research. Would Lord Rutherford be alive today, it is almost certain that he would not condemn the use of statistics in research, as he did in the dawn of the 20th century. But he would indeed urge everyone to use statistics quantum satis, since to use bad data, too many data, and statistics to enquire on irrelevant questions, is a source of bad science, namely because with too many data we can establish statistical significance of irrelevant results. This is an important point that addicts of evidence based medicine should be aware of, since the meta analysis of two many data will inevitably establish senseless results.
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced.
NASA Astrophysics Data System (ADS)
Paine, Gregory Harold
1982-03-01
The primary objective of the thesis is to explore the dynamical properties of small nerve networks by means of the methods of statistical mechanics. To this end, a general formalism is developed and applied to elementary groupings of model neurons which are driven by either constant (steady state) or nonconstant (nonsteady state) forces. Neuronal models described by a system of coupled, nonlinear, first-order, ordinary differential equations are considered. A linearized form of the neuronal equations is studied in detail. A Lagrange function corresponding to the linear neural network is constructed which, through a Legendre transformation, provides a constant of motion. By invoking the Maximum-Entropy Principle with the single integral of motion as a constraint, a probability distribution function for the network in a steady state can be obtained. The formalism is implemented for some simple networks driven by a constant force; accordingly, the analysis focuses on a study of fluctuations about the steady state. In particular, a network composed of N noninteracting neurons, termed Free Thinkers, is considered in detail, with a view to interpretation and numerical estimation of the Lagrange multiplier corresponding to the constant of motion. As an archetypical example of a net of interacting neurons, the classical neural oscillator, consisting of two mutually inhibitory neurons, is investigated. It is further shown that in the case of a network driven by a nonconstant force, the Maximum-Entropy Principle can be applied to determine a probability distribution functional describing the network in a nonsteady state. The above examples are reconsidered with nonconstant driving forces which produce small deviations from the steady state. Numerical studies are performed on simplified models of two physical systems: the starfish central nervous system and the mammalian olfactory bulb. Discussions are given as to how statistical neurodynamics can be used to gain a better
A study on the use of Gumbel approximation with the Bernoulli spatial scan statistic.
Read, S; Bath, P A; Willett, P; Maheswaran, R
2013-08-30
The Bernoulli version of the spatial scan statistic is a well established method of detecting localised spatial clusters in binary labelled point data, a typical application being the epidemiological case-control study. A recent study suggests the inferential accuracy of several versions of the spatial scan statistic (principally the Poisson version) can be improved, at little computational cost, by using the Gumbel distribution, a method now available in SaTScan(TM) (www.satscan.org). We study in detail the effect of this technique when applied to the Bernoulli version and demonstrate that it is highly effective, albeit with some increase in false alarm rates at certain significance thresholds. We explain how this increase is due to the discrete nature of the Bernoulli spatial scan statistic and demonstrate that it can affect even small p-values. Despite this, we argue that the Gumbel method is actually preferable for very small p-values. Furthermore, we extend previous research by running benchmark trials on 12 000 synthetic datasets, thus demonstrating that the overall detection capability of the Bernoulli version (i.e. ratio of power to false alarm rate) is not noticeably affected by the use of the Gumbel method. We also provide an example application of the Gumbel method using data on hospital admissions for chronic obstructive pulmonary disease.
Whither Statistics Education Research?
ERIC Educational Resources Information Center
Watson, Jane
2016-01-01
This year marks the 25th anniversary of the publication of a "National Statement on Mathematics for Australian Schools", which was the first curriculum statement this country had including "Chance and Data" as a significant component. It is hence an opportune time to survey the history of the related statistics education…
... PRS GO PSN PSEN GRAFT Contact Us News Plastic Surgery Statistics Plastic surgery procedural statistics from the ... Plastic Surgery Statistics 2005 Plastic Surgery Statistics 2016 Plastic Surgery Statistics Stats Report 2016 National Clearinghouse of ...
Detecting Statistically Significant Communities of Triangle Motifs in Undirected Networks
2015-03-16
to LFR benchmark graphs , relative to the method proposed by Perry et al. [6]. 6 Distribution A: Approved for public release; distribution is...trials. Specifically, let (X,Y ) denote any two observed triangles, then for a Bernoulli(p) graph : E(X) = E(Y ) = p3 (1) 7 Distribution A: Approved...the observed adjacency matrix and consider the null hypothesis H0: number of triangles in A is consistent with Bernoulli graph with probability p
Intervention for Maltreating Fathers: Statistically and Clinically Significant Change
ERIC Educational Resources Information Center
Scott, Katreena L.; Lishak, Vicky
2012-01-01
Objective: Fathers are seldom the focus of efforts to address child maltreatment and little is currently known about the effectiveness of intervention for this population. To address this gap, we examined the efficacy of a community-based group treatment program for fathers who had abused or neglected their children or exposed their children to…
Worry, Intolerance of Uncertainty, and Statistics Anxiety
ERIC Educational Resources Information Center
Williams, Amanda S.
2013-01-01
Statistics anxiety is a problem for most graduate students. This study investigates the relationship between intolerance of uncertainty, worry, and statistics anxiety. Intolerance of uncertainty was significantly related to worry, and worry was significantly related to three types of statistics anxiety. Six types of statistics anxiety were…
Suite versus composite statistics
Balsillie, J.H.; Tanner, W.F.
1999-01-01
Suite and composite methodologies, two statistically valid approaches for producing statistical descriptive measures, are investigated for sample groups representing a probability distribution where, in addition, each sample is probability distribution. Suite and composite means (first moment measures) are always equivalent. Composite standard deviations (second moment measures) are always larger than suite standard deviations. Suite and composite values for higher moment measures have more complex relationships. Very seldom, however, are they equivalent, and they normally yield statistically significant but different results. Multiple samples are preferable to single samples (including composites) because they permit the investigator to examine sample-to-sample variability. These and other relationships for suite and composite probability distribution analyses are investigated and reported using granulometric data.
... Standards Act and Program MQSA Insights MQSA National Statistics Share Tweet Linkedin Pin it More sharing options ... but should level off with time. Archived Scorecard Statistics 2017 Scorecard Statistics 2016 Scorecard Statistics (Archived) 2015 ...
Statistics Anxiety among Postgraduate Students
ERIC Educational Resources Information Center
Koh, Denise; Zawi, Mohd Khairi
2014-01-01
Most postgraduate programmes, that have research components, require students to take at least one course of research statistics. Not all postgraduate programmes are science based, there are a significant number of postgraduate students who are from the social sciences that will be taking statistics courses, as they try to complete their…
Nursing student attitudes toward statistics.
Mathew, Lizy; Aktan, Nadine M
2014-04-01
Nursing is guided by evidence-based practice. To understand and apply research to practice, nurses must be knowledgeable in statistics; therefore, it is crucial to promote a positive attitude toward statistics among nursing students. The purpose of this quantitative cross-sectional study was to assess differences in attitudes toward statistics among undergraduate nursing, graduate nursing, and undergraduate non-nursing students. The Survey of Attitudes Toward Statistics Scale-36 (SATS-36) was used to measure student attitudes, with higher scores denoting more positive attitudes. The convenience sample was composed of 175 students from a public university in the northeastern United States. Statistically significant relationships were found among some of the key demographic variables. Graduate nursing students had a significantly lower score on the SATS-36, compared with baccalaureate nursing and non-nursing students. Therefore, an innovative nursing curriculum that incorporates knowledge of student attitudes and key demographic variables may result in favorable outcomes.
Statistics Poker: Reinforcing Basic Statistical Concepts
ERIC Educational Resources Information Center
Leech, Nancy L.
2008-01-01
Learning basic statistical concepts does not need to be tedious or dry; it can be fun and interesting through cooperative learning in the small-group activity of Statistics Poker. This article describes a teaching approach for reinforcing basic statistical concepts that can help students who have high anxiety and makes learning and reinforcing…
Predict! Teaching Statistics Using Informational Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie
2013-01-01
Statistics is one of the most widely used topics for everyday life in the school mathematics curriculum. Unfortunately, the statistics taught in schools focuses on calculations and procedures before students have a chance to see it as a useful and powerful tool. Researchers have found that a dominant view of statistics is as an assortment of tools…
Neuroendocrine Tumor: Statistics
... Tumor > Neuroendocrine Tumor: Statistics Request Permissions Neuroendocrine Tumor: Statistics Approved by the Cancer.Net Editorial Board , 11/ ... the body. It is important to remember that statistics on how many people survive this type of ...
Adrenal Gland Tumors: Statistics
... Gland Tumor: Statistics Request Permissions Adrenal Gland Tumor: Statistics Approved by the Cancer.Net Editorial Board , 03/ ... primary adrenal gland tumor is very uncommon. Exact statistics are not available for this type of tumor ...
STATISTICAL ANALYSIS, REPORTS), (*PROBABILITY, REPORTS), INFORMATION THEORY, DIFFERENTIAL EQUATIONS, STATISTICAL PROCESSES, STOCHASTIC PROCESSES, MULTIVARIATE ANALYSIS, DISTRIBUTION THEORY , DECISION THEORY, MEASURE THEORY, OPTIMIZATION
Wild, M.; Rouhani, S.
1995-02-01
A typical site investigation entails extensive sampling and monitoring. In the past, sampling plans have been designed on purely ad hoc bases, leading to significant expenditures and, in some cases, collection of redundant information. In many instances, sampling costs exceed the true worth of the collected data. The US Environmental Protection Agency (EPA) therefore has advocated the use of geostatistics to provide a logical framework for sampling and analysis of environmental data. Geostatistical methodology uses statistical techniques for the spatial analysis of a variety of earth-related data. The use of geostatistics was developed by the mining industry to estimate ore concentrations. The same procedure is effective in quantifying environmental contaminants in soils for risk assessments. Unlike classical statistical techniques, geostatistics offers procedures to incorporate the underlying spatial structure of the investigated field. Sample points spaced close together tend to be more similar than samples spaced further apart. This can guide sampling strategies and determine complex contaminant distributions. Geostatistic techniques can be used to evaluate site conditions on the basis of regular, irregular, random and even spatially biased samples. In most environmental investigations, it is desirable to concentrate sampling in areas of known or suspected contamination. The rigorous mathematical procedures of geostatistics allow for accurate estimates at unsampled locations, potentially reducing sampling requirements. The use of geostatistics serves as a decision-aiding and planning tool and can significantly reduce short-term site assessment costs, long-term sampling and monitoring needs, as well as lead to more accurate and realistic remedial design criteria.
[Comment on] Statistical discrimination
NASA Astrophysics Data System (ADS)
Chinn, Douglas
In the December 8, 1981, issue of Eos, a news item reported the conclusion of a National Research Council study that sexual discrimination against women with Ph.D.'s exists in the field of geophysics. Basically, the item reported that even when allowances are made for motherhood the percentage of female Ph.D.'s holding high university and corporate positions is significantly lower than the percentage of male Ph.D.'s holding the same types of positions. The sexual discrimination conclusion, based only on these statistics, assumes that there are no basic psychological differences between men and women that might cause different populations in the employment group studied. Therefore, the reasoning goes, after taking into account possible effects from differences related to anatomy, such as women stopping their careers in order to bear and raise children, the statistical distributions of positions held by male and female Ph.D.'s ought to be very similar to one another. Any significant differences between the distributions must be caused primarily by sexual discrimination.
Antecedents of students' achievement in statistics
NASA Astrophysics Data System (ADS)
Awaludin, Izyan Syazana; Razak, Ruzanna Ab; Harris, Hezlin; Selamat, Zarehan
2015-02-01
The applications of statistics in most fields have been vast. Many degree programmes at local universities require students to enroll in at least one statistics course. The standard of these courses varies across different degree programmes. This is because of students' diverse academic backgrounds in which some comes far from the field of statistics. The high failure rate in statistics courses for non-science stream students had been concerning every year. The purpose of this research is to investigate the antecedents of students' achievement in statistics. A total of 272 students participated in the survey. Multiple linear regression was applied to examine the relationship between the factors and achievement. We found that statistics anxiety was a significant predictor of students' achievement. We also found that students' age has significant effect to achievement. Older students are more likely to achieve lowers scores in statistics. Student's level of study also has a significant impact on their achievement in statistics.
Statistical Reference Datasets
National Institute of Standards and Technology Data Gateway
Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
Explorations in statistics: statistical facets of reproducibility.
Curran-Everett, Douglas
2016-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eleventh installment of Explorations in Statistics explores statistical facets of reproducibility. If we obtain an experimental result that is scientifically meaningful and statistically unusual, we would like to know that our result reflects a general biological phenomenon that another researcher could reproduce if (s)he repeated our experiment. But more often than not, we may learn this researcher cannot replicate our result. The National Institutes of Health and the Federation of American Societies for Experimental Biology have created training modules and outlined strategies to help improve the reproducibility of research. These particular approaches are necessary, but they are not sufficient. The principles of hypothesis testing and estimation are inherent to the notion of reproducibility in science. If we want to improve the reproducibility of our research, then we need to rethink how we apply fundamental concepts of statistics to our science.
[Big data in official statistics].
Zwick, Markus
2015-08-01
The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany.
(Errors in statistical tests)3.
Phillips, Carl V; MacLehose, Richard F; Kaufman, Jay S
2008-07-14
In 2004, Garcia-Berthou and Alcaraz published "Incongruence between test statistics and P values in medical papers," a critique of statistical errors that received a tremendous amount of attention. One of their observations was that the final reported digit of p-values in articles published in the journal Nature departed substantially from the uniform distribution that they suggested should be expected. In 2006, Jeng critiqued that critique, observing that the statistical analysis of those terminal digits had been based on comparing the actual distribution to a uniform continuous distribution, when digits obviously are discretely distributed. Jeng corrected the calculation and reported statistics that did not so clearly support the claim of a digit preference. However delightful it may be to read a critique of statistical errors in a critique of statistical errors, we nevertheless found several aspects of the whole exchange to be quite troubling, prompting our own meta-critique of the analysis.The previous discussion emphasized statistical significance testing. But there are various reasons to expect departure from the uniform distribution in terminal digits of p-values, so that simply rejecting the null hypothesis is not terribly informative. Much more importantly, Jeng found that the original p-value of 0.043 should have been 0.086, and suggested this represented an important difference because it was on the other side of 0.05. Among the most widely reiterated (though often ignored) tenets of modern quantitative research methods is that we should not treat statistical significance as a bright line test of whether we have observed a phenomenon. Moreover, it sends the wrong message about the role of statistics to suggest that a result should be dismissed because of limited statistical precision when it is so easy to gather more data.In response to these limitations, we gathered more data to improve the statistical precision, and analyzed the actual pattern of the
Ranald Macdonald and statistical inference.
Smith, Philip T
2009-05-01
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing.
Chiou, Chei-Chang; Wang, Yu-Min; Lee, Li-Tze
2014-08-01
Statistical knowledge is widely used in academia; however, statistics teachers struggle with the issue of how to reduce students' statistics anxiety and enhance students' statistics learning. This study assesses the effectiveness of a "one-minute paper strategy" in reducing students' statistics-related anxiety and in improving students' statistics-related achievement. Participants were 77 undergraduates from two classes enrolled in applied statistics courses. An experiment was implemented according to a pretest/posttest comparison group design. The quasi-experimental design showed that the one-minute paper strategy significantly reduced students' statistics anxiety and improved students' statistics learning achievement. The strategy was a better instructional tool than the textbook exercise for reducing students' statistics anxiety and improving students' statistics achievement.
Mathematical and statistical analysis
NASA Technical Reports Server (NTRS)
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
... Research AMIGAS Fighting Cervical Cancer Worldwide Stay Informed Statistics for Other Kinds of Cancer Breast Cervical Colorectal ( ... Skin Vaginal and Vulvar Cancer Home Uterine Cancer Statistics Language: English Español (Spanish) Recommend on Facebook Tweet ...
Experiment in Elementary Statistics
ERIC Educational Resources Information Center
Fernando, P. C. B.
1976-01-01
Presents an undergraduate laboratory exercise in elementary statistics in which students verify empirically the various aspects of the Gaussian distribution. Sampling techniques and other commonly used statistical procedures are introduced. (CP)
Significance Analysis of Prognostic Signatures
Beck, Andrew H.; Knoblauch, Nicholas W.; Hefti, Marco M.; Kaplan, Jennifer; Schnitt, Stuart J.; Culhane, Aedin C.; Schroeder, Markus S.; Risch, Thomas; Quackenbush, John; Haibe-Kains, Benjamin
2013-01-01
A major goal in translational cancer research is to identify biological signatures driving cancer progression and metastasis. A common technique applied in genomics research is to cluster patients using gene expression data from a candidate prognostic gene set, and if the resulting clusters show statistically significant outcome stratification, to associate the gene set with prognosis, suggesting its biological and clinical importance. Recent work has questioned the validity of this approach by showing in several breast cancer data sets that “random” gene sets tend to cluster patients into prognostically variable subgroups. This work suggests that new rigorous statistical methods are needed to identify biologically informative prognostic gene sets. To address this problem, we developed Significance Analysis of Prognostic Signatures (SAPS) which integrates standard prognostic tests with a new prognostic significance test based on stratifying patients into prognostic subtypes with random gene sets. SAPS ensures that a significant gene set is not only able to stratify patients into prognostically variable groups, but is also enriched for genes showing strong univariate associations with patient prognosis, and performs significantly better than random gene sets. We use SAPS to perform a large meta-analysis (the largest completed to date) of prognostic pathways in breast and ovarian cancer and their molecular subtypes. Our analyses show that only a small subset of the gene sets found statistically significant using standard measures achieve significance by SAPS. We identify new prognostic signatures in breast and ovarian cancer and their corresponding molecular subtypes, and we show that prognostic signatures in ER negative breast cancer are more similar to prognostic signatures in ovarian cancer than to prognostic signatures in ER positive breast cancer. SAPS is a powerful new method for deriving robust prognostic biological signatures from clinically annotated
ERIC Educational Resources Information Center
Lenard, Christopher; McCarthy, Sally; Mills, Terence
2014-01-01
There are many different aspects of statistics. Statistics involves mathematics, computing, and applications to almost every field of endeavour. Each aspect provides an opportunity to spark someone's interest in the subject. In this paper we discuss some ethical aspects of statistics, and describe how an introduction to ethics has been…
Teaching Statistics Using SAS.
ERIC Educational Resources Information Center
Mandeville, Garrett K.
The Statistical Analysis System (SAS) is presented as the single most appropriate statistical package to use as an aid in teaching statistics. A brief review of literature in which SAS is compared to SPSS, BMDP, and other packages is followed by six examples which demonstrate features unique to SAS which have pedagogical utility. Of particular…
Minnesota Health Statistics 1988.
ERIC Educational Resources Information Center
Minnesota State Dept. of Health, St. Paul.
This document comprises the 1988 annual statistical report of the Minnesota Center for Health Statistics. After introductory technical notes on changes in format, sources of data, and geographic allocation of vital events, an overview is provided of vital health statistics in all areas. Thereafter, separate sections of the report provide tables…
Statistical Methods for Astronomy
NASA Astrophysics Data System (ADS)
Feigelson, Eric D.; Babu, G. Jogesh
Statistical methodology, with deep roots in probability theory, providesquantitative procedures for extracting scientific knowledge from astronomical dataand for testing astrophysical theory. In recent decades, statistics has enormouslyincreased in scope and sophistication. After a historical perspective, this reviewoutlines concepts of mathematical statistics, elements of probability theory,hypothesis tests, and point estimation. Least squares, maximum likelihood, andBayesian approaches to statistical inference are outlined. Resampling methods,particularly the bootstrap, provide valuable procedures when distributionsfunctions of statistics are not known. Several approaches to model selection andgoodness of fit are considered.
Gender Issues in Labour Statistics.
ERIC Educational Resources Information Center
Greenwood, Adriana Mata
1999-01-01
Presents the main features needed for labor statistics to reflect the respective situations for women and men in the labor market. Identifies topics to be covered and detail needed for significant distinctions to emerge. Explains how the choice of measurement method and data presentation can influence the final result. (Author/JOW)
... Naloxone Pain Prevention Treatment Trends & Statistics Women and Drugs Publications Funding Funding Opportunities Clinical Research Post-Award Concerns General Information Grant & Contract Application ...
Statistical distribution sampling
NASA Technical Reports Server (NTRS)
Johnson, E. S.
1975-01-01
Determining the distribution of statistics by sampling was investigated. Characteristic functions, the quadratic regression problem, and the differential equations for the characteristic functions are analyzed.
Thermodynamic Limit in Statistical Physics
NASA Astrophysics Data System (ADS)
Kuzemsky, A. L.
2014-03-01
The thermodynamic limit in statistical thermodynamics of many-particle systems is an important but often overlooked issue in the various applied studies of condensed matter physics. To settle this issue, we review tersely the past and present disposition of thermodynamic limiting procedure in the structure of the contemporary statistical mechanics and our current understanding of this problem. We pick out the ingenious approach by Bogoliubov, who developed a general formalism for establishing the limiting distribution functions in the form of formal series in powers of the density. In that study, he outlined the method of justification of the thermodynamic limit when he derived the generalized Boltzmann equations. To enrich and to weave our discussion, we take this opportunity to give a brief survey of the closely related problems, such as the equipartition of energy and the equivalence and nonequivalence of statistical ensembles. The validity of the equipartition of energy permits one to decide what are the boundaries of applicability of statistical mechanics. The major aim of this work is to provide a better qualitative understanding of the physical significance of the thermodynamic limit in modern statistical physics of the infinite and "small" many-particle systems.
Statistical Mechanics of Zooplankton.
Hinow, Peter; Nihongi, Ai; Strickler, J Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar "microscopic" quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the "ecological temperature" of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean's swimming behavior.
Statistical Mechanics of Zooplankton
Hinow, Peter; Nihongi, Ai; Strickler, J. Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar “microscopic” quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the “ecological temperature” of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean’s swimming behavior. PMID:26270537
Explorations in Statistics: Power
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2010-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fifth installment of "Explorations in Statistics" revisits power, a concept fundamental to the test of a null hypothesis. Power is the probability that we reject the null hypothesis when it is false. Four…
Teaching Statistics without Sadistics.
ERIC Educational Resources Information Center
Forte, James A.
1995-01-01
Five steps designed to take anxiety out of statistics for social work students are outlined. First, statistics anxiety is identified as an educational problem. Second, instructional objectives and procedures to achieve them are presented and methods and tools for evaluating the course are explored. Strategies for, and obstacles to, making…
STATSIM: Exercises in Statistics.
ERIC Educational Resources Information Center
Thomas, David B.; And Others
A computer-based learning simulation was developed at Florida State University which allows for high interactive responding via a time-sharing terminal for the purpose of demonstrating descriptive and inferential statistics. The statistical simulation (STATSIM) is comprised of four modules--chi square, t, z, and F distribution--and elucidates the…
Understanding Undergraduate Statistical Anxiety
ERIC Educational Resources Information Center
McKim, Courtney
2014-01-01
The purpose of this study was to understand undergraduate students' views of statistics. Results reveal that students with less anxiety have a higher interest in statistics and also believe in their ability to perform well in the course. Also students who have a more positive attitude about the class tend to have a higher belief in their…
ERIC Educational Resources Information Center
Hodgson, Ted; Andersen, Lyle; Robison-Cox, Jim; Jones, Clain
2004-01-01
Water quality experiments, especially the use of macroinvertebrates as indicators of water quality, offer an ideal context for connecting statistics and science. In the STAR program for secondary students and teachers, water quality experiments were also used as a context for teaching statistics. In this article, we trace one activity that uses…
Towards Statistically Undetectable Steganography
2011-06-30
Statistically Undciectable Steganography 5a. CONTRACT NUMBER FA9550-08-1-0084 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Prof. Jessica...approved for public release: distribution is unlimited. 13. SUPPLEMENTARY NOTES 14. ABSTRACT Fundamental asymptotic laws for imperfect steganography ...formats. 15. SUBJECT TERMS Steganography . covert communication, statistical detectability. asymptotic performance, secure pay load, minimum
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive…
ERIC Educational Resources Information Center
Singer, Arlene
This guide outlines a one semester Option Y course, which has seven learner objectives. The course is designed to provide students with an introduction to the concerns and methods of statistics, and to equip them to deal with the many statistical matters of importance to society. Topics covered include graphs and charts, collection and…
Croarkin, M. Carroll
2001-01-01
For more than 50 years, the Statistical Engineering Division (SED) has been instrumental in the success of a broad spectrum of metrology projects at NBS/NIST. This paper highlights fundamental contributions of NBS/NIST statisticians to statistics and to measurement science and technology. Published methods developed by SED staff, especially during the early years, endure as cornerstones of statistics not only in metrology and standards applications, but as data-analytic resources used across all disciplines. The history of statistics at NBS/NIST began with the formation of what is now the SED. Examples from the first five decades of the SED illustrate the critical role of the division in the successful resolution of a few of the highly visible, and sometimes controversial, statistical studies of national importance. A review of the history of major early publications of the division on statistical methods, design of experiments, and error analysis and uncertainty is followed by a survey of several thematic areas. The accompanying examples illustrate the importance of SED in the history of statistics, measurements and standards: calibration and measurement assurance, interlaboratory tests, development of measurement methods, Standard Reference Materials, statistical computing, and dissemination of measurement technology. A brief look forward sketches the expanding opportunity and demand for SED statisticians created by current trends in research and development at NIST. PMID:27500023
Reform in Statistical Education
ERIC Educational Resources Information Center
Huck, Schuyler W.
2007-01-01
Two questions are considered in this article: (a) What should professionals in school psychology do in an effort to stay current with developments in applied statistics? (b) What should they do with their existing knowledge to move from surface understanding of statistics to deep understanding? Written for school psychologists who have completed…
Deconstructing Statistical Analysis
ERIC Educational Resources Information Center
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
ERIC Educational Resources Information Center
Huizingh, Eelko K. R. E.
2007-01-01
Accessibly written and easy to use, "Applied Statistics Using SPSS" is an all-in-one self-study guide to SPSS and do-it-yourself guide to statistics. What is unique about Eelko Huizingh's approach is that this book is based around the needs of undergraduate students embarking on their own research project, and its self-help style is designed to…
Vijayaraj, Veeraraghavan; Cheriyadat, Anil M; Bhaduri, Budhendra L; Vatsavai, Raju; Bright, Eddie A
2008-01-01
Statistical properties of high-resolution overhead images representing different land use categories are analyzed using various local and global statistical image properties based on the shape of the power spectrum, image gradient distributions, edge co-occurrence, and inter-scale wavelet coefficient distributions. The analysis was performed on a database of high-resolution (1 meter) overhead images representing a multitude of different downtown, suburban, commercial, agricultural and wooded exemplars. Various statistical properties relating to these image categories and their relationship are discussed. The categorical variations in power spectrum contour shapes, the unique gradient distribution characteristics of wooded categories, the similarity in edge co-occurrence statistics for overhead and natural images, and the unique edge co-occurrence statistics of downtown categories are presented in this work. Though previous work on natural image statistics has showed some of the unique characteristics for different categories, the relationships for overhead images are not well understood. The statistical properties of natural images were used in previous studies to develop prior image models, to predict and index objects in a scene and to improve computer vision models. The results from our research findings can be used to augment and adapt computer vision algorithms that rely on prior image statistics to process overhead images, calibrate the performance of overhead image analysis algorithms, and derive features for better discrimination of overhead image categories.
Queer (v.) queer (v.): biology as curriculum, pedagogy, and being albeit queer (v.)
NASA Astrophysics Data System (ADS)
Broadway, Francis S.
2011-06-01
In order to advance the purpose of education as creating a sustainable world yet to be imagined, educationally, queer (v.) queer (v.) expounds curriculum, pedagogy and being, which has roots in sexuality—the public face of the private confluence of sexuality, gender, race and class, are a necessary framework for queer. If queer is a complicated conversation of strangers' eros, then queer facilitates the creation of space, revolution and transformation. In other words, queer, for science education, is more than increasing and privileging the heteronormative and non-heteronormative science content that extends capitalism's hegemony, but rather science as the dignity, identity, and loving and caring of and by one's self and fellow human beings as strangers.
Queer (v.) Queer (v.): Biology as Curriculum, Pedagogy, and Being albeit Queer (v.)
ERIC Educational Resources Information Center
Broadway, Francis S.
2011-01-01
In order to advance the purpose of education as creating a sustainable world yet to be imagined, educationally, queer (v.) queer (v.) expounds curriculum, pedagogy and being, which has roots in sexuality--the public face of the private confluence of sexuality, gender, race and class, are a necessary framework for queer. If queer is a complicated…
ERIC Educational Resources Information Center
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U.
2015-01-01
This article uses definitions provided by Cronbach in his seminal paper for coefficient a to show the concepts of reliability, dimensionality, and internal consistency are distinct but interrelated. The article begins with a critique of the definition of reliability and then explores mathematical properties of Cronbach's a. Internal consistency…
Ector, Hugo
2010-12-01
I still remember my first book on statistics: "Elementary statistics with applications in medicine and the biological sciences" by Frederick E. Croxton. For me, it has been the start of pursuing understanding statistics in daily life and in medical practice. It was the first volume in a long row of books. In his introduction, Croxton pretends that"nearly everyone involved in any aspect of medicine needs to have some knowledge of statistics". The reality is that for many clinicians, statistics are limited to a "P < 0.05 = ok". I do not blame my colleagues who omit the paragraph on statistical methods. They have never had the opportunity to learn concise and clear descriptions of the key features. I have experienced how some authors can describe difficult methods in a well understandable language. Others fail completely. As a teacher, I tell my students that life is impossible without a basic knowledge of statistics. This feeling has resulted in an annual seminar of 90 minutes. This tutorial is the summary of this seminar. It is a summary and a transcription of the best pages I have detected.
NASA Technical Reports Server (NTRS)
Young, M.; Koslovsky, M.; Schaefer, Caroline M.; Feiveson, A. H.
2017-01-01
Back by popular demand, the JSC Biostatistics Laboratory and LSAH statisticians are offering an opportunity to discuss your statistical challenges and needs. Take the opportunity to meet the individuals offering expert statistical support to the JSC community. Join us for an informal conversation about any questions you may have encountered with issues of experimental design, analysis, or data visualization. Get answers to common questions about sample size, repeated measures, statistical assumptions, missing data, multiple testing, time-to-event data, and when to trust the results of your analyses.
Commentary: statistics for biomarkers.
Lovell, David P
2012-05-01
This short commentary discusses Biomarkers' requirements for the reporting of statistical analyses in submitted papers. It is expected that submitters will follow the general instructions of the journal, the more detailed guidance given by the International Committee of Medical Journal Editors, the specific guidelines developed by the EQUATOR network, and those of various specialist groups. Biomarkers expects that the study design and subsequent statistical analyses are clearly reported and that the data reported can be made available for independent assessment. The journal recognizes that there is continuing debate about different approaches to statistical science. Biomarkers appreciates that the field continues to develop rapidly and encourages the use of new methodologies.
LED champing: statistically blessed?
Wang, Zhuo
2015-06-10
LED champing (smart mixing of individual LEDs to match the desired color and lumens) and color mixing strategies have been widely used to maintain the color consistency of light engines. Light engines with champed LEDs can easily achieve the color consistency of a couple MacAdam steps with widely distributed LEDs to begin with. From a statistical point of view, the distributions for the color coordinates and the flux after champing are studied. The related statistical parameters are derived, which facilitate process improvements such as Six Sigma and are instrumental to statistical quality control for mass productions.
Breast cancer statistics, 2011.
DeSantis, Carol; Siegel, Rebecca; Bandi, Priti; Jemal, Ahmedin
2011-01-01
In this article, the American Cancer Society provides an overview of female breast cancer statistics in the United States, including trends in incidence, mortality, survival, and screening. Approximately 230,480 new cases of invasive breast cancer and 39,520 breast cancer deaths are expected to occur among US women in 2011. Breast cancer incidence rates were stable among all racial/ethnic groups from 2004 to 2008. Breast cancer death rates have been declining since the early 1990s for all women except American Indians/Alaska Natives, among whom rates have remained stable. Disparities in breast cancer death rates are evident by state, socioeconomic status, and race/ethnicity. While significant declines in mortality rates were observed for 36 states and the District of Columbia over the past 10 years, rates for 14 states remained level. Analyses by county-level poverty rates showed that the decrease in mortality rates began later and was slower among women residing in poor areas. As a result, the highest breast cancer death rates shifted from the affluent areas to the poor areas in the early 1990s. Screening rates continue to be lower in poor women compared with non-poor women, despite much progress in increasing mammography utilization. In 2008, 51.4% of poor women had undergone a screening mammogram in the past 2 years compared with 72.8% of non-poor women. Encouraging patients aged 40 years and older to have annual mammography and a clinical breast examination is the single most important step that clinicians can take to reduce suffering and death from breast cancer. Clinicians should also ensure that patients at high risk of breast cancer are identified and offered appropriate screening and follow-up. Continued progress in the control of breast cancer will require sustained and increased efforts to provide high-quality screening, diagnosis, and treatment to all segments of the population.
Playing at Statistical Mechanics
ERIC Educational Resources Information Center
Clark, Paul M.; And Others
1974-01-01
Discussed are the applications of counting techniques of a sorting game to distributions and concepts in statistical mechanics. Included are the following distributions: Fermi-Dirac, Bose-Einstein, and most probable. (RH)
Hemophilia Data and Statistics
... Hemophilia Women Healthcare Providers Partners Media Policy Makers Data & Statistics Language: English Español (Spanish) Recommend on Facebook ... at a very young age. Based on CDC data, the median age at diagnosis is 36 months ...
Cooperative Learning in Statistics.
ERIC Educational Resources Information Center
Keeler, Carolyn M.; And Others
1994-01-01
Formal use of cooperative learning techniques proved effective in improving student performance and retention in a freshman level statistics course. Lectures interspersed with group activities proved effective in increasing conceptual understanding and overall class performance. (11 references) (Author)
NASA Astrophysics Data System (ADS)
Richfield, Jon; bookfeller
2016-07-01
In reply to Ralph Kenna and Pádraig Mac Carron's feature article “Maths meets myths” in which they describe how they are using techniques from statistical physics to characterize the societies depicted in ancient Icelandic sagas.
NASA Astrophysics Data System (ADS)
Grégoire, G.
2016-05-01
This chapter is devoted to two objectives. The first one is to answer the request expressed by attendees of the first Astrostatistics School (Annecy, October 2013) to be provided with an elementary vademecum of statistics that would facilitate understanding of the given courses. In this spirit we recall very basic notions, that is definitions and properties that we think sufficient to benefit from courses given in the Astrostatistical School. Thus we give briefly definitions and elementary properties on random variables and vectors, distributions, estimation and tests, maximum likelihood methodology. We intend to present basic ideas in a hopefully comprehensible way. We do not try to give a rigorous presentation, and due to the place devoted to this chapter, can cover only a rather limited field of statistics. The second aim is to focus on some statistical tools that are useful in classification: basic introduction to Bayesian statistics, maximum likelihood methodology, Gaussian vectors and Gaussian mixture models.
... and Statistics Recommend on Facebook Tweet Share Compartir Plague in the United States Plague was first introduced ... per year in the United States: 1900-2012. Plague Worldwide Plague epidemics have occurred in Africa, Asia, ...
Understanding Solar Flare Statistics
NASA Astrophysics Data System (ADS)
Wheatland, M. S.
2005-12-01
A review is presented of work aimed at understanding solar flare statistics, with emphasis on the well known flare power-law size distribution. Although avalanche models are perhaps the favoured model to describe flare statistics, their physical basis is unclear, and they are divorced from developing ideas in large-scale reconnection theory. An alternative model, aimed at reconciling large-scale reconnection models with solar flare statistics, is revisited. The solar flare waiting-time distribution has also attracted recent attention. Observed waiting-time distributions are described, together with what they might tell us about the flare phenomenon. Finally, a practical application of flare statistics to flare prediction is described in detail, including the results of a year of automated (web-based) predictions from the method.
Titanic: A Statistical Exploration.
ERIC Educational Resources Information Center
Takis, Sandra L.
1999-01-01
Uses the available data about the Titanic's passengers to interest students in exploring categorical data and the chi-square distribution. Describes activities incorporated into a statistics class and gives additional resources for collecting information about the Titanic. (ASK)
Purposeful Statistical Investigations
ERIC Educational Resources Information Center
Day, Lorraine
2014-01-01
Lorraine Day provides us with a great range of statistical investigations using various resources such as maths300 and TinkerPlots. Each of the investigations link mathematics to students' lives and provide engaging and meaningful contexts for mathematical inquiry.
NASA Astrophysics Data System (ADS)
Testa, Massimo
2015-08-01
Starting with the basic principles of Relativistic Quantum Mechanics, we give a rigorous, but completely elementary proof of the relation between fundamental observables of a statistical system, when measured within two inertial reference frames, related by a Lorentz transformation.
How Statistics "Excel" Online.
ERIC Educational Resources Information Center
Chao, Faith; Davis, James
2000-01-01
Discusses the use of Microsoft Excel software and provides examples of its use in an online statistics course at Golden Gate University in the areas of randomness and probability, sampling distributions, confidence intervals, and regression analysis. (LRW)
Lessons from Inferentialism for Statistics Education
ERIC Educational Resources Information Center
Bakker, Arthur; Derry, Jan
2011-01-01
This theoretical paper relates recent interest in informal statistical inference (ISI) to the semantic theory termed inferentialism, a significant development in contemporary philosophy, which places inference at the heart of human knowing. This theory assists epistemological reflection on challenges in statistics education encountered when…
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
NASA Astrophysics Data System (ADS)
Cook, Samuel A.; Fukawa-Connelly, Timothy
2016-02-01
Studies have shown that at the end of an introductory statistics course, students struggle with building block concepts, such as mean and standard deviation, and rely on procedural understandings of the concepts. This study aims to investigate the understandings entering freshman of a department of mathematics and statistics (including mathematics education), students who are presumably better prepared in terms of mathematics and statistics than the average university student, have of introductory statistics. This case study found that these students enter college with common statistical misunderstandings, lack of knowledge, and idiosyncratic collections of correct statistical knowledge. Moreover, they also have a wide range of beliefs about their knowledge with some of the students who believe that they have the strongest knowledge also having significant misconceptions. More attention to these statistical building blocks may be required in a university introduction statistics course.
Predicting Success in Psychological Statistics Courses.
Lester, David
2016-06-01
Many students perform poorly in courses on psychological statistics, and it is useful to be able to predict which students will have difficulties. In a study of 93 undergraduates enrolled in Statistical Methods (18 men, 75 women; M age = 22.0 years, SD = 5.1), performance was significantly associated with sex (female students performed better) and proficiency in algebra in a linear regression analysis. Anxiety about statistics was not associated with course performance, indicating that basic mathematical skills are the best correlate for performance in statistics courses and can usefully be used to stream students into classes by ability.
Statistical Physics of Fracture
Alava, Mikko; Nukala, Phani K; Zapperi, Stefano
2006-05-01
Disorder and long-range interactions are two of the key components that make material failure an interesting playfield for the application of statistical mechanics. The cornerstone in this respect has been lattice models of the fracture in which a network of elastic beams, bonds, or electrical fuses with random failure thresholds are subject to an increasing external load. These models describe on a qualitative level the failure processes of real, brittle, or quasi-brittle materials. This has been particularly important in solving the classical engineering problems of material strength: the size dependence of maximum stress and its sample-to-sample statistical fluctuations. At the same time, lattice models pose many new fundamental questions in statistical physics, such as the relation between fracture and phase transitions. Experimental results point out to the existence of an intriguing crackling noise in the acoustic emission and of self-affine fractals in the crack surface morphology. Recent advances in computer power have enabled considerable progress in the understanding of such models. Among these partly still controversial issues, are the scaling and size-effects in material strength and accumulated damage, the statistics of avalanches or bursts of microfailures, and the morphology of the crack surface. Here we present an overview of the results obtained with lattice models for fracture, highlighting the relations with statistical physics theories and more conventional fracture mechanics approaches.
SANABRIA, FEDERICO; KILLEEN, PETER R.
2008-01-01
Despite being under challenge for the past 50 years, null hypothesis significance testing (NHST) remains dominant in the scientific field for want of viable alternatives. NHST, along with its significance level p, is inadequate for most of the uses to which it is put, a flaw that is of particular interest to educational practitioners who too often must use it to sanctify their research. In this article, we review the failure of NHST and propose prep, the probability of replicating an effect, as a more useful statistic for evaluating research and aiding practical decision making. PMID:19122766
SHARE: Statistical hadronization with resonances
NASA Astrophysics Data System (ADS)
Torrieri, G.; Steinke, S.; Broniowski, W.; Florkowski, W.; Letessier, J.; Rafelski, J.
2005-05-01
errors are independent, since the systematic error is not a random variable). Aside of χ, the program also calculates the statistical significance [2], defined as the probability that, given a "true" theory and a statistical (Gaussian) experimental error, the fitted χ assumes the values at or above the considered value. In the case that the best fit has statistical significance significantly below unity, the model under consideration is very likely inappropriate. In the limit of many degrees of freedom ( N), the statistical significance function depends only on χ/N, with 90% statistical significance at χ/N˜1, and falling steeply at χ/N>1. However, the degrees of freedom in fits involving ratios are generally not sufficient to reach the asymptotic limit. Hence, statistical significance depends strongly on χ and N separately. In particular, if N<20, often for a fit to have an acceptable statistical significance, a χ/N significantly less than 1 is required. The fit routine does not always find the true lowest χ minimum. Specifically, multi-parameter fits with too few degrees of freedom generally exhibit a non-trivial structure in parameter space, with several secondary minima, saddle points, valleys, etc. To help the user perform the minimization effectively, we have added tools to compute the χ contours and profiles. In addition, our program's flexibility allows for many strategies in performing the fit. It is therefore possible, by following the techniques described in Section 3.7, to scan the parameter space and ensure that the minimum found is the true one. Further systematic deviations between the model and experiment can be recognized via the program's output, which includes a particle-by-particle comparison between experiment and theory. Additional comments: In consideration of the wide stream of new data coming out from RHIC, there is an on-going activity, with several groups performing analysis of particle yields. It is our hope that SHARE will allow to
Statistical learning and selective inference
Taylor, Jonathan; Tibshirani, Robert J.
2015-01-01
We describe the problem of “selective inference.” This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have “cherry-picked”—searched for the strongest associations—means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis. PMID:26100887
Statistical learning and selective inference.
Taylor, Jonathan; Tibshirani, Robert J
2015-06-23
We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
Inverse statistics and information content
NASA Astrophysics Data System (ADS)
Ebadi, H.; Bolgorian, Meysam; Jafari, G. R.
2010-12-01
Inverse statistics analysis studies the distribution of investment horizons to achieve a predefined level of return. This distribution provides a maximum investment horizon which determines the most likely horizon for gaining a specific return. There exists a significant difference between inverse statistics of financial market data and a fractional Brownian motion (fBm) as an uncorrelated time-series, which is a suitable criteria to measure information content in financial data. In this paper we perform this analysis for the DJIA and S&P500 as two developed markets and Tehran price index (TEPIX) as an emerging market. We also compare these probability distributions with fBm probability, to detect when the behavior of the stocks are the same as fBm.
Perception in statistical graphics
NASA Astrophysics Data System (ADS)
VanderPlas, Susan Ruth
There has been quite a bit of research on statistical graphics and visualization, generally focused on new types of graphics, new software to create graphics, interactivity, and usability studies. Our ability to interpret and use statistical graphics hinges on the interface between the graph itself and the brain that perceives and interprets it, and there is substantially less research on the interplay between graph, eye, brain, and mind than is sufficient to understand the nature of these relationships. The goal of the work presented here is to further explore the interplay between a static graph, the translation of that graph from paper to mental representation (the journey from eye to brain), and the mental processes that operate on that graph once it is transferred into memory (mind). Understanding the perception of statistical graphics should allow researchers to create more effective graphs which produce fewer distortions and viewer errors while reducing the cognitive load necessary to understand the information presented in the graph. Taken together, these experiments should lay a foundation for exploring the perception of statistical graphics. There has been considerable research into the accuracy of numerical judgments viewers make from graphs, and these studies are useful, but it is more effective to understand how errors in these judgments occur so that the root cause of the error can be addressed directly. Understanding how visual reasoning relates to the ability to make judgments from graphs allows us to tailor graphics to particular target audiences. In addition, understanding the hierarchy of salient features in statistical graphics allows us to clearly communicate the important message from data or statistical models by constructing graphics which are designed specifically for the perceptual system.
Significance of periodogram peaks
NASA Astrophysics Data System (ADS)
Süveges, Maria; Guy, Leanne; Zucker, Shay
2016-10-01
Three versions of significance measures or False Alarm Probabilities (FAPs) for periodogram peaks are presented and compared for sinusoidal and box-like signals, with specific application on large-scale surveys in mind.
Josse, Florent; Lefebvre, Yannick; Todeschini, Patrick; Turato, Silvia; Meister, Eric
2006-07-01
Assessing the structural integrity of a nuclear Reactor Pressure Vessel (RPV) subjected to pressurized-thermal-shock (PTS) transients is extremely important to safety. In addition to conventional deterministic calculations to confirm RPV integrity, Electricite de France (EDF) carries out probabilistic analyses. Probabilistic analyses are interesting because some key variables, albeit conventionally taken at conservative values, can be modeled more accurately through statistical variability. One variable which significantly affects RPV structural integrity assessment is cleavage fracture initiation toughness. The reference fracture toughness method currently in use at EDF is the RCCM and ASME Code lower-bound K{sub IC} based on the indexing parameter RT{sub NDT}. However, in order to quantify the toughness scatter for probabilistic analyses, the master curve method is being analyzed at present. Furthermore, the master curve method is a direct means of evaluating fracture toughness based on K{sub JC} data. In the framework of the master curve investigation undertaken by EDF, this article deals with the following two statistical items: building a master curve from an extract of a fracture toughness dataset (from the European project 'Unified Reference Fracture Toughness Design curves for RPV Steels') and controlling statistical uncertainty for both mono-temperature and multi-temperature tests. Concerning the first point, master curve temperature dependence is empirical in nature. To determine the 'original' master curve, Wallin postulated that a unified description of fracture toughness temperature dependence for ferritic steels is possible, and used a large number of data corresponding to nuclear-grade pressure vessel steels and welds. Our working hypothesis is that some ferritic steels may behave in slightly different ways. Therefore we focused exclusively on the basic french reactor vessel metal of types A508 Class 3 and A 533 grade B Class 1, taking the sampling
Banerjee, Rabin; Majhi, Bibhas Ranjan
2010-06-15
Starting from the definition of entropy used in statistical mechanics we show that it is proportional to the gravity action. For a stationary black hole this entropy is expressed as S=E/2T, where T is the Hawking temperature and E is shown to be the Komar energy. This relation is also compatible with the generalized Smarr formula for mass.
Statistical Reasoning over Lunch
ERIC Educational Resources Information Center
Selmer, Sarah J.; Bolyard, Johnna J.; Rye, James A.
2011-01-01
Students in the 21st century are exposed daily to a staggering amount of numerically infused media. In this era of abundant numeric data, students must be able to engage in sound statistical reasoning when making life decisions after exposure to varied information. The context of nutrition can be used to engage upper elementary and middle school…
ERIC Educational Resources Information Center
Akram, Muhammad; Siddiqui, Asim Jamal; Yasmeen, Farah
2004-01-01
In order to learn the concept of statistical techniques one needs to run real experiments that generate reliable data. In practice, the data from some well-defined process or system is very costly and time consuming. It is difficult to run real experiments during the teaching period in the university. To overcome these difficulties, statisticians…
Analogies for Understanding Statistics
ERIC Educational Resources Information Center
Hocquette, Jean-Francois
2004-01-01
This article describes a simple way to explain the limitations of statistics to scientists and students to avoid the publication of misleading conclusions. Biologists examine their results extremely critically and carefully choose the appropriate analytic methods depending on their scientific objectives. However, no such close attention is usually…
Structurally Sound Statistics Instruction
ERIC Educational Resources Information Center
Casey, Stephanie A.; Bostic, Jonathan D.
2016-01-01
The Common Core's Standards for Mathematical Practice (SMP) call for all K-grade 12 students to develop expertise in the processes and proficiencies of doing mathematics. However, the Common Core State Standards for Mathematics (CCSSM) (CCSSI 2010) as a whole addresses students' learning of not only mathematics but also statistics. This situation…
General Aviation Avionics Statistics.
1980-12-01
No. 2. Government Accession No. 3. Recipient’s Catalog No. 5" FAA-MS-80-7* a and. SubtitleDecember 1&80 "GENERAL AVIATION AVIONICS STATISTICS 0 6...Altimeter 8. Fuel gage 3. Compass 9. Landing gear 4. Tachometer 10. Belts 5. Oil temperature 11. Special equipment for 6. Emergency locator over water
NACME Statistical Report 1986.
ERIC Educational Resources Information Center
Miranda, Luis A.; Ruiz, Esther
This statistical report summarizes data on enrollment and graduation of minority students in engineering degree programs from 1974 to 1985. First, an introduction identifies major trends and briefly describes the Incentive Grants Program (IGP), the nation's largest privately supported source of scholarship funds available to minority engineering…
ERIC Educational Resources Information Center
Barnes, Bernis, Ed.; And Others
This teacher's guide to probability and statistics contains three major sections. The first section on elementary combinatorial principles includes activities, student problems, and suggested teaching procedures for the multiplication principle, permutations, and combinations. Section two develops an intuitive approach to probability through…
ERIC Educational Resources Information Center
Office of the Assistant Secretary of Defense -- Comptroller (DOD), Washington, DC.
This document contains summaries of basic manpower statistical data for the Department of Defense, with the Army, Navy, Marine Corps, and Air Force totals shown separately and collectively. Included are figures for active duty military personnel, civilian personnel, reserve components, and retired military personnel. Some of the data show…
NASA Astrophysics Data System (ADS)
Williams, R. L.; Gateley, Wilson Y.
1993-05-01
This paper summarizes the statistical quality control methods and procedures that can be employed in mass producing electronic parts (integrated circuits, buffers, capacitors, connectors) to reduce variability and ensure performance to specified radiation, current, voltage, temperature, shock, and vibration levels. Producing such quality parts reduces uncertainties in performance and will aid materially in validating the survivability of components, subsystems, and systems to specified threats.
Statistics for Learning Genetics
ERIC Educational Resources Information Center
Charles, Abigail Sheena
2012-01-01
This study investigated the knowledge and skills that biology students may need to help them understand statistics/mathematics as it applies to genetics. The data are based on analyses of current representative genetics texts, practicing genetics professors' perspectives, and more directly, students' perceptions of, and performance in, doing…
Education Statistics Quarterly, 2003.
ERIC Educational Resources Information Center
Marenus, Barbara; Burns, Shelley; Fowler, William; Greene, Wilma; Knepper, Paula; Kolstad, Andrew; McMillen Seastrom, Marilyn; Scott, Leslie
2003-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released in a 3-month period. Each issue also contains a message from the NCES on a timely…
Quartiles in Elementary Statistics
ERIC Educational Resources Information Center
Langford, Eric
2006-01-01
The calculation of the upper and lower quartile values of a data set in an elementary statistics course is done in at least a dozen different ways, depending on the text or computer/calculator package being used (such as SAS, JMP, MINITAB, "Excel," and the TI-83 Plus). In this paper, we examine the various methods and offer a suggestion for a new…
... of benign genes ID’s ASD suspects More Additional Mental Health Information from NIMH Medications Statistics Clinical Trials Coping ... Finder Publicaciones en Español The National Institute of Mental Health (NIMH) is part of the National Institutes of ...
Statistical Energy Analysis Program
NASA Technical Reports Server (NTRS)
Ferebee, R. C.; Trudell, R. W.; Yano, L. I.; Nygaard, S. I.
1985-01-01
Statistical Energy Analysis (SEA) is powerful tool for estimating highfrequency vibration spectra of complex structural systems and incorporated into computer program. Basic SEA analysis procedure divided into three steps: Idealization, parameter generation, and problem solution. SEA computer program written in FORTRAN V for batch execution.
Library Research and Statistics.
ERIC Educational Resources Information Center
Lynch, Mary Jo; St. Lifer, Evan; Halstead, Kent; Fox, Bette-Lee; Miller, Marilyn L.; Shontz, Marilyn L.
2001-01-01
These nine articles discuss research and statistics on libraries and librarianship, including libraries in the United States, Canada, and Mexico; acquisition expenditures in public, academic, special, and government libraries; price indexes; state rankings of public library data; library buildings; expenditures in school library media centers; and…
Statistics for Learning Genetics
NASA Astrophysics Data System (ADS)
Charles, Abigail Sheena
This study investigated the knowledge and skills that biology students may need to help them understand statistics/mathematics as it applies to genetics. The data are based on analyses of current representative genetics texts, practicing genetics professors' perspectives, and more directly, students' perceptions of, and performance in, doing statistically-based genetics problems. This issue is at the emerging edge of modern college-level genetics instruction, and this study attempts to identify key theoretical components for creating a specialized biological statistics curriculum. The goal of this curriculum will be to prepare biology students with the skills for assimilating quantitatively-based genetic processes, increasingly at the forefront of modern genetics. To fulfill this, two college level classes at two universities were surveyed. One university was located in the northeastern US and the other in the West Indies. There was a sample size of 42 students and a supplementary interview was administered to a select 9 students. Interviews were also administered to professors in the field in order to gain insight into the teaching of statistics in genetics. Key findings indicated that students had very little to no background in statistics (55%). Although students did perform well on exams with 60% of the population receiving an A or B grade, 77% of them did not offer good explanations on a probability question associated with the normal distribution provided in the survey. The scope and presentation of the applicable statistics/mathematics in some of the most used textbooks in genetics teaching, as well as genetics syllabi used by instructors do not help the issue. It was found that the text books, often times, either did not give effective explanations for students, or completely left out certain topics. The omission of certain statistical/mathematical oriented topics was seen to be also true with the genetics syllabi reviewed for this study. Nonetheless
Statistical properties of Fourier-based time-lag estimates
NASA Astrophysics Data System (ADS)
Epitropakis, A.; Papadakis, I. E.
2016-06-01
observed time series; b) smoothing of the cross-periodogram should be avoided, as this may introduce significant bias to the time-lag estimates, which can be taken into account by assuming a model cross-spectrum (and not just a model time-lag spectrum); c) time-lags should be estimated by dividing observed time series into a number, say m, of shorter data segments and averaging the resulting cross-periodograms; d) if the data segments have a duration ≳ 20 ks, the time-lag bias is ≲15% of its intrinsic value for the model cross-spectra and power-spectra considered in this work. This bias should be estimated in practise (by considering possible intrinsic cross-spectra that may be applicable to the time-lag spectra at hand) to assess the reliability of any time-lag analysis; e) the effects of experimental noise can be minimised by only estimating time-lags in the frequency range where the sample coherence is larger than 1.2/(1 + 0.2m). In this range, the amplitude of noise variations caused by measurement errors is smaller than the amplitude of the signal's intrinsic variations. As long as m ≳ 20, time-lags estimated by averaging over individual data segments have analytical error estimates that are within 95% of the true scatter around their mean, and their distribution is similar, albeit not identical, to a Gaussian.
Statistical aspects of solar flares
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
1987-01-01
A survey of the statistical properties of 850 H alpha solar flares during 1975 is presented. Comparison of the results found here with those reported elsewhere for different epochs is accomplished. Distributions of rise time, decay time, and duration are given, as are the mean, mode, median, and 90th percentile values. Proportions by selected groupings are also determined. For flares in general, mean values for rise time, decay time, and duration are 5.2 + or - 0.4 min, and 18.1 + or 1.1 min, respectively. Subflares, accounting for nearly 90 percent of the flares, had mean values lower than those found for flares of H alpha importance greater than 1, and the differences are statistically significant. Likewise, flares of bright and normal relative brightness have mean values of decay time and duration that are significantly longer than those computed for faint flares, and mass-motion related flares are significantly longer than non-mass-motion related flares. Seventy-three percent of the mass-motion related flares are categorized as being a two-ribbon flare and/or being accompanied by a high-speed dark filament. Slow rise time flares (rise time greater than 5 min) have a mean value for duration that is significantly longer than that computed for fast rise time flares, and long-lived duration flares (duration greater than 18 min) have a mean value for rise time that is significantly longer than that computed for short-lived duration flares, suggesting a positive linear relationship between rise time and duration for flares. Monthly occurrence rates for flares in general and by group are found to be linearly related in a positive sense to monthly sunspot number. Statistical testing reveals the association between sunspot number and numbers of flares to be significant at the 95 percent level of confidence, and the t statistic for slope is significant at greater than 99 percent level of confidence. Dependent upon the specific fit, between 58 percent and 94 percent of
NASA Technical Reports Server (NTRS)
Black, D. C.
1986-01-01
The significance of brown dwarfs for resolving some major problems in astronomy is discussed. The importance of brown dwarfs for models of star formation by fragmentation of molecular clouds and for obtaining independent measurements of the ages of stars in binary systems is addressed. The relationship of brown dwarfs to planets is considered.
The Statistical Drake Equation
NASA Astrophysics Data System (ADS)
Maccone, Claudio
2010-12-01
We provide the statistical generalization of the Drake equation. From a simple product of seven positive numbers, the Drake equation is now turned into the product of seven positive random variables. We call this "the Statistical Drake Equation". The mathematical consequences of this transformation are then derived. The proof of our results is based on the Central Limit Theorem (CLT) of Statistics. In loose terms, the CLT states that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable. This is called the Lyapunov Form of the CLT, or the Lindeberg Form of the CLT, depending on the mathematical constraints assumed on the third moments of the various probability distributions. In conclusion, we show that: The new random variable N, yielding the number of communicating civilizations in the Galaxy, follows the LOGNORMAL distribution. Then, as a consequence, the mean value of this lognormal distribution is the ordinary N in the Drake equation. The standard deviation, mode, and all the moments of this lognormal N are also found. The seven factors in the ordinary Drake equation now become seven positive random variables. The probability distribution of each random variable may be ARBITRARY. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into our statistical Drake equation by allowing an arbitrary probability distribution for each factor. This is both physically realistic and practically very useful, of course. An application of our statistical Drake equation then follows. The (average) DISTANCE between any two neighboring and communicating civilizations in the Galaxy may be shown to be inversely proportional to the cubic root of N. Then, in our approach, this distance becomes a new random variable. We derive the relevant probability density
NASA Astrophysics Data System (ADS)
Maccone, C.
In this paper is provided the statistical generalization of the Fermi paradox. The statistics of habitable planets may be based on a set of ten (and possibly more) astrobiological requirements first pointed out by Stephen H. Dole in his book Habitable planets for man (1964). The statistical generalization of the original and by now too simplistic Dole equation is provided by replacing a product of ten positive numbers by the product of ten positive random variables. This is denoted the SEH, an acronym standing for “Statistical Equation for Habitables”. The proof in this paper is based on the Central Limit Theorem (CLT) of Statistics, stating that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable (Lyapunov form of the CLT). It is then shown that: 1. The new random variable NHab, yielding the number of habitables (i.e. habitable planets) in the Galaxy, follows the log- normal distribution. By construction, the mean value of this log-normal distribution is the total number of habitable planets as given by the statistical Dole equation. 2. The ten (or more) astrobiological factors are now positive random variables. The probability distribution of each random variable may be arbitrary. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into the SEH by allowing an arbitrary probability distribution for each factor. This is both astrobiologically realistic and useful for any further investigations. 3. By applying the SEH it is shown that the (average) distance between any two nearby habitable planets in the Galaxy may be shown to be inversely proportional to the cubic root of NHab. This distance is denoted by new random variable D. The relevant probability density function is derived, which was named the "Maccone distribution" by Paul Davies in
Statistical considerations for preclinical studies.
Aban, Inmaculada B; George, Brandon
2015-08-01
Research studies must always have proper planning, conduct, analysis and reporting in order to preserve scientific integrity. Preclinical studies, the first stage of the drug development process, are no exception to this rule. The decision to advance to clinical trials in humans relies on the results of these studies. Recent observations show that a significant number of preclinical studies lack rigor in their conduct and reporting. This paper discusses statistical aspects, such as design, sample size determination, and methods of analyses, that will help add rigor and improve the quality of preclinical studies.
Statistics, Uncertainty, and Transmitted Variation
Wendelberger, Joanne Roth
2014-11-05
The field of Statistics provides methods for modeling and understanding data and making decisions in the presence of uncertainty. When examining response functions, variation present in the input variables will be transmitted via the response function to the output variables. This phenomenon can potentially have significant impacts on the uncertainty associated with results from subsequent analysis. This presentation will examine the concept of transmitted variation, its impact on designed experiments, and a method for identifying and estimating sources of transmitted variation in certain settings.
Statistical Inference: The Big Picture.
Kass, Robert E
2011-02-01
Statistics has moved beyond the frequentist-Bayesian controversies of the past. Where does this leave our ability to interpret results? I suggest that a philosophy compatible with statistical practice, labelled here statistical pragmatism, serves as a foundation for inference. Statistical pragmatism is inclusive and emphasizes the assumptions that connect statistical models with observed data. I argue that introductory courses often mis-characterize the process of statistical inference and I propose an alternative "big picture" depiction.
Composite Defect Significance.
1982-07-13
A12i 299 COMPOSITE DEFECT SIGNIFICANCE(U) MATERIALS SCIENCES 1/1 \\ CORP SPRING HOUSE PA S N CHATTERJEE ET AL. 13 JUL 82 MSC/TFR/1288/il87 NADC-80848...Directorate 30 Sensors & Avionics Technology Directorate 40 Communication & Navigation Technology Directorate 50 Software Computer Directorate 60 Aircraft ...instructions concerning commercial products herein do not constitute an endorsement by the Government nor do they convey or imply the license or right to use
NASA Astrophysics Data System (ADS)
Dunbar, P. K.; Furtney, M.; McLean, S. J.; Sweeney, A. D.
2014-12-01
Tsunamis have inflicted death and destruction on the coastlines of the world throughout history. The occurrence of tsunamis and the resulting effects have been collected and studied as far back as the second millennium B.C. The knowledge gained from cataloging and examining these events has led to significant changes in our understanding of tsunamis, tsunami sources, and methods to mitigate the effects of tsunamis. The most significant, not surprisingly, are often the most devastating, such as the 2011 Tohoku, Japan earthquake and tsunami. The goal of this poster is to give a brief overview of the occurrence of tsunamis and then focus specifically on several significant tsunamis. There are various criteria to determine the most significant tsunamis: the number of deaths, amount of damage, maximum runup height, had a major impact on tsunami science or policy, etc. As a result, descriptions will include some of the most costly (2011 Tohoku, Japan), the most deadly (2004 Sumatra, 1883 Krakatau), and the highest runup ever observed (1958 Lituya Bay, Alaska). The discovery of the Cascadia subduction zone as the source of the 1700 Japanese "Orphan" tsunami and a future tsunami threat to the U.S. northwest coast, contributed to the decision to form the U.S. National Tsunami Hazard Mitigation Program. The great Lisbon earthquake of 1755 marked the beginning of the modern era of seismology. Knowledge gained from the 1964 Alaska earthquake and tsunami helped confirm the theory of plate tectonics. The 1946 Alaska, 1952 Kuril Islands, 1960 Chile, 1964 Alaska, and the 2004 Banda Aceh, tsunamis all resulted in warning centers or systems being established.The data descriptions on this poster were extracted from NOAA's National Geophysical Data Center (NGDC) global historical tsunami database. Additional information about these tsunamis, as well as water level data can be found by accessing the NGDC website www.ngdc.noaa.gov/hazard/
A mixed-effects Statistical Model for Comparative LC-MS Proteomics Studies
Daly, Don S.; Anderson, Kevin K.; Panisko, Ellen A.; Purvine, Samuel O.; Fang, Ruihua; Monroe, Matthew E.; Baker, Scott E.
2008-03-01
Comparing a protein’s concentrations across two or more treatments is the focus of many proteomics studies. A frequent source of measurements for these comparisons is a mass spectrometry (MS) analysis of a protein’s peptide ions separated by liquid chromatography (LC) following its enzymatic digestion. Alas, LC-MS identification and quantification of equimolar peptides can vary significantly due to their unequal digestion, separation and ionization. This unequal measurability of peptides, the largest source of LC-MS nuisance variation, stymies confident comparison of a protein’s concentration across treatments. Our objective is to introduce a mixed-effects statistical model for comparative LC-MS proteomics studies. We describe LC-MS peptide abundance with a linear model featuring pivotal terms that account for unequal peptide LC-MS measurability. We advance fitting this model to an often incomplete LC-MS dataset with REstricted Maximum Likelihood (REML) estimation, producing estimates of model goodness-offit, treatment effects, standard errors, confidence intervals, and protein relative concentrations. We illustrate the model with an experiment featuring a known dilution series of a filamentous ascomycete fungus Trichoderma reesei protein mixture. For the 781 of 1546 T.reesei proteins with sufficient data coverage, the fitted mixed-effects models capably described the LC-MS measurements. The LC-MS measurability terms effectively accounted for this major source of uncertainty. Ninety percent of the relative concentration estimates were within 1/2 fold of the true relative concentrations. Akin to the common ratio method, this model also produced biased estimates, albeit less biased. Bias decreased significantly, both absolutely and relative to the ratio method, as the number of observed peptides per protein increased. Mixed-effects statistical modeling offers a flexible, well-established methodology for comparative proteomics studies integrating common
Statistical evaluation of forecasts.
Mader, Malenka; Mader, Wolfgang; Gluckman, Bruce J; Timmer, Jens; Schelter, Björn
2014-08-01
Reliable forecasts of extreme but rare events, such as earthquakes, financial crashes, and epileptic seizures, would render interventions and precautions possible. Therefore, forecasting methods have been developed which intend to raise an alarm if an extreme event is about to occur. In order to statistically validate the performance of a prediction system, it must be compared to the performance of a random predictor, which raises alarms independent of the events. Such a random predictor can be obtained by bootstrapping or analytically. We propose an analytic statistical framework which, in contrast to conventional methods, allows for validating independently the sensitivity and specificity of a forecasting method. Moreover, our method accounts for the periods during which an event has to remain absent or occur after a respective forecast.
Thacker, Michael A.; Moseley, G. Lorimer
2017-01-01
Perception is seen as a process that utilises partial and noisy information to construct a coherent understanding of the world. Here we argue that the experience of pain is no different; it is based on incomplete, multimodal information, which is used to estimate potential bodily threat. We outline a Bayesian inference model, incorporating the key components of cue combination, causal inference, and temporal integration, which highlights the statistical problems in everyday perception. It is from this platform that we are able to review the pain literature, providing evidence from experimental, acute, and persistent phenomena to demonstrate the advantages of adopting a statistical account in pain. Our probabilistic conceptualisation suggests a principles-based view of pain, explaining a broad range of experimental and clinical findings and making testable predictions. PMID:28081134
Statistical evaluation of forecasts
NASA Astrophysics Data System (ADS)
Mader, Malenka; Mader, Wolfgang; Gluckman, Bruce J.; Timmer, Jens; Schelter, Björn
2014-08-01
Reliable forecasts of extreme but rare events, such as earthquakes, financial crashes, and epileptic seizures, would render interventions and precautions possible. Therefore, forecasting methods have been developed which intend to raise an alarm if an extreme event is about to occur. In order to statistically validate the performance of a prediction system, it must be compared to the performance of a random predictor, which raises alarms independent of the events. Such a random predictor can be obtained by bootstrapping or analytically. We propose an analytic statistical framework which, in contrast to conventional methods, allows for validating independently the sensitivity and specificity of a forecasting method. Moreover, our method accounts for the periods during which an event has to remain absent or occur after a respective forecast.
1979 DOE statistical symposium
Gardiner, D.A.; Truett T.
1980-09-01
The 1979 DOE Statistical Symposium was the fifth in the series of annual symposia designed to bring together statisticians and other interested parties who are actively engaged in helping to solve the nation's energy problems. The program included presentations of technical papers centered around exploration and disposal of nuclear fuel, general energy-related topics, and health-related issues, and workshops on model evaluation, risk analysis, analysis of large data sets, and resource estimation.
Relativistic statistical arbitrage
NASA Astrophysics Data System (ADS)
Wissner-Gross, A. D.; Freer, C. E.
2010-11-01
Recent advances in high-frequency financial trading have made light propagation delays between geographically separated exchanges relevant. Here we show that there exist optimal locations from which to coordinate the statistical arbitrage of pairs of spacelike separated securities, and calculate a representative map of such locations on Earth. Furthermore, trading local securities along chains of such intermediate locations results in a novel econophysical effect, in which the relativistic propagation of tradable information is effectively slowed or stopped by arbitrage.
Statistical Methods in Cosmology
NASA Astrophysics Data System (ADS)
Verde, L.
2010-03-01
The advent of large data-set in cosmology has meant that in the past 10 or 20 years our knowledge and understanding of the Universe has changed not only quantitatively but also, and most importantly, qualitatively. Cosmologists rely on data where a host of useful information is enclosed, but is encoded in a non-trivial way. The challenges in extracting this information must be overcome to make the most of a large experimental effort. Even after having converged to a standard cosmological model (the LCDM model) we should keep in mind that this model is described by 10 or more physical parameters and if we want to study deviations from it, the number of parameters is even larger. Dealing with such a high dimensional parameter space and finding parameters constraints is a challenge on itself. Cosmologists want to be able to compare and combine different data sets both for testing for possible disagreements (which could indicate new physics) and for improving parameter determinations. Finally, cosmologists in many cases want to find out, before actually doing the experiment, how much one would be able to learn from it. For all these reasons, sophisiticated statistical techniques are being employed in cosmology, and it has become crucial to know some statistical background to understand recent literature in the field. I will introduce some statistical tools that any cosmologist should know about in order to be able to understand recently published results from the analysis of cosmological data sets. I will not present a complete and rigorous introduction to statistics as there are several good books which are reported in the references. The reader should refer to those.
Statistical Challenges of Astronomy
NASA Astrophysics Data System (ADS)
Feigelson, Eric D.; Babu, G. Jogesh
Digital sky surveys, data from orbiting telescopes, and advances in computation have increased the quantity and quality of astronomical data by several orders of magnitude in recent years. Making sense of this wealth of data requires sophisticated statistical and data analytic techniques. Fortunately, statistical methodologies have similarly made great strides in recent years. Powerful synergies thus emerge when astronomers and statisticians join in examining astrostatistical problems and approaches. The volume focuses on several themes: · The increasing power of Bayesian approaches to modeling astronomical data · The growth of enormous databases, leading an emerging federated Virtual Observatory, and their impact on modern astronomical research · Statistical modeling of critical datasets, such as galaxy clustering and fluctuations in the microwave background radiation, leading to a new era of precision cosmology · Methodologies for uncovering clusters and patterns in multivariate data · The characterization of multiscale patterns in imaging and time series data As in earlier volumes in this series, research contributions discussing topics in one field are joined with commentary from scholars in the other. Short contributed papers covering dozens of astrostatistical topics are also included.
Statistics in fusion experiments
NASA Astrophysics Data System (ADS)
McNeill, D. H.
1997-11-01
Since the reasons for the variability in data from plasma experiments are often unknown or uncontrollable, statistical methods must be applied. Reliable interpretation and public accountability require full data sets. Two examples of data misrepresentation at PPPL are analyzed: Te >100 eV on S-1 spheromak.(M. Yamada, Nucl. Fusion 25, 1327 (1985); reports to DoE; etc.) The reported high values (statistical artifacts of Thomson scattering measurements) were selected from a mass of data with an average of 40 eV or less. ``Correlated'' spectroscopic data were meaningless. (2) Extrapolation to Q >=0.5 for DT in TFTR.(D. Meade et al., IAEA Baltimore (1990), V. 1, p. 9; H. P. Furth, Statements to U. S. Congress (1989).) The DD yield used there was the highest through 1990 (>= 50% above average) and the DT to DD power ratio used was about twice any published value. Average DD yields and published yield ratios scale to Q<0.15 for DT, in accord with the observed performance over the last 3 1/2 years. Press reports of outlier data from TFTR have obscured the fact that the DT behavior follows from trivial scaling of the DD data. Good practice in future fusion research would have confidence intervals and other descriptive statistics accompanying reported numerical values (cf. JAMA).
Statistical Inference at Work: Statistical Process Control as an Example
ERIC Educational Resources Information Center
Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia
2008-01-01
To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…
Machtay; Glatstein
1998-01-01
have shown overall survivals superior to age-matched controls). It is fallacious and illogical to compare nonrandomized series of observation to those of aggressive therapy. In addition to the above problem, the use of DSS introduces another potential issue which we will call the bias of cause-of-death-interpretation. All statistical endpoints (e.g., response rates, local-regional control, freedom from brain metastases), except OS, are known to depend heavily on the methods used to define the endpoint and are often subject to significant interobserver variability. There is no reason to believe that this problem does not occasionally occur with respect to defining a death as due to the index cancer or to intercurrent disease, even though this issue has been poorly studied. In many oncologic situations-for example, metastatic lung cancer-this form of bias does not exist. In some situations, such as head and neck cancer, this could be an intermediate problem (Was that lethal chest tumor a second primary or a metastasis?.Would the fatal aspiration pneumonia have occurred if he still had a tongue?.And what about Mr. B. described above?). In some situations, particularly relatively "good prognosis" neoplasms, this could be a substantial problem, particularly if the adjudication of whether or not a death is cancer-related is performed solely by researchers who have an "interest" in demonstrating a good DSS. What we are most concerned about with this form of bias relates to recent series on observation, such as in early prostate cancer. It is interesting to note that although only 10% of the "observed" patients die from prostate cancer, many develop distant metastases by 10 years (approximately 40% among patients with intermediate grade tumors). Thus, it is implied that many prostate cancer metastases are usually not of themselves lethal, which is a misconception to anyone experienced in taking care of prostate cancer patients. This is inconsistent with U.S. studies of
Correcting a Significance Test for Clustering
ERIC Educational Resources Information Center
Hedges, Larry V.
2007-01-01
A common mistake in analysis of cluster randomized trials is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. This typically leads to an overstatement of the precision of results and anticonservative conclusions about precision and statistical significance of treatment effects. This…
Statistical considerations in design of spacelab experiments
NASA Technical Reports Server (NTRS)
Robinson, J.
1978-01-01
After making an analysis of experimental error sources, statistical models were developed for the design and analysis of potential Space Shuttle experiments. Guidelines for statistical significance and/or confidence limits of expected results were also included. The models were then tested out on the following proposed Space Shuttle biomedical experiments: (1) bone density by computer tomography; (2) basal metabolism; and (3) total body water. Analysis of those results and therefore of the models proved inconclusive due to the lack of previous research data and statistical values. However, the models were seen as possible guides to making some predictions and decisions.
Truth, Damn Truth, and Statistics
ERIC Educational Resources Information Center
Velleman, Paul F.
2008-01-01
Statisticians and Statistics teachers often have to push back against the popular impression that Statistics teaches how to lie with data. Those who believe incorrectly that Statistics is solely a branch of Mathematics (and thus algorithmic), often see the use of judgment in Statistics as evidence that we do indeed manipulate our results. In the…
Experimental Mathematics and Computational Statistics
Bailey, David H.; Borwein, Jonathan M.
2009-04-30
The field of statistics has long been noted for techniques to detect patterns and regularities in numerical data. In this article we explore connections between statistics and the emerging field of 'experimental mathematics'. These includes both applications of experimental mathematics in statistics, as well as statistical methods applied to computational mathematics.
Using scientifically and statistically sufficient statistics in comparing image segmentations.
Chi, Yueh-Yun; Muller, Keith E
2010-01-01
Automatic computer segmentation in three dimensions creates opportunity to reduce the cost of three-dimensional treatment planning of radiotherapy for cancer treatment. Comparisons between human and computer accuracy in segmenting kidneys in CT scans generate distance values far larger in number than the number of CT scans. Such high dimension, low sample size (HDLSS) data present a grand challenge to statisticians: how do we find good estimates and make credible inference? We recommend discovering and using scientifically and statistically sufficient statistics as an additional strategy for overcoming the curse of dimensionality. First, we reduced the three-dimensional array of distances for each image comparison to a histogram to be modeled individually. Second, we used non-parametric kernel density estimation to explore distributional patterns and assess multi-modality. Third, a systematic exploratory search for parametric distributions and truncated variations led to choosing a Gaussian form as approximating the distribution of a cube root transformation of distance. Fourth, representing each histogram by an individually estimated distribution eliminated the HDLSS problem by reducing on average 26,000 distances per histogram to just 2 parameter estimates. In the fifth and final step we used classical statistical methods to demonstrate that the two human observers disagreed significantly less with each other than with the computer segmentation. Nevertheless, the size of all disagreements was clinically unimportant relative to the size of a kidney. The hierarchal modeling approach to object-oriented data created response variables deemed sufficient by both the scientists and statisticians. We believe the same strategy provides a useful addition to the imaging toolkit and will succeed with many other high throughput technologies in genetics, metabolomics and chemical analysis.
Who Needs Statistics? | Poster
You may know the feeling. You have collected a lot of new data on an important experiment. Now you are faced with multiple groups of data, a sea of numbers, and a deadline for submitting your paper to a peer-reviewed journal. And you are not sure which data are relevant, or even the best way to present them. The statisticians at Data Management Services (DMS) know how to help. This small group of experts provides a wide array of statistical and mathematical consulting services to the scientific community at NCI at Frederick and NCI-Bethesda.
International petroleum statistics report
1995-10-01
The International Petroleum Statistics Report is a monthly publication that provides current international oil data. This report presents data on international oil production, demand, imports, exports and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). Section 2 presents an oil supply/demand balance for the world, in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries.
NASA Technical Reports Server (NTRS)
1994-01-01
Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
NASA Technical Reports Server (NTRS)
1995-01-01
NASA Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
NASA Technical Reports Server (NTRS)
1996-01-01
This booklet of pocket statistics includes the 1996 NASA Major Launch Record, NASA Procurement, Financial, and Workforce data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Luanch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
Fungi producing significant mycotoxins.
2012-01-01
Mycotoxins are secondary metabolites of microfungi that are known to cause sickness or death in humans or animals. Although many such toxic metabolites are known, it is generally agreed that only a few are significant in causing disease: aflatoxins, fumonisins, ochratoxin A, deoxynivalenol, zearalenone, and ergot alkaloids. These toxins are produced by just a few species from the common genera Aspergillus, Penicillium, Fusarium, and Claviceps. All Aspergillus and Penicillium species either are commensals, growing in crops without obvious signs of pathogenicity, or invade crops after harvest and produce toxins during drying and storage. In contrast, the important Fusarium and Claviceps species infect crops before harvest. The most important Aspergillus species, occurring in warmer climates, are A. flavus and A. parasiticus, which produce aflatoxins in maize, groundnuts, tree nuts, and, less frequently, other commodities. The main ochratoxin A producers, A. ochraceus and A. carbonarius, commonly occur in grapes, dried vine fruits, wine, and coffee. Penicillium verrucosum also produces ochratoxin A but occurs only in cool temperate climates, where it infects small grains. F. verticillioides is ubiquitous in maize, with an endophytic nature, and produces fumonisins, which are generally more prevalent when crops are under drought stress or suffer excessive insect damage. It has recently been shown that Aspergillus niger also produces fumonisins, and several commodities may be affected. F. graminearum, which is the major producer of deoxynivalenol and zearalenone, is pathogenic on maize, wheat, and barley and produces these toxins whenever it infects these grains before harvest. Also included is a short section on Claviceps purpurea, which produces sclerotia among the seeds in grasses, including wheat, barley, and triticale. The main thrust of the chapter contains information on the identification of these fungi and their morphological characteristics, as well as factors
Fragile entanglement statistics
NASA Astrophysics Data System (ADS)
Brody, Dorje C.; Hughston, Lane P.; Meier, David M.
2015-10-01
If X and Y are independent, Y and Z are independent, and so are X and Z, one might be tempted to conclude that X, Y, and Z are independent. But it has long been known in classical probability theory that, intuitive as it may seem, this is not true in general. In quantum mechanics one can ask whether analogous statistics can emerge for configurations of particles in certain types of entangled states. The explicit construction of such states, along with the specification of suitable sets of observables that have the purported statistical properties, is not entirely straightforward. We show that an example of such a configuration arises in the case of an N-particle GHZ state, and we are able to identify a family of observables with the property that the associated measurement outcomes are independent for any choice of 2,3,\\ldots ,N-1 of the particles, even though the measurement outcomes for all N particles are not independent. Although such states are highly entangled, the entanglement turns out to be ‘fragile’, i.e. the associated density matrix has the property that if one traces out the freedom associated with even a single particle, the resulting reduced density matrix is separable.
International petroleum statistics report
1997-05-01
The International Petroleum Statistics Report is a monthly publication that provides current international oil data. This report is published for the use of Members of Congress, Federal agencies, State agencies, industry, and the general public. Publication of this report is in keeping with responsibilities given the Energy Information Administration in Public Law 95-91. The International Petroleum Statistics Report presents data on international oil production, demand, imports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1995; OECD stocks from 1973 through 1995; and OECD trade from 1985 through 1995.
Elements of Statistical Mechanics
NASA Astrophysics Data System (ADS)
Sachs, Ivo; Sen, Siddhartha; Sexton, James
2006-05-01
This textbook provides a concise introduction to the key concepts and tools of modern statistical mechanics. It also covers advanced topics such as non-relativistic quantum field theory and numerical methods. After introducing classical analytical techniques, such as cluster expansion and Landau theory, the authors present important numerical methods with applications to magnetic systems, Lennard-Jones fluids and biophysics. Quantum statistical mechanics is discussed in detail and applied to Bose-Einstein condensation and topics in astrophysics and cosmology. In order to describe emergent phenomena in interacting quantum systems, canonical non-relativistic quantum field theory is introduced and then reformulated in terms of Feynman integrals. Combining the authors' many years' experience of teaching courses in this area, this textbook is ideal for advanced undergraduate and graduate students in physics, chemistry and mathematics. Analytical and numerical techniques in one text, including sample codes and solved problems on the web at www.cambridge.org/0521841984 Covers a wide range of applications including magnetic systems, turbulence astrophysics, and biology Contains a concise introduction to Markov processes and molecular dynamics
Statistical clumped isotope signatures
Röckmann, T.; Popa, M. E.; Krol, M. C.; Hofmann, M. E. G.
2016-01-01
High precision measurements of molecules containing more than one heavy isotope may provide novel constraints on element cycles in nature. These so-called clumped isotope signatures are reported relative to the random (stochastic) distribution of heavy isotopes over all available isotopocules of a molecule, which is the conventional reference. When multiple indistinguishable atoms of the same element are present in a molecule, this reference is calculated from the bulk (≈average) isotopic composition of the involved atoms. We show here that this referencing convention leads to apparent negative clumped isotope anomalies (anti-clumping) when the indistinguishable atoms originate from isotopically different populations. Such statistical clumped isotope anomalies must occur in any system where two or more indistinguishable atoms of the same element, but with different isotopic composition, combine in a molecule. The size of the anti-clumping signal is closely related to the difference of the initial isotope ratios of the indistinguishable atoms that have combined. Therefore, a measured statistical clumped isotope anomaly, relative to an expected (e.g. thermodynamical) clumped isotope composition, may allow assessment of the heterogeneity of the isotopic pools of atoms that are the substrate for formation of molecules. PMID:27535168
ERIC Educational Resources Information Center
Perepiczka, Michelle; Chandler, Nichelle; Becerra, Michael
2011-01-01
Statistics plays an integral role in graduate programs. However, numerous intra- and interpersonal factors may lead to successful completion of needed coursework in this area. The authors examined the extent of the relationship between self-efficacy to learn statistics and statistics anxiety, attitude towards statistics, and social support of 166…
Nonlinear Statistical Modeling of Speech
NASA Astrophysics Data System (ADS)
Srinivasan, S.; Ma, T.; May, D.; Lazarou, G.; Picone, J.
2009-12-01
Contemporary approaches to speech and speaker recognition decompose the problem into four components: feature extraction, acoustic modeling, language modeling and search. Statistical signal processing is an integral part of each of these components, and Bayes Rule is used to merge these components into a single optimal choice. Acoustic models typically use hidden Markov models based on Gaussian mixture models for state output probabilities. This popular approach suffers from an inherent assumption of linearity in speech signal dynamics. Language models often employ a variety of maximum entropy techniques, but can employ many of the same statistical techniques used for acoustic models. In this paper, we focus on introducing nonlinear statistical models to the feature extraction and acoustic modeling problems as a first step towards speech and speaker recognition systems based on notions of chaos and strange attractors. Our goal in this work is to improve the generalization and robustness properties of a speech recognition system. Three nonlinear invariants are proposed for feature extraction: Lyapunov exponents, correlation fractal dimension, and correlation entropy. We demonstrate an 11% relative improvement on speech recorded under noise-free conditions, but show a comparable degradation occurs for mismatched training conditions on noisy speech. We conjecture that the degradation is due to difficulties in estimating invariants reliably from noisy data. To circumvent these problems, we introduce two dynamic models to the acoustic modeling problem: (1) a linear dynamic model (LDM) that uses a state space-like formulation to explicitly model the evolution of hidden states using an autoregressive process, and (2) a data-dependent mixture of autoregressive (MixAR) models. Results show that LDM and MixAR models can achieve comparable performance with HMM systems while using significantly fewer parameters. Currently we are developing Bayesian parameter estimation and
"t" for Two: Using Mnemonics to Teach Statistics
ERIC Educational Resources Information Center
Stalder, Daniel R.; Olson, Elizabeth A.
2011-01-01
This article provides a list of statistical mnemonics for instructor use. This article also reports on the potential for such mnemonics to help students learn, enjoy, and become less apprehensive about statistics. Undergraduates from two sections of a psychology statistics course rated 8 of 11 mnemonics as significantly memorable and helpful in…
A Tablet-PC Software Application for Statistics Classes
ERIC Educational Resources Information Center
Probst, Alexandre C.
2014-01-01
A significant deficiency in the area of introductory statistics education exists: Student performance on standardized assessments after a full semester statistics course is poor and students report a very low desire to learn statistics. Research on the current generation of students indicates an affinity for technology and for multitasking.…
A Statistics Curriculum for the Undergraduate Chemistry Major
ERIC Educational Resources Information Center
Schlotter, Nicholas E.
2013-01-01
Our ability to statistically analyze data has grown significantly with the maturing of computer hardware and software. However, the evolution of our statistics capabilities has taken place without a corresponding evolution in the curriculum for the undergraduate chemistry major. Most faculty understands the need for a statistical educational…
Should College Algebra be a Prerequisite for Taking Psychology Statistics?
ERIC Educational Resources Information Center
Sibulkin, Amy E.; Butler, J. S.
2008-01-01
In order to consider whether a course in college algebra should be a prerequisite for taking psychology statistics, we recorded students' grades in elementary psychology statistics and in college algebra at a 4-year university. Students who earned credit in algebra prior to enrolling in statistics for the first time had a significantly higher mean…
Innovative trend significance test and applications
NASA Astrophysics Data System (ADS)
Şen, Zekai
2015-11-01
Hydro-climatological time series might embed characteristics of past changes concerning climate variability in terms of shifts, cyclic fluctuations, and more significantly in the form of trends. Identification of such features from the available records is one of the prime tasks of hydrologists, climatologists, applied statisticians, or experts in related topics. Although there are different trend identification and significance tests in the literature, they require restrictive assumptions, which may not be existent in the structure of hydro-climatological time series. In this paper, a method is suggested with statistical significance test for trend identification in an innovative manner. This method has non-parametric basis without any restrictive assumption, and its application is rather simple with the concept of sub-series comparisons that are extracted from the main time series. The method provides privilege for selection of sub-temporal half periods for the comparison and, finally, generates trend on objective and quantitative manners. The necessary statistical equations are derived for innovative trend identification and statistical significance test application. The application of the proposed methodology is suggested for three time series from different parts of the world including Southern New Jersey annual temperature, Danube River annual discharge, and Tigris River Diyarbakir meteorology station annual total rainfall records. Each record has significant trend with increasing type in the New Jersey case, whereas in other two cases, decreasing trends exist.
Innovative trend significance test and applications
NASA Astrophysics Data System (ADS)
Şen, Zekai
2017-02-01
Hydro-climatological time series might embed characteristics of past changes concerning climate variability in terms of shifts, cyclic fluctuations, and more significantly in the form of trends. Identification of such features from the available records is one of the prime tasks of hydrologists, climatologists, applied statisticians, or experts in related topics. Although there are different trend identification and significance tests in the literature, they require restrictive assumptions, which may not be existent in the structure of hydro-climatological time series. In this paper, a method is suggested with statistical significance test for trend identification in an innovative manner. This method has non-parametric basis without any restrictive assumption, and its application is rather simple with the concept of sub-series comparisons that are extracted from the main time series. The method provides privilege for selection of sub-temporal half periods for the comparison and, finally, generates trend on objective and quantitative manners. The necessary statistical equations are derived for innovative trend identification and statistical significance test application. The application of the proposed methodology is suggested for three time series from different parts of the world including Southern New Jersey annual temperature, Danube River annual discharge, and Tigris River Diyarbakir meteorology station annual total rainfall records. Each record has significant trend with increasing type in the New Jersey case, whereas in other two cases, decreasing trends exist.
Rossell, David
2016-01-01
Big Data brings unprecedented power to address scientific, economic and societal issues, but also amplifies the possibility of certain pitfalls. These include using purely data-driven approaches that disregard understanding the phenomenon under study, aiming at a dynamically moving target, ignoring critical data collection issues, summarizing or preprocessing the data inadequately and mistaking noise for signal. We review some success stories and illustrate how statistical principles can help obtain more reliable information from data. We also touch upon current challenges that require active methodological research, such as strategies for efficient computation, integration of heterogeneous data, extending the underlying theory to increasingly complex questions and, perhaps most importantly, training a new generation of scientists to develop and deploy these strategies. PMID:27722040
Dienes, J.K.
1983-01-01
An alternative to the use of plasticity theory to characterize the inelastic behavior of solids is to represent the flaws by statistical methods. We have taken such an approach to study fragmentation because it offers a number of advantages. Foremost among these is that, by considering the effects of flaws, it becomes possible to address the underlying physics directly. For example, we have been able to explain why rocks exhibit large strain-rate effects (a consequence of the finite growth rate of cracks), why a spherical explosive imbedded in oil shale produces a cavity with a nearly square section (opening of bedding cracks) and why propellants may detonate following low-speed impact (a consequence of frictional hot spots).
NASA Astrophysics Data System (ADS)
Hsu, Hsiao-Ping; Nadler, Walder; Grassberger, Peter
2005-07-01
The scaling behavior of randomly branched polymers in a good solvent is studied in two to nine dimensions, modeled by lattice animals on simple hypercubic lattices. For the simulations, we use a biased sequential sampling algorithm with re-sampling, similar to the pruned-enriched Rosenbluth method (PERM) used extensively for linear polymers. We obtain high statistics of animals with up to several thousand sites in all dimension 2⩽d⩽9. The partition sum (number of different animals) and gyration radii are estimated. In all dimensions we verify the Parisi-Sourlas prediction, and we verify all exactly known critical exponents in dimensions 2, 3, 4, and ⩾8. In addition, we present the hitherto most precise estimates for growth constants in d⩾3. For clusters with one site attached to an attractive surface, we verify the superuniversality of the cross-over exponent at the adsorption transition predicted by Janssen and Lyssy.
Conditional statistical model building
NASA Astrophysics Data System (ADS)
Hansen, Mads Fogtmann; Hansen, Michael Sass; Larsen, Rasmus
2008-03-01
We present a new statistical deformation model suited for parameterized grids with different resolutions. Our method models the covariances between multiple grid levels explicitly, and allows for very efficient fitting of the model to data on multiple scales. The model is validated on a data set consisting of 62 annotated MR images of Corpus Callosum. One fifth of the data set was used as a training set, which was non-rigidly registered to each other without a shape prior. From the non-rigidly registered training set a shape prior was constructed by performing principal component analysis on each grid level and using the results to construct a conditional shape model, conditioning the finer parameters with the coarser grid levels. The remaining shapes were registered with the constructed shape prior. The dice measures for the registration without prior and the registration with a prior were 0.875 +/- 0.042 and 0.8615 +/- 0.051, respectively.
Statistical physics ""Beyond equilibrium
Ecke, Robert E
2009-01-01
The scientific challenges of the 21st century will increasingly involve competing interactions, geometric frustration, spatial and temporal intrinsic inhomogeneity, nanoscale structures, and interactions spanning many scales. We will focus on a broad class of emerging problems that will require new tools in non-equilibrium statistical physics and that will find application in new material functionality, in predicting complex spatial dynamics, and in understanding novel states of matter. Our work will encompass materials under extreme conditions involving elastic/plastic deformation, competing interactions, intrinsic inhomogeneity, frustration in condensed matter systems, scaling phenomena in disordered materials from glasses to granular matter, quantum chemistry applied to nano-scale materials, soft-matter materials, and spatio-temporal properties of both ordinary and complex fluids.
Statistical Thermodynamics of Biomembranes
Devireddy, Ram V.
2010-01-01
An overview of the major issues involved in the statistical thermodynamic treatment of phospholipid membranes at the atomistic level is summarized: thermodynamic ensembles, initial configuration (or the physical system being modeled), force field representation as well as the representation of long-range interactions. This is followed by a description of the various ways that the simulated ensembles can be analyzed: area of the lipid, mass density profiles, radial distribution functions (RDFs), water orientation profile, Deuteurium order parameter, free energy profiles and void (pore) formation; with particular focus on the results obtained from our recent molecular dynamic (MD) simulations of phospholipids interacting with dimethylsulfoxide (Me2SO), a commonly used cryoprotective agent (CPA). PMID:19460363
Statistical physics of vaccination
NASA Astrophysics Data System (ADS)
Wang, Zhen; Bauch, Chris T.; Bhattacharyya, Samit; d'Onofrio, Alberto; Manfredi, Piero; Perc, Matjaž; Perra, Nicola; Salathé, Marcel; Zhao, Dawei
2016-12-01
Historically, infectious diseases caused considerable damage to human societies, and they continue to do so today. To help reduce their impact, mathematical models of disease transmission have been studied to help understand disease dynamics and inform prevention strategies. Vaccination-one of the most important preventive measures of modern times-is of great interest both theoretically and empirically. And in contrast to traditional approaches, recent research increasingly explores the pivotal implications of individual behavior and heterogeneous contact patterns in populations. Our report reviews the developmental arc of theoretical epidemiology with emphasis on vaccination, as it led from classical models assuming homogeneously mixing (mean-field) populations and ignoring human behavior, to recent models that account for behavioral feedback and/or population spatial/social structure. Many of the methods used originated in statistical physics, such as lattice and network models, and their associated analytical frameworks. Similarly, the feedback loop between vaccinating behavior and disease propagation forms a coupled nonlinear system with analogs in physics. We also review the new paradigm of digital epidemiology, wherein sources of digital data such as online social media are mined for high-resolution information on epidemiologically relevant individual behavior. Armed with the tools and concepts of statistical physics, and further assisted by new sources of digital data, models that capture nonlinear interactions between behavior and disease dynamics offer a novel way of modeling real-world phenomena, and can help improve health outcomes. We conclude the review by discussing open problems in the field and promising directions for future research.
Statistical Literacy: Developing a Youth and Adult Education Statistical Project
ERIC Educational Resources Information Center
Conti, Keli Cristina; Lucchesi de Carvalho, Dione
2014-01-01
This article focuses on the notion of literacy--general and statistical--in the analysis of data from a fieldwork research project carried out as part of a master's degree that investigated the teaching and learning of statistics in adult education mathematics classes. We describe the statistical context of the project that involved the…
Key Statistics for Thyroid Cancer
... and Treatment? Thyroid Cancer About Thyroid Cancer Key Statistics for Thyroid Cancer How common is thyroid cancer? ... remains very low compared with most other cancers. Statistics on survival rates for thyroid cancer are discussed ...
HPV-Associated Cancers Statistics
... What CDC Is Doing Related Links Stay Informed Statistics for Other Kinds of Cancer Breast Cervical Colorectal ( ... Vaginal and Vulvar Cancer Home HPV-Associated Cancer Statistics Language: English Español (Spanish) Recommend on Facebook Tweet ...
Muscular Dystrophy: Data and Statistics
... Statistics Recommend on Facebook Tweet Share Compartir MD STAR net Data and Statistics The following data and ... research [ Read Article ] For more information on MD STAR net see Research and Tracking . Key Findings Feature ...
International petroleum statistics report
1996-05-01
The International Petroleum Statistics Report presents data on international oil production, demand, imports, exports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1995; OECD stocks from 1973 through 1995; and OECD trade from 1084 through 1994.
Topics in statistical mechanics
Elser, V.
1984-05-01
This thesis deals with four independent topics in statistical mechanics: (1) the dimer problem is solved exactly for a hexagonal lattice with general boundary using a known generating function from the theory of partitions. It is shown that the leading term in the entropy depends on the shape of the boundary; (2) continuum models of percolation and self-avoiding walks are introduced with the property that their series expansions are sums over linear graphs with intrinsic combinatorial weights and explicit dimension dependence; (3) a constrained SOS model is used to describe the edge of a simple cubic crystal. Low and high temperature results are derived as well as the detailed behavior near the crystal facet; (4) the microscopic model of the lambda-transition involving atomic permutation cycles is reexamined. In particular, a new derivation of the two-component field theory model of the critical behavior is presented. Results for a lattice model originally proposed by Kikuchi are extended with a high temperature series expansion and Monte Carlo simulation. 30 references.
Statistics of indistinguishable particles.
Wittig, Curt
2009-07-02
The wave function of a system containing identical particles takes into account the relationship between a particle's intrinsic spin and its statistical property. Specifically, the exchange of two identical particles having odd-half-integer spin results in the wave function changing sign, whereas the exchange of two identical particles having integer spin is accompanied by no such sign change. This is embodied in a term (-1)(2s), which has the value +1 for integer s (bosons), and -1 for odd-half-integer s (fermions), where s is the particle spin. All of this is well-known. In the nonrelativistic limit, a detailed consideration of the exchange of two identical particles shows that exchange is accompanied by a 2pi reorientation that yields the (-1)(2s) term. The same bookkeeping is applicable to the relativistic case described by the proper orthochronous Lorentz group, because any proper orthochronous Lorentz transformation can be expressed as the product of spatial rotations and a boost along the direction of motion.
International petroleum statistics report
1996-10-01
The International Petroleum Statistics Report presents data on international oil production, demand, imports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. Word oil production and OECD demand data are for the years 1970 through 1995; OECD stocks from 1973 through 1995; and OECD trade from 1985 through 1995.
International petroleum statistics report
1995-11-01
The International Petroleum Statistics Report presents data on international oil production, demand, imports, exports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1994; OECD stocks from 1973 through 1994; and OECD trade from 1984 through 1994.
International petroleum statistics report
1995-07-27
The International Petroleum Statistics Report presents data on international oil production, demand, imports, and exports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1994; OECD stocks from 1973 through 1994; and OECD trade from 1984 through 1994.
International petroleum statistics report
1997-07-01
The International Petroleum Statistics Report is a monthly publication that provides current international data. The report presents data on international oil production, demand, imports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent 12 months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1996; OECD stocks from 1973 through 1996; and OECD trade from 1986 through 1996.
Information in statistical physics
NASA Astrophysics Data System (ADS)
Balian, Roger
We review with a tutorial scope the information theory foundations of quantum statistical physics. Only a small proportion of the variables that characterize a system at the microscopic scale can be controlled, for both practical and theoretical reasons, and a probabilistic description involving the observers is required. The criterion of maximum von Neumann entropy is then used for making reasonable inferences. It means that no spurious information is introduced besides the known data. Its outcomes can be given a direct justification based on the principle of indifference of Laplace. We introduce the concept of relevant entropy associated with some set of relevant variables; it characterizes the information that is missing at the microscopic level when only these variables are known. For equilibrium problems, the relevant variables are the conserved ones, and the Second Law is recovered as a second step of the inference process. For non-equilibrium problems, the increase of the relevant entropy expresses an irretrievable loss of information from the relevant variables towards the irrelevant ones. Two examples illustrate the flexibility of the choice of relevant variables and the multiplicity of the associated entropies: the thermodynamic entropy (satisfying the Clausius-Duhem inequality) and the Boltzmann entropy (satisfying the H -theorem). The identification of entropy with missing information is also supported by the paradox of Maxwell's demon. Spin-echo experiments show that irreversibility itself is not an absolute concept: use of hidden information may overcome the arrow of time.
Statistical Mechanics of Money
NASA Astrophysics Data System (ADS)
Dragulescu, Adrian; Yakovenko, Victor
2000-03-01
We study a network of agents exchanging money between themselves. We find that the stationary probability distribution of money M is the Gibbs distribution exp(-M/T), where T is an effective ``temperature'' equal to the average amount of money per agent. This is in agreement with the general laws of statistical mechanics, because money is conserved during each transaction and the number of agents is held constant. We have verified the emergence of the Gibbs distribution in computer simulations of various trading rules and models. When the time-reversal symmetry of the trading rules is explicitly broken, deviations from the Gibbs distribution may occur, as follows from the Boltzmann-equation approach to the problem. Money distribution characterizes the purchasing power of a system. A seller would maximize his/her income by setting the price of a product equal to the temperature T of the system. Buying products from a system of temperature T1 and selling it to a system of temperature T2 would generate profit T_2-T_1>0, as in a thermal machine.
Statistical mechanics of nucleosomes
NASA Astrophysics Data System (ADS)
Chereji, Razvan V.
Eukaryotic cells contain long DNA molecules (about two meters for a human cell) which are tightly packed inside the micrometric nuclei. Nucleosomes are the basic packaging unit of the DNA which allows this millionfold compactification. A longstanding puzzle is to understand the principles which allow cells to both organize their genomes into chromatin fibers in the crowded space of their nuclei, and also to keep the DNA accessible to many factors and enzymes. With the nucleosomes covering about three quarters of the DNA, their positions are essential because these influence which genes can be regulated by the transcription factors and which cannot. We study physical models which predict the genome-wide organization of the nucleosomes and also the relevant energies which dictate this organization. In the last five years, the study of chromatin knew many important advances. In particular, in the field of nucleosome positioning, new techniques of identifying nucleosomes and the competing DNA-binding factors appeared, as chemical mapping with hydroxyl radicals, ChIP-exo, among others, the resolution of the nucleosome maps increased by using paired-end sequencing, and the price of sequencing an entire genome decreased. We present a rigorous statistical mechanics model which is able to explain the recent experimental results by taking into account nucleosome unwrapping, competition between different DNA-binding proteins, and both the interaction between histones and DNA, and between neighboring histones. We show a series of predictions of our new model, all in agreement with the experimental observations.
Statistics: It's in the Numbers!
ERIC Educational Resources Information Center
Deal, Mary M.; Deal, Walter F., III
2007-01-01
Mathematics and statistics play important roles in peoples' lives today. A day hardly passes that they are not bombarded with many different kinds of statistics. As consumers they see statistical information as they surf the web, watch television, listen to their satellite radios, or even read the nutrition facts panel on a cereal box in the…
Digest of Education Statistics, 1980.
ERIC Educational Resources Information Center
Grant, W. Vance; Eiden, Leo J.
The primary purpose of this publication is to provide an abstract of statistical information covering the broad field of American education from prekindergarten through graduate school. Statistical information is presented in 14 figures and 200 tables with brief trend analyses. In addition to updating many of the statistics that have appeared in…
Statistical log analysis made practical
Mitchell, W.K.; Nelson, R.J. )
1991-06-01
This paper discusses the advantages of a statistical approach to log analysis. Statistical techniques use inverse methods to calculate formation parameters. The use of statistical techniques has been limited, however, by the complexity of the mathematics and lengthy computer time required to minimize traditionally used nonlinear equations.
Invention Activities Support Statistical Reasoning
ERIC Educational Resources Information Center
Smith, Carmen Petrick; Kenlan, Kris
2016-01-01
Students' experiences with statistics and data analysis in middle school are often limited to little more than making and interpreting graphs. Although students may develop fluency in statistical procedures and vocabulary, they frequently lack the skills necessary to apply statistical reasoning in situations other than clear-cut textbook examples.…
Teaching Statistics Online Using "Excel"
ERIC Educational Resources Information Center
Jerome, Lawrence
2011-01-01
As anyone who has taught or taken a statistics course knows, statistical calculations can be tedious and error-prone, with the details of a calculation sometimes distracting students from understanding the larger concepts. Traditional statistics courses typically use scientific calculators, which can relieve some of the tedium and errors but…
Exact significance test for Markov order
NASA Astrophysics Data System (ADS)
Pethel, S. D.; Hahs, D. W.
2014-02-01
We describe an exact significance test of the null hypothesis that a Markov chain is nth order. The procedure utilizes surrogate data to yield an exact test statistic distribution valid for any sample size. Surrogate data are generated using a novel algorithm that guarantees, per shot, a uniform sampling from the set of sequences that exactly match the nth order properties of the observed data. Using the test, the Markov order of Tel Aviv rainfall data is examined.
Superluminal motion statistics and cosmology
NASA Astrophysics Data System (ADS)
Vermeulen, R. C.; Cohen, M. H.
1994-08-01
This paper has three parts. First, we give an up-to-date overview of the available apparent velocity (Betaapp) data; second, we present some statistical predictions from simple relativistic beaming models; third, we discuss the inferences which a comparison of data and models allows for both relativistic jets and cosmology. We demonstrate that, in objects selected by Doppler-boosted flux density, likely Lorentz factors (gamma) can be estimated from the first-ranked (Betaapp) in samples as small as 5. Using 25 core-selected quasars, we find that the dependence of gamma on redshift differs depending on the value of qzero: gamma is close to constant over z if qzero = 0.5, but increases with z if qzero = 0.05. Conversely, this result could be used to constrain qzero, using either theoretical limits on gamma or observational constraints on the full distribution of gamma in each of several redshift bins, as could be derived from the (Betaapp) statistics in larger samples. We investigate several modifications to the simple relativistic beam concept, and their effects on the (Betaapp) statistics. There is likely to be a spread of gamma over the sample, with relative width W. There could also be a separate pattern and bulk gamma, which we model with a factor r identically equal to gammap/gammab. The values of W and r are coupled, and a swath in the (W,r)-plane is allowed by the (Betaapp) data in core-selected quasars. Interestingly, gammap could be both smaller and larger than gammab, or they could be equal, if W is large, but the most naive model (0,1) -- the same Lorentz factor in all sources and no separate pattern motions -- is excluded. A possible cutoff in quasar jet orientations, as in some unification models, causes a sharp shift toward higher (Betaapp) in randomly oriented samples but does not strongly affect the statistics of core-selected samples. If there is moderate bending of the jets on parsec scales, on the other hand, this has no significant impact on
Ideal statistically quasi Cauchy sequences
NASA Astrophysics Data System (ADS)
Savas, Ekrem; Cakalli, Huseyin
2016-08-01
An ideal I is a family of subsets of N, the set of positive integers which is closed under taking finite unions and subsets of its elements. A sequence (xk) of real numbers is said to be S(I)-statistically convergent to a real number L, if for each ɛ > 0 and for each δ > 0 the set { n ∈N :1/n | { k ≤n :| xk-L | ≥ɛ } | ≥δ } belongs to I. We introduce S(I)-statistically ward compactness of a subset of R, the set of real numbers, and S(I)-statistically ward continuity of a real function in the senses that a subset E of R is S(I)-statistically ward compact if any sequence of points in E has an S(I)-statistically quasi-Cauchy subsequence, and a real function is S(I)-statistically ward continuous if it preserves S(I)-statistically quasi-Cauchy sequences where a sequence (xk) is called to be S(I)-statistically quasi-Cauchy when (Δxk) is S(I)-statistically convergent to 0. We obtain results related to S(I)-statistically ward continuity, S(I)-statistically ward compactness, Nθ-ward continuity, and slowly oscillating continuity.
Statistical Symbolic Execution with Informed Sampling
NASA Technical Reports Server (NTRS)
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Basic statistics in cell biology.
Vaux, David L
2014-01-01
The physicist Ernest Rutherford said, "If your experiment needs statistics, you ought to have done a better experiment." Although this aphorism remains true for much of today's research in cell biology, a basic understanding of statistics can be useful to cell biologists to help in monitoring the conduct of their experiments, in interpreting the results, in presenting them in publications, and when critically evaluating research by others. However, training in statistics is often focused on the sophisticated needs of clinical researchers, psychologists, and epidemiologists, whose conclusions depend wholly on statistics, rather than the practical needs of cell biologists, whose experiments often provide evidence that is not statistical in nature. This review describes some of the basic statistical principles that may be of use to experimental biologists, but it does not cover the sophisticated statistics needed for papers that contain evidence of no other kind.
Statistical Analysis Experiment for Freshman Chemistry Lab.
ERIC Educational Resources Information Center
Salzsieder, John C.
1995-01-01
Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…
Statistical Misconceptions and Rushton's Writings on Race.
ERIC Educational Resources Information Center
Cernovsky, Zack Z.
The term "statistical significance" is often misunderstood or abused to imply a large effect size. A recent example is in the work of J. P. Rushton (1988, 1990) on differences between Negroids and Caucasoids. Rushton used brain size and cranial size as indicators of intelligence, using Pearson "r"s ranging from 0.03 to 0.35.…
Gaussian statistics for palaeomagnetic vectors
Love, J.J.; Constable, C.G.
2003-01-01
formulate the inverse problem, and how to estimate the mean and variance of the magnetic vector field, even when the data consist of mixed combinations of directions and intensities. We examine palaeomagnetic secular-variation data from Hawaii and Re??union, and although these two sites are on almost opposite latitudes, we find significant differences in the mean vector and differences in the local vectorial variances, with the Hawaiian data being particularly anisotropic. These observations are inconsistent with a description of the mean field as being a simple geocentric axial dipole and with secular variation being statistically symmetrical with respect to reflection through the equatorial plane. Finally, our analysis of palaeomagnetic acquisition data from the 1960 Kilauea flow in Hawaii and the Holocene Xitle flow in Mexico, is consistent with the widely held suspicion that directional data are more accurate than intensity data.
Statistical Seismology and Induced Seismicity
NASA Astrophysics Data System (ADS)
Tiampo, K. F.; González, P. J.; Kazemian, J.
2014-12-01
While seismicity triggered or induced by natural resources production such as mining or water impoundment in large dams has long been recognized, the recent increase in the unconventional production of oil and gas has been linked to rapid rise in seismicity in many places, including central North America (Ellsworth et al., 2012; Ellsworth, 2013). Worldwide, induced events of M~5 have occurred and, although rare, have resulted in both damage and public concern (Horton, 2012; Keranen et al., 2013). In addition, over the past twenty years, the increase in both number and coverage of seismic stations has resulted in an unprecedented ability to precisely record the magnitude and location of large numbers of small magnitude events. The increase in the number and type of seismic sequences available for detailed study has revealed differences in their statistics that previously difficult to quantify. For example, seismic swarms that produce significant numbers of foreshocks as well as aftershocks have been observed in different tectonic settings, including California, Iceland, and the East Pacific Rise (McGuire et al., 2005; Shearer, 2012; Kazemian et al., 2014). Similarly, smaller events have been observed prior to larger induced events in several occurrences from energy production. The field of statistical seismology has long focused on the question of triggering and the mechanisms responsible (Stein et al., 1992; Hill et al., 1993; Steacy et al., 2005; Parsons, 2005; Main et al., 2006). For example, in most cases the associated stress perturbations are much smaller than the earthquake stress drop, suggesting an inherent sensitivity to relatively small stress changes (Nalbant et al., 2005). Induced seismicity provides the opportunity to investigate triggering and, in particular, the differences between long- and short-range triggering. Here we investigate the statistics of induced seismicity sequences from around the world, including central North America and Spain, and
Interpretation and use of statistics in nursing research.
Giuliano, Karen K; Polanowicz, Michelle
2008-01-01
A working understanding of the major fundamentals of statistical analysis is required to incorporate the findings of empirical research into nursing practice. The primary focus of this article is to describe common statistical terms, present some common statistical tests, and explain the interpretation of results from inferential statistics in nursing research. An overview of major concepts in statistics, including the distinction between parametric and nonparametric statistics, different types of data, and the interpretation of statistical significance, is reviewed. Examples of some of the most common statistical techniques used in nursing research, such as the Student independent t test, analysis of variance, and regression, are also discussed. Nursing knowledge based on empirical research plays a fundamental role in the development of evidence-based nursing practice. The ability to interpret and use quantitative findings from nursing research is an essential skill for advanced practice nurses to ensure provision of the best care possible for our patients.
Siegel, Rebecca L; Miller, Kimberly D; Jemal, Ahmedin
2017-01-01
Each year, the American Cancer Society estimates the numbers of new cancer cases and deaths that will occur in the United States in the current year and compiles the most recent data on cancer incidence, mortality, and survival. Incidence data were collected by the Surveillance, Epidemiology, and End Results Program; the National Program of Cancer Registries; and the North American Association of Central Cancer Registries. Mortality data were collected by the National Center for Health Statistics. In 2017, 1,688,780 new cancer cases and 600,920 cancer deaths are projected to occur in the United States. For all sites combined, the cancer incidence rate is 20% higher in men than in women, while the cancer death rate is 40% higher. However, sex disparities vary by cancer type. For example, thyroid cancer incidence rates are 3-fold higher in women than in men (21 vs 7 per 100,000 population), despite equivalent death rates (0.5 per 100,000 population), largely reflecting sex differences in the "epidemic of diagnosis." Over the past decade of available data, the overall cancer incidence rate (2004-2013) was stable in women and declined by approximately 2% annually in men, while the cancer death rate (2005-2014) declined by about 1.5% annually in both men and women. From 1991 to 2014, the overall cancer death rate dropped 25%, translating to approximately 2,143,200 fewer cancer deaths than would have been expected if death rates had remained at their peak. Although the cancer death rate was 15% higher in blacks than in whites in 2014, increasing access to care as a result of the Patient Protection and Affordable Care Act may expedite the narrowing racial gap; from 2010 to 2015, the proportion of blacks who were uninsured halved, from 21% to 11%, as it did for Hispanics (31% to 16%). Gains in coverage for traditionally underserved Americans will facilitate the broader application of existing cancer control knowledge across every segment of the population. CA Cancer J Clin
NASA Astrophysics Data System (ADS)
Holmes, Jon L.
2000-06-01
IP-number access. Current subscriptions can be upgraded to IP-number access at little additional cost. We are pleased to be able to offer to institutions and libraries this convenient mode of access to subscriber only resources at JCE Online. JCE Online Usage Statistics We are continually amazed by the activity at JCE Online. So far, the year 2000 has shown a marked increase. Given the phenomenal overall growth of the Internet, perhaps our surprise is not warranted. However, during the months of January and February 2000, over 38,000 visitors requested over 275,000 pages. This is a monthly increase of over 33% from the October-December 1999 levels. It is good to know that people are visiting, but we would very much like to know what you would most like to see at JCE Online. Please send your suggestions to JCEOnline@chem.wisc.edu. For those who are interested, JCE Online year-to-date statistics are available. Biographical Snapshots of Famous Chemists: Mission Statement Feature Editor: Barbara Burke Chemistry Department, California State Polytechnic University-Pomona, Pomona, CA 91768 phone: 909/869-3664 fax: 909/869-4616 email: baburke@csupomona.edu The primary goal of this JCE Internet column is to provide information about chemists who have made important contributions to chemistry. For each chemist, there is a short biographical "snapshot" that provides basic information about the person's chemical work, gender, ethnicity, and cultural background. Each snapshot includes links to related websites and to a biobibliographic database. The database provides references for the individual and can be searched through key words listed at the end of each snapshot. All students, not just science majors, need to understand science as it really is: an exciting, challenging, human, and creative way of learning about our natural world. Investigating the life experiences of chemists can provide a means for students to gain a more realistic view of chemistry. In addition students
Statistics without Tears: Complex Statistics with Simple Arithmetic
ERIC Educational Resources Information Center
Smith, Brian
2011-01-01
One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
SOCR: Statistics Online Computational Resource
Dinov, Ivo D.
2011-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
ERIC Educational Resources Information Center
Bothe, Anne K.; Richardson, Jessica D.
2011-01-01
Purpose: To discuss constructs and methods related to assessing the magnitude and the meaning of clinical outcomes, with a focus on applications in speech-language pathology. Method: Professionals in medicine, allied health, psychology, education, and many other fields have long been concerned with issues referred to variously as practical…
ERIC Educational Resources Information Center
Buchanan, Taylor L.; Lohse, Keith R.
2016-01-01
We surveyed researchers in the health and exercise sciences to explore different areas and magnitudes of bias in researchers' decision making. Participants were presented with scenarios (testing a central hypothesis with p = 0.06 or p = 0.04) in a random order and surveyed about what they would do in each scenario. Participants showed significant…
ERIC Educational Resources Information Center
La Spata, Michelle G.; Carter, Christopher W.; Johnson, Wendi L.; McGill, Ryan J.
2016-01-01
The present study examined the utility of video self-modeling (VSM) for reducing externalizing behaviors (e.g., aggression, conduct problems, hyperactivity, and impulsivity) observed within the classroom environment. After identification of relevant target behaviors, VSM interventions were developed for first and second grade students (N = 4),…
Horne, Jim
2008-03-01
Habitually insufficient sleep could contribute towards obesity, metabolic syndrome, etc., via sleepiness-related inactivity and excess energy intake; more controversially, through more direct physiological changes. Epidemiological studies in adult/children point to small clinical risk only in very short (around 5h in adults), or long sleepers, developing over many years, involving hundreds of hours of 'too little' or 'too much' sleep. Although acute 4h/day sleep restriction leads to glucose intolerance and incipient metabolic syndrome, this is too little sleep and cannot be sustained beyond a few days. Few obese adults/children are short sleepers, and few short sleeping adults/children are obese or suffer obesity-related disorders. For adults, about 7h uninterrupted daily sleep is 'healthy'. Extending sleep, even with hypnotics, to lose weight, may take years, compared with the rapidity of utilising extra sleep time to exercise and evaluate one's diet. The real health risk of inadequate sleep comes from a sleepiness-related accident.
Statistical Significance, Effect Size Reporting, and Confidence Intervals: Best Reporting Strategies
ERIC Educational Resources Information Center
Capraro, Robert M.
2004-01-01
With great interest the author read the May 2002 editorial in the "Journal for Research in Mathematics Education (JRME)" (King, 2002) regarding changes to the 5th edition of the "Publication Manual of the American Psychological Association" (APA, 2001). Of special note to him, and of great import to the field of mathematics education research, are…
Constructing the Exact Significance Level for a Person-Fit Statistic.
ERIC Educational Resources Information Center
Liou, Michelle; Chang, Chih-Hsin
1992-01-01
An extension is proposed for the network algorithm introduced by C.R. Mehta and N.R. Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. A simulation study indicates the efficiency of the algorithm. (SLD)
NASA Astrophysics Data System (ADS)
Combes, Frédéric; Trescher, Maximilian; Piéchon, Frédéric; Fuchs, Jean-Noël
2016-10-01
We develop a theory for the analytic computation of the free energy of band insulators in the presence of a uniform and constant electric field. The two key ingredients are a perturbation-like expression of the Wannier-Stark energy spectrum of electrons and a modified statistical mechanics approach involving a local chemical potential in order to deal with the unbounded spectrum and impose the physically relevant electronic filling. At first order in the field, we recover the result of King-Smith, Vanderbilt, and Resta for the electric polarization in terms of a Zak phase—albeit at finite temperature—and, at second order, deduce a general formula for the electric susceptibility, or equivalently for the dielectric constant. Advantages of our method are the validity of the formalism both at zero and finite temperature and the easy computation of higher order derivatives of the free energy. We verify our findings on two different one-dimensional tight-binding models.
Graphic presentation of the simplest statistical tests
NASA Astrophysics Data System (ADS)
Georgiev, Tsvetan B.
This paper presents graphically well known tests about change of population mean and standard deviation, about comparison of population means and standard deviations, as well as about significance of correlation and regression coefficients. The critical bounds and criteria for variability with statistical guaranty P=95 % and P=99 % are presented as dependences on the data number n. The graphs further give fast visual solutions of the direct problem (estimation of confidence interval for specified P and n), as well of the reverse problem (estimation of n, which is necessary for achieving a desired statistical guaranty of the result). The aim of the work is to present the simplest statistical tests in a comprehensible and convenient graphs, which will be always at hand. The graphs may be useful in the investigations of time series in astronomy, geophysics, ecology etc., as well as in the education.
A spatial scan statistic for multinomial data.
Jung, Inkyung; Kulldorff, Martin; Richard, Otukei John
2010-08-15
As a geographical cluster detection analysis tool, the spatial scan statistic has been developed for different types of data such as Bernoulli, Poisson, ordinal, exponential and normal. Another interesting data type is multinomial. For example, one may want to find clusters where the disease-type distribution is statistically significantly different from the rest of the study region when there are different types of disease. In this paper, we propose a spatial scan statistic for such data, which is useful for geographical cluster detection analysis for categorical data without any intrinsic order information. The proposed method is applied to meningitis data consisting of five different disease categories to identify areas with distinct disease-type patterns in two counties in the U.K. The performance of the method is evaluated through a simulation study.
Schmidt decomposition and multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Bogdanov, Yu. I.; Bogdanova, N. A.; Fastovets, D. V.; Luckichev, V. F.
2016-12-01
The new method of multivariate data analysis based on the complements of classical probability distribution to quantum state and Schmidt decomposition is presented. We considered Schmidt formalism application to problems of statistical correlation analysis. Correlation of photons in the beam splitter output channels, when input photons statistics is given by compound Poisson distribution is examined. The developed formalism allows us to analyze multidimensional systems and we have obtained analytical formulas for Schmidt decomposition of multivariate Gaussian states. It is shown that mathematical tools of quantum mechanics can significantly improve the classical statistical analysis. The presented formalism is the natural approach for the analysis of both classical and quantum multivariate systems and can be applied in various tasks associated with research of dependences.
Statistical initial orbit determination
Taff, L.G.; Belkin, B.; Schweiter, G.A.; Sommar, K. D.H. Wagner Associates, Inc., Paoli, PA )
1992-02-01
For the ballistic missile initial orbit determination problem in particular, the concept of 'launch folders' is extended. This allows to decouple the observational data from the initial orbit determination problem per se. The observational data is only used to select among the possible orbital element sets in the group of folders. Monte Carlo simulations using up to 7200 orbital element sets are described. The results are compared to the true orbital element set and the one a good radar would have been able to produce if collocated with the optical sensor. The simplest version of the new method routinely outperforms the radar initial orbital element set by a factor of two in future miss distance. In addition, not only can a differentially corrected orbital element set be produced via this approach - after only two measurements of direction - but also an updated, meaningful, six-dimensional covariance array for it can be calculated. This technique represents a significant advance in initial orbit determination for this problem, and the concept can easily be extended to minor planets and artificial satellites. 9 refs.
Review: Zinc’s functional significance in the vertebrate retina
Chappell, Richard L.
2014-01-01
This review covers a broad range of topics related to the actions of zinc on the cells of the vertebrate retina. Much of this review relies on studies in which zinc was applied exogenously, and therefore the results, albeit highly suggestive, lack physiologic significance. This view stems from the fact that the concentrations of zinc used in these studies may not be encountered under the normal circumstances of life. This caveat is due to the lack of a zinc-specific probe with which to measure the concentrations of Zn2+ that may be released from neurons or act upon them. However, a great deal of relevant information has been garnered from studies in which Zn2+ was chelated, and the effects of its removal compared with findings obtained in its presence. For a more complete discussion of the consequences of depletion or excess in the body’s trace elements, the reader is referred to a recent review by Ugarte et al. in which they provide a detailed account of the interactions, toxicity, and metabolic activity of the essential trace elements iron, zinc, and copper in retinal physiology and disease. In addition, Smart et al. have published a splendid review on the modulation by zinc of inhibitory and excitatory amino acid receptor ion channels. PMID:25324679
Rock Statistics at the Mars Pathfinder Landing Site, Roughness and Roving on Mars
NASA Technical Reports Server (NTRS)
Haldemann, A. F. C.; Bridges, N. T.; Anderson, R. C.; Golombek, M. P.
1999-01-01
Several rock counts have been carried out at the Mars Pathfinder landing site producing consistent statistics of rock coverage and size-frequency distributions. These rock statistics provide a primary element of "ground truth" for anchoring remote sensing information used to pick the Pathfinder, and future, landing sites. The observed rock population statistics should also be consistent with the emplacement and alteration processes postulated to govern the landing site landscape. The rock population databases can however be used in ways that go beyond the calculation of cumulative number and cumulative area distributions versus rock diameter and height. Since the spatial parameters measured to characterize each rock are determined with stereo image pairs, the rock database serves as a subset of the full landing site digital terrain model (DTM). Insofar as a rock count can be carried out in a speedier, albeit coarser, manner than the full DTM analysis, rock counting offers several operational and scientific products in the near term. Quantitative rock mapping adds further information to the geomorphic study of the landing site, and can also be used for rover traverse planning. Statistical analysis of the surface roughness using the rock count proxy DTM is sufficiently accurate when compared to the full DTM to compare with radar remote sensing roughness measures, and with rover traverse profiles.
Statistical label fusion with hierarchical performance models
NASA Astrophysics Data System (ADS)
Asman, Andrew J.; Dagley, Alexander S.; Landman, Bennett A.
2014-03-01
Label fusion is a critical step in many image segmentation frameworks (e.g., multi-atlas segmentation) as it provides a mechanism for generalizing a collection of labeled examples into a single estimate of the underlying segmentation. In the multi-label case, typical label fusion algorithms treat all labels equally - fully neglecting the known, yet complex, anatomical relationships exhibited in the data. To address this problem, we propose a generalized statistical fusion framework using hierarchical models of rater performance. Building on the seminal work in statistical fusion, we reformulate the traditional rater performance model from a multi-tiered hierarchical perspective. This new approach provides a natural framework for leveraging known anatomical relationships and accurately modeling the types of errors that raters (or atlases) make within a hierarchically consistent formulation. Herein, we describe several contributions. First, we derive a theoretical advancement to the statistical fusion framework that enables the simultaneous estimation of multiple (hierarchical) performance models within the statistical fusion context. Second, we demonstrate that the proposed hierarchical formulation is highly amenable to the state-of-the-art advancements that have been made to the statistical fusion framework. Lastly, in an empirical whole-brain segmentation task we demonstrate substantial qualitative and significant quantitative improvement in overall segmentation accuracy.
Education Statistics Quarterly, Fall 2000.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2000-01-01
The "Education Statistics Quarterly" gives a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released during a 3-month period. Each message also contains a…
Zemstvo Statistics on Public Education.
ERIC Educational Resources Information Center
Abramov, V. F.
1997-01-01
Surveys the general organizational principles and forms of keeping the zemstvo (regional) statistics on Russian public education. Conveys that they were subdivided into three types: (1) the current statistics that continuously monitored schools; (2) basic surveys that provided a comprehensive characterization of a given territory's public…
Representational Versatility in Learning Statistics
ERIC Educational Resources Information Center
Graham, Alan T.; Thomas, Michael O. J.
2005-01-01
Statistical data can be represented in a number of qualitatively different ways, the choice depending on the following three conditions: the concepts to be investigated; the nature of the data; and the purpose for which they were collected. This paper begins by setting out frameworks that describe the nature of statistical thinking in schools, and…
Modern Statistical Methods for Astronomy
NASA Astrophysics Data System (ADS)
Feigelson, Eric D.; Babu, G. Jogesh
2012-07-01
1. Introduction; 2. Probability; 3. Statistical inference; 4. Probability distribution functions; 5. Nonparametric statistics; 6. Density estimation or data smoothing; 7. Regression; 8. Multivariate analysis; 9. Clustering, classification and data mining; 10. Nondetections: censored and truncated data; 11. Time series analysis; 12. Spatial point processes; Appendices; Index.
Digest of Education Statistics, 1998.
ERIC Educational Resources Information Center
Snyder, Thomas D.; Hoffman, Charlene M.; Geddes, Claire M.
This 1998 edition of the "Digest of Education Statistics" is the 34th in a series of publications initiated in 1962. Its primary purpose is to provide a compilation of statistical information covering the broad field of American education from kindergarten through graduate school. The digest includes data from many government and private…
Explorations in Statistics: Confidence Intervals
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2009-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This third installment of "Explorations in Statistics" investigates confidence intervals. A confidence interval is a range that we expect, with some level of confidence, to include the true value of a population parameter…
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Bosch, Stephen; Ink, Gary; Lofquist, William S.
1998-01-01
Provides data on prices of U.S. and foreign materials; book title output and average prices, 1996 final and 1997 preliminary figures; book sales statistics, 1997--AAP preliminary estimates; U.S. trade in books, 1997; international book title output, 1990-95; book review media statistics; and number of book outlets in the U.S. and Canada. (PEN)
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Sullivan, Sharon G.; Ink, Gary; Grabois, Andrew; Barr, Catherine
2001-01-01
Includes six articles that discuss research and statistics relating to the book trade. Topics include prices of U.S. and foreign materials; book title output and average prices; book sales statistics; book exports and imports; book outlets in the U.S. and Canada; and books and other media reviewed. (LRW)
Canadian Statistics in the Classroom.
ERIC Educational Resources Information Center
School Libraries in Canada, 2002
2002-01-01
Includes 22 articles that address the use of Canadian statistics in the classroom. Highlights include the Statistics Canada Web site; other Web resources; original sources; critical thinking; debating with talented and gifted students; teaching marketing; environmental resources; data management; social issues and values; math instruction; reading…
Statistical Factors in Complexation Reactions.
ERIC Educational Resources Information Center
Chung, Chung-Sun
1985-01-01
Four cases which illustrate statistical factors in complexation reactions (where two of the reactants are monodentate ligands) are presented. Included are tables showing statistical factors for the reactions of: (1) square-planar complexes; (2) tetrahedral complexes; and (3) octahedral complexes. (JN)
SOCR: Statistics Online Computational Resource
ERIC Educational Resources Information Center
Dinov, Ivo D.
2006-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an…
Statistical Methods in Psychology Journals.
ERIC Educational Resources Information Center
Willkinson, Leland
1999-01-01
Proposes guidelines for revising the American Psychological Association (APA) publication manual or other APA materials to clarify the application of statistics in research reports. The guidelines are intended to induce authors and editors to recognize the thoughtless application of statistical methods. Contains 54 references. (SLD)
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Alexander, Adrian W.; And Others
1994-01-01
The six articles in this section examine prices of U.S. and foreign materials; book title output and average prices; book sales statistics; U.S. book exports and imports; number of book outlets in the United States and Canada; and book review media statistics. (LRW)
Design of order statistics filters using feedforward neural networks
NASA Astrophysics Data System (ADS)
Maslennikova, Yu. S.; Bochkarev, V. V.
2016-08-01
In recent years significant progress have been made in the development of nonlinear data processing techniques. Such techniques are widely used in digital data filtering and image enhancement. Many of the most effective nonlinear filters based on order statistics. The widely used median filter is the best known order statistic filter. Generalized form of these filters could be presented based on Lloyd's statistics. Filters based on order statistics have excellent robustness properties in the presence of impulsive noise. In this paper, we present special approach for synthesis of order statistics filters using artificial neural networks. Optimal Lloyd's statistics are used for selecting of initial weights for the neural network. Adaptive properties of neural networks provide opportunities to optimize order statistics filters for data with asymmetric distribution function. Different examples demonstrate the properties and performance of presented approach.
Impaired Statistical Learning in Developmental Dyslexia
Thiessen, Erik D.; Holt, Lori L.
2015-01-01
Purpose Developmental dyslexia (DD) is commonly thought to arise from phonological impairments. However, an emerging perspective is that a more general procedural learning deficit, not specific to phonological processing, may underlie DD. The current study examined if individuals with DD are capable of extracting statistical regularities across sequences of passively experienced speech and nonspeech sounds. Such statistical learning is believed to be domain-general, to draw upon procedural learning systems, and to relate to language outcomes. Method DD and control groups were familiarized with a continuous stream of syllables or sine-wave tones, the ordering of which was defined by high or low transitional probabilities across adjacent stimulus pairs. Participants subsequently judged two 3-stimulus test items with either high or low statistical coherence as being the most similar to the sounds heard during familiarization. Results As with control participants, the DD group was sensitive to the transitional probability structure of the familiarization materials as evidenced by above-chance performance. However, the performance of participants with DD was significantly poorer than controls across linguistic and nonlinguistic stimuli. In addition, reading-related measures were significantly correlated with statistical learning performance of both speech and nonspeech material. Conclusion Results are discussed in light of procedural learning impairments among participants with DD. PMID:25860795
Significant warming of the Antarctic winter troposphere.
Turner, J; Lachlan-Cope, T A; Colwell, S; Marshall, G J; Connolley, W M
2006-03-31
We report an undocumented major warming of the Antarctic winter troposphere that is larger than any previously identified regional tropospheric warming on Earth. This result has come to light through an analysis of recently digitized and rigorously quality controlled Antarctic radiosonde observations. The data show that regional midtropospheric temperatures have increased at a statistically significant rate of 0.5 degrees to 0.7 degrees Celsius per decade over the past 30 years. Analysis of the time series of radiosonde temperatures indicates that the data are temporally homogeneous. The available data do not allow us to unambiguously assign a cause to the tropospheric warming at this stage.
Characterizations of linear sufficient statistics
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Reoner, R.; Decell, H. P., Jr.
1977-01-01
A surjective bounded linear operator T from a Banach space X to a Banach space Y must be a sufficient statistic for a dominated family of probability measures defined on the Borel sets of X. These results were applied, so that they characterize linear sufficient statistics for families of the exponential type, including as special cases the Wishart and multivariate normal distributions. The latter result was used to establish precisely which procedures for sampling from a normal population had the property that the sample mean was a sufficient statistic.
Indigenous family violence: a statistical challenge.
Cripps, Kyllie
2008-12-01
The issue of family violence and sexual abuse in Indigenous communities across Australia has attracted much attention throughout 2007, including significant intervention by the federal government into communities deemed to be in crisis. This paper critically examines the reporting and recording of Indigenous violence in Australia and reflects on what 'statistics' can offer as we grapple with how to respond appropriately to a problem defined as a 'national emergency'.
The faulty statistics of complementary alternative medicine (CAM).
Pandolfi, Maurizio; Carreras, Giulia
2014-09-01
The authors illustrate the difficulties involved in obtaining a valid statistical significance in clinical studies especially when the prior probability of the hypothesis under scrutiny is low. Since the prior probability of a research hypothesis is directly related to its scientific plausibility, the commonly used frequentist statistics, which does not take into account this probability, is particularly unsuitable for studies exploring matters in various degree disconnected from science such as complementary alternative medicine (CAM) interventions. Any statistical significance obtained in this field should be considered with great caution and may be better applied to more plausible hypotheses (like placebo effect) than that examined - which usually is the specific efficacy of the intervention. Since achieving meaningful statistical significance is an essential step in the validation of medical interventions, CAM practices, producing only outcomes inherently resistant to statistical validation, appear not to belong to modern evidence-based medicine.
The Statistical Basis of Chemical Equilibria.
ERIC Educational Resources Information Center
Hauptmann, Siegfried; Menger, Eva
1978-01-01
Describes a machine which demonstrates the statistical bases of chemical equilibrium, and in doing so conveys insight into the connections among statistical mechanics, quantum mechanics, Maxwell Boltzmann statistics, statistical thermodynamics, and transition state theory. (GA)
Spina Bifida Data and Statistics
... Materials About Us Information For... Media Policy Makers Data and Statistics Recommend on Facebook Tweet Share Compartir ... non-Hispanic white and non-Hispanic black women. Data from 12 state-based birth defects tracking programs ...
Statistical ecology comes of age.
Gimenez, Olivier; Buckland, Stephen T; Morgan, Byron J T; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-12-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1-4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data.
Middle atmosphere general circulation statistics
NASA Technical Reports Server (NTRS)
Geller, M. A.
1985-01-01
With the increased availability of remote sensing data for the middle atmosphere from satellites, more analyses of the middle atmosphere circulation are being published. Some of these are process studies for limited periods, and some are statistical analyses of middle atmosphere general circulation statistics. Results from the latter class of studies will be reviewed. These include analysis of the zonally averaged middle atmosphere structure, temperature, and zonal winds; analysis of planetary wave structures, analysis of heat and momentum fluxes; and analysis of Eliassen-and-Palm flux vectors and flux divergences. Emphasis is on the annual march of these quantities; Northern and Southern Hemisphere asymmetries; and interannual variability in these statistics. Statistics involving the global ozone distribution and transports of ozone are also discussed.
Summary statistics in auditory perception.
McDermott, Josh H; Schemitsch, Michael; Simoncelli, Eero P
2013-04-01
Sensory signals are transduced at high resolution, but their structure must be stored in a more compact format. Here we provide evidence that the auditory system summarizes the temporal details of sounds using time-averaged statistics. We measured discrimination of 'sound textures' that were characterized by particular statistical properties, as normally result from the superposition of many acoustic features in auditory scenes. When listeners discriminated examples of different textures, performance improved with excerpt duration. In contrast, when listeners discriminated different examples of the same texture, performance declined with duration, a paradoxical result given that the information available for discrimination grows with duration. These results indicate that once these sounds are of moderate length, the brain's representation is limited to time-averaged statistics, which, for different examples of the same texture, converge to the same values with increasing duration. Such statistical representations produce good categorical discrimination, but limit the ability to discern temporal detail.
National Center for Health Statistics
... Topics Data and Tools Publications News and Events Population Surveys National Health and Nutrition Examination Survey National Health Interview Survey National Survey of Family Growth Vital Records National Vital Statistics System National Death ...
FUNSTAT and statistical image representations
NASA Technical Reports Server (NTRS)
Parzen, E.
1983-01-01
General ideas of functional statistical inference analysis of one sample and two samples, univariate and bivariate are outlined. ONESAM program is applied to analyze the univariate probability distributions of multi-spectral image data.
Heart Disease and Stroke Statistics
... failure on the rise; cardiovascular diseases remain leading killer AHA News: Heart failure projected to increase dramatically, ... failure on the rise; cardiovascular diseases remain leading killer 2017 Statistics At-a-Glance Heart Disease and ...
Faculty Salary Equity Cases: Combining Statistics with the Law
ERIC Educational Resources Information Center
Luna, Andrew L.
2006-01-01
Researchers have used many statistical models to determine whether an institution's faculty pay structure is equitable, with varying degrees of success. Little attention, however, has been given to court interpretations of statistical significance or to what variables courts have acknowledged should be used in an equity model. This article…
The Effect Size Statistic: Overview of Various Choices.
ERIC Educational Resources Information Center
Mahadevan, Lakshmi
Over the years, methodologists have been recommending that researchers use magnitude of effect estimates in result interpretation to highlight the distinction between statistical and practical significance (cf. R. Kirk, 1996). A magnitude of effect statistic (i.e., effect size) tells to what degree the dependent variable can be controlled,…
Statistical Theory of Breakup Reactions
NASA Astrophysics Data System (ADS)
Bertulani, Carlos A.; Descouvemont, Pierre; Hussein, Mahir S.
2014-04-01
We propose an alternative for Coupled-Channels calculations with looselybound exotic nuclei(CDCC), based on the the Random Matrix Model of the statistical theory of nuclear reactions. The coupled channels equations are divided into two sets. The first set, described by the CDCC, and the other set treated with RMT. The resulting theory is a Statistical CDCC (CDCCs), able in principle to take into account many pseudo channels.
Hidden Statistics of Schroedinger Equation
NASA Technical Reports Server (NTRS)
Zak, Michail
2011-01-01
Work was carried out in determination of the mathematical origin of randomness in quantum mechanics and creating a hidden statistics of Schr dinger equation; i.e., to expose the transitional stochastic process as a "bridge" to the quantum world. The governing equations of hidden statistics would preserve such properties of quantum physics as superposition, entanglement, and direct-product decomposability while allowing one to measure its state variables using classical methods.
Sanitary Surveys & Significant Deficiencies Presentation
The Sanitary Surveys & Significant Deficiencies Presentation highlights some of the things EPA looks for during drinking water system site visits, how to avoid significant deficiencies and what to do if you receive one.
NASA Astrophysics Data System (ADS)
Gorzkowski, Waldemar
The following sections are included: * NATIONAL PHYSICS OLYMPIADS * DISTRIBUTION OF PRIZES IN TWENTY INTERNATIONAL PHYSICS OLYMPIADS * NUMBERS OF PRIZES IN SUBSEQUENT INTERNATIONAL PHYSICS OLYMPIADS * PROBLEMS AND THEIR MARKING
Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science
Veldkamp, Coosje L. S.; Nuijten, Michèle B.; Dominguez-Alvarez, Linda; van Assen, Marcel A. L. M.; Wicherts, Jelte M.
2014-01-01
Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this ‘co-piloting’ currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors. PMID:25493918
Statistical inference and Aristotle's Rhetoric.
Macdonald, Ranald R
2004-11-01
Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
Fractional statistical potential in graphene
NASA Astrophysics Data System (ADS)
Ardenghi, J. S.
2017-03-01
In this work the fractional statistics is applied to an anyon gas in graphene to obtain the special features that the arbitrary phase interchange of the particle coordinates introduce in the thermodynamic properties. The electron gas is constituted by N anyons in the long wavelength approximation obeying fractional exclusion statistics and the partition function is analyzed in terms of a perturbation expansion up to first order in the dimensionless constant λ / L being L the length of the graphene sheet and λ = βℏvF the thermal wavelength. By considering the correct permutation expansion of the many-anyons wavefunction, taking into account that the phase changes with the number of inversions in each permutation, the statistical fermionic/bosonic potential is obtained and the intermediate statistical behavior is found. It is shown that "extra" fermonic and bosonic particles states appears and this "statistical particle" distribution depends on N. Entropy and specific heat is obtained up to first order in λ / L showing that the results obtained differs from those obtained in different approximation to the fractional exclusion statistics.
Dai, Wu-Sheng Xie, Mi
2013-05-15
In this paper, we give a general discussion on the calculation of the statistical distribution from a given operator relation of creation, annihilation, and number operators. Our result shows that as long as the relation between the number operator and the creation and annihilation operators can be expressed as a{sup †}b=Λ(N) or N=Λ{sup −1}(a{sup †}b), where N, a{sup †}, and b denote the number, creation, and annihilation operators, i.e., N is a function of quadratic product of the creation and annihilation operators, the corresponding statistical distribution is the Gentile distribution, a statistical distribution in which the maximum occupation number is an arbitrary integer. As examples, we discuss the statistical distributions corresponding to various operator relations. In particular, besides the Bose–Einstein and Fermi–Dirac cases, we discuss the statistical distributions for various schemes of intermediate statistics, especially various q-deformation schemes. Our result shows that the statistical distributions corresponding to various q-deformation schemes are various Gentile distributions with different maximum occupation numbers which are determined by the deformation parameter q. This result shows that the results given in much literature on the q-deformation distribution are inaccurate or incomplete. -- Highlights: ► A general discussion on calculating statistical distribution from relations of creation, annihilation, and number operators. ► A systemic study on the statistical distributions corresponding to various q-deformation schemes. ► Arguing that many results of q-deformation distributions in literature are inaccurate or incomplete.
Ergodic theorem, ergodic theory, and statistical mechanics
Moore, Calvin C.
2015-01-01
This perspective highlights the mean ergodic theorem established by John von Neumann and the pointwise ergodic theorem established by George Birkhoff, proofs of which were published nearly simultaneously in PNAS in 1931 and 1932. These theorems were of great significance both in mathematics and in statistical mechanics. In statistical mechanics they provided a key insight into a 60-y-old fundamental problem of the subject—namely, the rationale for the hypothesis that time averages can be set equal to phase averages. The evolution of this problem is traced from the origins of statistical mechanics and Boltzman's ergodic hypothesis to the Ehrenfests' quasi-ergodic hypothesis, and then to the ergodic theorems. We discuss communications between von Neumann and Birkhoff in the Fall of 1931 leading up to the publication of these papers and related issues of priority. These ergodic theorems initiated a new field of mathematical-research called ergodic theory that has thrived ever since, and we discuss some of recent developments in ergodic theory that are relevant for statistical mechanics. PMID:25691697
Statistical Sampling of Tide Heights Study
NASA Technical Reports Server (NTRS)
2002-01-01
The goal of the study was to determine if it was possible to reduce the cost of verifying computational models of tidal waves and currents. Statistical techniques were used to determine the least number of samples required, in a given situation, to remain statistically significant, and thereby reduce overall project costs. Commercial, academic, and Federal agencies could benefit by applying these techniques, without the need to 'touch' every item in the population. For example, the requirement of this project was to measure the heights and times of high and low tides at 8,000 locations for verification of computational models of tidal waves and currents. The application of the statistical techniques began with observations to determine the correctness of submitted measurement data, followed by some assumptions based on the observations. Among the assumptions were that the data were representative of data-collection techniques used at the measurement locations, that time measurements could be ignored (that is, height measurements alone would suffice), and that the height measurements were from a statistically normal distribution. Sample means and standard deviations were determined for all locations. Interval limits were determined for confidence levels of 95, 98, and 99 percent. It was found that the numbers of measurement locations needed to attain these confidence levels were 55, 78, and 96, respectively.
The use of statistics in heart rhythm research: a review.
Shen, Changyu; Yu, Zhangsheng; Liu, Ziyue
2015-06-01
In this article, we provide a brief review of key statistical concepts/methods that are commonly used in heart rhythm research, including concepts such as standard deviation, standard error, confidence interval, statistical/clinical significance, correlation coefficients, multiple comparisons, cohort and case-control studies, and missing data, as well as methods such as statistical hypothesis testing, receiver operating characteristic curve, binary vs time-to-event outcome, competing risk methods, and analysis of correlated data. We also make recommendations on how related statistical procedures should be applied and results should be reported.
Significance testing as perverse probabilistic reasoning
2011-01-01
Truth claims in the medical literature rely heavily on statistical significance testing. Unfortunately, most physicians misunderstand the underlying probabilistic logic of significance tests and consequently often misinterpret their results. This near-universal misunderstanding is highlighted by means of a simple quiz which we administered to 246 physicians at two major academic hospitals, on which the proportion of incorrect responses exceeded 90%. A solid understanding of the fundamental concepts of probability theory is becoming essential to the rational interpretation of medical information. This essay provides a technically sound review of these concepts that is accessible to a medical audience. We also briefly review the debate in the cognitive sciences regarding physicians' aptitude for probabilistic inference. PMID:21356064
ERIC Educational Resources Information Center
Martins, Jose Alexandre; Nascimento, Maria Manuel; Estrada, Assumpta
2012-01-01
Teachers' attitudes towards statistics can have a significant effect on their own statistical training, their teaching of statistics, and the future attitudes of their students. The influence of attitudes in teaching statistics in different contexts was previously studied in the work of Estrada et al. (2004, 2010a, 2010b) and Martins et al.…
On More Sensitive Periodogram Statistics
NASA Astrophysics Data System (ADS)
Bélanger, G.
2016-05-01
Period searches in event data have traditionally used the Rayleigh statistic, R 2. For X-ray pulsars, the standard has been the Z 2 statistic, which sums over more than one harmonic. For γ-rays, the H-test, which optimizes the number of harmonics to sum, is often used. These periodograms all suffer from the same problem, namely artifacts caused by correlations in the Fourier components that arise from testing frequencies with a non-integer number of cycles. This article addresses this problem. The modified Rayleigh statistic is discussed, its generalization to any harmonic, {{ R }}k2, is formulated, and from the latter, the modified Z 2 statistic, {{ Z }}2, is constructed. Versions of these statistics for binned data and point measurements are derived, and it is shown that the variance in the uncertainties can have an important influence on the periodogram. It is shown how to combine the information about the signal frequency from the different harmonics to estimate its value with maximum accuracy. The methods are applied to an XMM-Newton observation of the Crab pulsar for which a decomposition of the pulse profile is presented, and shows that most of the power is in the second, third, and fifth harmonics. Statistical detection power of the {{ R }}k2 statistic is superior to the FFT and equivalent to the Lomb--Scargle (LS). Response to gaps in the data is assessed, and it is shown that the LS does not protect against the distortions they cause. The main conclusion of this work is that the classical R 2 and Z 2 should be replaced by {{ R }}k2 and {{ Z }}2 in all applications with event data, and the LS should be replaced by the {{ R }}k2 when the uncertainty varies from one point measurement to another.
Statistical algorithms for ontology-based annotation of scientific literature
2014-01-01
Background Ontologies encode relationships within a domain in robust data structures that can be used to annotate data objects, including scientific papers, in ways that ease tasks such as search and meta-analysis. However, the annotation process requires significant time and effort when performed by humans. Text mining algorithms can facilitate this process, but they render an analysis mainly based upon keyword, synonym and semantic matching. They do not leverage information embedded in an ontology's structure. Methods We present a probabilistic framework that facilitates the automatic annotation of literature by indirectly modeling the restrictions among the different classes in the ontology. Our research focuses on annotating human functional neuroimaging literature within the Cognitive Paradigm Ontology (CogPO). We use an approach that combines the stochastic simplicity of naïve Bayes with the formal transparency of decision trees. Our data structure is easily modifiable to reflect changing domain knowledge. Results We compare our results across naïve Bayes, Bayesian Decision Trees, and Constrained Decision Tree classifiers that keep a human expert in the loop, in terms of the quality measure of the F1-mirco score. Conclusions Unlike traditional text mining algorithms, our framework can model the knowledge encoded by the dependencies in an ontology, albeit indirectly. We successfully exploit the fact that CogPO has explicitly stated restrictions, and implicit dependencies in the form of patterns in the expert curated annotations. PMID:25093071
Environmental Health Practice: Statistically Based Performance Measurement
Enander, Richard T.; Gagnon, Ronald N.; Hanumara, R. Choudary; Park, Eugene; Armstrong, Thomas; Gute, David M.
2007-01-01
Objectives. State environmental and health protection agencies have traditionally relied on a facility-by-facility inspection-enforcement paradigm to achieve compliance with government regulations. We evaluated the effectiveness of a new approach that uses a self-certification random sampling design. Methods. Comprehensive environmental and occupational health data from a 3-year statewide industry self-certification initiative were collected from representative automotive refinishing facilities located in Rhode Island. Statistical comparisons between baseline and postintervention data facilitated a quantitative evaluation of statewide performance. Results. The analysis of field data collected from 82 randomly selected automotive refinishing facilities showed statistically significant improvements (P<.05, Fisher exact test) in 4 major performance categories: occupational health and safety, air pollution control, hazardous waste management, and wastewater discharge. Statistical significance was also shown when a modified Bonferroni adjustment for multiple comparisons was performed. Conclusions. Our findings suggest that the new self-certification approach to environmental and worker protection is effective and can be used as an adjunct to further enhance state and federal enforcement programs. PMID:17267709
FRB repetition and non-Poissonian statistics
NASA Astrophysics Data System (ADS)
Connor, Liam; Pen, Ue-Li; Oppermann, Niels
2016-05-01
We discuss some of the claims that have been made regarding the statistics of fast radio bursts (FRBs). In an earlier Letter, we conjectured that flicker noise associated with FRB repetition could show up in non-cataclysmic neutron star emission models, like supergiant pulses. We show how the current limits of repetition would be significantly weakened if their repeat rate really were non-Poissonian and had a pink or red spectrum. Repetition and its statistics have implications for observing strategy, generally favouring shallow wide-field surveys, since in the non-repeating scenario survey depth is unimportant. We also discuss the statistics of the apparent latitudinal dependence of FRBs, and offer a simple method for calculating the significance of this effect. We provide a generalized Bayesian framework for addressing this problem, which allows for direct model comparison. It is shown how the evidence for a steep latitudinal gradient of the FRB rate is less strong than initially suggested and simple explanations like increased scattering and sky temperature in the plane are sufficient to decrease the low-latitude burst rate, given current data. The reported dearth of bursts near the plane is further complicated if FRBs have non-Poissonian repetition, since in that case the event rate inferred from observation depends on observing strategy.
Statistical methods in translational medicine.
Chow, Shein-Chung; Tse, Siu-Keung; Lin, Min
2008-12-01
This study focuses on strategies and statistical considerations for assessment of translation in language (e.g. translation of case report forms in multinational clinical trials), information (e.g. translation of basic discoveries to the clinic) and technology (e.g. translation of Chinese diagnostic techniques to well-established clinical study endpoints) in pharmaceutical/clinical research and development. However, most of our efforts will be directed to statistical considerations for translation in information. Translational medicine has been defined as bench-to-bedside research, where a basic laboratory discovery becomes applicable to the diagnosis, treatment or prevention of a specific disease, and is brought forth by either a physicianscientist who works at the interface between the research laboratory and patient care, or by a team of basic and clinical science investigators. Statistics plays an important role in translational medicine to ensure that the translational process is accurate and reliable with certain statistical assurance. Statistical inference for the applicability of an animal model to a human model is also discussed. Strategies for selection of clinical study endpoints (e.g. absolute changes, relative changes, or responder-defined, based on either absolute or relative change) are reviewed.
Integrable matrix theory: Level statistics.
Scaramazza, Jasen A; Shastry, B Sriram; Yuzbashyan, Emil A
2016-09-01
We study level statistics in ensembles of integrable N×N matrices linear in a real parameter x. The matrix H(x) is considered integrable if it has a prescribed number n>1 of linearly independent commuting partners H^{i}(x) (integrals of motion) [H(x),H^{i}(x)]=0, [H^{i}(x),H^{j}(x)]=0, for all x. In a recent work [Phys. Rev. E 93, 052114 (2016)2470-004510.1103/PhysRevE.93.052114], we developed a basis-independent construction of H(x) for any n from which we derived the probability density function, thereby determining how to choose a typical integrable matrix from the ensemble. Here, we find that typical integrable matrices have Poisson statistics in the N→∞ limit provided n scales at least as logN; otherwise, they exhibit level repulsion. Exceptions to the Poisson case occur at isolated coupling values x=x_{0} or when correlations are introduced between typically independent matrix parameters. However, level statistics cross over to Poisson at O(N^{-0.5}) deviations from these exceptions, indicating that non-Poissonian statistics characterize only subsets of measure zero in the parameter space. Furthermore, we present strong numerical evidence that ensembles of integrable matrices are stationary and ergodic with respect to nearest-neighbor level statistics.
Equivalent statistics and data interpretation.
Francis, Gregory
2016-10-14
Recent reform efforts in psychological science have led to a plethora of choices for scientists to analyze their data. A scientist making an inference about their data must now decide whether to report a p value, summarize the data with a standardized effect size and its confidence interval, report a Bayes Factor, or use other model comparison methods. To make good choices among these options, it is necessary for researchers to understand the characteristics of the various statistics used by the different analysis frameworks. Toward that end, this paper makes two contributions. First, it shows that for the case of a two-sample t test with known sample sizes, many different summary statistics are mathematically equivalent in the sense that they are based on the very same information in the data set. When the sample sizes are known, the p value provides as much information about a data set as the confidence interval of Cohen's d or a JZS Bayes factor. Second, this equivalence means that different analysis methods differ only in their interpretation of the empirical data. At first glance, it might seem that mathematical equivalence of the statistics suggests that it does not matter much which statistic is reported, but the opposite is true because the appropriateness of a reported statistic is relative to the inference it promotes. Accordingly, scientists should choose an analysis method appropriate for their scientific investigation. A direct comparison of the different inferential frameworks provides some guidance for scientists to make good choices and improve scientific practice.
40 CFR Appendix IV to Part 265 - Tests for Significance
Code of Federal Regulations, 2012 CFR
2012-07-01
... changes in the concentration or value of an indicator parameter in periodic ground-water samples when... then be compared to the value of the t-statistic found in a table for t-test of significance at the specified level of significance. A calculated value of t which exceeds the value of t found in the...
40 CFR Appendix IV to Part 265 - Tests for Significance
Code of Federal Regulations, 2010 CFR
2010-07-01
... changes in the concentration or value of an indicator parameter in periodic ground-water samples when... then be compared to the value of the t-statistic found in a table for t-test of significance at the specified level of significance. A calculated value of t which exceeds the value of t found in the...
Ethical Statistics and Statistical Ethics: Making an Interdisciplinary Module
ERIC Educational Resources Information Center
Lesser, Lawrence M.; Nordenhaug, Erik
2004-01-01
This article describes an innovative curriculum module the first author created on the two-way exchange between statistics and applied ethics. The module, having no particular mathematical prerequisites beyond high school algebra, is part of an undergraduate interdisciplinary ethics course which begins with a 3-week introduction to basic applied…
Statistics for People Who (Think They) Hate Statistics. Third Edition
ERIC Educational Resources Information Center
Salkind, Neil J.
2007-01-01
This text teaches an often intimidating and difficult subject in a way that is informative, personable, and clear. The author takes students through various statistical procedures, beginning with correlation and graphical representation of data and ending with inferential techniques and analysis of variance. In addition, the text covers SPSS, and…
Writing to Learn Statistics in an Advanced Placement Statistics Course
ERIC Educational Resources Information Center
Northrup, Christian Glenn
2012-01-01
This study investigated the use of writing in a statistics classroom to learn if writing provided a rich description of problem-solving processes of students as they solved problems. Through analysis of 329 written samples provided by students, it was determined that writing provided a rich description of problem-solving processes and enabled…
Noise level and MPEG-2 encoder statistics
NASA Astrophysics Data System (ADS)
Lee, Jungwoo
1997-01-01
Most software in the movie and broadcasting industries are still in analog film or tape format, which typically contains random noise that originated from film, CCD camera, and tape recording. The performance of the MPEG-2 encoder may be significantly degraded by the noise. It is also affected by the scene type that includes spatial and temporal activity. The statistical property of noise originating from camera and tape player is analyzed and the models for the two types of noise are developed. The relationship between the noise, the scene type, and encoder statistics of a number of MPEG-2 parameters such as motion vector magnitude, prediction error, and quant scale are discussed. This analysis is intended to be a tool for designing robust MPEG encoding algorithms such as preprocessing and rate control.
Quantum statistical mechanics in arithmetic topology
NASA Astrophysics Data System (ADS)
Marcolli, Matilde; Xu, Yujie
2017-04-01
This paper provides a construction of a quantum statistical mechanical system associated to knots in the 3-sphere and cyclic branched coverings of the 3-sphere, which is an analog, in the sense of arithmetic topology, of the Bost-Connes system, with knots replacing primes, and cyclic branched coverings of the 3-sphere replacing abelian extensions of the field of rational numbers. The operator algebraic properties of this system differ significantly from the Bost-Connes case, due to the properties of the action of the semigroup of knots on a direct limit of knot groups. The resulting algebra of observables is a noncommutative Bernoulli product. We describe the main properties of the associated quantum statistical mechanical system and of the relevant partition functions, which are obtained from simple knot invariants like genus and crossing number.
Statistical validation of stochastic models
Hunter, N.F.; Barney, P.; Paez, T.L.; Ferregut, C.; Perez, L.
1996-12-31
It is common practice in structural dynamics to develop mathematical models for system behavior, and the authors are now capable of developing stochastic models, i.e., models whose parameters are random variables. Such models have random characteristics that are meant to simulate the randomness in characteristics of experimentally observed systems. This paper suggests a formal statistical procedure for the validation of mathematical models of stochastic systems when data taken during operation of the stochastic system are available. The statistical characteristics of the experimental system are obtained using the bootstrap, a technique for the statistical analysis of non-Gaussian data. The authors propose a procedure to determine whether or not a mathematical model is an acceptable model of a stochastic system with regard to user-specified measures of system behavior. A numerical example is presented to demonstrate the application of the technique.
Statistical modeling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1992-01-01
This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.
Wallace, D L; Perlman, M D
1980-06-01
This report describes the research activities of the Department of Statistics, University of Chicago, during the period June 15, 1975 to July 30, 1979. Nine research projects are briefly described on the following subjects: statistical computing and approximation techniques in statistics; numerical computation of first passage distributions; probabilities of large deviations; combining independent tests of significance; small-sample efficiencies of tests and estimates; improved procedures for simultaneous estimation and testing of many correlations; statistical computing and improved regression methods; comparison of several populations; and unbiasedness in multivariate statistics. A description of the statistical consultation activities of the Department that are of interest to DOE, in particular, the scientific interactions between the Department and the scientists at Argonne National Laboratories, is given. A list of publications issued during the term of the contract is included.
XMM-Newton publication statistics
NASA Astrophysics Data System (ADS)
Ness, J.-U.; Parmar, A. N.; Valencic, L. A.; Smith, R.; Loiseau, N.; Salama, A.; Ehle, M.; Schartel, N.
2014-02-01
We assessed the scientific productivity of XMM-Newton by examining XMM-Newton publications and data usage statistics. We analyse 3272 refereed papers, published until the end of 2012, that directly use XMM-Newton data. The SAO/NASA Astrophysics Data System (ADS) was used to provide additional information on each paper including the number of citations. For each paper, the XMM-Newton observation identifiers and instruments used to provide the scientific results were determined. The identifiers were used to access the XMM-{Newton} Science Archive (XSA) to provide detailed information on the observations themselves and on the original proposals. The information obtained from these sources was then combined to allow the scientific productivity of the mission to be assessed. Since around three years after the launch of XMM-Newton there have been around 300 refereed papers per year that directly use XMM-Newton data. After more than 13 years in operation, this rate shows no evidence that it is decreasing. Since 2002, around 100 scientists per year become lead authors for the first time on a refereed paper which directly uses XMM-Newton data. Each refereed XMM-Newton paper receives around four citations per year in the first few years with a long-term citation rate of three citations per year, more than five years after publication. About half of the articles citing XMM-Newton articles are not primarily X-ray observational papers. The distribution of elapsed time between observations taken under the Guest Observer programme and first article peaks at 2 years with a possible second peak at 3.25 years. Observations taken under the Target of Opportunity programme are published significantly faster, after one year on average. The fraction of science time taken until the end of 2009 that has been used in at least one article is {˜ 90} %. Most observations were used more than once, yielding on average a factor of two in usage on available observing time per year. About 20 % of
ERIC Educational Resources Information Center
Schneider, William R.
2011-01-01
The purpose of this study was to determine the relationship between statistics self-efficacy, statistics anxiety, and performance in introductory graduate statistics courses. The study design compared two statistics self-efficacy measures developed by Finney and Schraw (2003), a statistics anxiety measure developed by Cruise and Wilkins (1980),…
Illustrating the practice of statistics
Hamada, Christina A; Hamada, Michael S
2009-01-01
The practice of statistics involves analyzing data and planning data collection schemes to answer scientific questions. Issues often arise with the data that must be dealt with and can lead to new procedures. In analyzing data, these issues can sometimes be addressed through the statistical models that are developed. Simulation can also be helpful in evaluating a new procedure. Moreover, simulation coupled with optimization can be used to plan a data collection scheme. The practice of statistics as just described is much more than just using a statistical package. In analyzing the data, it involves understanding the scientific problem and incorporating the scientist's knowledge. In modeling the data, it involves understanding how the data were collected and accounting for limitations of the data where possible. Moreover, the modeling is likely to be iterative by considering a series of models and evaluating the fit of these models. Designing a data collection scheme involves understanding the scientist's goal and staying within hislher budget in terms of time and the available resources. Consequently, a practicing statistician is faced with such tasks and requires skills and tools to do them quickly. We have written this article for students to provide a glimpse of the practice of statistics. To illustrate the practice of statistics, we consider a problem motivated by some precipitation data that our relative, Masaru Hamada, collected some years ago. We describe his rain gauge observational study in Section 2. We describe modeling and an initial analysis of the precipitation data in Section 3. In Section 4, we consider alternative analyses that address potential issues with the precipitation data. In Section 5, we consider the impact of incorporating additional infonnation. We design a data collection scheme to illustrate the use of simulation and optimization in Section 6. We conclude this article in Section 7 with a discussion.
Key China Energy Statistics 2012
Levine, Mark; Fridley, David; Lu, Hongyou; Fino-Chen, Cecilia
2012-05-01
The China Energy Group at Lawrence Berkeley National Laboratory (LBNL) was established in 1988. Over the years the Group has gained recognition as an authoritative source of China energy statistics through the publication of its China Energy Databook (CED). The Group has published seven editions to date of the CED (http://china.lbl.gov/research/chinaenergy-databook). This handbook summarizes key statistics from the CED and is expressly modeled on the International Energy Agency’s “Key World Energy Statistics” series of publications. The handbook contains timely, clearly-presented data on the supply, transformation, and consumption of all major energy sources.
Statistical parameters for gloss evaluation
Peiponen, Kai-Erik; Juuti, Mikko
2006-02-13
The measurement of minute changes in local gloss has not been presented in international standards due to a lack of suitable glossmeters. The development of a diffractive-element-based glossmeter (DOG) made it possible to detect local variation of gloss from planar and complex-shaped surfaces. Hence, a demand for proper statistical gloss parameters for classifying surface quality by gloss, similar to the standardized surface roughness classification, has become necessary. In this letter, we define statistical gloss parameters and utilize them as an example in the characterization of gloss from metal surface roughness standards by the DOG.
Statistical inference for inverse problems
NASA Astrophysics Data System (ADS)
Bissantz, Nicolai; Holzmann, Hajo
2008-06-01
In this paper we study statistical inference for certain inverse problems. We go beyond mere estimation purposes and review and develop the construction of confidence intervals and confidence bands in some inverse problems, including deconvolution and the backward heat equation. Further, we discuss the construction of certain hypothesis tests, in particular concerning the number of local maxima of the unknown function. The methods are illustrated in a case study, where we analyze the distribution of heliocentric escape velocities of galaxies in the Centaurus galaxy cluster, and provide statistical evidence for its bimodality.
Statistical Mechanics of Prion Diseases
Slepoy, A.; Singh, R. R. P.; Pazmandi, F.; Kulkarni, R. V.; Cox, D. L.
2001-07-30
We present a two-dimensional, lattice based, protein-level statistical mechanical model for prion diseases (e.g., mad cow disease) with concomitant prion protein misfolding and aggregation. Our studies lead us to the hypothesis that the observed broad incubation time distribution in epidemiological data reflect fluctuation dominated growth seeded by a few nanometer scale aggregates, while much narrower incubation time distributions for innoculated lab animals arise from statistical self-averaging. We model ''species barriers'' to prion infection and assess a related treatment protocol.
The Statistics of Visual Representation
NASA Technical Reports Server (NTRS)
Jobson, Daniel J.; Rahman, Zia-Ur; Woodell, Glenn A.
2002-01-01
The experience of retinex image processing has prompted us to reconsider fundamental aspects of imaging and image processing. Foremost is the idea that a good visual representation requires a non-linear transformation of the recorded (approximately linear) image data. Further, this transformation appears to converge on a specific distribution. Here we investigate the connection between numerical and visual phenomena. Specifically the questions explored are: (1) Is there a well-defined consistent statistical character associated with good visual representations? (2) Does there exist an ideal visual image? And (3) what are its statistical properties?
Key China Energy Statistics 2011
Levine, Mark; Fridley, David; Lu, Hongyou; Fino-Chen, Cecilia
2012-01-15
The China Energy Group at Lawrence Berkeley National Laboratory (LBNL) was established in 1988. Over the years the Group has gained recognition as an authoritative source of China energy statistics through the publication of its China Energy Databook (CED). In 2008 the Group published the Seventh Edition of the CED (http://china.lbl.gov/research/chinaenergy-databook). This handbook summarizes key statistics from the CED and is expressly modeled on the International Energy Agency’s “Key World Energy Statistics” series of publications. The handbook contains timely, clearly-presented data on the supply, transformation, and consumption of all major energy sources.
Vector statistics of LANDSAT imagery
NASA Technical Reports Server (NTRS)
Jayroe, R. R., Jr.; Underwood, D.
1977-01-01
A digitized multispectral image, such as LANDSAT data, is composed of numerous four dimensional vectors, which quantitatively describe the ground scene from which the data are acquired. The statistics of unique vectors that occur in LANDSAT imagery are studied to determine if that information can provide some guidance on reducing image processing costs. A second purpose of this report is to investigate how the vector statistics are changed by various types of image processing techniques and determine if that information can be useful in choosing one processing approach over another.
Fluid turbulence - Deterministic or statistical
NASA Astrophysics Data System (ADS)
Cheng, Sin-I.
The deterministic view of turbulence suggests that the classical theory of fluid turbulence may be treating the wrong entity. The paper explores the physical implications of such an abstract mathematical result, and provides a constructive computational demonstration of the deterministic and the wave nature of fluid turbulence. The associated pressure disturbance for restoring solenoidal velocity is the primary agent, and its reflection from solid surface(s) the dominant mechanism of turbulence production. Statistical properties and their modeling must address to the statistics of the uncertainties of initial boundary data of the ensemble.
Statistics of Statisticians: Critical Mass of Statistics and Operational Research Groups
NASA Astrophysics Data System (ADS)
Kenna, Ralph; Berche, Bertrand
Using a recently developed model, inspired by mean field theory in statistical physics, and data from the UK's Research Assessment Exercise, we analyse the relationship between the qualities of statistics and operational research groups and the quantities of researchers in them. Similar to other academic disciplines, we provide evidence for a linear dependency of quality on quantity up to an upper critical mass, which is interpreted as the average maximum number of colleagues with whom a researcher can communicate meaningfully within a research group. The model also predicts a lower critical mass, which research groups should strive to achieve to avoid extinction. For statistics and operational research, the lower critical mass is estimated to be 9 ± 3. The upper critical mass, beyond which research quality does not significantly depend on group size, is 17 ± 6.
Schmidler, Scott C; Lucas, Joseph E; Oas, Terrence G
2007-12-01
Analysis of biopolymer sequences and structures generally adopts one of two approaches: use of detailed biophysical theoretical models of the system with experimentally-determined parameters, or largely empirical statistical models obtained by extracting parameters from large datasets. In this work, we demonstrate a merger of these two approaches using Bayesian statistics. We adopt a common biophysical model for local protein folding and peptide configuration, the helix-coil model. The parameters of this model are estimated by statistical fitting to a large dataset, using prior distributions based on experimental data. L(1)-norm shrinkage priors are applied to induce sparsity among the estimated parameters, resulting in a significantly simplified model. Formal statistical procedures for evaluating support in the data for previously proposed model extensions are presented. We demonstrate the advantages of this approach including improved prediction accuracy and quantification of prediction uncertainty, and discuss opportunities for statistical design of experiments. Our approach yields a 39% improvement in mean-squared predictive error over the current best algorithm for this problem. In the process we also provide an efficient recursive algorithm for exact calculation of ensemble helicity including sidechain interactions, and derive an explicit relation between homo- and heteropolymer helix-coil theories and Markov chains and (non-standard) hidden Markov models respectively, which has not appeared in the literature previously.
Significant Decisions in Labor Cases.
ERIC Educational Resources Information Center
Monthly Labor Review, 1979
1979-01-01
Several significant court decisions involving labor cases are discussed including a series of decisions concerning constitutional protections afforded aliens; the First Amendment and national labor relations laws; and the bifurcated backpay rule. (BM)
Significant Scales in Community Structure
Traag, V. A.; Krings, G.; Van Dooren, P.
2013-01-01
Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant. This problem shows foremost in multi-resolution methods. We here introduce an efficient method for scanning for resolutions in one such method. Additionally, we introduce the notion of “significance” of a partition, based on subgraph probabilities. Significance is independent of the exact method used, so could also be applied in other methods, and can be interpreted as the gain in encoding a graph by making use of a partition. Using significance, we can determine “good” resolution parameters, which we demonstrate on benchmark networks. Moreover, optimizing significance itself also shows excellent performance. We demonstrate our method on voting data from the European Parliament. Our analysis suggests the European Parliament has become increasingly ideologically divided and that nationality plays no role. PMID:24121597
Astronomical Significance of Ancient Monuments
NASA Astrophysics Data System (ADS)
Simonia, I.
2011-06-01
Astronomical significance of Gokhnari megalithic monument (eastern Georgia) is considered. Possible connection of Amirani ancient legend with Gokhnari monument is discussed. Concepts of starry practicality and solar stations are proposed.
20 CFR 634.4 - Statistical standards.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 20 Employees' Benefits 3 2011-04-01 2011-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs....
20 CFR 634.4 - Statistical standards.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 20 Employees' Benefits 3 2010-04-01 2010-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs....
Teaching Statistics in Integration with Psychology
ERIC Educational Resources Information Center
Wiberg, Marie
2009-01-01
The aim was to revise a statistics course in order to get the students motivated to learn statistics and to integrate statistics more throughout a psychology course. Further, we wish to make students become more interested in statistics and to help them see the importance of using statistics in psychology research. To achieve this goal, several…
Understanding Statistics Using Computer Demonstrations
ERIC Educational Resources Information Center
Dunn, Peter K.
2004-01-01
This paper discusses programs that clarify some statistical ideas often discussed yet poorly understood by students. The programs adopt the approach of demonstrating what is happening, rather than using the computer to do the work for the students (and hide the understanding). The programs demonstrate normal probability plots, overfitting of…
Introductory Statistics and Fish Management.
ERIC Educational Resources Information Center
Jardine, Dick
2002-01-01
Describes how fisheries research and management data (available on a website) have been incorporated into an Introductory Statistics course. In addition to the motivation gained from seeing the practical relevance of the course, some students have participated in the data collection and analysis for the New Hampshire Fish and Game Department. (MM)
China's Statistical System and Resources
ERIC Educational Resources Information Center
Xue, Susan
2004-01-01
As the People's Republic of China plays an increasingly important role in international politics and trade, countries with economic interests there find they need to know more about this nation. Access to primary information sources, including official statistics from China, however, is very limited, as little exploration has been done into this…
Education Statistics Quarterly, Fall 2001.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2001-01-01
The publication gives a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products, and funding opportunities developed over a 3-month period. Each issue also contains a message from…
Education Statistics Quarterly, Fall 2002.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2003-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released in a 3-month period. Each issue also contains a message from the NCES on a timely…
Education Statistics Quarterly, Spring 2002.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2002-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products, and funding opportunities developed over a 3-month period. Each issue also contains a message…
Education Statistics Quarterly, Summer 2002.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2002-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products, and funding opportunities developed over a 3-month period. Each issue also contains a message…
Digest of Education Statistics, 1990.
ERIC Educational Resources Information Center
Snyder, Thomas D.; Hoffman, Charlene M.
This document, consisting of 7 chapters, 35 figures, and 380 tables, provides statistical data on most aspects of United States education, both public and private, from kindergarten through graduate school. The chapters cover the following topics; (1) all levels of education; (2) elementary and secondary education; (3) postsecondary, college,…
Concept Maps in Introductory Statistics
ERIC Educational Resources Information Center
Witmer, Jeffrey A.
2016-01-01
Concept maps are tools for organizing thoughts on the main ideas in a course. I present an example of a concept map that was created through the work of students in an introductory class and discuss major topics in statistics and relationships among them.
QUANTUM MECHANICS WITHOUT STATISTICAL POSTULATES
G. GEIGER; ET AL
2000-11-01
The Bohmian formulation of quantum mechanics describes the measurement process in an intuitive way without a reduction postulate. Due to the chaotic motion of the hidden classical particle all statistical features of quantum mechanics during a sequence of repeated measurements can be derived in the framework of a deterministic single system theory.
Undergraduate experiments on statistical optics
NASA Astrophysics Data System (ADS)
Scholz, Ruediger; Friege, Gunnar; Weber, Kim-Alessandro
2016-09-01
Since the pioneering experiments of Forrester et al (1955 Phys. Rev. 99 1691) and Hanbury Brown and Twiss (1956 Nature 177 27; Nature 178 1046), along with the introduction of the laser in the 1960s, the systematic analysis of random fluctuations of optical fields has developed to become an indispensible part of physical optics for gaining insight into features of the fields. In 1985 Joseph W Goodman prefaced his textbook on statistical optics with a strong commitment to the ‘tools of probability and statistics’ (Goodman 2000 Statistical Optics (New York: John Wiley & Sons Inc.)) in the education of advanced optics. Since then a wide range of novel undergraduate optical counting experiments and corresponding pedagogical approaches have been introduced to underpin the rapid growth of the interest in coherence and photon statistics. We propose low cost experimental steps that are a fair way off ‘real’ quantum optics, but that give deep insight into random optical fluctuation phenomena: (1) the introduction of statistical methods into undergraduate university optical lab work, and (2) the connection between the photoelectrical signal and the characteristics of the light source. We describe three experiments and theoretical approaches which may be used to pave the way for a well balanced growth of knowledge, providing students with an opportunity to enhance their abilities to adapt the ‘tools of probability and statistics’.
Statistical methods for evolutionary trees.
Edwards, A W F
2009-09-01
In 1963 and 1964, L. L. Cavalli-Sforza and A. W. F. Edwards introduced novel methods for computing evolutionary trees from genetical data, initially for human populations from blood-group gene frequencies. The most important development was their introduction of statistical methods of estimation applied to stochastic models of evolution.
What Price Statistical Tables Now?
ERIC Educational Resources Information Center
Hunt, Neville
1997-01-01
Describes the generation of all the tables required for school-level study of statistics using Microsoft's Excel spreadsheet package. Highlights cumulative binomial probabilities, cumulative Poisson probabilities, normal distribution, t-distribution, chi-squared distribution, F-distribution, random numbers, and accuracy. (JRH)
Statistical modelling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1991-01-01
During the six-month period from 1 April 1991 to 30 September 1991 the following research papers in statistical modeling of software reliability appeared: (1) A Nonparametric Software Reliability Growth Model; (2) On the Use and the Performance of Software Reliability Growth Models; (3) Research and Development Issues in Software Reliability Engineering; (4) Special Issues on Software; and (5) Software Reliability and Safety.
Education Statistics Quarterly, Winter 2001.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2002-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released in a 3-month period. Each issue also contains a message from the NCES on a timely…
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Bosch, Stephen; Ink, Gary; Greco, Albert N.
1999-01-01
Presents: "Prices of United States and Foreign Published Materials"; "Book Title Output and Average Prices"; "Book Sales Statistics, 1998"; "United States Book Exports and Imports: 1998"; "International Book Title Output: 1990-96"; "Number of Book Outlets in the United States and Canada";…
American Youth: A Statistical Snapshot.
ERIC Educational Resources Information Center
Wetzel, James R.
This document presents a statistics snapshot of young people, aged 15 to 24 years. It provides a broad overview of trends documenting the direction of changes in social behavior and economic circumstances. The projected decline in the total number of youth from 43 million in 1980 to 35 million in 1995 will affect marriage and childbearing…
The Statistical Handbook on Technology.
ERIC Educational Resources Information Center
Berinstein, Paula
This volume tells stories about the tools we use, but these narratives are told in numbers rather than in words. Organized by various aspects of society, each chapter uses tables and statistics to examine everything from budgets, costs, sales, trade, employment, patents, prices, usage, access and consumption. In each chapter, each major topic is…
Discussion on Statistics Teaching Management
ERIC Educational Resources Information Center
Wu, Qingjun
2008-01-01
The teaching management requires reasonable deployment of all kinds of teaching essential factors in teaching process to promote students' comprehensive and harmonious development. Having analyzing questions which appears frequently in statistics teaching management of college, the article finds out their causes, according to which this article…
Central Statistical Libraries in Europe.
ERIC Educational Resources Information Center
Kaiser, Lisa
The paper tries to clarify the position special governmental libraries hold in the system of libraries of today by investigating only one specific type of library mainly from a formal and historical point of view. Central statistical libraries in Europe were first regarded as administrative and archival libraries. Their early holdings of foreign…
Instructional Theory for Teaching Statistics.
ERIC Educational Resources Information Center
Atwood, Jan R.; Dinham, Sarah M.
Metatheoretical analysis of Ausubel's Theory of Meaningful Verbal Learning and Gagne's Theory of Instruction using the Dickoff and James paradigm produced two instructional systems for basic statistics. The systems were tested with a pretest-posttest control group design utilizing students enrolled in an introductory-level graduate statistics…
Statistics of premixed flame cells
NASA Technical Reports Server (NTRS)
Noever, David A.
1991-01-01
The statistics of random cellular patterns in premixed flames are analyzed. Agreement is found with a variety of topological relations previously found for other networks, namely, Lewis's law and Aboav's law. Despite the diverse underlying physics, flame cells are shown to share a broad class of geometric properties with other random networks-metal grains, soap foams, bioconvection, and Langmuir monolayers.
Statistics of premixed flame cells
Noever, D.A. )
1991-07-15
The statistics of random cellular patterns in premixed flames are analyzed. Agreement is found with a variety of topological relations previously found for other networks, namely, Lewis's law and Aboav's law. Despite the diverse underlying physics, flame cells are shown to share a broad class of geometric properties with other random networks---metal grains, soap foams, bioconvection, and Langmuir monolayers.
A Simple Statistical Thermodynamics Experiment
ERIC Educational Resources Information Center
LoPresto, Michael C.
2010-01-01
Comparing the predicted and actual rolls of combinations of both two and three dice can help to introduce many of the basic concepts of statistical thermodynamics, including multiplicity, probability, microstates, and macrostates, and demonstrate that entropy is indeed a measure of randomness, that disordered states (those of higher entropy) are…
Statistical description of tectonic motions
NASA Technical Reports Server (NTRS)
Agnew, Duncan Carr
1993-01-01
This report summarizes investigations regarding tectonic motions. The topics discussed include statistics of crustal deformation, Earth rotation studies, using multitaper spectrum analysis techniques applied to both space-geodetic data and conventional astrometric estimates of the Earth's polar motion, and the development, design, and installation of high-stability geodetic monuments for use with the global positioning system.
Tsallis statistics and neurodegenerative disorders
NASA Astrophysics Data System (ADS)
Iliopoulos, Aggelos C.; Tsolaki, Magdalini; Aifantis, Elias C.
2016-08-01
In this paper, we perform statistical analysis of time series deriving from four neurodegenerative disorders, namely epilepsy, amyotrophic lateral sclerosis (ALS), Parkinson's disease (PD), Huntington's disease (HD). The time series are concerned with electroencephalograms (EEGs) of healthy and epileptic states, as well as gait dynamics (in particular stride intervals) of the ALS, PD and HDs. We study data concerning one subject for each neurodegenerative disorder and one healthy control. The analysis is based on Tsallis non-extensive statistical mechanics and in particular on the estimation of Tsallis q-triplet, namely {qstat, qsen, qrel}. The deviation of Tsallis q-triplet from unity indicates non-Gaussian statistics and long-range dependencies for all time series considered. In addition, the results reveal the efficiency of Tsallis statistics in capturing differences in brain dynamics between healthy and epileptic states, as well as differences between ALS, PD, HDs from healthy control subjects. The results indicate that estimations of Tsallis q-indices could be used as possible biomarkers, along with others, for improving classification and prediction of epileptic seizures, as well as for studying the gait complex dynamics of various diseases providing new insights into severity, medications and fall risk, improving therapeutic interventions.
GPS: Geometry, Probability, and Statistics
ERIC Educational Resources Information Center
Field, Mike
2012-01-01
It might be said that for most occupations there is now less of a need for mathematics than there was say fifty years ago. But, the author argues, geometry, probability, and statistics constitute essential knowledge for everyone. Maybe not the geometry of Euclid, but certainly geometrical ways of thinking that might enable us to describe the world…
Measuring Skewness: A Forgotten Statistic?
ERIC Educational Resources Information Center
Doane, David P.; Seward, Lori E.
2011-01-01
This paper discusses common approaches to presenting the topic of skewness in the classroom, and explains why students need to know how to measure it. Two skewness statistics are examined: the Fisher-Pearson standardized third moment coefficient, and the Pearson 2 coefficient that compares the mean and median. The former is reported in statistical…
Teaching Statistics through Learning Projects
ERIC Educational Resources Information Center
Moreira da Silva, Mauren Porciúncula; Pinto, Suzi Samá
2014-01-01
This paper aims to reflect on the teaching of statistics through student research, in the form of projects carried out by students on self-selected topics. The paper reports on a study carried out with two undergraduate classes using a methodology of teaching that we call "learning projects." Monitoring the development of the various…
Education Statistics Quarterly, Summer 2001.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2001-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released during a 3-month period. Each issue also contains a message from the NCES on a…
Statistical Prediction in Proprietary Rehabilitation.
ERIC Educational Resources Information Center
Johnson, Kurt L.; And Others
1987-01-01
Applied statistical methods to predict case expenditures for low back pain rehabilitation cases in proprietary rehabilitation. Extracted predictor variables from case records of 175 workers compensation claimants with some degree of permanent disability due to back injury. Performed several multiple regression analyses resulting in a formula that…
Statistics by Example, Detecting Patterns.
ERIC Educational Resources Information Center
Mosteller, Frederick; And Others
This booklet is part of a series of four pamphlets, each intended to stand alone, which provide problems in probability and statistics at the secondary school level. Twelve different real-life examples (written by professional statisticians and experienced teachers) have been collected in this booklet to illustrate the ideas of mean, variation,…
An operational definition of a statistically meaningful trend.
Bryhn, Andreas C; Dimberg, Peter H
2011-04-28
Linear trend analysis of time series is standard procedure in many scientific disciplines. If the number of data is large, a trend may be statistically significant even if data are scattered far from the trend line. This study introduces and tests a quality criterion for time trends referred to as statistical meaningfulness, which is a stricter quality criterion for trends than high statistical significance. The time series is divided into intervals and interval mean values are calculated. Thereafter, r(2) and p values are calculated from regressions concerning time and interval mean values. If r(2) ≥ 0.65 at p ≤ 0.05 in any of these regressions, then the trend is regarded as statistically meaningful. Out of ten investigated time series from different scientific disciplines, five displayed statistically meaningful trends. A Microsoft Excel application (add-in) was developed which can perform statistical meaningfulness tests and which may increase the operationality of the test. The presented method for distinguishing statistically meaningful trends should be reasonably uncomplicated for researchers with basic statistics skills and may thus be useful for determining which trends are worth analysing further, for instance with respect to causal factors. The method can also be used for determining which segments of a time trend may be particularly worthwhile to focus on.
Significant and meaningful effects in sports biomechanics research.
Knudson, Duane
2009-03-01
Errors in statistical analysis of multiple dependent variables and in documenting the size of effects are common in the scientific and biomechanical literature. In this paper, I review these errors and several solutions that can improve the validity of sports biomechanics research reports. Studies examining multiple dependent variables should either control for the inflation of Type I errors (e.g. Holm's procedure) during multiple comparisons or use multivariate analysis of variance to focus on the structure and interaction of the dependent variables. When statistically significant differences are observed, research reports should provide confidence limits or effect sizes to document the size of the effects. Authors of sports biomechanics research reports are encouraged to analyse and present their data accounting for the experiment-wise Type I error rate, as well as reporting data documenting the size or practical significance of effects reaching their standard of statistical significance.
A spatial scan statistic for multiple clusters.
Li, Xiao-Zhou; Wang, Jin-Feng; Yang, Wei-Zhong; Li, Zhong-Jie; Lai, Sheng-Jie
2011-10-01
Spatial scan statistics are commonly used for geographical disease surveillance and cluster detection. While there are multiple clusters coexisting in the study area, they become difficult to detect because of clusters' shadowing effect to each other. The recently proposed sequential method showed its better power for detecting the second weaker cluster, but did not improve the ability of detecting the first stronger cluster which is more important than the second one. We propose a new extension of the spatial scan statistic which could be used to detect multiple clusters. Through constructing two or more clusters in the alternative hypothesis, our proposed method accounts for other coexisting clusters in the detecting and evaluating process. The performance of the proposed method is compared to the sequential method through an intensive simulation study, in which our proposed method shows better power in terms of both rejecting the null hypothesis and accurately detecting the coexisting clusters. In the real study of hand-foot-mouth disease data in Pingdu city, a true cluster town is successfully detected by our proposed method, which cannot be evaluated to be statistically significant by the standard method due to another cluster's shadowing effect.
Critical analysis of adsorption data statistically
NASA Astrophysics Data System (ADS)
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are <1, indicating favourable isotherms. Karl Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
The new statistics: why and how.
Cumming, Geoff
2014-01-01
We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data-analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesis significance testing (NHST), we need to shift from reliance on NHST to estimation and other preferred techniques. The new statistics refers to recommended practices, including estimation based on effect sizes, confidence intervals, and meta-analysis. The techniques are not new, but adopting them widely would be new for many researchers, as well as highly beneficial. This article explains why the new statistics are important and offers guidance for their use. It describes an eight-step new-statistics strategy for research with integrity, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.
Ideas for Effective Communication of Statistical Results
Anderson-Cook, Christine M.
2015-03-01
Effective presentation of statistical results to those with less statistical training, including managers and decision-makers requires planning, anticipation and thoughtful delivery. Here are several recommendations for effectively presenting statistical results.
A Perspective on Teaching Elementary Statistics.
ERIC Educational Resources Information Center
Wainwright, Barbara A.; Austin, Homer W.
1997-01-01
Shares the perspectives of two instructors of elementary statistics at the college level. Describes a course developed to increase statistics learning and student motivation to learn statistics by introducing writing into the course content. (DDR)
Societal Statistics by virtue of the Statistical Drake Equation
NASA Astrophysics Data System (ADS)
Maccone, Claudio
2012-09-01
The Drake equation, first proposed by Frank D. Drake in 1961, is the foundational equation of SETI. It yields an estimate of the number N of extraterrestrial communicating civilizations in the Galaxy given by the product N=Ns×fp×ne×fl×fi×fc×fL, where: Ns is the number of stars in the Milky Way Galaxy; fp is the fraction of stars that have planetary systems; ne is the number of planets in a given system that are ecologically suitable for life; fl is the fraction of otherwise suitable planets on which life actually arises; fi is the fraction of inhabited planets on which an intelligent form of life evolves; fc is the fraction of planets inhabited by intelligent beings on which a communicative technical civilization develops; and fL is the fraction of planetary lifetime graced by a technical civilization. The first three terms may be called "the astrophysical terms" in the Drake equation since their numerical value is provided by astrophysical considerations. The fourth term, fl, may be called "the origin-of-life term" and entails biology. The last three terms may be called "the societal terms" inasmuch as their respective numerical values are provided by anthropology, telecommunication science and "futuristic science", respectively. In this paper, we seek to provide a statistical estimate of the three societal terms in the Drake equation basing our calculations on the Statistical Drake Equation first proposed by this author at the 2008 IAC. In that paper the author extended the simple 7-factor product so as to embody Statistics. He proved that, no matter which probability distribution may be assigned to each factor, if the number of factors tends to infinity, then the random variable N follows the lognormal distribution (central limit theorem of Statistics). This author also proved at the 2009 IAC that the Dole (1964) [7] equation, yielding the number of Habitable Planets for Man in the Galaxy, has the same mathematical structure as the Drake equation. So the
Significance testing testate amoeba water table reconstructions
NASA Astrophysics Data System (ADS)
Payne, Richard J.; Babeshko, Kirill V.; van Bellen, Simon; Blackford, Jeffrey J.; Booth, Robert K.; Charman, Dan J.; Ellershaw, Megan R.; Gilbert, Daniel; Hughes, Paul D. M.; Jassey, Vincent E. J.; Lamentowicz, Łukasz; Lamentowicz, Mariusz; Malysheva, Elena A.; Mauquoy, Dmitri; Mazei, Yuri; Mitchell, Edward A. D.; Swindles, Graeme T.; Tsyganov, Andrey N.; Turner, T. Edward; Telford, Richard J.
2016-04-01
Transfer functions are valuable tools in palaeoecology, but their output may not always be meaningful. A recently-developed statistical test ('randomTF') offers the potential to distinguish among reconstructions which are more likely to be useful, and those less so. We applied this test to a large number of reconstructions of peatland water table depth based on testate amoebae. Contrary to our expectations, a substantial majority (25 of 30) of these reconstructions gave non-significant results (P > 0.05). The underlying reasons for this outcome are unclear. We found no significant correlation between randomTF P-value and transfer function performance, the properties of the training set and reconstruction, or measures of transfer function fit. These results give cause for concern but we believe it would be extremely premature to discount the results of non-significant reconstructions. We stress the need for more critical assessment of transfer function output, replication of results and ecologically-informed interpretation of palaeoecological data.
P-Value Club: Teaching Significance Level on the Dance Floor
ERIC Educational Resources Information Center
Gray, Jennifer
2010-01-01
Courses: Beginning research methods and statistics courses, as well as advanced communication courses that require reading research articles and completing research projects involving statistics. Objective: Students will understand the difference between significant and nonsignificant statistical results based on p-value.
Statistical modeling of the arterial vascular tree
NASA Astrophysics Data System (ADS)
Beck, Thomas; Godenschwager, Christian; Bauer, Miriam; Bernhardt, Dominik; Dillmann, Rüdiger
2011-03-01
Automatic examination of medical images becomes increasingly important due to the rising amount of data. Therefore automated methods are required which combine anatomical knowledge and robust segmentation to examine the structure of interest. We propose a statistical model of the vascular tree based on vascular landmarks and unbranched vessel sections. An undirected graph provides anatomical topology, semantics, existing landmarks and attached vessel sections. The atlas was built using semi-automatically generated geometric models of various body regions ranging from carotid arteries to the lower legs. Geometric models contain vessel centerlines as well as orthogonal cross-sections in equidistant intervals with the vessel contour having the form of a polygon path. The geometric vascular model is supplemented by anatomical landmarks which are not necessarily related to the vascular system. These anatomical landmarks define point correspondences which are used for registration with a Thin-Plate-Spline interpolation. After the registration process, the models were merged to form the statistical model which can be mapped to unseen images based on a subset of anatomical landmarks. This approach provides probability distributions for the location of landmarks, vessel-specific geometric properties including shape, expected radii and branching points and vascular topology. The applications of this statistical model include model-based extraction of the vascular tree which greatly benefits from vessel-specific geometry description and variation ranges. Furthermore, the statistical model can be applied as a basis for computer aided diagnosis systems as indicator for pathologically deformed vessels and the interaction with the geometric model is significantly more user friendly for physicians through anatomical names.
Status and Significance of Credentialing.
ERIC Educational Resources Information Center
Musgrave, Dorothea
1984-01-01
Discusses the current status, significance, and future of credentialing in the field of environmental health. Also discusses four phases of a Bureau of Health Professions (BHP) Credentialing Program and BHP-funded projects related to their development and implementation. Phases include role delineation, resources development, examination…
Statistical phenomena in particle beams
Bisognano, J.J.
1984-09-01
Particle beams are subject to a variety of apparently distinct statistical phenomena such as intrabeam scattering, stochastic cooling, electron cooling, coherent instabilities, and radiofrequency noise diffusion. In fact, both the physics and mathematical description of these mechanisms are quite similar, with the notion of correlation as a powerful unifying principle. In this presentation we will attempt to provide both a physical and a mathematical basis for understanding the wide range of statistical phenomena that have been discussed. In the course of this study the tools of the trade will be introduced, e.g., the Vlasov and Fokker-Planck equations, noise theory, correlation functions, and beam transfer functions. Although a major concern will be to provide equations for analyzing machine design, the primary goal is to introduce a basic set of physical concepts having a very broad range of applicability.
Parallel contingency statistics with Titan.
Thompson, David C.; Pebay, Philippe Pierre
2009-09-01
This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized contingency statistics engine. It is a sequel to [PT08] and [BPRT09] which studied the parallel descriptive, correlative, multi-correlative, and principal component analysis engines. The ease of use of this new parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; however, the very nature of contingency tables prevent this new engine from exhibiting optimal parallel speed-up as the aforementioned engines do. This report therefore discusses the design trade-offs we made and study performance with up to 200 processors.
Performance Measures For Statistical Segmentation
NASA Astrophysics Data System (ADS)
Shazeer, Dov J.
1983-03-01
Performance measures for statistical segmentation have been developed for a space-and-time critical Bayesian statistical tracker. They are intended to become an integral part of a knowledge-based tracking algorithm, which has been developed by RCA. The performance measures are serving to quantify the usefulness of the processed input, to assist in the identification of each tracking state and give its reliability, and to predict impending changes of state. They have been tested using stochastically generated target-background frames. Performance measure results have correlated well with the parameters which characterize the difference in the target and background distributions. A host of possible performance measures are discussed in relation to their strengths and weaknesses. Experimental results for the measures currently being employed by RCA are given, and areas for future research are indicated.
Statistical mechanics and Lorentz violation
NASA Astrophysics Data System (ADS)
Colladay, Don; McDonald, Patrick
2004-12-01
The theory of statistical mechanics is studied in the presence of Lorentz-violating background fields. The analysis is performed using the Standard-Model Extension (SME) together with a Jaynesian formulation of statistical inference. Conventional laws of thermodynamics are obtained in the presence of a perturbed hamiltonian that contains the Lorentz-violating terms. As an example, properties of the nonrelativistic ideal gas are calculated in detail. To lowest order in Lorentz violation, the scalar thermodynamic variables are only corrected by a rotationally invariant combination of parameters that mimics a (frame dependent) effective mass. Spin-couplings can induce a temperature-independent polarization in the classical gas that is not present in the conventional case. Precision measurements in the residual expectation values of the magnetic moment of Fermi gases in the limit of high temperature may provide interesting limits on these parameters.
Introduction to Statistically Designed Experiments
Heaney, Mike
2016-09-13
Statistically designed experiments can save researchers time and money by reducing the number of necessary experimental trials, while resulting in more conclusive experimental results. Surprisingly, many researchers are still not aware of this efficient and effective experimental methodology. As reported in a 2013 article from Chemical & Engineering News, there has been a resurgence of this methodology in recent years (http://cen.acs.org/articles/91/i13/Design-Experiments-Makes-Comeback.html?h=2027056365). This presentation will provide a brief introduction to statistically designed experiments. The main advantages will be reviewed along with the some basic concepts such as factorial and fractional factorial designs. The recommended sequential approach to experiments will be introduced and finally a case study will be presented to demonstrate this methodology.
Statistical description for survival data
2016-01-01
Statistical description is always the first step in data analysis. It gives investigator a general impression of the data at hand. Traditionally, data are described as central tendency and deviation. However, this framework does not fit to the survival data (also termed time-to-event data). Such data type contains two components. One is the survival time and the other is the status. Researchers are usually interested in the probability of event at a given survival time point. Hazard function, cumulative hazard function and survival function are commonly used to describe survival data. Survival function can be estimated using Kaplan-Meier estimator, which is also the default method in most statistical packages. Alternatively, Nelson-Aalen estimator is available to estimate survival function. Survival functions of subgroups can be compared using log-rank test. Furthermore, the article also introduces how to describe time-to-event data with parametric modeling. PMID:27867953
Statistical Methods for Cardiovascular Researchers
Moyé, Lem
2016-01-01
Rationale Biostatistics continues to play an essential role in contemporary cardiovascular investigations, but successful implementation of biostatistical methods can be complex. Objective To present the rationale behind statistical applications and to review useful tools for cardiology research. Methods and Results Prospective declaration of the research question, clear methodology, and study execution that adheres to the protocol together serve as the critical foundation of a research endeavor. Both parametric and distribution-free measures of central tendency and dispersion are presented. T-testing, analysis of variance, and regression analyses are reviewed. Survival analysis, logistic regression, and interim monitoring are also discussed. Finally, common weaknesses in statistical analyses are considered. Conclusion Biostatistics can be productively applied to cardiovascular research if investigators 1) develop and rely on a well-written protocol and analysis plan, 2) consult with a biostatistician when necessary, and 3) write results clearly, differentiating confirmatory from exploratory findings. PMID:26846639
The significance of the washout period in Preconditioning.
Salie, Ruduwaan; Lochner, Amanda; Loubser, Dirk J
2017-01-24
Exposure of the heart to 5 min global ischaemia (I) followed by 5 min reperfusion (R) (ischaemic preconditioning, IPC) or transient Beta 2-adrenergic receptor (B2-AR) stimulation with formoterol (B2PC), followed by 5 min washout before index ischaemia, elicits cardioprotection against subsequent sustained ischaemia. Since the washout period during preconditioning is essential for subsequent cardioprotection, the aim of this study was to investigate the involvement of protein kinase A (PKA), reactive oxygen species (ROS), extracellular signal-regulated kinase (ERK), PKB/Akt, p38 MAPK and c-jun N-terminal kinase (JNK) during this period. Isolated perfused rat hearts were exposed to IPC (1x5min I / 5min R) or B2PC (1x5min Formoterol / 5min R) followed by 35 min regional ischaemia and reperfusion. Inhibitors for PKA (Rp-8CPT-cAMP)(16μM), ROS (NAC)(300μM), PKB (A-6730)(2.5μM), ERKp44/p42 (PD98,059)(10μM), p38MAPK (SB239063)(1μM) or JNK (SP600125)(10μM) were administered for 5 minutes before 5 minutes global ischaemia / 5 min reperfusion (IPC) or for 5 minutes before and during administration of formoterol ( B2PC) prior to regional ischaemia, reperfusion and infarct size (IS) determination. Hearts exposed to B2PC or IPC were freeze-clamped during the washout period for Western blots analysis of PKB, ERKp44/p42, p38MAPK and JNK. The PKA blocker abolished both B2PC and IPC, while NAC significantly increased IS of IPC but not of B2PC. Western blot analysis showed that ERKp44/p42 and PKB activation during washout after B2PC compared to IPC was significantly increased. IPC compared to B2PC showed significant p38MAPK and JNKp54/p46 activation. PKB and ERK inhibition or p38MAPK and JNK inhibition during the washout period of B2PC and IPC respectively, significantly increased IS. PKA activation before regional ischaemia is a prerequisite for cardioprotection in both B2PC and IPC. However, ROS was crucial only in IPC. Kinase activation during the washout phase of IPC and B2
Statistical Perspectives on Stratospheric Transport
NASA Technical Reports Server (NTRS)
Sparling, L. C.
1999-01-01
Long-lived tropospheric source gases, such as nitrous oxide, enter the stratosphere through the tropical tropopause, are transported throughout the stratosphere by the Brewer-Dobson circulation, and are photochemically destroyed in the upper stratosphere. These chemical constituents, or "tracers" can be used to track mixing and transport by the stratospheric winds. Much of our understanding about the stratospheric circulation is based on large scale gradients and other spatial features in tracer fields constructed from satellite measurements. The point of view presented in this paper is different, but complementary, in that transport is described in terms of tracer probability distribution functions (PDFs). The PDF is computed from the measurements, and is proportional to the area occupied by tracer values in a given range. The flavor of this paper is tutorial, and the ideas are illustrated with several examples of transport-related phenomena, annotated with remarks that summarize the main point or suggest new directions. One example shows how the multimodal shape of the PDF gives information about the different branches of the circulation. Another example shows how the statistics of fluctuations from the most probable tracer value give insight into mixing between different regions of the atmosphere. Also included is an analysis of the time-dependence of the PDF during the onset and decline of the winter circulation, and a study of how "bursts" in the circulation are reflected in transient periods of rapid evolution of the PDF. The dependence of the statistics on location and time are also shown to be important for practical problems related to statistical robustness and satellite sampling. The examples illustrate how physically-based statistical analysis can shed some light on aspects of stratospheric transport that may not be obvious or quantifiable with other types of analyses. An important motivation for the work presented here is the need for synthesis of the
Statistical Mechanics of Dynamical Systems
NASA Astrophysics Data System (ADS)
Mori, H.; Hata, H.; Horita, T.; Kobayashi, T.
A statistical-mechanical formalism of chaos based on the geometry of invariant sets in phase space is discussed to show that chaotic dynamical systems can be treated by a formalism analogous to that of thermodynamic systems if one takes a relevant coarse-grained quantity, but their statistical laws are quite different from those of thermodynamic systems. This is a generalization of statistical mechanics for dealing with dissipative and hamiltonian (i.e., conservative) dynamical systems of a few degrees of freedom. Thus the sum of the local expansion rate of nearby orbits along relevant orbit over a long but finite time has been introduced in order to describe and characterize (1) a drastic change of the structure of a chaotic attractor at a bifurcation and anomalous phenomena associated, (2) a critical scaling of chaos in the neighborhood of a critical point for the bifurcation to a nonexotic state, and a self-similar temporal structure of a critical orbit on the critical 2^∞ attractor an the critical golden tori without mixing, (3) the critical KAM torus, diffusion and repeated sticking of a chaotic orbit to a critical torus in hamiltonian systems. Here a q-phase transition, analogous to the ferromagnetic phase transition, plays an important role. They are illustrated numerically and theoretically by treating the driven damped pendulum, the driven Duffing equation, the Henon map, and the dissipative and conservative standard maps. This description of chaos breaks the time-reversal symmetry of hamiltonian dynamical laws analogously to statistical mechanics of irreversible processes. The broken time-reversal symmetry is brought about by orbital instability of chaos.
Introduction to Modern Statistical Mechanics
NASA Astrophysics Data System (ADS)
Chandler, David
1987-09-01
Leading physical chemist David Chandler takes a new approach to statistical mechanics to provide the only introductory-level work on the modern topics of renormalization group theory, Monte Carlo simulations, time correlation functions, and liquid structure. The author provides compact summaries of the fundamentals of this branch of physics and discussions of many of its traditional elementary applications, interspersed with over 150 exercises and microcomputer programs.
[Pro Familia statistics for 1974].
1975-09-01
Statistics for 1974 for the West German family planning organization Pro Familia are reported. 56 offices are now operating, and 23,726 clients were seen. Men were seen more frequently than previously. 10,000 telephone calls were also handled. 16-25 year olds were increasingly represented in the clientele, as were unmarried persons of all ages. 1,242 patients were referred to physicians or clinics for clinical diagnosis.
The natural statistics of blur
Sprague, William W.; Cooper, Emily A.; Reissier, Sylvain; Yellapragada, Baladitya; Banks, Martin S.
2016-01-01
Blur from defocus can be both useful and detrimental for visual perception: It can be useful as a source of depth information and detrimental because it degrades image quality. We examined these aspects of blur by measuring the natural statistics of defocus blur across the visual field. Participants wore an eye-and-scene tracker that measured gaze direction, pupil diameter, and scene distances as they performed everyday tasks. We found that blur magnitude increases with increasing eccentricity. There is a vertical gradient in the distances that generate defocus blur: Blur below the fovea is generally due to scene points nearer than fixation; blur above the fovea is mostly due to points farther than fixation. There is no systematic horizontal gradient. Large blurs are generally caused by points farther rather than nearer than fixation. Consistent with the statistics, participants in a perceptual experiment perceived vertical blur gradients as slanted top-back whereas horizontal gradients were perceived equally as left-back and right-back. The tendency for people to see sharp as near and blurred as far is also consistent with the observed statistics. We calculated how many observations will be perceived as unsharp and found that perceptible blur is rare. Finally, we found that eye shape in ground-dwelling animals conforms to that required to put likely distances in best focus. PMID:27580043
Statistical inference and string theory
NASA Astrophysics Data System (ADS)
Heckman, Jonathan J.
2015-09-01
In this paper, we expose some surprising connections between string theory and statistical inference. We consider a large collective of agents sweeping out a family of nearby statistical models for an M-dimensional manifold of statistical fitting parameters. When the agents making nearby inferences align along a d-dimensional grid, we find that the pooled probability that the collective reaches a correct inference is the partition function of a nonlinear sigma model in d dimensions. Stability under perturbations to the original inference scheme requires the agents of the collective to distribute along two dimensions. Conformal invariance of the sigma model corresponds to the condition of a stable inference scheme, directly leading to the Einstein field equations for classical gravity. By summing over all possible arrangements of the agents in the collective, we reach a string theory. We also use this perspective to quantify how much an observer can hope to learn about the internal geometry of a superstring compactification. Finally, we present some brief speculative remarks on applications to the AdS/CFT correspondence and Lorentzian signature space-times.
Statistical mechanics of economics I
NASA Astrophysics Data System (ADS)
Kusmartsev, F. V.
2011-02-01
We show that statistical mechanics is useful in the description of financial crisis and economics. Taking a large amount of instant snapshots of a market over an interval of time we construct their ensembles and study their statistical interference. This results in a probability description of the market and gives capital, money, income, wealth and debt distributions, which in the most cases takes the form of the Bose-Einstein distribution. In addition, statistical mechanics provides the main market equations and laws which govern the correlations between the amount of money, debt, product, prices and number of retailers. We applied the found relations to a study of the evolution of the economics in USA between the years 1996 to 2008 and observe that over that time the income of a major population is well described by the Bose-Einstein distribution which parameters are different for each year. Each financial crisis corresponds to a peak in the absolute activity coefficient. The analysis correctly indicates the past crises and predicts the future one.
The Reverse Statistical Disclosure Attack
NASA Astrophysics Data System (ADS)
Mallesh, Nayantara; Wright, Matthew
Statistical disclosure is a well-studied technique that an attacker can use to uncover relations between users in mix-based anonymity systems. Prior work has focused on finding the receivers to whom a given targeted user sends. In this paper, we investigate the effectiveness of statistical disclosure in finding all of a users' contacts, including those from whom she receives messages. To this end, we propose a new attack called the Reverse Statistical Disclosure Attack (RSDA). RSDA uses observations of all users sending patterns to estimate both the targeted user's sending pattern and her receiving pattern. The estimated patterns are combined to find a set of the targeted user's most likely contacts. We study the performance of RSDA in simulation using different mix network configurations and also study the effectiveness of cover traffic as a countermeasure. Our results show that that RSDA outperforms the traditional SDA in finding the user's contacts, particularly as the amounts of user traffic and cover traffic rise.
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Redshift data and statistical inference
NASA Technical Reports Server (NTRS)
Newman, William I.; Haynes, Martha P.; Terzian, Yervant
1994-01-01
Frequency histograms and the 'power spectrum analysis' (PSA) method, the latter developed by Yu & Peebles (1969), have been widely employed as techniques for establishing the existence of periodicities. We provide a formal analysis of these two classes of methods, including controlled numerical experiments, to better understand their proper use and application. In particular, we note that typical published applications of frequency histograms commonly employ far greater numbers of class intervals or bins than is advisable by statistical theory sometimes giving rise to the appearance of spurious patterns. The PSA method generates a sequence of random numbers from observational data which, it is claimed, is exponentially distributed with unit mean and variance, essentially independent of the distribution of the original data. We show that the derived random processes is nonstationary and produces a small but systematic bias in the usual estimate of the mean and variance. Although the derived variable may be reasonably described by an exponential distribution, the tail of the distribution is far removed from that of an exponential, thereby rendering statistical inference and confidence testing based on the tail of the distribution completely unreliable. Finally, we examine a number of astronomical examples wherein these methods have been used giving rise to widespread acceptance of statistically unconfirmed conclusions.
Quantum Estimation, meet Computational Statistics; Computational Statistics, meet Quantum Estimation
NASA Astrophysics Data System (ADS)
Ferrie, Chris; Granade, Chris; Combes, Joshua
2013-03-01
Quantum estimation, that is, post processing data to obtain classical descriptions of quantum states and processes, is an intractable problem--scaling exponentially with the number of interacting systems. Thankfully there is an entire field, Computational Statistics, devoted to designing algorithms to estimate probabilities for seemingly intractable problems. So, why not look to the most advanced machine learning algorithms for quantum estimation tasks? We did. I'll describe how we adapted and combined machine learning methodologies to obtain an online learning algorithm designed to estimate quantum states and processes.
What can we learn from noise? - Mesoscopic nonequilibrium statistical physics.
Kobayashi, Kensuke
2016-01-01
Mesoscopic systems - small electric circuits working in quantum regime - offer us a unique experimental stage to explorer quantum transport in a tunable and precise way. The purpose of this Review is to show how they can contribute to statistical physics. We introduce the significance of fluctuation, or equivalently noise, as noise measurement enables us to address the fundamental aspects of a physical system. The significance of the fluctuation theorem (FT) in statistical physics is noted. We explain what information can be deduced from the current noise measurement in mesoscopic systems. As an important application of the noise measurement to statistical physics, we describe our experimental work on the current and current noise in an electron interferometer, which is the first experimental test of FT in quantum regime. Our attempt will shed new light in the research field of mesoscopic quantum statistical physics.
Insights into Corona Formation through Statistical Analyses
NASA Technical Reports Server (NTRS)
Glaze, L. S.; Stofan, E. R.; Smrekar, S. E.; Baloga, S. M.
2002-01-01
Statistical analysis of an expanded database of coronae on Venus indicates that the populations of Type 1 (with fracture annuli) and 2 (without fracture annuli) corona diameters are statistically indistinguishable, and therefore we have no basis for assuming different formation mechanisms. Analysis of the topography and diameters of coronae shows that coronae that are depressions, rimmed depressions, and domes tend to be significantly smaller than those that are plateaus, rimmed plateaus, or domes with surrounding rims. This is consistent with the model of Smrekar and Stofan and inconsistent with predictions of the spreading drop model of Koch and Manga. The diameter range for domes, the initial stage of corona formation, provides a broad constraint on the buoyancy of corona-forming plumes. Coronae are only slightly more likely to be topographically raised than depressions, with Type 1 coronae most frequently occurring as rimmed depressions and Type 2 coronae most frequently occuring with flat interiors and raised rims. Most Type 1 coronae are located along chasmata systems or fracture belts, while Type 2 coronas are found predominantly as isolated features in the plains. Coronae at hotspot rises tend to be significantly larger than coronae in other settings, consistent with a hotter upper mantle at hotspot rises and their active state.
Statistical methods of estimating mining costs
Long, K.R.
2011-01-01
Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.
Further developments in cloud statistics for computer simulations
NASA Technical Reports Server (NTRS)
Chang, D. T.; Willand, J. H.
1972-01-01
This study is a part of NASA's continued program to provide global statistics of cloud parameters for computer simulation. The primary emphasis was on the development of the data bank of the global statistical distributions of cloud types and cloud layers and their applications in the simulation of the vertical distributions of in-cloud parameters such as liquid water content. These statistics were compiled from actual surface observations as recorded in Standard WBAN forms. Data for a total of 19 stations were obtained and reduced. These stations were selected to be representative of the 19 primary cloud climatological regions defined in previous studies of cloud statistics. Using the data compiled in this study, a limited study was conducted of the hemogeneity of cloud regions, the latitudinal dependence of cloud-type distributions, the dependence of these statistics on sample size, and other factors in the statistics which are of significance to the problem of simulation. The application of the statistics in cloud simulation was investigated. In particular, the inclusion of the new statistics in an expanded multi-step Monte Carlo simulation scheme is suggested and briefly outlined.
Application of parametric statistical weights in CAD imaging systems
NASA Astrophysics Data System (ADS)
Galperin, Michael
2005-04-01
PURPOSE: To propose a method for Parametric Statistical Weights (PSW) estimations and analyze its statistical impact in Computer-Aided Diagnosis Imaging Systems based on a Relative Similarity (CADIRS) classification approach. MATERIALS AND METHODS: A Multifactor statistical method was developed and applied for Parametric Statistical Weights calculations in CADIRS. The implemented PSW method was used for statistical estimations of PSW impact when applied to a clinically validated breast ultrasound digital database of 332 patients' cases with biopsy proven findings. The method is based on the assumption that each parameter used in Relative Similarity (RS) classifier contributes to the deviation of the diagnostic prediction proportionally to the normalized value of its coefficient of multiple regression. The calculated by CADIRS Relative Similarity values with and without PSW were statistically estimated, compared and analyzed (on subset of cases) using classic Receiver Operator Characteristic (ROC) analysis methods. RESULTS: When CADIRS classification scheme was augmented with PSW the Relative Similarity the calculated values were 2-5% higher in average. Numeric estimations of PSW allowed decomposition of statistical significance for each component (factor) and its impact on similarity to the diagnostic results (biopsy proven). CONCLUSION: Parametric Statistical Weights in Computer-Aided Diagnosis Imaging Systems based on a Relative Similarity classification approach can be successfully applied in an effort to enhance overall classification (including scoring) outcomes. For the analyzed cohort of 332 cases the application of PSW increased Relative Similarity to the retrieved templates with known findings by 2-5% in average.
Statistical modelling of citation exchange between statistics journals.
Varin, Cristiano; Cattelan, Manuela; Firth, David
2016-01-01
Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation 'export scores' within the discipline of statistics.
Statistical anisotropies in gravitational waves in solid inflation
Akhshik, Mohammad; Emami, Razieh; Firouzjahi, Hassan; Wang, Yi E-mail: emami@ipm.ir E-mail: yw366@cam.ac.uk
2014-09-01
Solid inflation can support a long period of anisotropic inflation. We calculate the statistical anisotropies in the scalar and tensor power spectra and their cross-correlation in anisotropic solid inflation. The tensor-scalar cross-correlation can either be positive or negative, which impacts the statistical anisotropies of the TT and TB spectra in CMB map more significantly compared with the tensor self-correlation. The tensor power spectrum contains potentially comparable contributions from quadrupole and octopole angular patterns, which is different from the power spectra of scalar, the cross-correlation or the scalar bispectrum, where the quadrupole type statistical anisotropy dominates over octopole.
Statistics of contractive cracking patterns. [frozen soil-water rheology
NASA Technical Reports Server (NTRS)
Noever, David A.
1991-01-01
The statistics of convective soil patterns are analyzed using statistical crystallography. An underlying hierarchy of order is found to span four orders of magnitude in characteristic pattern length. Strict mathematical requirements determine the two-dimensional (2D) topology, such that random partitioning of space yields a predictable statistical geometry for polygons. For all lengths, Aboav's and Lewis's laws are verified; this result is consistent both with the need to fill 2D space and most significantly with energy carried not by the patterns' interior, but by the boundaries. Together, this suggests a common mechanism of formation for both micro- and macro-freezing patterns.
PSD Significance Levels for Monitoring
This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.
Where boosted significances come from
NASA Astrophysics Data System (ADS)
Plehn, Tilman; Schichtel, Peter; Wiegand, Daniel
2014-03-01
In an era of increasingly advanced experimental analysis techniques it is crucial to understand which phase space regions contribute a signal extraction from backgrounds. Based on the Neyman-Pearson lemma we compute the maximum significance for a signal extraction as an integral over phase space regions. We then study to what degree boosted Higgs strategies benefit ZH and tt¯H searches and which transverse momenta of the Higgs are most promising. We find that Higgs and top taggers are the appropriate tools, but would profit from a targeted optimization towards smaller transverse momenta. MadMax is available as an add-on to MadGraph 5.
The use and misuse of statistics in space physics
NASA Technical Reports Server (NTRS)
Reiff, Patricia H.
1990-01-01
This paper presents several statistical techniques most commonly used in space physics, including Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density and superimposed epoch analysis, and presents tests to assess the significance of the results. New techniques such as bootstrapping and jackknifing are presented. When no test of significance is in common usage, a plausible test is suggested.
Public health significance of neuroticism.
Lahey, Benjamin B
2009-01-01
The personality trait of neuroticism refers to relatively stable tendencies to respond with negative emotions to threat, frustration, or loss. Individuals in the population vary markedly on this trait, ranging from frequent and intense emotional reactions to minor challenges to little emotional reaction even in the face of significant difficulties. Although not widely appreciated, there is growing evidence that neuroticism is a psychological trait of profound public health significance. Neuroticism is a robust correlate and predictor of many different mental and physical disorders, comorbidity among them, and the frequency of mental and general health service use. Indeed, neuroticism apparently is a predictor of the quality and longevity of our lives. Achieving a full understanding of the nature and origins of neuroticism, and the mechanisms through which neuroticism is linked to mental and physical disorders, should be a top priority for research. Knowing why neuroticism predicts such a wide variety of seemingly diverse outcomes should lead to improved understanding of commonalities among those outcomes and improved strategies for preventing them.
Significance of biofilms in dentistry.
Wróblewska, Marta; Strużycka, Izabela; Mierzwińska-Nastalska, Elżbieta
2015-01-01
In the past decades significant scientific progress has taken place in the knowledge about biofilms. They constitute multilayer conglomerates of bacteria and fungi, surrounded by carbohydrates which they produce, as well as substances derived from saliva and gingival fluid. Modern techniques showed significant diversity of the biofilm environment and a system of microbial communication (quorum sensing), enhancing their survival. At present it is believed that the majority of infections, particularly chronic with exacerbations, are a result of biofilm formation, particularly in the presence of biomaterials. It should be emphasised that penetration of antibiotics and other antimicrobial agents into deeper layers of a biofilm is poor, causing therapeutic problems and necessitating sometimes removal of the implant or prosthesis. Biofilms play an increasing role in dentistry as a result of more and more broad use in dental practice of plastic and implantable materials. Biofilms are produced on the surfaces of teeth as dental plaque, in the para-nasal sinuses, on prostheses, dental implants, as well as in waterlines of a dental unit, constituting a particular risk for severely immunocompromised patients. New methods of therapy and prevention of infections linked to biofilms are under development.
21 CFR 820.250 - Statistical techniques.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Statistical techniques. 820.250 Section 820.250...) MEDICAL DEVICES QUALITY SYSTEM REGULATION Statistical Techniques § 820.250 Statistical techniques. (a... statistical techniques required for establishing, controlling, and verifying the acceptability of...
21 CFR 820.250 - Statistical techniques.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Statistical techniques. 820.250 Section 820.250...) MEDICAL DEVICES QUALITY SYSTEM REGULATION Statistical Techniques § 820.250 Statistical techniques. (a... statistical techniques required for establishing, controlling, and verifying the acceptability of...