Statistical Significance Testing.
ERIC Educational Resources Information Center
McLean, James E., Ed.; Kaufman, Alan S., Ed.
1998-01-01
The controversy about the use or misuse of statistical significance testing has become the major methodological issue in educational research. This special issue contains three articles that explore the controversy, three commentaries on these articles, an overall response, and three rejoinders by the first three authors. They are: (1)…
Statistical or biological significance?
Saxon, Emma
2015-01-01
Oat plants grown at an agricultural research facility produce higher yields in Field 1 than in Field 2, under well fertilised conditions and with similar weather exposure; all oat plants in both fields are healthy and show no sign of disease. In this study, the authors hypothesised that the soil microbial community might be different in each field, and these differences might explain the difference in oat plant growth. They carried out a metagenomic analysis of the 16 s ribosomal 'signature' sequences from bacteria in 50 randomly located soil samples in each field to determine the composition of the bacterial community. The study identified >1000 species, most of which were present in both fields. The authors identified two plant growth-promoting species that were significantly reduced in soil from Field 2 (Student's t-test P < 0.05), and concluded that these species might have contributed to reduced yield. PMID:26541972
Statistically significant relational data mining :
Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.
2014-02-01
This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.
Statistical significance of the gallium anomaly
Giunti, Carlo; Laveder, Marco
2011-06-15
We calculate the statistical significance of the anomalous deficit of electron neutrinos measured in the radioactive source experiments of the GALLEX and SAGE solar neutrino detectors, taking into account the uncertainty of the detection cross section. We found that the statistical significance of the anomaly is {approx}3.0{sigma}. A fit of the data in terms of neutrino oscillations favors at {approx}2.7{sigma} short-baseline electron neutrino disappearance with respect to the null hypothesis of no oscillations.
The insignificance of statistical significance testing
Johnson, Douglas H.
1999-01-01
Despite their use in scientific joumals such asThe journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.
Statistical Significance vs. Practical Significance: An Exploration through Health Education
ERIC Educational Resources Information Center
Rosen, Brittany L.; DeMaria, Andrea L.
2012-01-01
The purpose of this paper is to examine the differences between statistical and practical significance, including strengths and criticisms of both methods, as well as provide information surrounding the application of various effect sizes and confidence intervals within health education research. Provided are recommendations, explanations and…
Understanding Statistical Significance: A Conceptual History.
ERIC Educational Resources Information Center
Little, Joseph
2001-01-01
Considers how if literacy is envisioned as a sort of competence in a set of social and intellectual practices, then scientific literacy must encompass the realization that "statistical significance," the cardinal arbiter of social scientific knowledge, was not born out of an immanent logic of mathematics but socially constructed and reconstructed…
Determining the Statistical Significance of Relative Weights
ERIC Educational Resources Information Center
Tonidandel, Scott; LeBreton, James M.; Johnson, Jeff W.
2009-01-01
Relative weight analysis is a procedure for estimating the relative importance of correlated predictors in a regression equation. Because the sampling distribution of relative weights is unknown, researchers using relative weight analysis are unable to make judgments regarding the statistical significance of the relative weights. J. W. Johnson…
Comments on the Statistical Significance Testing Articles.
ERIC Educational Resources Information Center
Knapp, Thomas R.
1998-01-01
Expresses a "middle-of-the-road" position on statistical significance testing, suggesting that it has its place but that confidence intervals are generally more useful. Identifies 10 errors of omission or commission in the papers reviewed that weaken the positions taken in their discussions. (SLD)
Social significance of community structure: Statistical view
NASA Astrophysics Data System (ADS)
Li, Hui-Jia; Daniels, Jasmine J.
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p -value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc.
Statistical Significance of Clustering using Soft Thresholding
Huang, Hanwen; Liu, Yufeng; Yuan, Ming; Marron, J. S.
2015-01-01
Clustering methods have led to a number of important discoveries in bioinformatics and beyond. A major challenge in their use is determining which clusters represent important underlying structure, as opposed to spurious sampling artifacts. This challenge is especially serious, and very few methods are available, when the data are very high in dimension. Statistical Significance of Clustering (SigClust) is a recently developed cluster evaluation tool for high dimensional low sample size data. An important component of the SigClust approach is the very definition of a single cluster as a subset of data sampled from a multivariate Gaussian distribution. The implementation of SigClust requires the estimation of the eigenvalues of the covariance matrix for the null multivariate Gaussian distribution. We show that the original eigenvalue estimation can lead to a test that suffers from severe inflation of type-I error, in the important case where there are a few very large eigenvalues. This paper addresses this critical challenge using a novel likelihood based soft thresholding approach to estimate these eigenvalues, which leads to a much improved SigClust. Major improvements in SigClust performance are shown by both mathematical analysis, based on the new notion of Theoretical Cluster Index, and extensive simulation studies. Applications to some cancer genomic data further demonstrate the usefulness of these improvements. PMID:26755893
The Use of Meta-Analytic Statistical Significance Testing
ERIC Educational Resources Information Center
Polanin, Joshua R.; Pigott, Terri D.
2015-01-01
Meta-analysis multiplicity, the concept of conducting multiple tests of statistical significance within one review, is an underdeveloped literature. We address this issue by considering how Type I errors can impact meta-analytic results, suggest how statistical power may be affected through the use of multiplicity corrections, and propose how…
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
ERIC Educational Resources Information Center
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Advances in Testing the Statistical Significance of Mediation Effects
ERIC Educational Resources Information Center
Mallinckrodt, Brent; Abraham, W. Todd; Wei, Meifen; Russell, Daniel W.
2006-01-01
P. A. Frazier, A. P. Tix, and K. E. Barron (2004) highlighted a normal theory method popularized by R. M. Baron and D. A. Kenny (1986) for testing the statistical significance of indirect effects (i.e., mediator variables) in multiple regression contexts. However, simulation studies suggest that this method lacks statistical power relative to some…
The questioned p value: clinical, practical and statistical significance.
Jiménez-Paneque, Rosa
2016-01-01
The use of p-value and statistical significance have been questioned since the early 80s in the last century until today. Much has been discussed about it in the field of statistics and its applications, especially in Epidemiology and Public Health. As a matter of fact, the p-value and its equivalent, statistical significance, are difficult concepts to grasp for the many health professionals some way involved in research applied to their work areas. However, its meaning should be clear in intuitive terms although it is based on theoretical concepts of the field of Statistics. This paper attempts to present the p-value as a concept that applies to everyday life and therefore intuitively simple but whose proper use cannot be separated from theoretical and methodological elements of inherent complexity. The reasons behind the criticism received by the p-value and its isolated use are intuitively explained, mainly the need to demarcate statistical significance from clinical significance and some of the recommended remedies for these problems are approached as well. It finally refers to the current trend to vindicate the p-value appealing to the convenience of its use in certain situations and the recent statement of the American Statistical Association in this regard. PMID:27636600
The questioned p value: clinical, practical and statistical significance.
Jiménez-Paneque, Rosa
2016-09-09
The use of p-value and statistical significance have been questioned since the early 80s in the last century until today. Much has been discussed about it in the field of statistics and its applications, especially in Epidemiology and Public Health. As a matter of fact, the p-value and its equivalent, statistical significance, are difficult concepts to grasp for the many health professionals some way involved in research applied to their work areas. However, its meaning should be clear in intuitive terms although it is based on theoretical concepts of the field of Statistics. This paper attempts to present the p-value as a concept that applies to everyday life and therefore intuitively simple but whose proper use cannot be separated from theoretical and methodological elements of inherent complexity. The reasons behind the criticism received by the p-value and its isolated use are intuitively explained, mainly the need to demarcate statistical significance from clinical significance and some of the recommended remedies for these problems are approached as well. It finally refers to the current trend to vindicate the p-value appealing to the convenience of its use in certain situations and the recent statement of the American Statistical Association in this regard.
Has Testing for Statistical Significance Outlived Its Usefulness?
ERIC Educational Resources Information Center
McLean, James E.; Ernest, James M.
The research methodology literature in recent years has included a full frontal assault on statistical significance testing. An entire edition of "Experimental Education" explored this controversy. The purpose of this paper is to promote the position that while significance testing by itself may be flawed, it has not outlived its usefulness.…
On detection and assessment of statistical significance of Genomic Islands
Chatterjee, Raghunath; Chaudhuri, Keya; Chaudhuri, Probal
2008-01-01
Background Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. Results Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island) runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. Conclusion The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods. PMID:18380895
A Comparison of Statistical Significance Tests for Selecting Equating Functions
ERIC Educational Resources Information Center
Moses, Tim
2009-01-01
This study compared the accuracies of nine previously proposed statistical significance tests for selecting identity, linear, and equipercentile equating functions in an equivalent groups equating design. The strategies included likelihood ratio tests for the loglinear models of tests' frequency distributions, regression tests, Kolmogorov-Smirnov…
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Estimation of the geochemical threshold and its statistical significance
Miesch, A.T.
1981-01-01
A statistic is proposed for estimating the geochemical threshold and its statistical significance, or it may be used to identify a group of extreme values that can be tested for significance by other means. The statistic is the maximum gap between adjacent values in an ordered array after each gap has been adjusted for the expected frequency. The values in the ordered array are geochemical values transformed by either ln(?? - ??) or ln(?? - ??) and then standardized so that the mean is zero and the variance is unity. The expected frequency is taken from a fitted normal curve with unit area. The midpoint of an adjusted gap that exceeds the corresponding critical value may be taken as an estimate of the geochemical threshold, and the associated probability indicates the likelihood that the threshold separates two geochemical populations. The adjusted gap test may fail to identify threshold values if the variation tends to be continuous from background values to the higher values that reflect mineralized ground. However, the test will serve to identify other anomalies that may be too subtle to have been noted by other means. ?? 1981.
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance
Kramer, Karen L.; Veile, Amanda; Otárola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children’s growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children’s monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children’s growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children’s growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children’s growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance. PMID:26938742
Fostering Students' Statistical Literacy through Significant Learning Experience
ERIC Educational Resources Information Center
Krishnan, Saras
2015-01-01
A major objective of statistics education is to develop students' statistical literacy that enables them to be educated users of data in context. Teaching statistics in today's educational settings is not an easy feat because teachers have a huge task in keeping up with the demands of the new generation of learners. The present day students have…
A Tutorial on Hunting Statistical Significance by Chasing N
Szucs, Denes
2016-01-01
There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Hidden data dredging (also called p-hacking) is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. In order to build better intuition to avoid, detect and criticize some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings. I illustrate several forms of two special cases of data dredging. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. Second, researchers may group participants post hoc along potential but unplanned independent grouping variables. The first approach ‘hacks’ the number of participants in studies, the second approach ‘hacks’ the number of variables in the analysis. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. Similar, usually undocumented data dredging steps can easily lead to having 20–50%, or more false positives. PMID:27713723
Statistical controversies in clinical research: statistical significance-too much of a good thing ….
Buyse, M; Hurvitz, S A; Andre, F; Jiang, Z; Burris, H A; Toi, M; Eiermann, W; Lindsay, M-A; Slamon, D
2016-05-01
The use and interpretation of P values is a matter of debate in applied research. We argue that P values are useful as a pragmatic guide to interpret the results of a clinical trial, not as a strict binary boundary that separates real treatment effects from lack thereof. We illustrate our point using the result of BOLERO-1, a randomized, double-blind trial evaluating the efficacy and safety of adding everolimus to trastuzumab and paclitaxel as first-line therapy for HER2+ advanced breast cancer. In this trial, the benefit of everolimus was seen only in the predefined subset of patients with hormone receptor-negative breast cancer at baseline (progression-free survival hazard ratio = 0.66, P = 0.0049). A strict interpretation of this finding, based on complex 'alpha splitting' rules to assess statistical significance, led to the conclusion that the benefit of everolimus was not statistically significant either overall or in the subset. We contend that this interpretation does not do justice to the data, and we argue that the benefit of everolimus in hormone receptor-negative breast cancer is both statistically compelling and clinically relevant. PMID:26861602
Shukla, R.; Yu Daohai; Fulk, F.
1995-12-31
Short-term toxicity tests with aquatic organisms are a valuable measurement tool in the assessment of the toxicity of effluents, environmental samples and single chemicals. Currently toxicity tests are utilized in a wide range of US EPA regulatory activities including effluent discharge compliance. In the current approach for determining the No Observed Effect Concentration, an effluent concentration is presumed safe if there is no statistically significant difference in toxicant response versus control response. The conclusion of a safe concentration may be due to the fact that it truly is safe, or alternatively, that the ability of the statistical test to detect an effect, given its existence, is inadequate. Results of research of a new statistical approach, the basis of which is to move away from a demonstration of no difference to a demonstration of equivalence, will be discussed. The concept of observed confidence distributions, first suggested by Cox, is proposed as a measure of the strength of evidence for practically equivalent responses between a given effluent concentration and the control. The research included determination of intervals of practically equivalent responses as a function of the variability of control response. The approach is illustrated using reproductive data from tests with Ceriodaphnia dubia and survival and growth data from tests with fathead minnow. The data are from the US EPA`s National Reference Toxicant Database.
Statistical downscaling rainfall using artificial neural network: significantly wetter Bangkok?
NASA Astrophysics Data System (ADS)
Vu, Minh Tue; Aribarg, Thannob; Supratid, Siriporn; Raghavan, Srivatsan V.; Liong, Shie-Yui
2015-08-01
Artificial neural network (ANN) is an established technique with a flexible mathematical structure that is capable of identifying complex nonlinear relationships between input and output data. The present study utilizes ANN as a method of statistically downscaling global climate models (GCMs) during the rainy season at meteorological site locations in Bangkok, Thailand. The study illustrates the applications of the feed forward back propagation using large-scale predictor variables derived from both the ERA-Interim reanalyses data and present day/future GCM data. The predictors are first selected over different grid boxes surrounding Bangkok region and then screened by using principal component analysis (PCA) to filter the best correlated predictors for ANN training. The reanalyses downscaled results of the present day climate show good agreement against station precipitation with a correlation coefficient of 0.8 and a Nash-Sutcliffe efficiency of 0.65. The final downscaled results for four GCMs show an increasing trend of precipitation for rainy season over Bangkok by the end of the twenty-first century. The extreme values of precipitation determined using statistical indices show strong increases of wetness. These findings will be useful for policy makers in pondering adaptation measures due to flooding such as whether the current drainage network system is sufficient to meet the changing climate and to plan for a range of related adaptation/mitigation measures.
Statistically significant data base of rock properties for geothermal use
NASA Astrophysics Data System (ADS)
Koch, A.; Jorand, R.; Clauser, C.
2009-04-01
The high risk of failure due to the unknown properties of the target rocks at depth is a major obstacle for the exploration of geothermal energy. In general, the ranges of thermal and hydraulic properties given in compilations of rock properties are too large to be useful to constrain properties at a specific site. To overcome this problem, we study the thermal and hydraulic rock properties of the main rock types in Germany in a statistical approach. An important aspect is the use of data from exploration wells that are largely untapped for the purpose of geothermal exploration. In the current project stage, we have been analyzing mostly Devonian and Carboniferous drill cores from 20 deep boreholes in the region of the Lower Rhine Embayment and the Ruhr area (western North Rhine Westphalia). In total, we selected 230 core samples with a length of up to 30 cm from the core archive of the State Geological Survey. The use of core scanning technology allowed the rapid measurement of thermal conductivity, sonic velocity, and gamma density under dry and water saturated conditions with high resolution for a large number of samples. In addition, we measured porosity, bulk density, and matrix density based on Archimedes' principle and pycnometer analysis. As first results we present arithmetic means, medians and standard deviations characterizing the petrophysical properties and their variability for specific lithostratigraphic units. Bi- and multimodal frequency distributions correspond to the occurrence of different lithologies such as shale, limestone, dolomite, sandstone, siltstone, marlstone, and quartz-schist. In a next step, the data set will be combined with logging data and complementary mineralogical analyses to derive the variation of thermal conductivity with depth. As a final result, this may be used to infer thermal conductivity for boreholes without appropriate core data which were drilled in similar geological settings.
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
ERIC Educational Resources Information Center
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
ERIC Educational Resources Information Center
Monterde-i-Bort, Hector; Frias-Navarro, Dolores; Pascual-Llobell, Juan
2010-01-01
The empirical study we present here deals with a pedagogical issue that has not been thoroughly explored up until now in our field. Previous empirical studies in other sectors have identified the opinions of researchers about this topic, showing that completely unacceptable interpretations have been made of significance tests and other statistical…
ERIC Educational Resources Information Center
Huston, Holly L.
This paper begins with a general discussion of statistical significance, effect size, and power analysis; and concludes by extending the discussion to the multivariate case (MANOVA). Historically, traditional statistical significance testing has guided researchers' thinking about the meaningfulness of their data. The use of significance testing…
Assessing Genome-Wide Statistical Significance for Large p Small n Problems
Diao, Guoqing; Vidyashankar, Anand N.
2013-01-01
Assessing genome-wide statistical significance is an important issue in genetic studies. We describe a new resampling approach for determining the appropriate thresholds for statistical significance. Our simulation results demonstrate that the proposed approach accurately controls the genome-wide type I error rate even under the large p small n situations. PMID:23666935
Assessing genome-wide statistical significance for large p small n problems.
Diao, Guoqing; Vidyashankar, Anand N
2013-07-01
Assessing genome-wide statistical significance is an important issue in genetic studies. We describe a new resampling approach for determining the appropriate thresholds for statistical significance. Our simulation results demonstrate that the proposed approach accurately controls the genome-wide type I error rate even under the large p small n situations.
ERIC Educational Resources Information Center
Norris, John M.
2015-01-01
Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…
A Review of Post-1994 Literature on Whether Statistical Significance Tests Should Be Banned.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
This paper summarizes the literature regarding statistical significance testing with an emphasis on: (1) the post-1994 literature in various disciplines; (2) alternatives to statistical significance testing; and (3) literature exploring why researchers have demonstrably failed to be influenced by the 1994 American Psychological Association…
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
ERIC Educational Resources Information Center
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
Chládek, J; Brázdil, M; Halámek, J; Plešinger, F; Jurák, P
2013-01-01
We present an off-line analysis procedure for exploring brain activity recorded from intra-cerebral electroencephalographic data (SEEG). The objective is to determine the statistical differences between different types of stimulations in the time-frequency domain. The procedure is based on computing relative signal power change and subsequent statistical analysis. An example of characteristic statistically significant event-related de/synchronization (ERD/ERS) detected across different frequency bands following different oddball stimuli is presented. The method is used for off-line functional classification of different brain areas. PMID:24109865
Coulson, Melissa; Healey, Michelle; Fidler, Fiona; Cumming, Geoff
2010-01-01
A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST), or confidence intervals (CIs). Authors of articles published in psychology, behavioral neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST. PMID:21607077
NASA Astrophysics Data System (ADS)
Wilks, Daniel S.
1996-04-01
A simple approach to long-range forecasting of monthly or seasonal quantities is as the average of observations over some number of the most recent years. Finding this `optimal climate normal' (OCN) involves examining the relationships between the observed variable and averages of its values over the previous one to 30 years and selecting the averaging period yielding the best results. This procedure involves a multiplicity of comparisons, which will lead to misleadingly positive results for developments data. The statistical significance of these OCNs are assessed here using a resampling procedure, in which time series of U.S. Climate Division data are repeatedly shuffled to produce statistical distributions of forecast performance measures, under the null hypothesis that the OCNs exhibit no predictive skill. Substantial areas in the United States are found for which forecast performance appears to be significantly better than would occur by chance.Another complication in the assessment of the statistical significance of the OCNs derives from the spatial correlation exhibited by the data. Because of this correlation, instances of Type I errors (false rejections of local null hypotheses) will tend to occur with spatial coherency and accordingly have the potential to be confused with regions for which there may be real predictability. The `field significance' of the collections of local tests is also assessed here by simultaneously and coherently shuffling the time series for the Climate Divisions. Areas exhibiting significant local tests are large enough to conclude that seasonal OCN temperature forecasts exhibit significant skill over parts of the United States for all seasons except SON, OND, and NDJ, and that seasonal OCN precipitation forecasts are significantly skillful only in the fall. Statistical significance is weaker for monthly than for seasonal OCN temperature forecasts, and the monthly OCN precipitation forecasts do not exhibit significant predictive
Weighing the costs of different errors when determining statistical significant during monitoring
Technology Transfer Automated Retrieval System (TEKTRAN)
Selecting appropriate significance levels when constructing confidence intervals and performing statistical analyses with rangeland monitoring data is not a straightforward process. This process is burdened by the conventional selection of “95% confidence” (i.e., Type I error rate, a =0.05) as the d...
ERIC Educational Resources Information Center
Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.
2011-01-01
In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…
ERIC Educational Resources Information Center
Spinella, Sarah
2011-01-01
As result replicability is essential to science and difficult to achieve through external replicability, the present paper notes the insufficiency of null hypothesis statistical significance testing (NHSST) and explains the bootstrap as a plausible alternative, with a heuristic example to illustrate the bootstrap method. The bootstrap relies on…
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
ERIC Educational Resources Information Center
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate unless "corrected" effect…
Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.
ERIC Educational Resources Information Center
Deegear, James
This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…
Statistical Significance of the Trends in Monthly Heavy Precipitation Over the US
Mahajan, Salil; North, Dr. Gerald R.; Saravanan, Dr. R.; Genton, Dr. Marc G.
2012-01-01
Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall's {tau} test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong.
ERIC Educational Resources Information Center
Snyder, Patricia; Lawson, Stephen
Magnitude of effect measures (MEMs), when adequately understood and correctly used, are important aids for researchers who do not want to rely solely on tests of statistical significance in substantive result interpretation. The MEM tells how much of the dependent variable can be controlled, predicted, or explained by the independent variables.…
Reflections on Statistical and Substantive Significance, with a Slice of Replication.
ERIC Educational Resources Information Center
Robinson, Daniel H.; Levin, Joel R.
1997-01-01
Proposes modifications to the recent suggestions by B. Thompson (1996) for an American Educational Research Association editorial policy on statistical significance testing. Points out that, although it is useful to include effect sizes, they can be misinterpreted, and argues, as does Thompson, for greater attention to replication in educational…
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Suffredini, Anthony F; Sacks, David B; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple 'fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.
NASA Astrophysics Data System (ADS)
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Rudd, James; Moore, Jason H; Urbanowicz, Ryan J
2013-11-01
Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear. PMID:24358057
Martinez-Val, Ana; Garcia, Fernando; Ximénez-Embún, Pilar; Ibarz, Nuria; Zarzuela, Eduardo; Ruppen, Isabel; Mohammed, Shabaz; Munoz, Javier
2016-09-01
Isobaric labeling is gaining popularity in proteomics due to its multiplexing capacity. However, copeptide fragmentation introduces a bias that undermines its accuracy. Several strategies have been shown to partially and, in some cases, completely solve this issue. However, it is still not clear how ratio compression affects the ability to identify a protein's change of abundance as statistically significant. Here, by using the "two proteomes" approach (E. coli lysates with fixed 2.5 ratios in the presence or absence of human lysates acting as the background interference) and manipulating isolation width values, we were able to model isobaric data with different levels of accuracy and precision in three types of mass spectrometers: LTQ Orbitrap Velos, Impact, and Q Exactive. We determined the influence of these variables on the statistical significance of the distorted ratios and compared them to the ratios measured without impurities. Our results confirm previous findings1-4 regarding the importance of optimizing acquisition parameters in each instrument in order to minimize interference without compromising precision and identification. We also show that, under these experimental conditions, the inclusion of a second replicate increases statistical sensitivity 2-3-fold and counterbalances to a large extent the issue of ratio compression.
Deriving statistical significance maps for SVM based image classification and group comparisons.
Gaonkar, Bilwaj; Davatzikos, Christos
2012-01-01
Population based pattern analysis and classification for quantifying structural and functional differences between diverse groups has been shown to be a powerful tool for the study of a number of diseases, and is quite commonly used especially in neuroimaging. The alternative to these pattern analysis methods, namely mass univariate methods such as voxel based analysis and all related methods, cannot detect multivariate patterns associated with group differences, and are not particularly suitable for developing individual-based diagnostic and prognostic biomarkers. A commonly used pattern analysis tool is the support vector machine (SVM). Unlike univariate statistical frameworks for morphometry, analytical tools for statistical inference are unavailable for the SVM. In this paper, we show that null distributions ordinarily obtained by permutation tests using SVMs can be analytically approximated from the data. The analytical computation takes a small fraction of the time it takes to do an actual permutation test, thereby rendering it possible to quickly create statistical significance maps derived from SVMs. Such maps are critical for understanding imaging patterns of group differences and interpreting which anatomical regions are important in determining the classifier's decision.
Jefferson, L; Cooper, E; Hewitt, C; Torgerson, T; Cook, L; Tharmanathan, P; Cockayne, S; Torgerson, D
2016-01-01
Objective Time-lag from study completion to publication is a potential source of publication bias in randomised controlled trials. This study sought to update the evidence base by identifying the effect of the statistical significance of research findings on time to publication of trial results. Design Literature searches were carried out in four general medical journals from June 2013 to June 2014 inclusive (BMJ, JAMA, the Lancet and the New England Journal of Medicine). Setting Methodological review of four general medical journals. Participants Original research articles presenting the primary analyses from phase 2, 3 and 4 parallel-group randomised controlled trials were included. Main outcome measures Time from trial completion to publication. Results The median time from trial completion to publication was 431 days (n = 208, interquartile range 278–618). A multivariable adjusted Cox model found no statistically significant difference in time to publication for trials reporting positive or negative results (hazard ratio: 0.86, 95% CI 0.64 to 1.16, p = 0.32). Conclusion In contrast to previous studies, this review did not demonstrate the presence of time-lag bias in time to publication. This may be a result of these articles being published in four high-impact general medical journals that may be more inclined to publish rapidly, whatever the findings. Further research is needed to explore the presence of time-lag bias in lower quality studies and lower impact journals. PMID:27757242
McKay, J Lucas; Welch, Torrence D J; Vidakovic, Brani; Ting, Lena H
2013-01-01
We developed wavelet-based functional ANOVA (wfANOVA) as a novel approach for comparing neurophysiological signals that are functions of time. Temporal resolution is often sacrificed by analyzing such data in large time bins, increasing statistical power by reducing the number of comparisons. We performed ANOVA in the wavelet domain because differences between curves tend to be represented by a few temporally localized wavelets, which we transformed back to the time domain for visualization. We compared wfANOVA and ANOVA performed in the time domain (tANOVA) on both experimental electromyographic (EMG) signals from responses to perturbation during standing balance across changes in peak perturbation acceleration (3 levels) and velocity (4 levels) and on simulated data with known contrasts. In experimental EMG data, wfANOVA revealed the continuous shape and magnitude of significant differences over time without a priori selection of time bins. However, tANOVA revealed only the largest differences at discontinuous time points, resulting in features with later onsets and shorter durations than those identified using wfANOVA (P < 0.02). Furthermore, wfANOVA required significantly fewer (~1/4;×; P < 0.015) significant F tests than tANOVA, resulting in post hoc tests with increased power. In simulated EMG data, wfANOVA identified known contrast curves with a high level of precision (r(2) = 0.94 ± 0.08) and performed better than tANOVA across noise levels (P < <0.01). Therefore, wfANOVA may be useful for revealing differences in the shape and magnitude of neurophysiological signals (e.g., EMG, firing rates) across multiple conditions with both high temporal resolution and high statistical power. PMID:23100136
2014-01-01
Background Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are “real” (i.e., statistically significant). Results The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations. Conclusions We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models. PMID:24694189
Massage induces an immediate, albeit short-term, reduction in muscle stiffness.
Eriksson Crommert, M; Lacourpaille, L; Heales, L J; Tucker, K; Hug, F
2015-10-01
Using ultrasound shear wave elastography, the aims of this study were: (a) to evaluate the effect of massage on stiffness of the medial gastrocnemius (MG) muscle and (b) to determine whether this effect (if any) persists over a short period of rest. A 7-min massage protocol was performed unilaterally on MG in 18 healthy volunteers. Measurements of muscle shear elastic modulus (stiffness) were performed bilaterally (control and massaged leg) in a moderately stretched position at three time points: before massage (baseline), directly after massage (follow-up 1), and following 3 min of rest (follow-up 2). Directly after massage, participants rated pain experienced during the massage. MG shear elastic modulus of the massaged leg decreased significantly at follow-up 1 (-5.2 ± 8.8%, P = 0.019, d = -0.66). There was no difference between follow-up 2 and baseline for the massaged leg (P = 0.83) indicating that muscle stiffness returned to baseline values. Shear elastic modulus was not different between time points in the control leg. There was no association between perceived pain during the massage and stiffness reduction (r = 0.035; P = 0.89). This is the first study to provide evidence that massage reduces muscle stiffness. However, this effect is short lived and returns to baseline values quickly after cessation of the massage.
NASA Astrophysics Data System (ADS)
Santer, B. D.; Wigley, T. M. L.; Boyle, J. S.; Gaffen, D. J.; Hnilo, J. J.; Nychka, D.; Parker, D. E.; Taylor, K. E.
2000-03-01
This paper examines trend uncertainties in layer-average free atmosphere temperatures arising from the use of different trend estimation methods. It also considers statistical issues that arise in assessing the significance of individual trends and of trend differences between data sets. Possible causes of these trends are not addressed. We use data from satellite and radiosonde measurements and from two reanalysis projects. To facilitate intercomparison, we compute from reanalyses and radiosonde data temperatures equivalent to those from the satellite-based Microwave Sounding Unit (MSU). We compare linear trends based on minimization of absolute deviations (LA) and minimization of squared deviations (LS). Differences are generally less than 0.05°C/decade over 1959-1996. Over 1979-1993, they exceed 0.10°C/decade for lower tropospheric time series and 0.15°C/decade for the lower stratosphere. Trend fitting by the LA method can degrade the lower-tropospheric trend agreement of 0.03°C/decade (over 1979-1996) previously reported for the MSU and radiosonde data. In assessing trend significance we employ two methods to account for temporal autocorrelation effects. With our preferred method, virtually none of the individual 1979-1993 trends in deep-layer temperatures are significantly different from zero. To examine trend differences between data sets we compute 95% confidence intervals for individual trends and show that these overlap for almost all data sets considered. Confidence intervals for lower-tropospheric trends encompass both zero and the model-projected trends due to anthropogenic effects. We also test the significance of a trend in d(t), the time series of differences between a pair of data sets. Use of d(t) removes variability common to both time series and facilitates identification of small trend differences. This more discerning test reveals that roughly 30% of the data set comparisons have significant differences in lower-tropospheric trends
Crow, C.J.
1985-01-01
Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.
A Network-Based Method to Assess the Statistical Significance of Mild Co-Regulation Effects
Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna
2013-01-01
Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis. PMID:24039936
Statistics, Probability, Significance, Likelihood: Words Mean What We Define Them to Mean
ERIC Educational Resources Information Center
Drummond, Gordon B.; Tom, Brian D. M.
2011-01-01
Statisticians use words deliberately and specifically, but not necessarily in the way they are used colloquially. For example, in general parlance "statistics" can mean numerical information, usually data. In contrast, one large statistics textbook defines the term "statistic" to denote "a characteristic of a "sample", such as the average score",…
NASA Astrophysics Data System (ADS)
Kellerer-Pirklbauer, Andreas
2016-04-01
Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter
Carr, J.R.; Roberts, K.P.
1989-02-01
Universal kriging is compared with ordinary kriging for estimation of earthquake ground motion. Ordinary kriging is based on a stationary random function model; universal kriging is based on a nonstationary random function model representing first-order drift. Accuracy of universal kriging is compared with that for ordinary kriging; cross-validation is used as the basis for comparison. Hypothesis testing on these results shows that accuracy obtained using universal kriging is not significantly different from accuracy obtained using ordinary kriging. Test based on normal distribution assumptions are applied to errors measured in the cross-validation procedure; t and F tests reveal no evidence to suggest universal and ordinary kriging are different for estimation of earthquake ground motion. Nonparametric hypothesis tests applied to these errors and jackknife statistics yield the same conclusion: universal and ordinary kriging are not significantly different for this application as determined by a cross-validation procedure. These results are based on application to four independent data sets (four different seismic events).
Alexandrov, N. N.; Go, N.
1994-01-01
We have completed an exhaustive search for the common spatial arrangements of backbone fragments (SARFs) in nonhomologous proteins. This type of local structural similarity, incorporating short fragments of backbone atoms, arranged not necessarily in the same order along the polypeptide chain, appears to be important for protein function and stability. To estimate the statistical significance of the similarities, we have introduced a similarity score. We present several locally similar structures, with a large similarity score, which have not yet been reported. On the basis of the results of pairwise comparison, we have performed hierarchical cluster analysis of protein structures. Our analysis is not limited by comparison of single chains but also includes complex molecules consisting of several subunits. The SARFs with backbone fragments from different polypeptide chains provide a stable interaction between subunits in protein molecules. In many cases the active site of enzyme is located at the same position relative to the common SARFs, implying a function of the certain SARFs as a universal interface of the protein-substrate interaction. PMID:8069217
Hulshizer, Randall; Blalock, Eric M
2007-01-01
Background Researchers using RNA expression microarrays in experimental designs with more than two treatment groups often identify statistically significant genes with ANOVA approaches. However, the ANOVA test does not discriminate which of the multiple treatment groups differ from one another. Thus, post hoc tests, such as linear contrasts, template correlations, and pairwise comparisons are used. Linear contrasts and template correlations work extremely well, especially when the researcher has a priori information pointing to a particular pattern/template among the different treatment groups. Further, all pairwise comparisons can be used to identify particular, treatment group-dependent patterns of gene expression. However, these approaches are biased by the researcher's assumptions, and some treatment-based patterns may fail to be detected using these approaches. Finally, different patterns may have different probabilities of occurring by chance, importantly influencing researchers' conclusions about a pattern and its constituent genes. Results We developed a four step, post hoc pattern matching (PPM) algorithm to automate single channel gene expression pattern identification/significance. First, 1-Way Analysis of Variance (ANOVA), coupled with post hoc 'all pairwise' comparisons are calculated for all genes. Second, for each ANOVA-significant gene, all pairwise contrast results are encoded to create unique pattern ID numbers. The # genes found in each pattern in the data is identified as that pattern's 'actual' frequency. Third, using Monte Carlo simulations, those patterns' frequencies are estimated in random data ('random' gene pattern frequency). Fourth, a Z-score for overrepresentation of the pattern is calculated ('actual' against 'random' gene pattern frequencies). We wrote a Visual Basic program (StatiGen) that automates PPM procedure, constructs an Excel workbook with standardized graphs of overrepresented patterns, and lists of the genes comprising
Statistical Significance of Periodicity and Log-Periodicity with Heavy-Tailed Correlated Noise
NASA Astrophysics Data System (ADS)
Zhou, Wei-Xing; Sornette, Didier
We estimate the probability that random noise, of several plausible standard distributions, creates a false alarm that a periodicity (or log-periodicity) is found in a time series. The solution of this problem is already known for independent Gaussian distributed noise. We investigate more general situations with non-Gaussian correlated noises and present synthetic tests on the detectability and statistical significance of periodic components. A periodic component of a time series is usually detected by some sort of Fourier analysis. Here, we use the Lomb periodogram analysis, which is suitable and outperforms Fourier transforms for unevenly sampled time series. We examine the false-alarm probability of the largest spectral peak of the Lomb periodogram in the presence of power-law distributed noises, of short-range and of long-range fractional-Gaussian noises. Increasing heavy-tailness (respectively correlations describing persistence) tends to decrease (respectively increase) the false-alarm probability of finding a large spurious Lomb peak. Increasing anti-persistence tends to decrease the false-alarm probability. We also study the interplay between heavy-tailness and long-range correlations. In order to fully determine if a Lomb peak signals a genuine rather than a spurious periodicity, one should in principle characterize the Lomb peak height, its width and its relations to other peaks in the complete spectrum. As a step towards this full characterization, we construct the joint-distribution of the frequency position (relative to other peaks) and of the height of the highest peak of the power spectrum. We also provide the distributions of the ratio of the highest Lomb peak to the second highest one. Using the insight obtained by the present statistical study, we re-examine previously reported claims of ``log-periodicity'' and find that the credibility for log-periodicity in 2D-freely decaying turbulence is weakened while it is strengthened for fracture, for the
2013-01-01
Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9). PMID:23721463
NASA Astrophysics Data System (ADS)
Woodruff, J. D.; Donnelly, J. P.; Emanuel, K.
2007-12-01
Coastal overwash deposits preserved within backbarrier sediments extend the documented record of tropical cyclone strikes back several millennia, providing valuable new data that help to elucidate links between tropical cyclone activity and climate variability. Certain caveats should be considered, however, when assessing trends observed within these paleo-storm records. For instance, gaps in overwash activity at a particular site could simply be artifacts produced by the random nature of these episodic events. Recently, a 5000 year record of intense hurricane strikes has been developed using coarse-grained overwash deposits from Laguna Playa Grande (LPG), a coastal lagoon located on the island of Vieques, Puerto Rico. The LPG record exhibits periods of frequent and infrequent hurricane-induced overwash activity spanning many centuries. These trends are consistent with overwash reconstructions from western Long Island, NY, and have been linked in part to variability in the El Niño/Southern Oscillation and the West African monsoon. Here we assess the statistical significance for active and inactive periods at LPG by creating thousands of synthetic overwash records for the site using storm tracks generated by a coupled ocean-atmosphere hurricane model set to mimic modern climatology. Results show that periods of infrequent overwash activity at the LPG site between 3600 and 1500 yrs BP and 1000 and 250 yrs BP are extremely unlikely to occur under modern climate conditions (above 99 percent confidence). This suggests that the variability observed in the Vieques record is consistent with changing climatic boundary conditions. Overwash frequency is greatest over the last 300 years, with 2 to 3 deposits/century compared to 0.6 deposits/century for earlier active regimes from 2500 to 1000 yrs BP and 5000 to 3600 yrs BP. While this may reflect an unprecedented level of activity over the last 5000 years, it may also in part be due to an undercounting of events in older
NASA Astrophysics Data System (ADS)
Alves, Gelio
After the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two concrete examples: global sequence alignment statistics and peptide sequencing/identification using mass spectrometry. High-performance liquid chromatography coupled to a mass spectrometer (HPLC/MS/MS), enabling peptide identifications and thus protein identifications, has become the tool of choice in large-scale proteomics experiments. Peptide identification is usually done by database searches methods. The lack of robust statistical significance assignment among current methods motivated the development of a novel de novo algorithm, RAId, whose score statistics then provide statistical significance for high scoring peptides found in our custom, enzyme-digested peptide library. The ease of incorporating post-translation modifications is another important feature of RAId. To organize the massive protein/DNA data accumulated, biologists often cluster proteins according to their similarity via tools such as sequence alignment. Homologous proteins share similar domains. To assess the similarity of two domains usually requires alignment from head to toe, ie. a global alignment. A good alignment score statistics with an appropriate null model enable us to distinguish the biologically meaningful similarity from chance similarity. There has been much progress in local alignment statistics, which characterize score statistics when alignments tend to appear as a short segment of the whole sequence. For global alignment, which is useful in domain alignment, there is still much room for
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
NASA Astrophysics Data System (ADS)
Williams, Arnold C.; Pachowicz, Peter W.
2004-09-01
Current mine detection research indicates that no single sensor or single look from a sensor will detect mines/minefields in a real-time manner at a performance level suitable for a forward maneuver unit. Hence, the integrated development of detectors and fusion algorithms are of primary importance. A problem in this development process has been the evaluation of these algorithms with relatively small data sets, leading to anecdotal and frequently over trained results. These anecdotal results are often unreliable and conflicting among various sensors and algorithms. Consequently, the physical phenomena that ought to be exploited and the performance benefits of this exploitation are often ambiguous. The Army RDECOM CERDEC Night Vision Laboratory and Electron Sensors Directorate has collected large amounts of multisensor data such that statistically significant evaluations of detection and fusion algorithms can be obtained. Even with these large data sets care must be taken in algorithm design and data processing to achieve statistically significant performance results for combined detectors and fusion algorithms. This paper discusses statistically significant detection and combined multilook fusion results for the Ellipse Detector (ED) and the Piecewise Level Fusion Algorithm (PLFA). These statistically significant performance results are characterized by ROC curves that have been obtained through processing this multilook data for the high resolution SAR data of the Veridian X-Band radar. We discuss the implications of these results on mine detection and the importance of statistical significance, sample size, ground truth, and algorithm design in performance evaluation.
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated. PMID:25388545
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated.
ERIC Educational Resources Information Center
Hojat, Mohammadreza; Xu, Gang
2004-01-01
Effect Sizes (ES) are an increasingly important index used to quantify the degree of practical significance of study results. This paper gives an introduction to the computation and interpretation of effect sizes from the perspective of the consumer of the research literature. The key points made are: (1) "ES" is a useful indicator of the…
Key statistics related to CO/sub 2/ emissions: Significant contributing countries
Kellogg, M.A.; Edmonds, J.A.; Scott, M.J.; Pomykala, J.S.
1987-07-01
This country selection task report describes and applies a methodology for identifying a set of countries responsible for significant present and anticipated future emissions of CO/sub 2/ and other radiatively important gases (RIGs). The identification of countries responsible for CO/sub 2/ and other RIGs emissions will help determine to what extent a select number of countries might be capable of influencing future emissions. Once identified, those countries could potentially exercise cooperative collective control of global emissions and thus mitigate the associated adverse affects of those emissions. The methodology developed consists of two approaches: the resource approach and the emissions approach. While conceptually very different, both approaches yield the same fundamental conclusion. The core of any international initiative to control global emissions must include three key countries: the US, USSR, and the People's Republic of China. It was also determined that broader control can be achieved through the inclusion of sixteen additional countries with significant contributions to worldwide emissions.
Petykhov, A B; Maev, I V; Deriabin, V E
2012-01-01
Anthropometry--a technique, allowing to obtain the necessary features for the characteristic of human body's changes in norm and at pathology. Statistical analysis of anthropometric parameters, such as--body mass, length, waist line, hip, shoulder and wrist circumferences, skin rolls of fat thickness: on triceps, under a bladebone, on a breast, on a venter and on a biceps, with calculation of indexes and an assessment of possible age influence was carried out for the first time in domestic medicine. Complexes of showing interrelations anthropometric characteristics were detected. Correlation coefficients (r) were counted and the factorial (on a method main a component with the subsequent rotation--a varimax method), covariance and discriminative analyses (with application of the Kaiser and Wilks criterions and F-test) is applied. Study of intergroup variability of body composition was carried out on separate characteristics in healthy individuals groups (135 surveyed aged 45,6 +/- 1,2 years, 56,3% men and 43,7% women) and at internal pathology: patients after a gastrectomy--121 (57,7 +/- 1,2 years, 52% men and 48% women); after Billroth operation--214 (56,1 +/- 1,0 years, 53% men and 47% women); after enterectomy--103 (44,5 +/- 1,8 years, 53% men and 47% women); after mixed genesis protein-energy wasting--206 (29,04 +/- 1,6 years, 79% men and 21% women). The group of interlocking characteristics which includes anthropometric parameters of hypodermic lipopexia (rolls of fat thickness on triceps, a biceps, under a bladebone, on a venter) and fatty body mass was defined by results of the analysis. These characteristics are interconnected with age and growth and have more expressed dependence at women, that reflects development of a fatty component of a body, at assessment of body mass index at women (unlike men). The waist-hip circumference index differs irrespective of body composition indicators that doesn't allow to characterize it with the terms of truncal or
Petykhov, A B; Maev, I V; Deriabin, V E
2012-01-01
Anthropometry--a technique, allowing to obtain the necessary features for the characteristic of human body's changes in norm and at pathology. Statistical analysis of anthropometric parameters, such as--body mass, length, waist line, hip, shoulder and wrist circumferences, skin rolls of fat thickness: on triceps, under a bladebone, on a breast, on a venter and on a biceps, with calculation of indexes and an assessment of possible age influence was carried out for the first time in domestic medicine. Complexes of showing interrelations anthropometric characteristics were detected. Correlation coefficients (r) were counted and the factorial (on a method main a component with the subsequent rotation--a varimax method), covariance and discriminative analyses (with application of the Kaiser and Wilks criterions and F-test) is applied. Study of intergroup variability of body composition was carried out on separate characteristics in healthy individuals groups (135 surveyed aged 45,6 +/- 1,2 years, 56,3% men and 43,7% women) and at internal pathology: patients after a gastrectomy--121 (57,7 +/- 1,2 years, 52% men and 48% women); after Billroth operation--214 (56,1 +/- 1,0 years, 53% men and 47% women); after enterectomy--103 (44,5 +/- 1,8 years, 53% men and 47% women); after mixed genesis protein-energy wasting--206 (29,04 +/- 1,6 years, 79% men and 21% women). The group of interlocking characteristics which includes anthropometric parameters of hypodermic lipopexia (rolls of fat thickness on triceps, a biceps, under a bladebone, on a venter) and fatty body mass was defined by results of the analysis. These characteristics are interconnected with age and growth and have more expressed dependence at women, that reflects development of a fatty component of a body, at assessment of body mass index at women (unlike men). The waist-hip circumference index differs irrespective of body composition indicators that doesn't allow to characterize it with the terms of truncal or
ERIC Educational Resources Information Center
Oshima, T. C.; Raju, Nambury S.; Nanda, Alice O.
2006-01-01
A new item parameter replication method is proposed for assessing the statistical significance of the noncompensatory differential item functioning (NCDIF) index associated with the differential functioning of items and tests framework. In this new method, a cutoff score for each item is determined by obtaining a (1-alpha ) percentile rank score…
NASA Technical Reports Server (NTRS)
Staubert, R.
1985-01-01
Methods for calculating the statistical significance of excess events and the interpretation of the formally derived values are discussed. It is argued that a simple formula for a conservative estimate should generally be used in order to provide a common understanding of quoted values.
WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions.
Vilardell, Mireia; Parra, Genis; Civit, Sergi
2014-01-01
Classically, gene prediction programs are based on detecting signals such as boundary sites (splice sites, starts, and stops) and coding regions in the DNA sequence in order to build potential exons and join them into a gene structure. Although nowadays it is possible to improve their performance with additional information from related species or/and cDNA databases, further improvement at any step could help to obtain better predictions. Here, we present WISCOD, a web-enabled tool for the identification of significant protein coding regions, a novel software tool that tackles the exon prediction problem in eukaryotic genomes. WISCOD has the capacity to detect real exons from large lists of potential exons, and it provides an easy way to use global P value called expected probability of being a false exon (EPFE) that is useful for ranking potential exons in a probabilistic framework, without additional computational costs. The advantage of our approach is that it significantly increases the specificity and sensitivity (both between 80% and 90%) in comparison to other ab initio methods (where they are in the range of 70-75%). WISCOD is written in JAVA and R and is available to download and to run in a local mode on Linux and Windows platforms. PMID:25313355
WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions.
Vilardell, Mireia; Parra, Genis; Civit, Sergi
2014-01-01
Classically, gene prediction programs are based on detecting signals such as boundary sites (splice sites, starts, and stops) and coding regions in the DNA sequence in order to build potential exons and join them into a gene structure. Although nowadays it is possible to improve their performance with additional information from related species or/and cDNA databases, further improvement at any step could help to obtain better predictions. Here, we present WISCOD, a web-enabled tool for the identification of significant protein coding regions, a novel software tool that tackles the exon prediction problem in eukaryotic genomes. WISCOD has the capacity to detect real exons from large lists of potential exons, and it provides an easy way to use global P value called expected probability of being a false exon (EPFE) that is useful for ranking potential exons in a probabilistic framework, without additional computational costs. The advantage of our approach is that it significantly increases the specificity and sensitivity (both between 80% and 90%) in comparison to other ab initio methods (where they are in the range of 70-75%). WISCOD is written in JAVA and R and is available to download and to run in a local mode on Linux and Windows platforms.
Kavvoura, Fotini K.; McQueen, Matthew B.; Khoury, Muin J.; Tanzi, Rudolph E.; Bertram, Lars
2008-01-01
The authors evaluated whether there is an excess of statistically significant results in studies of genetic associations with Alzheimer's disease reflecting either between-study heterogeneity or bias. Among published articles on genetic associations entered into the comprehensive AlzGene database (www.alzgene.org) through January 31, 2007, 1,348 studies included in 175 meta-analyses with 3 or more studies each were analyzed. The number of observed studies (O) with statistically significant results (P = 0.05 threshold) was compared with the expected number (E) under different assumptions for the magnitude of the effect size. In the main analysis, the plausible effect size of each association was the summary effect presented in the respective meta-analysis. Overall, 19 meta-analyses (all with eventually nonsignificant summary effects) had a documented excess of O over E: Typically single studies had significant effects pointing in opposite directions and early summary effects were dissipated over time. Across the whole domain, O was 235 (17.4%), while E was 164.8 (12.2%) (P < 10−6). The excess showed a predilection for meta-analyses with nonsignificant summary effects and between-study heterogeneity. The excess was seen for all levels of statistical significance and also for studies with borderline P values (P = 0.05–0.10). The excess of significant findings may represent significance-chasing biases in a setting of massive testing. PMID:18779388
Potts, T.T.; Hylko, J.M.; Almond, D.
2007-07-01
A company's overall safety program becomes an important consideration to continue performing work and for procuring future contract awards. When injuries or accidents occur, the employer ultimately loses on two counts - increased medical costs and employee absences. This paper summarizes the human and organizational components that contributed to successful safety programs implemented by WESKEM, LLC's Environmental, Safety, and Health Departments located in Paducah, Kentucky, and Oak Ridge, Tennessee. The philosophy of 'safety, compliance, and then production' and programmatic components implemented at the start of the contracts were qualitatively identified as contributing factors resulting in a significant accumulation of safe work hours and an Experience Modification Rate (EMR) of <1.0. Furthermore, a study by the Associated General Contractors of America quantitatively validated components, already found in the WESKEM, LLC programs, as contributing factors to prevent employee accidents and injuries. Therefore, an investment in the human and organizational components now can pay dividends later by reducing the EMR, which is the key to reducing Workers' Compensation premiums. Also, knowing your employees' demographics and taking an active approach to evaluate and prevent fatigue may help employees balance work and non-work responsibilities. In turn, this approach can assist employers in maintaining a healthy and productive workforce. For these reasons, it is essential that safety needs be considered as the starting point when performing work. (authors)
Wang, Q.; Denton, D.L.; Shukla, R.
2000-01-01
As a follow up to the recommendations of the September 1995 SETAC Pellston Workshop on Whole Effluent Toxicity (WET) on test methods and appropriate endpoints, this paper will discuss the applications and statistical properties of using a statistical criterion of minimum significant difference (MSD). The authors examined the upper limits of acceptable MSDs as acceptance criterion in the case of normally distributed data. The implications of this approach are examined in terms of false negative rate as well as false positive rate. Results indicated that the proposed approach has reasonable statistical properties. Reproductive data from short-term chronic WET test with Ceriodaphnia dubia tests were used to demonstrate the applications of the proposed approach. The data were collected by the North Carolina Department of Environment, Health, and Natural Resources (Raleigh, NC, USA) as part of their National Pollutant Discharge Elimination System program.
Li, Qingbo; Roxas, Bryan AP
2009-01-01
Background Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins. Results We have experimented to operate on several parameters to control a FDR, including a fold-change, a statistical test, and a minimum number of permuted significant pairings. Although none of these parameters alone gives a satisfactory control of a FDR, we find that a combination of these parameters provides a very effective means to control a FDR without compromising the sensitivity. The results suggest that it is possible to perform a significance analysis without protein sample replicates. Only duplicate LC/MS injections per sample are needed. We illustrate that differentially expressed proteins can be detected with a FDR between 0 and 15% at a positive rate of 4–16%. The method is evaluated for its sensitivity and specificity by a ROC analysis, and is further validated with a [15N]-labeled internal-standard protein sample and additional unlabeled protein sample replicates. Conclusion We demonstrate that a statistical significance can be inferred without protein sample replicates in label-free quantitative proteomics. The
NASA Astrophysics Data System (ADS)
Baluev, Roman V.
2013-11-01
We consider the `multifrequency' periodogram, in which the putative signal is modelled as a sum of two or more sinusoidal harmonics with independent frequencies. It is useful in cases when the data may contain several periodic components, especially when their interaction with each other and with the data sampling patterns might produce misleading results. Although the multifrequency statistic itself was constructed earlier, for example by G. Foster in his CLEANest algorithm, its probabilistic properties (the detection significance levels) are still poorly known and much of what is deemed known is not rigorous. These detection levels are nonetheless important for data analysis. We argue that to prove the simultaneous existence of all n components revealed in a multiperiodic variation, it is mandatory to apply at least 2n - 1 significance tests, among which most involve various multifrequency statistics, and only n tests are single-frequency ones. The main result of this paper is an analytic estimation of the statistical significance of the frequency tuples that the multifrequency periodogram can reveal. Using the theory of extreme values of random fields (the generalized Rice method), we find a useful approximation to the relevant false alarm probability. For the double-frequency periodogram, this approximation is given by the elementary formula (π/16)W2e- zz2, where W denotes the normalized width of the settled frequency range, and z is the observed periodogram maximum. We carried out intensive Monte Carlo simulations to show that the practical quality of this approximation is satisfactory. A similar analytic expression for the general multifrequency periodogram is also given, although with less numerical verification.
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
NASA Astrophysics Data System (ADS)
Casati, Michele
2014-05-01
The assertion that solar activity may play a significant role in the trigger of large volcanic eruptions is, and has been discussed by many geophysicists. Numerous scientific papers have established a possible correlation between these events and the electromagnetic coupling between the Earth and the Sun, but none of them has been able to highlight a possible statistically significant relationship between large volcanic eruptions and any of the series, such as geomagnetic activity, solar wind, sunspots number. In our research, we compare the 148 volcanic eruptions with index VEI4, the major 37 historical volcanic eruptions equal to or greater than index VEI5, recorded from 1610 to 2012 , with its sunspots number. Staring, as the threshold value, a monthly sunspot number of 46 (recorded during the great eruption of Krakatoa VEI6 historical index, August 1883), we note some possible relationships and conduct a statistical test. • Of the historical 31 large volcanic eruptions with index VEI5+, recorded between 1610 and 1955, 29 of these were recorded when the SSN<46. The remaining 2 eruptions were not recorded when the SSN<46, but rather during solar maxima of the solar cycle of the year 1739 and in the solar cycle No. 14 (Shikotsu eruption of 1739 and Ksudach 1907). • Of the historical 8 large volcanic eruptions with index VEI6+, recorded from 1610 to the present, 7 of these were recorded with SSN<46 and more specifically, within the three large solar minima known : Maunder (1645-1710), Dalton (1790-1830) and during the solar minimums occurred between 1880 and 1920. As the only exception, we note the eruption of Pinatubo of June 1991, recorded in the solar maximum of cycle 22. • Of the historical 6 major volcanic eruptions with index VEI5+, recorded after 1955, 5 of these were not recorded during periods of low solar activity, but rather during solar maxima, of the cycles 19,21 and 22. The significant tests, conducted with the chi-square χ ² = 7,782, detect a
Webb-Robertson, Bobbie-Jo M.; McCue, Lee Ann; Waters, Katrina M.; Matzke, Melissa M.; Jacobs, Jon M.; Metz, Thomas O.; Varnum, Susan M.; Pounds, Joel G.
2010-11-01
Liquid chromatography-mass spectrometry-based (LC-MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in peptide intensities and observations (presence/absence) of peptides makes data analysis quite challenging. The missing abundance values in LC-MS proteomics data are difficult to address with traditional imputation-based approaches because the mechanisms by which data are missing are unknown a priori. Data can be missing due to random mechanisms such as experimental error, or non-random mechanisms such as a true biological effect. We present a statistical approach that uses a test of independence known as a G-test to test the null hypothesis of independence between the number of missing values and the experimental groups. We pair the G-test results evaluating independence of missing data (IMD) with a standard analysis of variance (ANOVA) that uses only means and variances computed from the observed data. Each peptide is therefore represented by two statistical confidence metrics, one for qualitative differential observation and one for quantitative differential intensity. We use two simulated and two real LC-MS datasets to demonstrate the robustness and sensitivity of the ANOVA-IMD approach for assigning confidence to peptides with significant differential abundance among experimental groups.
NASA Astrophysics Data System (ADS)
Wang, H. J.; Shi, W. L.; Chen, X. H.
2006-05-01
The West Development Policy being implemented in China is causing significant land use and land cover (LULC) changes in West China. With the up-to-date satellite database of the Global Land Cover Characteristics Database (GLCCD) that characterizes the lower boundary conditions, the regional climate model RIEMS-TEA is used to simulate possible impacts of the significant LULC variation. The model was run for five continuous three-month periods from 1 June to 1 September of 1993, 1994, 1995, 1996, and 1997, and the results of the five groups are examined by means of a student t-test to identify the statistical significance of regional climate variation. The main results are: (1) The regional climate is affected by the LULC variation because the equilibrium of water and heat transfer in the air-vegetation interface is changed. (2) The integrated impact of the LULC variation on regional climate is not only limited to West China where the LULC varies, but also to some areas in the model domain where the LULC does not vary at all. (3) The East Asian monsoon system and its vertical structure are adjusted by the large scale LULC variation in western China, where the consequences axe the enhancement of the westward water vapor transfer from the east east and the relevant increase of wet-hydrostatic energy in the middle-upper atmospheric layers. (4) The ecological engineering in West China affects significantly the regional climate in Northwest China, North China and the middle-lower reaches of the Yangtze River; there are obvious effects in South, Northeast, and Southwest China, but minor effects in Tibet.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
ERIC Educational Resources Information Center
Harrison, Judith; Thompson, Bruce; Vannest, Kimberly J.
2009-01-01
This article reviews the literature on interventions targeting the academic performance of students with attention-deficit/hyperactivity disorder (ADHD) and does so within the context of the statistical significance testing controversy. Both the arguments for and against null hypothesis statistical significance tests are reviewed. Recent standards…
ERIC Educational Resources Information Center
Weigle, David C.
The purposes of the present paper are to address the historical development of statistical significance testing and to briefly examine contemporary practices regarding such testing in the light of these historical origins. Precursors leading to the advent of statistical significance testing are examined as are more recent controversies surrounding…
Bornmann, Lutz; Leydesdorff, Loet
2013-01-01
Using the InCites tool of Thomson Reuters, this study compares normalized citation impact values calculated for China, Japan, France, Germany, United States, and the UK throughout the time period from 1981 to 2010. InCites offers a unique opportunity to study the normalized citation impacts of countries using (i) a long publication window (1981 to 2010), (ii) a differentiation in (broad or more narrow) subject areas, and (iii) allowing for the use of statistical procedures in order to obtain an insightful investigation of national citation trends across the years. Using four broad categories, our results show significantly increasing trends in citation impact values for France, the UK, and especially Germany across the last thirty years in all areas. The citation impact of papers from China is still at a relatively low level (mostly below the world average), but the country follows an increasing trend line. The USA exhibits a stable pattern of high citation impact values across the years. With small impact differences between the publication years, the US trend is increasing in engineering and technology but decreasing in medical and health sciences as well as in agricultural sciences. Similar to the USA, Japan follows increasing as well as decreasing trends in different subject areas, but the variability across the years is small. In most of the years, papers from Japan perform below or approximately at the world average in each subject area.
Bornmann, Lutz; Leydesdorff, Loet
2013-01-01
Using the InCites tool of Thomson Reuters, this study compares normalized citation impact values calculated for China, Japan, France, Germany, United States, and the UK throughout the time period from 1981 to 2010. InCites offers a unique opportunity to study the normalized citation impacts of countries using (i) a long publication window (1981 to 2010), (ii) a differentiation in (broad or more narrow) subject areas, and (iii) allowing for the use of statistical procedures in order to obtain an insightful investigation of national citation trends across the years. Using four broad categories, our results show significantly increasing trends in citation impact values for France, the UK, and especially Germany across the last thirty years in all areas. The citation impact of papers from China is still at a relatively low level (mostly below the world average), but the country follows an increasing trend line. The USA exhibits a stable pattern of high citation impact values across the years. With small impact differences between the publication years, the US trend is increasing in engineering and technology but decreasing in medical and health sciences as well as in agricultural sciences. Similar to the USA, Japan follows increasing as well as decreasing trends in different subject areas, but the variability across the years is small. In most of the years, papers from Japan perform below or approximately at the world average in each subject area. PMID:23418600
Denton, Debra L; Diamond, Jerry; Zheng, Lei
2011-05-01
The U.S. Environmental Protection Agency (U.S. EPA) and state agencies implement the Clean Water Act, in part, by evaluating the toxicity of effluent and surface water samples. A common goal for both regulatory authorities and permittees is confidence in an individual test result (e.g., no-observed-effect concentration [NOEC], pass/fail, 25% effective concentration [EC25]), which is used to make regulatory decisions, such as reasonable potential determinations, permit compliance, and watershed assessments. This paper discusses an additional statistical approach (test of significant toxicity [TST]), based on bioequivalence hypothesis testing, or, more appropriately, test of noninferiority, which examines whether there is a nontoxic effect at a single concentration of concern compared with a control. Unlike the traditional hypothesis testing approach in whole effluent toxicity (WET) testing, TST is designed to incorporate explicitly both α and β error rates at levels of toxicity that are unacceptable and acceptable, given routine laboratory test performance for a given test method. Regulatory management decisions are used to identify unacceptable toxicity levels for acute and chronic tests, and the null hypothesis is constructed such that test power is associated with the ability to declare correctly a truly nontoxic sample as acceptable. This approach provides a positive incentive to generate high-quality WET data to make informed decisions regarding regulatory decisions. This paper illustrates how α and β error rates were established for specific test method designs and tests the TST approach using both simulation analyses and actual WET data. In general, those WET test endpoints having higher routine (e.g., 50th percentile) within-test control variation, on average, have higher method-specific α values (type I error rate), to maintain a desired type II error rate. This paper delineates the technical underpinnings of this approach and demonstrates the benefits
NASA Technical Reports Server (NTRS)
Friedlander, Alan L.; Harry, David P., III
1960-01-01
An exploratory analysis of vehicle guidance during the approach to a target planet is presented. The objective of the guidance maneuver is to guide the vehicle to a specific perigee distance with a high degree of accuracy and minimum corrective velocity expenditure. The guidance maneuver is simulated by considering the random sampling of real measurements with significant error and reducing this information to prescribe appropriate corrective action. The instrumentation system assumed includes optical and/or infrared devices to indicate range and a reference angle in the trajectory plane. Statistical results are obtained by Monte-Carlo techniques and are shown as the expectation of guidance accuracy and velocity-increment requirements. Results are nondimensional and applicable to any planet within limits of two-body assumptions. The problem of determining how many corrections to make and when to make them is a consequence of the conflicting requirement of accurate trajectory determination and propulsion. Optimum values were found for a vehicle approaching a planet along a parabolic trajectory with an initial perigee distance of 5 radii and a target perigee of 1.02 radii. In this example measurement errors were less than i minute of arc. Results indicate that four corrections applied in the vicinity of 50, 16, 15, and 1.5 radii, respectively, yield minimum velocity-increment requirements. Thrust devices capable of producing a large variation of velocity-increment size are required. For a vehicle approaching the earth, miss distances within 32 miles are obtained with 90-percent probability. Total velocity increments used in guidance are less than 3300 feet per second with 90-percent probability. It is noted that the above representative results are valid only for the particular guidance scheme hypothesized in this analysis. A parametric study is presented which indicates the effects of measurement error size, initial perigee, and initial energy on the guidance
Fisher, Aaron; Anderson, G Brooke; Peng, Roger; Leek, Jeff
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%-49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%-76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Fisher, Aaron; Anderson, G. Brooke; Peng, Roger
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%–49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%–76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Greenhalgh, T.
1997-01-01
It is possible to be seriously misled by taking the statistical competence (and/or the intellectual honesty) of authors for granted. Some common errors committed (deliberately or inadvertently) by the authors of papers are given in the final box. PMID:9277611
Yokoyama, Shozo; Takenaka, Naomi
2005-04-01
Red-green color vision is strongly suspected to enhance the survival of its possessors. Despite being red-green color blind, however, many species have successfully competed in nature, which brings into question the evolutionary advantage of achieving red-green color vision. Here, we propose a new method of identifying positive selection at individual amino acid sites with the premise that if positive Darwinian selection has driven the evolution of the protein under consideration, then it should be found mostly at the branches in the phylogenetic tree where its function had changed. The statistical and molecular methods have been applied to 29 visual pigments with the wavelengths of maximal absorption at approximately 510-540 nm (green- or middle wavelength-sensitive [MWS] pigments) and at approximately 560 nm (red- or long wavelength-sensitive [LWS] pigments), which are sampled from a diverse range of vertebrate species. The results show that the MWS pigments are positively selected through amino acid replacements S180A, Y277F, and T285A and that the LWS pigments have been subjected to strong evolutionary conservation. The fact that these positively selected M/LWS pigments are found not only in animals with red-green color vision but also in those with red-green color blindness strongly suggests that both red-green color vision and color blindness have undergone adaptive evolution independently in different species.
Kurtz, S.E.; Fields, D.E.
1983-10-01
This report describes a version of the TERPED/P computer code that is very useful for small data sets. A new algorithm for determining the Kolmogorov-Smirnov (KS) statistics is used to extend program applicability. The TERPED/P code facilitates the analysis of experimental data and assists the user in determining its probability distribution function. Graphical and numerical tests are performed interactively in accordance with the user's assumption of normally or log-normally distributed data. Statistical analysis options include computation of the chi-square statistic and the KS one-sample test statistic and the corresponding significance levels. Cumulative probability plots of the user's data are generated either via a local graphics terminal, a local line printer or character-oriented terminal, or a remote high-resolution graphics device such as the FR80 film plotter or the Calcomp paper plotter. Several useful computer methodologies suffer from limitations of their implementations of the KS nonparametric test. This test is one of the more powerful analysis tools for examining the validity of an assumption about the probability distribution of a set of data. KS algorithms are found in other analysis codes, including the Statistical Analysis Subroutine (SAS) package and earlier versions of TERPED. The inability of these algorithms to generate significance levels for sample sizes less than 50 has limited their usefulness. The release of the TERPED code described herein contains algorithms to allow computation of the KS statistic and significance level for data sets of, if the user wishes, as few as three points. Values computed for the KS statistic are within 3% of the correct value for all data set sizes.
Boareto, Marcelo; Caticha, Nestor
2014-01-01
Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.
NASA Technical Reports Server (NTRS)
Druzhinin, I. P.; Khamyanova, N. V.; Yagodinskiy, V. N.
1974-01-01
Statistical evaluations of the significance of the relationship of abrupt changes in solar activity and discontinuities in the multi-year pattern of an epidemic process are reported. They reliably (with probability of more than 99.9%) show the real nature of this relationship and its great specific weight (about half) in the formation of discontinuities in the multi-year pattern of the processes in question.
Mark Burden, Adrian; Lewis, Sandra Elizabeth; Willcox, Emma
2014-12-01
Numerous ways exist to process raw electromyograms (EMGs). However, the effect of altering processing methods on peak and mean EMG has seldom been investigated. The aim of this study was to investigate the effect of using different root mean square (RMS) window lengths and overlaps on the amplitude, reliability and inter-individual variability of gluteus maximus EMGs recorded during the clam exercise, and on the statistical significance and clinical relevance of amplitude differences between two exercise conditions. Mean and peak RMS of 10 repetitions from 17 participants were obtained using processing window lengths of 0.01, 0.15, 0.2, 0.25 and 1 s, with no overlap and overlaps of 25, 50 and 75% of window length. The effect of manipulating window length on reliability and inter-individual variability was greater for peak EMG (coefficient of variation [CV] <9%) than for mean EMG (CV <3%), with the 1 s window generally displaying the lowest variability. As a consequence, neither statistical significance nor clinical relevance (effect size [ES]) of mean EMG was affected by manipulation of window length. Statistical significance of peak EMG was more sensitive to changes in window length, with lower p-values generally being recorded for the 1 s window. As use of different window lengths has a greater effect on variability and statistical significance of the peak EMG, then clinicians should use the mean EMG. They should also be aware that use of different numbers of exercise repetitions and participants can have a greater effect on EMG parameters than length of processing window.
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the 'IMGT clonotype (AA)' (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
Best, R; Harrell, A; Geesey, C; Libby, B; Wijesooriya, K
2014-06-15
Purpose: The purpose of this study is to inter-compare and find statistically significant differences between flattened field fixed-beam (FB) IMRT with flattening-filter free (FFF) volumetric modulated arc therapy (VMAT) for stereotactic body radiation therapy SBRT. Methods: SBRT plans using FB IMRT and FFF VMAT were generated for fifteen SBRT lung patients using 6 MV beams. For each patient, both IMRT and VMAT plans were created for comparison. Plans were generated utilizing RTOG 0915 (peripheral, 10 patients) and RTOG 0813 (medial, 5 patients) lung protocols. Target dose, critical structure dose, and treatment time were compared and tested for statistical significance. Parameters of interest included prescription isodose surface coverage, target dose heterogeneity, high dose spillage (location and volume), low dose spillage (location and volume), lung dose spillage, and critical structure maximum- and volumetric-dose limits. Results: For all criteria, we found equivalent or higher conformality with VMAT plans as well as reduced critical structure doses. Several differences passed a Student's t-test of significance: VMAT reduced the high dose spillage, evaluated with conformality index (CI), by an average of 9.4%±15.1% (p=0.030) compared to IMRT. VMAT plans reduced the lung volume receiving 20 Gy by 16.2%±15.0% (p=0.016) compared with IMRT. For the RTOG 0915 peripheral lesions, the volumes of lung receiving 12.4 Gy and 11.6 Gy were reduced by 27.0%±13.8% and 27.5%±12.6% (for both, p<0.001) in VMAT plans. Of the 26 protocol pass/fail criteria, VMAT plans were able to achieve an average of 0.2±0.7 (p=0.026) more constraints than the IMRT plans. Conclusions: FFF VMAT has dosimetric advantages over fixed beam IMRT for lung SBRT. Significant advantages included increased dose conformity, and reduced organs-at-risk doses. The overall improvements in terms of protocol pass/fail criteria were more modest and will require more patient data to establish difference
Maric, Marija; de Haan, Else; Hogendoorn, Sanne M; Wolters, Lidewij H; Huizenga, Hilde M
2015-03-01
Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a data-analytic method to analyze univariate (i.e., one symptom) single-case data using the common package SPSS. This method can help the clinical researcher to investigate whether an intervention works as compared with a baseline period or another intervention type, and to determine whether symptom improvement is clinically significant. First, we describe the statistical method in a conceptual way and show how it can be implemented in SPSS. Simulation studies were performed to determine the number of observation points required per intervention phase. Second, to illustrate this method and its implications, we present a case study of an adolescent with anxiety disorders treated with cognitive-behavioral therapy techniques in an outpatient psychotherapy clinic, whose symptoms were regularly assessed before each session. We provide a description of the data analyses and results of this case study. Finally, we discuss the advantages and shortcomings of the proposed method.
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the ‘IMGT clonotype (AA)’ (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
Jäger, Markus; Bottlender, Ronald; Strauss, Anton; Möller, Hans-Jürgen
2005-01-01
Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), after Kraepelin's original description of "manic-depressive insanity," embodied a broad concept of affective disorders including mood-congruent and mood-incongruent psychotic features. Controversial results have been reported about the prognostic significance of psychotic symptoms in depressive disorders challenging this broad concept of affective disorders. One hundred seventeen inpatients first hospitalized in 1980 to 1982 who retrospectively fulfilled the DSM-IV criteria for depressive disorders with mood-congruent or mood-incongruent psychotic features (n = 20), nonpsychotic depressive disorders (n = 33), or schizophrenia (n = 64) were followed up 15 years after their first hospitalization. Global functioning was recorded with the Global Assessment Scale; the clinical picture at follow-up was assessed using the Hamilton Rating Scale for Depression, the Positive and Negative Syndrome Scale, and the Scale for the Assessment of Negative Symptoms. With respect to global functioning, clinical picture, and social impairment at follow-up, depressive disorders with psychotic features were similar to those without, but markedly different from schizophrenia. However, patients with psychotic depressive disorders experienced more rehospitalizations than those with nonpsychotic ones. The findings indicating low prognostic significance of psychotic symptoms in depressive disorders are in line with the broad concept of affective disorders in DSM-IV.
Minor changes in the indicator used to measure fine PM, which cause only modest changes in Mass concentrations, can lead to dramatic changes in the statistical relationship of fine PM mass with cardiovascular mortality. An epidemiologic study in Phoenix (Mar et al., 2000), augme...
ERIC Educational Resources Information Center
Cicchetti, Domenic V.; Koenig, Kathy; Klin, Ami; Volkmar, Fred R.; Paul, Rhea; Sparrow, Sara
2011-01-01
The objectives of this report are: (a) to trace the theoretical roots of the concept clinical significance that derives from Bayesian thinking, Marginal Utility/Diminishing Returns in Economics, and the "just noticeable difference", in Psychophysics. These concepts then translated into: Effect Size (ES), strength of agreement, clinical…
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2001-01-01
Since 1750, the number of cataclysmic volcanic eruptions (volcanic explosivity index (VEI)>=4) per decade spans 2-11, with 96 percent located in the tropics and extra-tropical Northern Hemisphere. A two-point moving average of the volcanic time series has higher values since the 1860's than before, being 8.00 in the 1910's (the highest value) and 6.50 in the 1980's, the highest since the 1910's peak. Because of the usual behavior of the first difference of the two-point moving averages, one infers that its value for the 1990's will measure approximately 6.50 +/- 1, implying that approximately 7 +/- 4 cataclysmic volcanic eruptions should be expected during the present decade (2000-2009). Because cataclysmic volcanic eruptions (especially those having VEI>=5) nearly always have been associated with short-term episodes of global cooling, the occurrence of even one might confuse our ability to assess the effects of global warming. Poisson probability distributions reveal that the probability of one or more events with a VEI>=4 within the next ten years is >99 percent. It is approximately 49 percent for an event with a VEI>=5, and 18 percent for an event with a VEI>=6. Hence, the likelihood that a climatically significant volcanic eruption will occur within the next ten years appears reasonably high.
Tonello, Lucio; Conway de Macario, Everly; Marino Gammazza, Antonella; Cocchi, Massimo; Gabrielli, Fabio; Zummo, Giovanni; Cappello, Francesco; Macario, Alberto J L
2015-03-01
The pathogenesis of Hashimoto's thyroiditis includes autoimmunity involving thyroid antigens, autoantibodies, and possibly cytokines. It is unclear what role plays Hsp60, but our recent data indicate that it may contribute to pathogenesis as an autoantigen. Its role in the induction of cytokine production, pro- or anti-inflammatory, was not elucidated, except that we found that peripheral blood mononucleated cells (PBMC) from patients or from healthy controls did not respond with cytokine production upon stimulation by Hsp60 in vitro with patterns that would differentiate patients from controls with statistical significance. This "negative" outcome appeared when the data were pooled and analyzed with conventional statistical methods. We re-analyzed our data with non-conventional statistical methods based on data mining using the classification and regression tree learning algorithm and clustering methodology. The results indicate that by focusing on IFN-γ and IL-2 levels before and after Hsp60 stimulation of PBMC in each patient, it is possible to differentiate patients from controls. A major general conclusion is that when trying to identify disease markers such as levels of cytokines and Hsp60, reference to standards obtained from pooled data from many patients may be misleading. The chosen biomarker, e.g., production of IFN-γ and IL-2 by PBMC upon stimulation with Hsp60, must be assessed before and after stimulation and the results compared within each patient and analyzed with conventional and data mining statistical methods.
Grossling, Bernardo F.
1975-01-01
Exploratory drilling is still in incipient or youthful stages in those areas of the world where the bulk of the potential petroleum resources is yet to be discovered. Methods of assessing resources from projections based on historical production and reserve data are limited to mature areas. For most of the world's petroleum-prospective areas, a more speculative situation calls for a critical review of resource-assessment methodology. The language of mathematical statistics is required to define more rigorously the appraisal of petroleum resources. Basically, two approaches have been used to appraise the amounts of undiscovered mineral resources in a geologic province: (1) projection models, which use statistical data on the past outcome of exploration and development in the province; and (2) estimation models of the overall resources of the province, which use certain known parameters of the province together with the outcome of exploration and development in analogous provinces. These two approaches often lead to widely different estimates. Some of the controversy that arises results from a confusion of the probabilistic significance of the quantities yielded by each of the two approaches. Also, inherent limitations of analytic projection models-such as those using the logistic and Gomperts functions --have often been ignored. The resource-assessment problem should be recast in terms that provide for consideration of the probability of existence of the resource and of the probability of discovery of a deposit. Then the two above-mentioned models occupy the two ends of the probability range. The new approach accounts for (1) what can be expected with reasonably high certainty by mere projections of what has been accomplished in the past; (2) the inherent biases of decision-makers and resource estimators; (3) upper bounds that can be set up as goals for exploration; and (4) the uncertainties in geologic conditions in a search for minerals. Actual outcomes can then
Cosmic statistics of statistics
NASA Astrophysics Data System (ADS)
Szapudi, István; Colombi, Stéphane; Bernardeau, Francis
1999-12-01
The errors on statistics measured in finite galaxy catalogues are exhaustively investigated. The theory of errors on factorial moments by Szapudi & Colombi is applied to cumulants via a series expansion method. All results are subsequently extended to the weakly non-linear regime. Together with previous investigations this yields an analytic theory of the errors for moments and connected moments of counts in cells from highly non-linear to weakly non-linear scales. For non-linear functions of unbiased estimators, such as the cumulants, the phenomenon of cosmic bias is identified and computed. Since it is subdued by the cosmic errors in the range of applicability of the theory, correction for it is inconsequential. In addition, the method of Colombi, Szapudi & Szalay concerning sampling effects is generalized, adapting the theory for inhomogeneous galaxy catalogues. While previous work focused on the variance only, the present article calculates the cross-correlations between moments and connected moments as well for a statistically complete description. The final analytic formulae representing the full theory are explicit but somewhat complicated. Therefore we have made available a fortran program capable of calculating the described quantities numerically (for further details e-mail SC at colombi@iap.fr). An important special case is the evaluation of the errors on the two-point correlation function, for which this should be more accurate than any method put forward previously. This tool will be immensely useful in the future for assessing the precision of measurements from existing catalogues, as well as aiding the design of new galaxy surveys. To illustrate the applicability of the results and to explore the numerical aspects of the theory qualitatively and quantitatively, the errors and cross-correlations are predicted under a wide range of assumptions for the future Sloan Digital Sky Survey. The principal results concerning the cumulants ξ, Q3 and Q4 is that
Hunt, N C; Ghosh, K M; Blain, A P; Rushton, S P; Longstaff, L M; Deehan, D J
2015-05-01
The aim of this study was to compare the maximum laxity conferred by the cruciate-retaining (CR) and posterior-stabilised (PS) Triathlon single-radius total knee arthroplasty (TKA) for anterior drawer, varus-valgus opening and rotation in eight cadaver knees through a defined arc of flexion (0º to 110º). The null hypothesis was that the limits of laxity of CR- and PS-TKAs are not significantly different. The investigation was undertaken in eight loaded cadaver knees undergoing subjective stress testing using a measurement rig. Firstly the native knee was tested prior to preparation for CR-TKA and subsequently for PS-TKA implantation. Surgical navigation was used to track maximal displacements/rotations at 0º, 30º, 60º, 90º and 110° of flexion. Mixed-effects modelling was used to define the behaviour of the TKAs. The laxity measured for the CR- and PS-TKAs revealed no statistically significant differences over the studied flexion arc for the two versions of TKA. Compared with the native knee both TKAs exhibited slightly increased anterior drawer and decreased varus-valgus and internal-external roational laxities. We believe further study is required to define the clinical states for which the additional constraint offered by a PS-TKA implant may be beneficial.
NASA Astrophysics Data System (ADS)
Temme, F. P.
1992-12-01
Realisation of the invariance properties of the p ⩽ 2 number partitional inventory components of the 20-fold spin algebra associated with [A] 20 nuclear spin clusters under SU2 × L20 allows the mappings {[λ] → Γ} to be derived. In addition, recent general inner tensor product expressions under Ln, for n even (odd), also facilitates the evaluation of many higher [λ] ( L20; p = 3) correlative mappings onto SU3↓SO(3) × L↓20T A 5 subduced symmetry from SU2 duality, thus providing results that determine the nature of adapted NMR bases for both dodecahedrane and its d 20 analogue. The significance of this work lies in the pertinence of nuclear spin statistics to both selective MQ-NMR and to other spectroscopic aspects of cage clusters, e.g., [ 13C] n, n = 20, 60, fullerenes. Mappings onto Ln irreps sets of specific p ⩽ 3 number partitions arise in combinatorial treatment of {M iti} Rota fields, defining scalar invariants in the context of Cayley algebra. Inclusion of the Ln group in the specific Racah chain for NMR symmetry gives rise to significant further physical insight.
Statistics-based research--a pig in a poke?
Penston, James
2011-10-01
Much of medical research involves large-scale randomized controlled trials designed to detect small differences in outcome between the study groups. This approach is believed to produce reliable evidence on which the management of patients is based. But can we be sure that the demonstration of a small, albeit statistically significant, difference is sufficient to infer the presence of a causal relationship between the drug and the outcome? A study is claimed to have internal validity when other explanations for the observed difference - namely, inequalities between the groups, bias in the assessment of the outcome and chance - have been excluded. Despite the various processes that are put into place - including, for example, randomization, allocation concealment, double-blinding and intention-to-treat analysis - it remains doubtful whether the groups are equal in terms of all factors relevant to the outcome and whether bias has been excluded. As for the exclusion of chance, not only may inappropriate statistical tests be used, but also frequentist statistics has been subjected to serious criticisms in recent years that further bring internal validity into question. But the problems do not end with the flaws in internal validity. The philosophical basis of large-scale randomized controlled trials and epidemiological studies is unsound. When examined closely, many obstacles emerge that threaten the inference from a small, statistically significant difference to the presence of a causal relationship between the drug and the outcome. Given the influence of statistics-based research on the practice of medicine, it is of the utmost importance that the flaws in this methodology are brought to the fore.
Smith, Alwyn
1969-01-01
This paper is based on an analysis of questionnaires sent to the health ministries of Member States of WHO asking for information about the extent, nature, and scope of morbidity statistical information. It is clear that most countries collect some statistics of morbidity and many countries collect extensive data. However, few countries relate their collection to the needs of health administrators for information, and many countries collect statistics principally for publication in annual volumes which may appear anything up to 3 years after the year to which they refer. The desiderata of morbidity statistics may be summarized as reliability, representativeness, and relevance to current health problems. PMID:5306722
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2012-01-01
The term "data snooping" refers to the practice of choosing which statistical analyses to apply to a set of data after having first looked at those data. Data snooping contradicts a fundamental precept of applied statistics, that the scheme of analysis is to be planned in advance. In this column, the authors shall elucidate the statistical…
Simulations of avalanche breakdown statistics: probability and timing
NASA Astrophysics Data System (ADS)
Ng, Jo Shien; Tan, Chee Hing; David, John P. R.
2010-04-01
Important avalanche breakdown statistics for Single Photon Avalanche Diodes (SPADs), such as avalanche breakdown probability, dark count rate, and the distribution of time taken to reach breakdown (providing mean time to breakdown and jitter), were simulated. These simulations enable unambiguous studies on effects of avalanche region width, ionization coefficient ratio and carrier dead space on the avalanche statistics, which are the fundamental limits of the SPADs. The effects of quenching resistor/circuit have been ignored. Due to competing effects between dead spaces, which are significant in modern SPADs with narrow avalanche regions, and converging ionization coefficients, the breakdown probability versus overbias characteristics from different avalanche region widths are fairly close to each other. Concerning avalanche breakdown timing at given value of breakdown probability, using avalanche material with similar ionization coefficients yields fast avalanche breakdowns with small timing jitter (albeit higher operating field), compared to material with dissimilar ionization coefficients. This is the opposite requirement for abrupt breakdown probability versus overbias characteristics. In addition, by taking band-to-band tunneling current (dark carriers) into account, minimum avalanche region width for practical SPADs was found to be 0.3 and 0.2 μm, for InP and InAlAs, respectively.
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
ERIC Educational Resources Information Center
Huberty, Carl J.
An approach to statistical testing, which combines Neyman-Pearson hypothesis testing and Fisher significance testing, is recommended. The use of P-values in this approach is discussed in some detail. The author also discusses some problems which are often found in introductory statistics textbooks. The problems involve the definitions of…
The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute works to provide information on cancer statistics in an effort to reduce the burden of cancer among the U.S. population.
... cancer statistics across the world. U.S. Cancer Mortality Trends The best indicator of progress against cancer is ... the number of cancer survivors has increased. These trends show that progress is being made against the ...
NASA Astrophysics Data System (ADS)
Hermann, Claudine
Statistical Physics bridges the properties of a macroscopic system and the microscopic behavior of its constituting particles, otherwise impossible due to the giant magnitude of Avogadro's number. Numerous systems of today's key technologies - such as semiconductors or lasers - are macroscopic quantum objects; only statistical physics allows for understanding their fundamentals. Therefore, this graduate text also focuses on particular applications such as the properties of electrons in solids with applications, and radiation thermodynamics and the greenhouse effect.
Significant lexical relationships
Pedersen, T.; Kayaalp, M.; Bruce, R.
1996-12-31
Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.
Bishop, Joseph E.; Strack, O. E.
2011-03-22
A novel method is presented for assessing the convergence of a sequence of statistical distributions generated by direct Monte Carlo sampling. The primary application is to assess the mesh or grid convergence, and possibly divergence, of stochastic outputs from non-linear continuum systems. Example systems include those from fluid or solid mechanics, particularly those with instabilities and sensitive dependence on initial conditions or system parameters. The convergence assessment is based on demonstrating empirically that a sequence of cumulative distribution functions converges in the Linfty norm. The effect of finite sample sizes is quantified using confidence levels from the Kolmogorov–Smirnov statistic. The statistical method is independent of the underlying distributions. The statistical method is demonstrated using two examples: (1) the logistic map in the chaotic regime, and (2) a fragmenting ductile ring modeled with an explicit-dynamics finite element code. In the fragmenting ring example the convergence of the distribution describing neck spacing is investigated. The initial yield strength is treated as a random field. Two different random fields are considered, one with spatial correlation and the other without. Both cases converged, albeit to different distributions. The case with spatial correlation exhibited a significantly higher convergence rate compared with the one without spatial correlation.
NASA Astrophysics Data System (ADS)
Goodman, Joseph W.
2000-07-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Robert G. Bartle The Elements of Integration and Lebesgue Measure George E. P. Box & Norman R. Draper Evolutionary Operation: A Statistical Method for Process Improvement George E. P. Box & George C. Tiao Bayesian Inference in Statistical Analysis R. W. Carter Finite Groups of Lie Type: Conjugacy Classes and Complex Characters R. W. Carter Simple Groups of Lie Type William G. Cochran & Gertrude M. Cox Experimental Designs, Second Edition Richard Courant Differential and Integral Calculus, Volume I RIchard Courant Differential and Integral Calculus, Volume II Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume I Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume II D. R. Cox Planning of Experiments Harold S. M. Coxeter Introduction to Geometry, Second Edition Charles W. Curtis & Irving Reiner Representation Theory of Finite Groups and Associative Algebras Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume I Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume II Cuthbert Daniel Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition Bruno de Finetti Theory of Probability, Volume I Bruno de Finetti Theory of Probability, Volume 2 W. Edwards Deming Sample Design in Business Research
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. PMID:26466186
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced.
Han, Yali; Liu, Jie; Sun, Meili; Zhang, Zongpu; Liu, Chuanyong; Sun, Yuping
2016-01-01
Background. There is no definitive conclusion so far on the predictive values of ERCC1 polymorphisms for clinical outcomes of platinum-based chemotherapy in non-small cell lung cancer (NSCLC). We updated this meta-analysis with an expectation to obtain some statistical advancement on this issue. Methods. Relevant studies were identified by searching MEDLINE, EMBASE databases from inception to April 2015. Primary outcomes included objective response rate (ORR), progression-free survival (PFS), and overall survival (OS). All analyses were performed using the Review Manager version 5.3 and the Stata version 12.0. Results. A total of 33 studies including 5373 patients were identified. ERCC1 C118T and C8092A could predict both ORR and OS for platinum-based chemotherapy in Asian NSCLC patients (CT + TT versus CC, ORR: OR = 0.80, 95% CI = 0.67–0.94; OS: HR = 1.24, 95% CI = 1.01–1.53) (CA + AA versus CC, ORR: OR = 0.76, 95% CI = 0.60–0.96; OS: HR = 1.37, 95% CI = 1.06–1.75). Conclusions. Current evidence strongly indicated the prospect of ERCC1 C118T and C8092A as predictive biomarkers for platinum-based chemotherapy in Asian NSCLC patients. However, the results should be interpreted with caution and large prospective studies are still required to further investigate these findings. PMID:27057082
1986-01-01
Official population data for the USSR are presented for 1985 and 1986. Part 1 (pp. 65-72) contains data on capitals of union republics and cities with over one million inhabitants, including population estimates for 1986 and vital statistics for 1985. Part 2 (p. 72) presents population estimates by sex and union republic, 1986. Part 3 (pp. 73-6) presents data on population growth, including birth, death, and natural increase rates, 1984-1985; seasonal distribution of births and deaths; birth order; age-specific birth rates in urban and rural areas and by union republic; marriages; age at marriage; and divorces. PMID:12178831
A study on the use of Gumbel approximation with the Bernoulli spatial scan statistic.
Read, S; Bath, P A; Willett, P; Maheswaran, R
2013-08-30
The Bernoulli version of the spatial scan statistic is a well established method of detecting localised spatial clusters in binary labelled point data, a typical application being the epidemiological case-control study. A recent study suggests the inferential accuracy of several versions of the spatial scan statistic (principally the Poisson version) can be improved, at little computational cost, by using the Gumbel distribution, a method now available in SaTScan(TM) (www.satscan.org). We study in detail the effect of this technique when applied to the Bernoulli version and demonstrate that it is highly effective, albeit with some increase in false alarm rates at certain significance thresholds. We explain how this increase is due to the discrete nature of the Bernoulli spatial scan statistic and demonstrate that it can affect even small p-values. Despite this, we argue that the Gumbel method is actually preferable for very small p-values. Furthermore, we extend previous research by running benchmark trials on 12 000 synthetic datasets, thus demonstrating that the overall detection capability of the Bernoulli version (i.e. ratio of power to false alarm rate) is not noticeably affected by the use of the Gumbel method. We also provide an example application of the Gumbel method using data on hospital admissions for chronic obstructive pulmonary disease. PMID:23348825
A study on the use of Gumbel approximation with the Bernoulli spatial scan statistic.
Read, S; Bath, P A; Willett, P; Maheswaran, R
2013-08-30
The Bernoulli version of the spatial scan statistic is a well established method of detecting localised spatial clusters in binary labelled point data, a typical application being the epidemiological case-control study. A recent study suggests the inferential accuracy of several versions of the spatial scan statistic (principally the Poisson version) can be improved, at little computational cost, by using the Gumbel distribution, a method now available in SaTScan(TM) (www.satscan.org). We study in detail the effect of this technique when applied to the Bernoulli version and demonstrate that it is highly effective, albeit with some increase in false alarm rates at certain significance thresholds. We explain how this increase is due to the discrete nature of the Bernoulli spatial scan statistic and demonstrate that it can affect even small p-values. Despite this, we argue that the Gumbel method is actually preferable for very small p-values. Furthermore, we extend previous research by running benchmark trials on 12 000 synthetic datasets, thus demonstrating that the overall detection capability of the Bernoulli version (i.e. ratio of power to false alarm rate) is not noticeably affected by the use of the Gumbel method. We also provide an example application of the Gumbel method using data on hospital admissions for chronic obstructive pulmonary disease.
ERIC Educational Resources Information Center
Peterson, Lisa S.
2008-01-01
Clinical significance is an important concept in research, particularly in education and the social sciences. The present article first compares clinical significance to other measures of "significance" in statistics. The major methods used to determine clinical significance are explained and the strengths and weaknesses of clinical significance…
Does the Aging Process Significantly Modify the Mean Heart Rate?
Santos, Marcos Antonio Almeida; Sousa, Antonio Carlos Sobral; Reis, Francisco Prado; Santos, Thayná Ramos; Lima, Sonia Oliveira; Barreto-Filho, José Augusto
2013-01-01
Background The Mean Heart Rate (MHR) tends to decrease with age. When adjusted for gender and diseases, the magnitude of this effect is unclear. Objective To analyze the MHR in a stratified sample of active and functionally independent individuals. Methods A total of 1,172 patients aged ≥ 40 years underwent Holter monitoring and were stratified by age group: 1 = 40-49, 2 = 50-59, 3 = 60-69, 4 = 70-79, 5 = ≥ 80 years. The MHR was evaluated according to age and gender, adjusted for Hypertension (SAH), dyslipidemia and non-insulin dependent diabetes mellitus (NIDDM). Several models of ANOVA, correlation and linear regression were employed. A two-tailed p value <0.05 was considered significant (95% CI). Results The MHR tended to decrease with the age range: 1 = 77.20 ± 7.10; 2 = 76.66 ± 7.07; 3 = 74.02 ± 7.46; 4 = 72.93 ± 7.35; 5 = 73.41 ± 7.98 (p < 0.001). Women showed a correlation with higher MHR (p <0.001). In the ANOVA and regression models, age and gender were predictors (p < 0.001). However, R2 and ETA2 < 0.10, as well as discrete standardized beta coefficients indicated reduced effect. Dyslipidemia, hypertension and DM did not influence the findings. Conclusion The MHR decreased with age. Women had higher values of MHR, regardless of the age group. Correlations between MHR and age or gender, albeit significant, showed the effect magnitude had little statistical relevance. The prevalence of SAH, dyslipidemia and diabetes mellitus did not influence the results. PMID:24029962
Worry, Intolerance of Uncertainty, and Statistics Anxiety
ERIC Educational Resources Information Center
Williams, Amanda S.
2013-01-01
Statistics anxiety is a problem for most graduate students. This study investigates the relationship between intolerance of uncertainty, worry, and statistics anxiety. Intolerance of uncertainty was significantly related to worry, and worry was significantly related to three types of statistics anxiety. Six types of statistics anxiety were…
Intervention for Maltreating Fathers: Statistically and Clinically Significant Change
ERIC Educational Resources Information Center
Scott, Katreena L.; Lishak, Vicky
2012-01-01
Objective: Fathers are seldom the focus of efforts to address child maltreatment and little is currently known about the effectiveness of intervention for this population. To address this gap, we examined the efficacy of a community-based group treatment program for fathers who had abused or neglected their children or exposed their children to…
Suite versus composite statistics
Balsillie, J.H.; Tanner, W.F.
1999-01-01
Suite and composite methodologies, two statistically valid approaches for producing statistical descriptive measures, are investigated for sample groups representing a probability distribution where, in addition, each sample is probability distribution. Suite and composite means (first moment measures) are always equivalent. Composite standard deviations (second moment measures) are always larger than suite standard deviations. Suite and composite values for higher moment measures have more complex relationships. Very seldom, however, are they equivalent, and they normally yield statistically significant but different results. Multiple samples are preferable to single samples (including composites) because they permit the investigator to examine sample-to-sample variability. These and other relationships for suite and composite probability distribution analyses are investigated and reported using granulometric data.
Candidate Assembly Statistical Evaluation
1998-07-15
The Savannah River Site (SRS) receives aluminum clad spent Material Test Reactor (MTR) fuel from all over the world for storage and eventual reprocessing. There are hundreds of different kinds of MTR fuels and these fuels will continue to be received at SRS for approximately ten more years. SRS''s current criticality evaluation methodology requires the modeling of all MTR fuels utilizing Monte Carlo codes, which is extremely time consuming and resource intensive. Now that amore » significant number of MTR calculations have been conducted it is feasible to consider building statistical models that will provide reasonable estimations of MTR behavior. These statistical models can be incorporated into a standardized model homogenization spreadsheet package to provide analysts with a means of performing routine MTR fuel analyses with a minimal commitment of time and resources. This became the purpose for development of the Candidate Assembly Statistical Evaluation (CASE) program at SRS.« less
Bell, Graham
2016-01-01
In this experiment, the authors were interested in testing the effect of a small molecule inhibitor on the ratio of males and females in the offspring of their model Dipteran species. The authors report that in a wild-type population, ~50 % of offspring are male. They then test the effect of treating females with the chemical, which they think might affect the male:female ratio compared with the untreated group. They claim that there is a statistically significant increase in the percentage of males produced and conclude that the drug affects sex ratios. PMID:27338560
Cosmetic Plastic Surgery Statistics
2014 Cosmetic Plastic Surgery Statistics Cosmetic Procedure Trends 2014 Plastic Surgery Statistics Report Please credit the AMERICAN SOCIETY OF PLASTIC SURGEONS when citing statistical data or using ...
Statistics Anxiety among Postgraduate Students
ERIC Educational Resources Information Center
Koh, Denise; Zawi, Mohd Khairi
2014-01-01
Most postgraduate programmes, that have research components, require students to take at least one course of research statistics. Not all postgraduate programmes are science based, there are a significant number of postgraduate students who are from the social sciences that will be taking statistics courses, as they try to complete their…
Nursing student attitudes toward statistics.
Mathew, Lizy; Aktan, Nadine M
2014-04-01
Nursing is guided by evidence-based practice. To understand and apply research to practice, nurses must be knowledgeable in statistics; therefore, it is crucial to promote a positive attitude toward statistics among nursing students. The purpose of this quantitative cross-sectional study was to assess differences in attitudes toward statistics among undergraduate nursing, graduate nursing, and undergraduate non-nursing students. The Survey of Attitudes Toward Statistics Scale-36 (SATS-36) was used to measure student attitudes, with higher scores denoting more positive attitudes. The convenience sample was composed of 175 students from a public university in the northeastern United States. Statistically significant relationships were found among some of the key demographic variables. Graduate nursing students had a significantly lower score on the SATS-36, compared with baccalaureate nursing and non-nursing students. Therefore, an innovative nursing curriculum that incorporates knowledge of student attitudes and key demographic variables may result in favorable outcomes.
Predict! Teaching Statistics Using Informational Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie
2013-01-01
Statistics is one of the most widely used topics for everyday life in the school mathematics curriculum. Unfortunately, the statistics taught in schools focuses on calculations and procedures before students have a chance to see it as a useful and powerful tool. Researchers have found that a dominant view of statistics is as an assortment of tools…
Antecedents of students' achievement in statistics
NASA Astrophysics Data System (ADS)
Awaludin, Izyan Syazana; Razak, Ruzanna Ab; Harris, Hezlin; Selamat, Zarehan
2015-02-01
The applications of statistics in most fields have been vast. Many degree programmes at local universities require students to enroll in at least one statistics course. The standard of these courses varies across different degree programmes. This is because of students' diverse academic backgrounds in which some comes far from the field of statistics. The high failure rate in statistics courses for non-science stream students had been concerning every year. The purpose of this research is to investigate the antecedents of students' achievement in statistics. A total of 272 students participated in the survey. Multiple linear regression was applied to examine the relationship between the factors and achievement. We found that statistics anxiety was a significant predictor of students' achievement. We also found that students' age has significant effect to achievement. Older students are more likely to achieve lowers scores in statistics. Student's level of study also has a significant impact on their achievement in statistics.
[Comment on] Statistical discrimination
NASA Astrophysics Data System (ADS)
Chinn, Douglas
In the December 8, 1981, issue of Eos, a news item reported the conclusion of a National Research Council study that sexual discrimination against women with Ph.D.'s exists in the field of geophysics. Basically, the item reported that even when allowances are made for motherhood the percentage of female Ph.D.'s holding high university and corporate positions is significantly lower than the percentage of male Ph.D.'s holding the same types of positions. The sexual discrimination conclusion, based only on these statistics, assumes that there are no basic psychological differences between men and women that might cause different populations in the employment group studied. Therefore, the reasoning goes, after taking into account possible effects from differences related to anatomy, such as women stopping their careers in order to bear and raise children, the statistical distributions of positions held by male and female Ph.D.'s ought to be very similar to one another. Any significant differences between the distributions must be caused primarily by sexual discrimination.
Statistical Reference Datasets
National Institute of Standards and Technology Data Gateway
Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
Explorations in statistics: statistical facets of reproducibility.
Curran-Everett, Douglas
2016-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eleventh installment of Explorations in Statistics explores statistical facets of reproducibility. If we obtain an experimental result that is scientifically meaningful and statistically unusual, we would like to know that our result reflects a general biological phenomenon that another researcher could reproduce if (s)he repeated our experiment. But more often than not, we may learn this researcher cannot replicate our result. The National Institutes of Health and the Federation of American Societies for Experimental Biology have created training modules and outlined strategies to help improve the reproducibility of research. These particular approaches are necessary, but they are not sufficient. The principles of hypothesis testing and estimation are inherent to the notion of reproducibility in science. If we want to improve the reproducibility of our research, then we need to rethink how we apply fundamental concepts of statistics to our science.
[Big data in official statistics].
Zwick, Markus
2015-08-01
The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany.
[Big data in official statistics].
Zwick, Markus
2015-08-01
The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany. PMID:26077871
Ranald Macdonald and statistical inference.
Smith, Philip T
2009-05-01
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing. PMID:19351454
Chiou, Chei-Chang; Wang, Yu-Min; Lee, Li-Tze
2014-08-01
Statistical knowledge is widely used in academia; however, statistics teachers struggle with the issue of how to reduce students' statistics anxiety and enhance students' statistics learning. This study assesses the effectiveness of a "one-minute paper strategy" in reducing students' statistics-related anxiety and in improving students' statistics-related achievement. Participants were 77 undergraduates from two classes enrolled in applied statistics courses. An experiment was implemented according to a pretest/posttest comparison group design. The quasi-experimental design showed that the one-minute paper strategy significantly reduced students' statistics anxiety and improved students' statistics learning achievement. The strategy was a better instructional tool than the textbook exercise for reducing students' statistics anxiety and improving students' statistics achievement.
Developments in Statistical Education.
ERIC Educational Resources Information Center
Kapadia, Ramesh
1980-01-01
The current status of statistics education at the secondary level is reviewed, with particular attention focused on the various instructional programs in England. A description and preliminary evaluation of the Schools Council Project on Statistical Education is included. (MP)
Mathematical and statistical analysis
NASA Technical Reports Server (NTRS)
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
ERIC Educational Resources Information Center
Bopp, Richard E.; Van Der Laan, Sharon J.
1985-01-01
Presents a search strategy for locating time-series or cross-sectional statistical data in published sources which was designed for undergraduate students who require 30 units of data for five separate variables in a statistical model. Instructional context and the broader applicability of the search strategy for general statistical research is…
ERIC Educational Resources Information Center
Strasser, Nora
2007-01-01
Avoiding statistical mistakes is important for educators at all levels. Basic concepts will help you to avoid making mistakes using statistics and to look at data with a critical eye. Statistical data is used at educational institutions for many purposes. It can be used to support budget requests, changes in educational philosophy, changes to…
ERIC Educational Resources Information Center
Lenard, Christopher; McCarthy, Sally; Mills, Terence
2014-01-01
There are many different aspects of statistics. Statistics involves mathematics, computing, and applications to almost every field of endeavour. Each aspect provides an opportunity to spark someone's interest in the subject. In this paper we discuss some ethical aspects of statistics, and describe how an introduction to ethics has been…
Statistical quality management
NASA Astrophysics Data System (ADS)
Vanderlaan, Paul
1992-10-01
Some aspects of statistical quality management are discussed. Quality has to be defined as a concrete, measurable quantity. The concepts of Total Quality Management (TQM), Statistical Process Control (SPC), and inspection are explained. In most cases SPC is better than inspection. It can be concluded that statistics has great possibilities in the field of TQM.
Statistical Modeling of Occupational Exposure to Polycyclic Aromatic Hydrocarbons Using OSHA Data.
Lee, Derrick G; Lavoué, Jérôme; Spinelli, John J; Burstyn, Igor
2015-01-01
Polycyclic aromatic hydrocarbons (PAHs) are a group of pollutants with multiple variants classified as carcinogenic. The Occupational Safety and Health Administration (OSHA) provided access to two PAH exposure databanks of United States workplace compliance testing data collected between 1979 and 2010. Mixed-effects logistic models were used to predict the exceedance fraction (EF), i.e., the probability of exceeding OSHA's Permissible Exposure Limit (PEL = 0.2 mg/m3) for PAHs based on industry and occupation. Measurements of coal tar pitch volatiles were used as a surrogate for PAHs. Time, databank, occupation, and industry were included as fixed-effects while an identifier for the compliance inspection number was included as a random effect. Analyses involved 2,509 full-shift personal measurements. Results showed that the majority of industries had an estimated EF < 0.5, although several industries, including Standardized Industry Classification codes 1623 (Water, Sewer, Pipeline, and Communication and Powerline Construction), 1711 (Plumbing, Heating, and Air-Conditioning), 2824 (Manmade Organic Fibres), 3496 (Misc. Fabricated Wire products), and 5812 (Eating Places), and Major group's 13 (Oil and Gas Extraction) and 30 (Rubber and Miscellaneous Plastic Products), were estimated to have more than an 80% likelihood of exceeding the PEL. There was an inverse temporal trend of exceeding the PEL, with lower risk in most recent years, albeit not statistically significant. Similar results were shown when incorporating occupation, but varied depending on the occupation as the majority of industries predicted at the administrative level, e.g., managers, had an estimated EF < 0.5 while at the minimally skilled/laborer level there was a substantial increase in the estimated EF. These statistical models allow the prediction of PAH exposure risk through individual occupational histories and will be used to create a job-exposure matrix for use in a population-based case
Tannery, Nancy Hrinya; Silverman, Deborah L; Epstein, Barbara A
2002-01-01
Online use statistics can provide libraries with a tool to be used when developing an online collection of resources. Statistics can provide information on overall use of a collection, individual print and electronic journal use, and collection use by specific user populations. They can also be used to determine the number of user licenses to purchase. This paper focuses on the issue of use statistics made available for one collection of online resources.
Statistical distribution sampling
NASA Technical Reports Server (NTRS)
Johnson, E. S.
1975-01-01
Determining the distribution of statistics by sampling was investigated. Characteristic functions, the quadratic regression problem, and the differential equations for the characteristic functions are analyzed.
Statistical prediction of cyclostationary processes
Kim, K.Y.
2000-03-15
Considered in this study is a cyclostationary generalization of an EOF-based prediction method. While linear statistical prediction methods are typically optimal in the sense that prediction error variance is minimal within the assumption of stationarity, there is some room for improved performance since many physical processes are not stationary. For instance, El Nino is known to be strongly phase locked with the seasonal cycle, which suggests nonstationarity of the El Nino statistics. Many geophysical and climatological processes may be termed cyclostationary since their statistics show strong cyclicity instead of stationarity. Therefore, developed in this study is a cyclostationary prediction method. Test results demonstrate that performance of prediction methods can be improved significantly by accounting for the cyclostationarity of underlying processes. The improvement comes from an accurate rendition of covariance structure both in space and time.
Statistical Mechanics of Zooplankton.
Hinow, Peter; Nihongi, Ai; Strickler, J Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar "microscopic" quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the "ecological temperature" of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean's swimming behavior.
Statistical Mechanics of Zooplankton
Hinow, Peter; Nihongi, Ai; Strickler, J. Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar “microscopic” quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the “ecological temperature” of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean’s swimming behavior. PMID:26270537
Multidimensional Visual Statistical Learning
ERIC Educational Resources Information Center
Turk-Browne, Nicholas B.; Isola, Phillip J.; Scholl, Brian J.; Treat, Teresa A.
2008-01-01
Recent studies of visual statistical learning (VSL) have demonstrated that statistical regularities in sequences of visual stimuli can be automatically extracted, even without intent or awareness. Despite much work on this topic, however, several fundamental questions remain about the nature of VSL. In particular, previous experiments have not…
Croarkin, M. Carroll
2001-01-01
For more than 50 years, the Statistical Engineering Division (SED) has been instrumental in the success of a broad spectrum of metrology projects at NBS/NIST. This paper highlights fundamental contributions of NBS/NIST statisticians to statistics and to measurement science and technology. Published methods developed by SED staff, especially during the early years, endure as cornerstones of statistics not only in metrology and standards applications, but as data-analytic resources used across all disciplines. The history of statistics at NBS/NIST began with the formation of what is now the SED. Examples from the first five decades of the SED illustrate the critical role of the division in the successful resolution of a few of the highly visible, and sometimes controversial, statistical studies of national importance. A review of the history of major early publications of the division on statistical methods, design of experiments, and error analysis and uncertainty is followed by a survey of several thematic areas. The accompanying examples illustrate the importance of SED in the history of statistics, measurements and standards: calibration and measurement assurance, interlaboratory tests, development of measurement methods, Standard Reference Materials, statistical computing, and dissemination of measurement technology. A brief look forward sketches the expanding opportunity and demand for SED statisticians created by current trends in research and development at NIST. PMID:27500023
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…
Reform in Statistical Education
ERIC Educational Resources Information Center
Huck, Schuyler W.
2007-01-01
Two questions are considered in this article: (a) What should professionals in school psychology do in an effort to stay current with developments in applied statistics? (b) What should they do with their existing knowledge to move from surface understanding of statistics to deep understanding? Written for school psychologists who have completed…
Demonstrating Poisson Statistics.
ERIC Educational Resources Information Center
Vetterling, William T.
1980-01-01
Describes an apparatus that offers a very lucid demonstration of Poisson statistics as applied to electrical currents, and the manner in which such statistics account for shot noise when applied to macroscopic currents. The experiment described is intended for undergraduate physics students. (HM)
Statistical Summaries: Public Institutions.
ERIC Educational Resources Information Center
Virginia State Council of Higher Education, Richmond.
This document, presents a statistical portrait of the Virginia's 17 public higher education institutions. Data provided include: enrollment figures (broken down in categories such as sex, residency, full- and part-time status, residence, ethnicity, age, and level of postsecondary education); FTE figures; admissions statistics (such as number…
Explorations in Statistics: Power
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2010-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fifth installment of "Explorations in Statistics" revisits power, a concept fundamental to the test of a null hypothesis. Power is the probability that we reject the null hypothesis when it is false. Four things affect…
ERIC Educational Resources Information Center
Huizingh, Eelko K. R. E.
2007-01-01
Accessibly written and easy to use, "Applied Statistics Using SPSS" is an all-in-one self-study guide to SPSS and do-it-yourself guide to statistics. What is unique about Eelko Huizingh's approach is that this book is based around the needs of undergraduate students embarking on their own research project, and its self-help style is designed to…
ERIC Educational Resources Information Center
Council of Ontario Universities, Toronto.
Summary statistics on application and registration patterns of applicants wishing to pursue full-time study in first-year places in Ontario universities (for the fall of 1987) are given. Data on registrations were received indirectly from the universities as part of their annual submission of USIS/UAR enrollment data to Statistics Canada and MCU.…
Introduction to Statistical Physics
NASA Astrophysics Data System (ADS)
Casquilho, João Paulo; Ivo Cortez Teixeira, Paulo
2014-12-01
Preface; 1. Random walks; 2. Review of thermodynamics; 3. The postulates of statistical physics. Thermodynamic equilibrium; 4. Statistical thermodynamics – developments and applications; 5. The classical ideal gas; 6. The quantum ideal gas; 7. Magnetism; 8. The Ising model; 9. Liquid crystals; 10. Phase transitions and critical phenomena; 11. Irreversible processes; Appendixes; Index.
Deconstructing Statistical Analysis
ERIC Educational Resources Information Center
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
ERIC Educational Resources Information Center
Hodgson, Ted; Andersen, Lyle; Robison-Cox, Jim; Jones, Clain
2004-01-01
Water quality experiments, especially the use of macroinvertebrates as indicators of water quality, offer an ideal context for connecting statistics and science. In the STAR program for secondary students and teachers, water quality experiments were also used as a context for teaching statistics. In this article, we trace one activity that uses…
Understanding Undergraduate Statistical Anxiety
ERIC Educational Resources Information Center
McKim, Courtney
2014-01-01
The purpose of this study was to understand undergraduate students' views of statistics. Results reveal that students with less anxiety have a higher interest in statistics and also believe in their ability to perform well in the course. Also students who have a more positive attitude about the class tend to have a higher belief in their…
Explorations in Statistics: Correlation
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2010-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This sixth installment of "Explorations in Statistics" explores correlation, a familiar technique that estimates the magnitude of a straight-line relationship between two variables. Correlation is meaningful only when the…
LED champing: statistically blessed?
Wang, Zhuo
2015-06-10
LED champing (smart mixing of individual LEDs to match the desired color and lumens) and color mixing strategies have been widely used to maintain the color consistency of light engines. Light engines with champed LEDs can easily achieve the color consistency of a couple MacAdam steps with widely distributed LEDs to begin with. From a statistical point of view, the distributions for the color coordinates and the flux after champing are studied. The related statistical parameters are derived, which facilitate process improvements such as Six Sigma and are instrumental to statistical quality control for mass productions. PMID:26192863
Winters, Ryan; Winters, Andrew; Amedee, Ronald G.
2010-01-01
The Accreditation Council for Graduate Medical Education sets forth a number of required educational topics that must be addressed in residency and fellowship programs. We sought to provide a primer on some of the important basic statistical concepts to consider when examining the medical literature. It is not essential to understand the exact workings and methodology of every statistical test encountered, but it is necessary to understand selected concepts such as parametric and nonparametric tests, correlation, and numerical versus categorical data. This working knowledge will allow you to spot obvious irregularities in statistical analyses that you encounter. PMID:21603381
NASA Technical Reports Server (NTRS)
Young, M.; Koslovsky, M.; Schaefer, Caroline M.; Feiveson, A. H.
2017-01-01
Back by popular demand, the JSC Biostatistics Laboratory and LSAH statisticians are offering an opportunity to discuss your statistical challenges and needs. Take the opportunity to meet the individuals offering expert statistical support to the JSC community. Join us for an informal conversation about any questions you may have encountered with issues of experimental design, analysis, or data visualization. Get answers to common questions about sample size, repeated measures, statistical assumptions, missing data, multiple testing, time-to-event data, and when to trust the results of your analyses.
Lessons from Inferentialism for Statistics Education
ERIC Educational Resources Information Center
Bakker, Arthur; Derry, Jan
2011-01-01
This theoretical paper relates recent interest in informal statistical inference (ISI) to the semantic theory termed inferentialism, a significant development in contemporary philosophy, which places inference at the heart of human knowing. This theory assists epistemological reflection on challenges in statistics education encountered when…
Petroleum statistics in France
De Saint Germain, H.; Lamiraux, C.
1995-08-01
33 oil companies, including Elf, Exxon, Agip, Conoco as well as Coparex, Enron, Hadson, Midland, Hunt, Canyon and Union Texas are present in oil and gas exploration and production in France. The production of oil and gas in France amounts to some 60,000 bopd of oil and 350 MMcfpd of marketed natural gas each year, which still accounts for 3.5% and 10% for French domestic needs, respectively. To date, 166 fields have been discovered, representing a total reserve of 3 billion bbl of crude oil and 13 trillion cf of raw gas. These fields are concentrated in two major onshore sedimentary basins of Mesozoic age, which are the Aquitaine basin and the Paris basin. The Aquitaine basin should be subdivided into two distinct domains: The Parentis basin where the largest field Parentis was discovered in 1954 with still production of about 3700 bopd of oil and where Les Arbouslers field, discovered at the end of 1991, is currently producing about 10,000 bopd of oil. The northern Pyrenees and their foreland, where the Lacq field, discovered in 1951, has produced about 7.7 tcf of gas since 1957, and is still producing 138 MMcfpd. In the Paris basin, the two large oil fields are Villeperclue discovered in 1982 by Triton and Total, and Chaunoy, discovered in 1983 by Essorep, which are still producing about 10,000 and 15,000 bopd, respectively. The last significantly sized discovery occurred in 1990 with Itteville by Elf Aquitaine which is currently producing 4,200 bopd. The poster shows statistical data related to the past 20 years of oil and gas exploration and production in France.
NASA Astrophysics Data System (ADS)
Richfield, Jon; bookfeller
2016-07-01
In reply to Ralph Kenna and Pádraig Mac Carron's feature article “Maths meets myths” in which they describe how they are using techniques from statistical physics to characterize the societies depicted in ancient Icelandic sagas.
... facts and statistics here include brain and central nervous system tumors (including spinal cord, pituitary and pineal gland ... U.S. living with a primary brain and central nervous system tumor. This year, nearly 17,000 people will ...
Titanic: A Statistical Exploration.
ERIC Educational Resources Information Center
Takis, Sandra L.
1999-01-01
Uses the available data about the Titanic's passengers to interest students in exploring categorical data and the chi-square distribution. Describes activities incorporated into a statistics class and gives additional resources for collecting information about the Titanic. (ASK)
NASA Astrophysics Data System (ADS)
Grégoire, G.
2016-05-01
This chapter is devoted to two objectives. The first one is to answer the request expressed by attendees of the first Astrostatistics School (Annecy, October 2013) to be provided with an elementary vademecum of statistics that would facilitate understanding of the given courses. In this spirit we recall very basic notions, that is definitions and properties that we think sufficient to benefit from courses given in the Astrostatistical School. Thus we give briefly definitions and elementary properties on random variables and vectors, distributions, estimation and tests, maximum likelihood methodology. We intend to present basic ideas in a hopefully comprehensible way. We do not try to give a rigorous presentation, and due to the place devoted to this chapter, can cover only a rather limited field of statistics. The second aim is to focus on some statistical tools that are useful in classification: basic introduction to Bayesian statistics, maximum likelihood methodology, Gaussian vectors and Gaussian mixture models.
... and Statistics Recommend on Facebook Tweet Share Compartir Plague in the United States Plague was first introduced ... per year in the United States: 1900-2012. Plague Worldwide Plague epidemics have occurred in Africa, Asia, ...
Cooperative Learning in Statistics.
ERIC Educational Resources Information Center
Keeler, Carolyn M.; And Others
1994-01-01
Formal use of cooperative learning techniques proved effective in improving student performance and retention in a freshman level statistics course. Lectures interspersed with group activities proved effective in increasing conceptual understanding and overall class performance. (11 references) (Author)
Purposeful Statistical Investigations
ERIC Educational Resources Information Center
Day, Lorraine
2014-01-01
Lorraine Day provides us with a great range of statistical investigations using various resources such as maths300 and TinkerPlots. Each of the investigations link mathematics to students' lives and provide engaging and meaningful contexts for mathematical inquiry.
Tuberculosis Data and Statistics
... Organization Chart Advisory Groups Federal TB Task Force Data and Statistics Language: English Español (Spanish) Recommend on ... United States publication. PDF [6 MB] Interactive TB Data Tool Online Tuberculosis Information System (OTIS) OTIS is ...
Understanding Solar Flare Statistics
NASA Astrophysics Data System (ADS)
Wheatland, M. S.
2005-12-01
A review is presented of work aimed at understanding solar flare statistics, with emphasis on the well known flare power-law size distribution. Although avalanche models are perhaps the favoured model to describe flare statistics, their physical basis is unclear, and they are divorced from developing ideas in large-scale reconnection theory. An alternative model, aimed at reconciling large-scale reconnection models with solar flare statistics, is revisited. The solar flare waiting-time distribution has also attracted recent attention. Observed waiting-time distributions are described, together with what they might tell us about the flare phenomenon. Finally, a practical application of flare statistics to flare prediction is described in detail, including the results of a year of automated (web-based) predictions from the method.
Oakland, J.S.
1986-01-01
Addressing the increasing importance for firms to have a thorough knowledge of statistically based quality control procedures, this book presents the fundamentals of statistical process control (SPC) in a non-mathematical, practical way. It provides real-life examples and data drawn from a wide variety of industries. The foundations of good quality management and process control, and control of conformance and consistency during production are given. Offers clear guidance to those who wish to understand and implement modern SPC techniques.
NASA Astrophysics Data System (ADS)
Kardar, Mehran
2006-06-01
While many scientists are familiar with fractals, fewer are familiar with the concepts of scale-invariance and universality which underly the ubiquity of their shapes. These properties may emerge from the collective behaviour of simple fundamental constituents, and are studied using statistical field theories. Based on lectures for a course in statistical mechanics taught by Professor Kardar at Massachusetts Institute of Technology, this textbook demonstrates how such theories are formulated and studied. Perturbation theory, exact solutions, renormalization groups, and other tools are employed to demonstrate the emergence of scale invariance and universality, and the non-equilibrium dynamics of interfaces and directed paths in random media are discussed. Ideal for advanced graduate courses in statistical physics, it contains an integrated set of problems, with solutions to selected problems at the end of the book. A complete set of solutions is available to lecturers on a password protected website at www.cambridge.org/9780521873413. Based on lecture notes from a course on Statistical Mechanics taught by the author at MIT Contains 65 exercises, with solutions to selected problems Features a thorough introduction to the methods of Statistical Field theory Ideal for graduate courses in Statistical Physics
Statistical Physics of Particles
NASA Astrophysics Data System (ADS)
Kardar, Mehran
2006-06-01
Statistical physics has its origins in attempts to describe the thermal properties of matter in terms of its constituent particles, and has played a fundamental role in the development of quantum mechanics. Based on lectures for a course in statistical mechanics taught by Professor Kardar at Massachusetts Institute of Technology, this textbook introduces the central concepts and tools of statistical physics. It contains a chapter on probability and related issues such as the central limit theorem and information theory, and covers interacting particles, with an extensive description of the van der Waals equation and its derivation by mean field approximation. It also contains an integrated set of problems, with solutions to selected problems at the end of the book. It will be invaluable for graduate and advanced undergraduate courses in statistical physics. A complete set of solutions is available to lecturers on a password protected website at www.cambridge.org/9780521873420. Based on lecture notes from a course on Statistical Mechanics taught by the author at MIT Contains 89 exercises, with solutions to selected problems Contains chapters on probability and interacting particles Ideal for graduate courses in Statistical Mechanics
NASA Astrophysics Data System (ADS)
Cook, Samuel A.; Fukawa-Connelly, Timothy
2016-02-01
Studies have shown that at the end of an introductory statistics course, students struggle with building block concepts, such as mean and standard deviation, and rely on procedural understandings of the concepts. This study aims to investigate the understandings entering freshman of a department of mathematics and statistics (including mathematics education), students who are presumably better prepared in terms of mathematics and statistics than the average university student, have of introductory statistics. This case study found that these students enter college with common statistical misunderstandings, lack of knowledge, and idiosyncratic collections of correct statistical knowledge. Moreover, they also have a wide range of beliefs about their knowledge with some of the students who believe that they have the strongest knowledge also having significant misconceptions. More attention to these statistical building blocks may be required in a university introduction statistics course.
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Predicting Success in Psychological Statistics Courses.
Lester, David
2016-06-01
Many students perform poorly in courses on psychological statistics, and it is useful to be able to predict which students will have difficulties. In a study of 93 undergraduates enrolled in Statistical Methods (18 men, 75 women; M age = 22.0 years, SD = 5.1), performance was significantly associated with sex (female students performed better) and proficiency in algebra in a linear regression analysis. Anxiety about statistics was not associated with course performance, indicating that basic mathematical skills are the best correlate for performance in statistics courses and can usefully be used to stream students into classes by ability. PMID:27273557
Predicting Success in Psychological Statistics Courses.
Lester, David
2016-06-01
Many students perform poorly in courses on psychological statistics, and it is useful to be able to predict which students will have difficulties. In a study of 93 undergraduates enrolled in Statistical Methods (18 men, 75 women; M age = 22.0 years, SD = 5.1), performance was significantly associated with sex (female students performed better) and proficiency in algebra in a linear regression analysis. Anxiety about statistics was not associated with course performance, indicating that basic mathematical skills are the best correlate for performance in statistics courses and can usefully be used to stream students into classes by ability.
SANABRIA, FEDERICO; KILLEEN, PETER R.
2008-01-01
Despite being under challenge for the past 50 years, null hypothesis significance testing (NHST) remains dominant in the scientific field for want of viable alternatives. NHST, along with its significance level p, is inadequate for most of the uses to which it is put, a flaw that is of particular interest to educational practitioners who too often must use it to sanctify their research. In this article, we review the failure of NHST and propose prep, the probability of replicating an effect, as a more useful statistic for evaluating research and aiding practical decision making. PMID:19122766
Statistical Physics of Fracture
Alava, Mikko; Nukala, Phani K; Zapperi, Stefano
2006-05-01
Disorder and long-range interactions are two of the key components that make material failure an interesting playfield for the application of statistical mechanics. The cornerstone in this respect has been lattice models of the fracture in which a network of elastic beams, bonds, or electrical fuses with random failure thresholds are subject to an increasing external load. These models describe on a qualitative level the failure processes of real, brittle, or quasi-brittle materials. This has been particularly important in solving the classical engineering problems of material strength: the size dependence of maximum stress and its sample-to-sample statistical fluctuations. At the same time, lattice models pose many new fundamental questions in statistical physics, such as the relation between fracture and phase transitions. Experimental results point out to the existence of an intriguing crackling noise in the acoustic emission and of self-affine fractals in the crack surface morphology. Recent advances in computer power have enabled considerable progress in the understanding of such models. Among these partly still controversial issues, are the scaling and size-effects in material strength and accumulated damage, the statistics of avalanches or bursts of microfailures, and the morphology of the crack surface. Here we present an overview of the results obtained with lattice models for fracture, highlighting the relations with statistical physics theories and more conventional fracture mechanics approaches.
SHARE: Statistical hadronization with resonances
NASA Astrophysics Data System (ADS)
Torrieri, G.; Steinke, S.; Broniowski, W.; Florkowski, W.; Letessier, J.; Rafelski, J.
2005-05-01
errors are independent, since the systematic error is not a random variable). Aside of χ, the program also calculates the statistical significance [2], defined as the probability that, given a "true" theory and a statistical (Gaussian) experimental error, the fitted χ assumes the values at or above the considered value. In the case that the best fit has statistical significance significantly below unity, the model under consideration is very likely inappropriate. In the limit of many degrees of freedom ( N), the statistical significance function depends only on χ/N, with 90% statistical significance at χ/N˜1, and falling steeply at χ/N>1. However, the degrees of freedom in fits involving ratios are generally not sufficient to reach the asymptotic limit. Hence, statistical significance depends strongly on χ and N separately. In particular, if N<20, often for a fit to have an acceptable statistical significance, a χ/N significantly less than 1 is required. The fit routine does not always find the true lowest χ minimum. Specifically, multi-parameter fits with too few degrees of freedom generally exhibit a non-trivial structure in parameter space, with several secondary minima, saddle points, valleys, etc. To help the user perform the minimization effectively, we have added tools to compute the χ contours and profiles. In addition, our program's flexibility allows for many strategies in performing the fit. It is therefore possible, by following the techniques described in Section 3.7, to scan the parameter space and ensure that the minimum found is the true one. Further systematic deviations between the model and experiment can be recognized via the program's output, which includes a particle-by-particle comparison between experiment and theory. Additional comments: In consideration of the wide stream of new data coming out from RHIC, there is an on-going activity, with several groups performing analysis of particle yields. It is our hope that SHARE will allow to
Helping Alleviate Statistical Anxiety with Computer Aided Statistical Classes
ERIC Educational Resources Information Center
Stickels, John W.; Dobbs, Rhonda R.
2007-01-01
This study, Helping Alleviate Statistical Anxiety with Computer Aided Statistics Classes, investigated whether undergraduate students' anxiety about statistics changed when statistics is taught using computers compared to the traditional method. Two groups of students were questioned concerning their anxiety about statistics. One group was taught…
Statistical learning and selective inference
Taylor, Jonathan; Tibshirani, Robert J.
2015-01-01
We describe the problem of “selective inference.” This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have “cherry-picked”—searched for the strongest associations—means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis. PMID:26100887
ERIC Educational Resources Information Center
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U.
2015-01-01
This article uses definitions provided by Cronbach in his seminal paper for coefficient a to show the concepts of reliability, dimensionality, and internal consistency are distinct but interrelated. The article begins with a critique of the definition of reliability and then explores mathematical properties of Cronbach's a. Internal consistency…
Nitrofurantoin-induced interstitial pneumonitis: albeit rare, should not be missed.
Syed, Haamid; Bachuwa, Ghassan; Upadhaya, Sunil; Abed, Firas
2016-01-01
Interstitial lung disease (ILD) is a rare adverse effect of nitrofurantoin and can range from benign infiltrates to a fatal condition. Nitrofurantoin acts via inhibiting the protein synthesis in bacteria by helping reactive intermediates and is known to produce primary lung parenchymal injury through an oxidant mechanism. Stopping the drug leads to complete recovery of symptoms. In this report, we present a case of nitrofurantoin-induced ILD with the recovery of symptoms and disease process after stopping the drug. PMID:26912767
Periodontal Disease as a Specific, albeit Chronic, Infection: Diagnosis and Treatment
Loesche, Walter J.; Grossman, Natalie S.
2001-01-01
Periodontal disease is perhaps the most common chronic infection in adults. Evidence has been accumulating for the past 30 years which indicates that almost all forms of periodontal disease are chronic but specific bacterial infections due to the overgrowth in the dental plaque of a finite number of mostly anaerobic species such as Porphyromonas gingivalis, Bacteroides forsythus, and Treponema denticola. The success of traditional debridement procedures and/or antimicrobial agents in improving periodontal health can be associated with the reduction in levels of these anaerobes in the dental plaque. These findings suggest that patients and clinicians have a choice in the treatment of this overgrowth, either a debridement and surgery approach or a debridement and antimicrobial treatment approach. However, the antimicrobial approach, while supported by a wealth of scientific evidence, goes contrary to centuries of dental teaching that states that periodontal disease results from a “dirty mouth.” If periodontal disease is demonstrated to be a risk factor for cardiovascular disease and stroke, it will be a modifiable risk factor since periodontal disease can be prevented and treated. Since the antimicrobial approach may be as effective as a surgical approach in the restoration and maintenance of a periodontally healthy dentition, this would give a cardiac or stroke patient and his or her physician a choice in the implementation of treatment seeking to improve the patient's periodontal condition so as to reduce and/or delay future cardiovascular events. PMID:11585783
Queer (v.) Queer (v.): Biology as Curriculum, Pedagogy, and Being albeit Queer (v.)
ERIC Educational Resources Information Center
Broadway, Francis S.
2011-01-01
In order to advance the purpose of education as creating a sustainable world yet to be imagined, educationally, queer (v.) queer (v.) expounds curriculum, pedagogy and being, which has roots in sexuality--the public face of the private confluence of sexuality, gender, race and class, are a necessary framework for queer. If queer is a complicated…
Queer (v.) queer (v.): biology as curriculum, pedagogy, and being albeit queer (v.)
NASA Astrophysics Data System (ADS)
Broadway, Francis S.
2011-06-01
In order to advance the purpose of education as creating a sustainable world yet to be imagined, educationally, queer (v.) queer (v.) expounds curriculum, pedagogy and being, which has roots in sexuality—the public face of the private confluence of sexuality, gender, race and class, are a necessary framework for queer. If queer is a complicated conversation of strangers' eros, then queer facilitates the creation of space, revolution and transformation. In other words, queer, for science education, is more than increasing and privileging the heteronormative and non-heteronormative science content that extends capitalism's hegemony, but rather science as the dignity, identity, and loving and caring of and by one's self and fellow human beings as strangers.
Perception in statistical graphics
NASA Astrophysics Data System (ADS)
VanderPlas, Susan Ruth
There has been quite a bit of research on statistical graphics and visualization, generally focused on new types of graphics, new software to create graphics, interactivity, and usability studies. Our ability to interpret and use statistical graphics hinges on the interface between the graph itself and the brain that perceives and interprets it, and there is substantially less research on the interplay between graph, eye, brain, and mind than is sufficient to understand the nature of these relationships. The goal of the work presented here is to further explore the interplay between a static graph, the translation of that graph from paper to mental representation (the journey from eye to brain), and the mental processes that operate on that graph once it is transferred into memory (mind). Understanding the perception of statistical graphics should allow researchers to create more effective graphs which produce fewer distortions and viewer errors while reducing the cognitive load necessary to understand the information presented in the graph. Taken together, these experiments should lay a foundation for exploring the perception of statistical graphics. There has been considerable research into the accuracy of numerical judgments viewers make from graphs, and these studies are useful, but it is more effective to understand how errors in these judgments occur so that the root cause of the error can be addressed directly. Understanding how visual reasoning relates to the ability to make judgments from graphs allows us to tailor graphics to particular target audiences. In addition, understanding the hierarchy of salient features in statistical graphics allows us to clearly communicate the important message from data or statistical models by constructing graphics which are designed specifically for the perceptual system.
NASA Astrophysics Data System (ADS)
Inomata, Akira
1997-03-01
To understand possible physical consequences of quantum deformation, we investigate statistical behaviors of a quon gas. The quon is an object which obeys the minimally deformed commutator (or q-mutator): a a† - q a†a=1 with -1≤ q≤ 1. Although q=1 and q=-1 appear to correspond respectively to boson and fermion statistics, it is not easy to create a gas which unifies the boson gas and the fermion gas. We present a model which is able to interpolates between the two limits. The quon gas shows the Bose-Einstein condensation near the Boson limit in two dimensions.
NASA Astrophysics Data System (ADS)
2014-02-01
When promoting the value of their research or procuring funding, researchers often need to explain the significance of their work to the community -- something that can be just as tricky as the research itself.
Statistical aspects of solar flares
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
1987-01-01
A survey of the statistical properties of 850 H alpha solar flares during 1975 is presented. Comparison of the results found here with those reported elsewhere for different epochs is accomplished. Distributions of rise time, decay time, and duration are given, as are the mean, mode, median, and 90th percentile values. Proportions by selected groupings are also determined. For flares in general, mean values for rise time, decay time, and duration are 5.2 + or - 0.4 min, and 18.1 + or 1.1 min, respectively. Subflares, accounting for nearly 90 percent of the flares, had mean values lower than those found for flares of H alpha importance greater than 1, and the differences are statistically significant. Likewise, flares of bright and normal relative brightness have mean values of decay time and duration that are significantly longer than those computed for faint flares, and mass-motion related flares are significantly longer than non-mass-motion related flares. Seventy-three percent of the mass-motion related flares are categorized as being a two-ribbon flare and/or being accompanied by a high-speed dark filament. Slow rise time flares (rise time greater than 5 min) have a mean value for duration that is significantly longer than that computed for fast rise time flares, and long-lived duration flares (duration greater than 18 min) have a mean value for rise time that is significantly longer than that computed for short-lived duration flares, suggesting a positive linear relationship between rise time and duration for flares. Monthly occurrence rates for flares in general and by group are found to be linearly related in a positive sense to monthly sunspot number. Statistical testing reveals the association between sunspot number and numbers of flares to be significant at the 95 percent level of confidence, and the t statistic for slope is significant at greater than 99 percent level of confidence. Dependent upon the specific fit, between 58 percent and 94 percent of
Statistical insight: a review.
Vardell, Emily; Garcia-Barcena, Yanira
2012-01-01
Statistical Insight is a database that offers the ability to search across multiple sources of data, including the federal government, private organizations, research centers, and international intergovernmental organizations in one search. Two sample searches on the same topic, a basic and an advanced, were conducted to evaluate the database.
Pilot Class Testing: Statistics.
ERIC Educational Resources Information Center
Washington Univ., Seattle. Washington Foreign Language Program.
Statistics derived from test score data from the pilot classes participating in the Washington Foreign Language Program are presented in tables in this report. An index accompanies the tables, itemizing the classes by level (FLES, middle, and high school), grade test, language skill, and school. MLA-Coop test performances for each class were…
Statistical Reasoning over Lunch
ERIC Educational Resources Information Center
Selmer, Sarah J.; Bolyard, Johnna J.; Rye, James A.
2011-01-01
Students in the 21st century are exposed daily to a staggering amount of numerically infused media. In this era of abundant numeric data, students must be able to engage in sound statistical reasoning when making life decisions after exposure to varied information. The context of nutrition can be used to engage upper elementary and middle school…
Selected Outdoor Recreation Statistics.
ERIC Educational Resources Information Center
Bureau of Outdoor Recreation (Dept. of Interior), Washington, DC.
In this recreational information report, 96 tables are compiled from Bureau of Outdoor Recreation programs and surveys, other governmental agencies, and private sources. Eight sections comprise the document: (1) The Bureau of Outdoor Recreation, (2) Federal Assistance to Recreation, (3) Recreation Surveys for Planning, (4) Selected Statistics of…
ASURV: Astronomical SURVival Statistics
NASA Astrophysics Data System (ADS)
Feigelson, E. D.; Nelson, P. I.; Isobe, T.; LaValley, M.
2014-06-01
ASURV (Astronomical SURVival Statistics) provides astronomy survival analysis for right- and left-censored data including the maximum-likelihood Kaplan-Meier estimator and several univariate two-sample tests, bivariate correlation measures, and linear regressions. ASURV is written in FORTRAN 77, and is stand-alone and does not call any specialized libraries.
Statistics for Learning Genetics
ERIC Educational Resources Information Center
Charles, Abigail Sheena
2012-01-01
This study investigated the knowledge and skills that biology students may need to help them understand statistics/mathematics as it applies to genetics. The data are based on analyses of current representative genetics texts, practicing genetics professors' perspectives, and more directly, students' perceptions of, and performance in,…
Spitball Scatterplots in Statistics
ERIC Educational Resources Information Center
Wagaman, John C.
2012-01-01
This paper describes an active learning idea that I have used in my applied statistics class as a first lesson in correlation and regression. Students propel spitballs from various standing distances from the target and use the recorded data to determine if the spitball accuracy is associated with standing distance and review the algebra of lines…
Geopositional Statistical Methods
NASA Technical Reports Server (NTRS)
Ross, Kenton
2006-01-01
RMSE based methods distort circular error estimates (up to 50% overestimation). The empirical approach is the only statistically unbiased estimator offered. Ager modification to Shultz approach is nearly unbiased, but cumbersome. All methods hover around 20% uncertainty (@ 95% confidence) for low geopositional bias error estimates. This requires careful consideration in assessment of higher accuracy products.
ERIC Educational Resources Information Center
Akram, Muhammad; Siddiqui, Asim Jamal; Yasmeen, Farah
2004-01-01
In order to learn the concept of statistical techniques one needs to run real experiments that generate reliable data. In practice, the data from some well-defined process or system is very costly and time consuming. It is difficult to run real experiments during the teaching period in the university. To overcome these difficulties, statisticians…
Education Statistics Quarterly, 2003.
ERIC Educational Resources Information Center
Marenus, Barbara; Burns, Shelley; Fowler, William; Greene, Wilma; Knepper, Paula; Kolstad, Andrew; McMillen Seastrom, Marilyn; Scott, Leslie
2003-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released in a 3-month period. Each issue also contains a message from the NCES on a timely…
Analogies for Understanding Statistics
ERIC Educational Resources Information Center
Hocquette, Jean-Francois
2004-01-01
This article describes a simple way to explain the limitations of statistics to scientists and students to avoid the publication of misleading conclusions. Biologists examine their results extremely critically and carefully choose the appropriate analytic methods depending on their scientific objectives. However, no such close attention is usually…
Statistical properties of Fourier-based time-lag estimates
NASA Astrophysics Data System (ADS)
Epitropakis, A.; Papadakis, I. E.
2016-06-01
observed time series; b) smoothing of the cross-periodogram should be avoided, as this may introduce significant bias to the time-lag estimates, which can be taken into account by assuming a model cross-spectrum (and not just a model time-lag spectrum); c) time-lags should be estimated by dividing observed time series into a number, say m, of shorter data segments and averaging the resulting cross-periodograms; d) if the data segments have a duration ≳ 20 ks, the time-lag bias is ≲15% of its intrinsic value for the model cross-spectra and power-spectra considered in this work. This bias should be estimated in practise (by considering possible intrinsic cross-spectra that may be applicable to the time-lag spectra at hand) to assess the reliability of any time-lag analysis; e) the effects of experimental noise can be minimised by only estimating time-lags in the frequency range where the sample coherence is larger than 1.2/(1 + 0.2m). In this range, the amplitude of noise variations caused by measurement errors is smaller than the amplitude of the signal's intrinsic variations. As long as m ≳ 20, time-lags estimated by averaging over individual data segments have analytical error estimates that are within 95% of the true scatter around their mean, and their distribution is similar, albeit not identical, to a Gaussian.
NASA Technical Reports Server (NTRS)
Black, D. C.
1986-01-01
The significance of brown dwarfs for resolving some major problems in astronomy is discussed. The importance of brown dwarfs for models of star formation by fragmentation of molecular clouds and for obtaining independent measurements of the ages of stars in binary systems is addressed. The relationship of brown dwarfs to planets is considered.
Statistics, Uncertainty, and Transmitted Variation
Wendelberger, Joanne Roth
2014-11-05
The field of Statistics provides methods for modeling and understanding data and making decisions in the presence of uncertainty. When examining response functions, variation present in the input variables will be transmitted via the response function to the output variables. This phenomenon can potentially have significant impacts on the uncertainty associated with results from subsequent analysis. This presentation will examine the concept of transmitted variation, its impact on designed experiments, and a method for identifying and estimating sources of transmitted variation in certain settings.
NASA Astrophysics Data System (ADS)
Maccone, C.
In this paper is provided the statistical generalization of the Fermi paradox. The statistics of habitable planets may be based on a set of ten (and possibly more) astrobiological requirements first pointed out by Stephen H. Dole in his book Habitable planets for man (1964). The statistical generalization of the original and by now too simplistic Dole equation is provided by replacing a product of ten positive numbers by the product of ten positive random variables. This is denoted the SEH, an acronym standing for “Statistical Equation for Habitables”. The proof in this paper is based on the Central Limit Theorem (CLT) of Statistics, stating that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable (Lyapunov form of the CLT). It is then shown that: 1. The new random variable NHab, yielding the number of habitables (i.e. habitable planets) in the Galaxy, follows the log- normal distribution. By construction, the mean value of this log-normal distribution is the total number of habitable planets as given by the statistical Dole equation. 2. The ten (or more) astrobiological factors are now positive random variables. The probability distribution of each random variable may be arbitrary. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into the SEH by allowing an arbitrary probability distribution for each factor. This is both astrobiologically realistic and useful for any further investigations. 3. By applying the SEH it is shown that the (average) distance between any two nearby habitable planets in the Galaxy may be shown to be inversely proportional to the cubic root of NHab. This distance is denoted by new random variable D. The relevant probability density function is derived, which was named the "Maccone distribution" by Paul Davies in
The Statistical Drake Equation
NASA Astrophysics Data System (ADS)
Maccone, Claudio
2010-12-01
We provide the statistical generalization of the Drake equation. From a simple product of seven positive numbers, the Drake equation is now turned into the product of seven positive random variables. We call this "the Statistical Drake Equation". The mathematical consequences of this transformation are then derived. The proof of our results is based on the Central Limit Theorem (CLT) of Statistics. In loose terms, the CLT states that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable. This is called the Lyapunov Form of the CLT, or the Lindeberg Form of the CLT, depending on the mathematical constraints assumed on the third moments of the various probability distributions. In conclusion, we show that: The new random variable N, yielding the number of communicating civilizations in the Galaxy, follows the LOGNORMAL distribution. Then, as a consequence, the mean value of this lognormal distribution is the ordinary N in the Drake equation. The standard deviation, mode, and all the moments of this lognormal N are also found. The seven factors in the ordinary Drake equation now become seven positive random variables. The probability distribution of each random variable may be ARBITRARY. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into our statistical Drake equation by allowing an arbitrary probability distribution for each factor. This is both physically realistic and practically very useful, of course. An application of our statistical Drake equation then follows. The (average) DISTANCE between any two neighboring and communicating civilizations in the Galaxy may be shown to be inversely proportional to the cubic root of N. Then, in our approach, this distance becomes a new random variable. We derive the relevant probability density
NASA Astrophysics Data System (ADS)
Dunbar, P. K.; Furtney, M.; McLean, S. J.; Sweeney, A. D.
2014-12-01
Tsunamis have inflicted death and destruction on the coastlines of the world throughout history. The occurrence of tsunamis and the resulting effects have been collected and studied as far back as the second millennium B.C. The knowledge gained from cataloging and examining these events has led to significant changes in our understanding of tsunamis, tsunami sources, and methods to mitigate the effects of tsunamis. The most significant, not surprisingly, are often the most devastating, such as the 2011 Tohoku, Japan earthquake and tsunami. The goal of this poster is to give a brief overview of the occurrence of tsunamis and then focus specifically on several significant tsunamis. There are various criteria to determine the most significant tsunamis: the number of deaths, amount of damage, maximum runup height, had a major impact on tsunami science or policy, etc. As a result, descriptions will include some of the most costly (2011 Tohoku, Japan), the most deadly (2004 Sumatra, 1883 Krakatau), and the highest runup ever observed (1958 Lituya Bay, Alaska). The discovery of the Cascadia subduction zone as the source of the 1700 Japanese "Orphan" tsunami and a future tsunami threat to the U.S. northwest coast, contributed to the decision to form the U.S. National Tsunami Hazard Mitigation Program. The great Lisbon earthquake of 1755 marked the beginning of the modern era of seismology. Knowledge gained from the 1964 Alaska earthquake and tsunami helped confirm the theory of plate tectonics. The 1946 Alaska, 1952 Kuril Islands, 1960 Chile, 1964 Alaska, and the 2004 Banda Aceh, tsunamis all resulted in warning centers or systems being established.The data descriptions on this poster were extracted from NOAA's National Geophysical Data Center (NGDC) global historical tsunami database. Additional information about these tsunamis, as well as water level data can be found by accessing the NGDC website www.ngdc.noaa.gov/hazard/
A mixed-effects Statistical Model for Comparative LC-MS Proteomics Studies
Daly, Don S.; Anderson, Kevin K.; Panisko, Ellen A.; Purvine, Samuel O.; Fang, Ruihua; Monroe, Matthew E.; Baker, Scott E.
2008-03-01
Comparing a protein’s concentrations across two or more treatments is the focus of many proteomics studies. A frequent source of measurements for these comparisons is a mass spectrometry (MS) analysis of a protein’s peptide ions separated by liquid chromatography (LC) following its enzymatic digestion. Alas, LC-MS identification and quantification of equimolar peptides can vary significantly due to their unequal digestion, separation and ionization. This unequal measurability of peptides, the largest source of LC-MS nuisance variation, stymies confident comparison of a protein’s concentration across treatments. Our objective is to introduce a mixed-effects statistical model for comparative LC-MS proteomics studies. We describe LC-MS peptide abundance with a linear model featuring pivotal terms that account for unequal peptide LC-MS measurability. We advance fitting this model to an often incomplete LC-MS dataset with REstricted Maximum Likelihood (REML) estimation, producing estimates of model goodness-offit, treatment effects, standard errors, confidence intervals, and protein relative concentrations. We illustrate the model with an experiment featuring a known dilution series of a filamentous ascomycete fungus Trichoderma reesei protein mixture. For the 781 of 1546 T.reesei proteins with sufficient data coverage, the fitted mixed-effects models capably described the LC-MS measurements. The LC-MS measurability terms effectively accounted for this major source of uncertainty. Ninety percent of the relative concentration estimates were within 1/2 fold of the true relative concentrations. Akin to the common ratio method, this model also produced biased estimates, albeit less biased. Bias decreased significantly, both absolutely and relative to the ratio method, as the number of observed peptides per protein increased. Mixed-effects statistical modeling offers a flexible, well-established methodology for comparative proteomics studies integrating common
Nock, Richard; Nielsen, Frank
2004-11-01
This paper explores a statistical basis for a process often described in computer vision: image segmentation by region merging following a particular order in the choice of regions. We exhibit a particular blend of algorithmics and statistics whose segmentation error is, as we show, limited from both the qualitative and quantitative standpoints. This approach can be efficiently approximated in linear time/space, leading to a fast segmentation algorithm tailored to processing images described using most common numerical pixel attribute spaces. The conceptual simplicity of the approach makes it simple to modify and cope with hard noise corruption, handle occlusion, authorize the control of the segmentation scale, and process unconventional data such as spherical images. Experiments on gray-level and color images, obtained with a short readily available C-code, display the quality of the segmentations obtained.
Modeling cosmic void statistics
NASA Astrophysics Data System (ADS)
Hamaus, Nico; Sutter, P. M.; Wandelt, Benjamin D.
2016-10-01
Understanding the internal structure and spatial distribution of cosmic voids is crucial when considering them as probes of cosmology. We present recent advances in modeling void density- and velocity-profiles in real space, as well as void two-point statistics in redshift space, by examining voids identified via the watershed transform in state-of-the-art ΛCDM n-body simulations and mock galaxy catalogs. The simple and universal characteristics that emerge from these statistics indicate the self-similarity of large-scale structure and suggest cosmic voids to be among the most pristine objects to consider for future studies on the nature of dark energy, dark matter and modified gravity.
Statistical evaluation of forecasts
NASA Astrophysics Data System (ADS)
Mader, Malenka; Mader, Wolfgang; Gluckman, Bruce J.; Timmer, Jens; Schelter, Björn
2014-08-01
Reliable forecasts of extreme but rare events, such as earthquakes, financial crashes, and epileptic seizures, would render interventions and precautions possible. Therefore, forecasting methods have been developed which intend to raise an alarm if an extreme event is about to occur. In order to statistically validate the performance of a prediction system, it must be compared to the performance of a random predictor, which raises alarms independent of the events. Such a random predictor can be obtained by bootstrapping or analytically. We propose an analytic statistical framework which, in contrast to conventional methods, allows for validating independently the sensitivity and specificity of a forecasting method. Moreover, our method accounts for the periods during which an event has to remain absent or occur after a respective forecast.
Journey Through Statistical Mechanics
NASA Astrophysics Data System (ADS)
Yang, C. N.
2013-05-01
My first involvement with statistical mechanics and the many body problem was when I was a student at The National Southwest Associated University in Kunming during the war. At that time Professor Wang Zhu-Xi had just come back from Cambridge, England, where he was a student of Fowler, and his thesis was on phase transitions, a hot topic at that time, and still a very hot topic today...
Statistical Methods in Cosmology
NASA Astrophysics Data System (ADS)
Verde, L.
2010-03-01
The advent of large data-set in cosmology has meant that in the past 10 or 20 years our knowledge and understanding of the Universe has changed not only quantitatively but also, and most importantly, qualitatively. Cosmologists rely on data where a host of useful information is enclosed, but is encoded in a non-trivial way. The challenges in extracting this information must be overcome to make the most of a large experimental effort. Even after having converged to a standard cosmological model (the LCDM model) we should keep in mind that this model is described by 10 or more physical parameters and if we want to study deviations from it, the number of parameters is even larger. Dealing with such a high dimensional parameter space and finding parameters constraints is a challenge on itself. Cosmologists want to be able to compare and combine different data sets both for testing for possible disagreements (which could indicate new physics) and for improving parameter determinations. Finally, cosmologists in many cases want to find out, before actually doing the experiment, how much one would be able to learn from it. For all these reasons, sophisiticated statistical techniques are being employed in cosmology, and it has become crucial to know some statistical background to understand recent literature in the field. I will introduce some statistical tools that any cosmologist should know about in order to be able to understand recently published results from the analysis of cosmological data sets. I will not present a complete and rigorous introduction to statistics as there are several good books which are reported in the references. The reader should refer to those.
NASA Astrophysics Data System (ADS)
Talkner, Peter
2003-07-01
The statistical properties of the transitions of a discrete Markov process are investigated in terms of entrance times. A simple formula for their density is given and used to measure the synchronization of a process with a periodic driving force. For the McNamara-Wiesenfeld model of stochastic resonance we find parameter regions in which the transition frequency of the process is locked with the frequency of the external driving.
1979 DOE statistical symposium
Gardiner, D.A.; Truett T.
1980-09-01
The 1979 DOE Statistical Symposium was the fifth in the series of annual symposia designed to bring together statisticians and other interested parties who are actively engaged in helping to solve the nation's energy problems. The program included presentations of technical papers centered around exploration and disposal of nuclear fuel, general energy-related topics, and health-related issues, and workshops on model evaluation, risk analysis, analysis of large data sets, and resource estimation.
Hockey sticks, principal components, and spurious significance
NASA Astrophysics Data System (ADS)
McIntyre, Stephen; McKitrick, Ross
2005-02-01
The ``hockey stick'' shaped temperature reconstruction of Mann et al. (1998, 1999) has been widely applied. However it has not been previously noted in print that, prior to their principal components (PCs) analysis on tree ring networks, they carried out an unusual data transformation which strongly affects the resulting PCs. Their method, when tested on persistent red noise, nearly always produces a hockey stick shaped first principal component (PC1) and overstates the first eigenvalue. In the controversial 15th century period, the MBH98 method effectively selects only one species (bristlecone pine) into the critical North American PC1, making it implausible to describe it as the ``dominant pattern of variance''. Through Monte Carlo analysis, we show that MBH98 benchmarks for significance of the Reduction of Error (RE) statistic are substantially under-stated and, using a range of cross-validation statistics, we show that the MBH98 15th century reconstruction lacks statistical significance.
Guta, Madalin; Butucea, Cristina
2010-10-15
The notion of a U-statistic for an n-tuple of identical quantum systems is introduced in analogy to the classical (commutative) case: given a self-adjoint 'kernel' K acting on (C{sup d}){sup '}x{sup r} with r
Statistical Inference at Work: Statistical Process Control as an Example
ERIC Educational Resources Information Center
Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia
2008-01-01
To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…
ERIC Educational Resources Information Center
Chan, Shiau Wei; Ismail, Zaleha
2014-01-01
The focus of assessment in statistics has gradually shifted from traditional assessment towards alternative assessment where more attention has been paid to the core statistical concepts such as center, variability, and distribution. In spite of this, there are comparatively few assessments that combine the significant three types of statistical…
Machtay; Glatstein
1998-01-01
have shown overall survivals superior to age-matched controls). It is fallacious and illogical to compare nonrandomized series of observation to those of aggressive therapy. In addition to the above problem, the use of DSS introduces another potential issue which we will call the bias of cause-of-death-interpretation. All statistical endpoints (e.g., response rates, local-regional control, freedom from brain metastases), except OS, are known to depend heavily on the methods used to define the endpoint and are often subject to significant interobserver variability. There is no reason to believe that this problem does not occasionally occur with respect to defining a death as due to the index cancer or to intercurrent disease, even though this issue has been poorly studied. In many oncologic situations-for example, metastatic lung cancer-this form of bias does not exist. In some situations, such as head and neck cancer, this could be an intermediate problem (Was that lethal chest tumor a second primary or a metastasis?.Would the fatal aspiration pneumonia have occurred if he still had a tongue?.And what about Mr. B. described above?). In some situations, particularly relatively "good prognosis" neoplasms, this could be a substantial problem, particularly if the adjudication of whether or not a death is cancer-related is performed solely by researchers who have an "interest" in demonstrating a good DSS. What we are most concerned about with this form of bias relates to recent series on observation, such as in early prostate cancer. It is interesting to note that although only 10% of the "observed" patients die from prostate cancer, many develop distant metastases by 10 years (approximately 40% among patients with intermediate grade tumors). Thus, it is implied that many prostate cancer metastases are usually not of themselves lethal, which is a misconception to anyone experienced in taking care of prostate cancer patients. This is inconsistent with U.S. studies of
Statistical considerations in design of spacelab experiments
NASA Technical Reports Server (NTRS)
Robinson, J.
1978-01-01
After making an analysis of experimental error sources, statistical models were developed for the design and analysis of potential Space Shuttle experiments. Guidelines for statistical significance and/or confidence limits of expected results were also included. The models were then tested out on the following proposed Space Shuttle biomedical experiments: (1) bone density by computer tomography; (2) basal metabolism; and (3) total body water. Analysis of those results and therefore of the models proved inconclusive due to the lack of previous research data and statistical values. However, the models were seen as possible guides to making some predictions and decisions.
Anthropological significance of phenylketonuria.
Saugstad, L F
1975-01-01
The highest incidence rates of phenylketonuria (PKU) have been observed in Ireland and Scotlant. Parents heterozygous for PKU in Norway differ significantly from the general population in the Rhesus, Kell and PGM systems. The parents investigated showed an excess of Rh negative, Kell plus and PGM type 1 individuals, which makes them similar to the present populations in Ireland and Scotlant. It is postulated that the heterozygotes for PKU in Norway are descended from a completely assimilated sub-population of Celtic origin, who came or were brought here, 1ooo years ago. Bronze objects of Western European (Scottish, Irish) origin, found in Viking graves widely distributed in Norway, have been taken as evidence of Vikings returning with loot (including a number of Celts) from Western Viking settlements. The continuity of residence since the Viking age in most habitable parts of Norway, and what seems to be a nearly complete regional relationship between the sites where Viking graves contain western imported objects and the birthplaces of grandparents of PKUs identified in Norway, lend further support to the hypothesis that the heterozygotes for PKU in Norway are descended from a completely assimilated subpopulation. The remarkable resemblance between Iceland and Ireland, in respect of several genetic markers (including the Rhesus, PGM and Kell systems), is considered to be an expression of a similar proportion of people of Celtic origin in each of the two countries. Their identical, high incidence rates of PKU are regarded as further evidence of this. The significant decline in the incidence of PKU when one passes from Ireland, Scotland and Iceland, to Denmark and on to Norway and Sweden, is therefore explained as being related to a reduction in the proportion of inhabitants of Celtic extraction in the respective populations.
Statistical design for microwave systems
NASA Technical Reports Server (NTRS)
Cooke, Roland; Purviance, John
1991-01-01
This paper presents an introduction to statistical system design. Basic ideas needed to understand statistical design and a method for implementing statistical design are presented. The nonlinear characteristics of the system amplifiers and mixers are accounted for in the given examples. The specification of group delay, signal-to-noise ratio and output power are considered in these statistical designs.
Experimental Mathematics and Computational Statistics
Bailey, David H.; Borwein, Jonathan M.
2009-04-30
The field of statistics has long been noted for techniques to detect patterns and regularities in numerical data. In this article we explore connections between statistics and the emerging field of 'experimental mathematics'. These includes both applications of experimental mathematics in statistics, as well as statistical methods applied to computational mathematics.
NASA Technical Reports Server (NTRS)
1995-01-01
NASA Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
NASA Technical Reports Server (NTRS)
1994-01-01
Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
NASA Technical Reports Server (NTRS)
1996-01-01
This booklet of pocket statistics includes the 1996 NASA Major Launch Record, NASA Procurement, Financial, and Workforce data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Luanch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
Who Needs Statistics? | Poster
You may know the feeling. You have collected a lot of new data on an important experiment. Now you are faced with multiple groups of data, a sea of numbers, and a deadline for submitting your paper to a peer-reviewed journal. And you are not sure which data are relevant, or even the best way to present them. The statisticians at Data Management Services (DMS) know how to help. This small group of experts provides a wide array of statistical and mathematical consulting services to the scientific community at NCI at Frederick and NCI-Bethesda.
Statistical physics and ecology
NASA Astrophysics Data System (ADS)
Volkov, Igor
This work addresses the applications of the methods of statistical physics to problems in population ecology. A theoretical framework based on stochastic Markov processes for the unified neutral theory of biodiversity is presented and an analytical solution for the distribution of the relative species abundance distribution both in the large meta-community and in the small local community is obtained. It is shown that the framework of the current neutral theory in ecology can be easily generalized to incorporate symmetric density dependence. An analytically tractable model is studied that provides an accurate description of beta-diversity and exhibits novel scaling behavior that leads to links between ecological measures such as relative species abundance and the species area relationship. We develop a simple framework that incorporates the Janzen-Connell, dispersal and immigration effects and leads to a description of the distribution of relative species abundance, the equilibrium species richness, beta-diversity and the species area relationship, in good accord with data. Also it is shown that an ecosystem can be mapped into an unconventional statistical ensemble and is quite generally tuned in the vicinity of a phase transition where bio-diversity and the use of resources are optimized. We also perform a detailed study of the unconventional statistical ensemble, in which, unlike in physics, the total number of particles and the energy are not fixed but bounded. We show that the temperature and the chemical potential play a dual role: they determine the average energy and the population of the levels in the system and at the same time they act as an imbalance between the energy and population ceilings and the corresponding average values. Different types of statistics (Boltzmann, Bose-Einstein, Fermi-Dirac and one corresponding to the description of a simple ecosystem) are considered. In all cases, we show that the systems may undergo a first or a second order
International petroleum statistics report
1995-10-01
The International Petroleum Statistics Report is a monthly publication that provides current international oil data. This report presents data on international oil production, demand, imports, exports and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). Section 2 presents an oil supply/demand balance for the world, in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries.
NASA Technical Reports Server (NTRS)
Freilich, M. H.; Pawka, S. S.
1987-01-01
The statistics of Sxy estimates derived from orthogonal-component measurements are examined. Based on results of Goodman (1957), the probability density function (pdf) for Sxy(f) estimates is derived, and a closed-form solution for arbitrary moments of the distribution is obtained. Characteristic functions are used to derive the exact pdf of Sxy(tot). In practice, a simple Gaussian approximation is found to be highly accurate even for relatively few degrees of freedom. Implications for experiment design are discussed, and a maximum-likelihood estimator for a posterior estimation is outlined.
Fragile entanglement statistics
NASA Astrophysics Data System (ADS)
Brody, Dorje C.; Hughston, Lane P.; Meier, David M.
2015-10-01
If X and Y are independent, Y and Z are independent, and so are X and Z, one might be tempted to conclude that X, Y, and Z are independent. But it has long been known in classical probability theory that, intuitive as it may seem, this is not true in general. In quantum mechanics one can ask whether analogous statistics can emerge for configurations of particles in certain types of entangled states. The explicit construction of such states, along with the specification of suitable sets of observables that have the purported statistical properties, is not entirely straightforward. We show that an example of such a configuration arises in the case of an N-particle GHZ state, and we are able to identify a family of observables with the property that the associated measurement outcomes are independent for any choice of 2,3,\\ldots ,N-1 of the particles, even though the measurement outcomes for all N particles are not independent. Although such states are highly entangled, the entanglement turns out to be ‘fragile’, i.e. the associated density matrix has the property that if one traces out the freedom associated with even a single particle, the resulting reduced density matrix is separable.
Statistical clumped isotope signatures.
Röckmann, T; Popa, M E; Krol, M C; Hofmann, M E G
2016-08-18
High precision measurements of molecules containing more than one heavy isotope may provide novel constraints on element cycles in nature. These so-called clumped isotope signatures are reported relative to the random (stochastic) distribution of heavy isotopes over all available isotopocules of a molecule, which is the conventional reference. When multiple indistinguishable atoms of the same element are present in a molecule, this reference is calculated from the bulk (≈average) isotopic composition of the involved atoms. We show here that this referencing convention leads to apparent negative clumped isotope anomalies (anti-clumping) when the indistinguishable atoms originate from isotopically different populations. Such statistical clumped isotope anomalies must occur in any system where two or more indistinguishable atoms of the same element, but with different isotopic composition, combine in a molecule. The size of the anti-clumping signal is closely related to the difference of the initial isotope ratios of the indistinguishable atoms that have combined. Therefore, a measured statistical clumped isotope anomaly, relative to an expected (e.g. thermodynamical) clumped isotope composition, may allow assessment of the heterogeneity of the isotopic pools of atoms that are the substrate for formation of molecules.
International petroleum statistics report
1997-05-01
The International Petroleum Statistics Report is a monthly publication that provides current international oil data. This report is published for the use of Members of Congress, Federal agencies, State agencies, industry, and the general public. Publication of this report is in keeping with responsibilities given the Energy Information Administration in Public Law 95-91. The International Petroleum Statistics Report presents data on international oil production, demand, imports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1995; OECD stocks from 1973 through 1995; and OECD trade from 1985 through 1995.
Statistical clumped isotope signatures
Röckmann, T.; Popa, M. E.; Krol, M. C.; Hofmann, M. E. G.
2016-01-01
High precision measurements of molecules containing more than one heavy isotope may provide novel constraints on element cycles in nature. These so-called clumped isotope signatures are reported relative to the random (stochastic) distribution of heavy isotopes over all available isotopocules of a molecule, which is the conventional reference. When multiple indistinguishable atoms of the same element are present in a molecule, this reference is calculated from the bulk (≈average) isotopic composition of the involved atoms. We show here that this referencing convention leads to apparent negative clumped isotope anomalies (anti-clumping) when the indistinguishable atoms originate from isotopically different populations. Such statistical clumped isotope anomalies must occur in any system where two or more indistinguishable atoms of the same element, but with different isotopic composition, combine in a molecule. The size of the anti-clumping signal is closely related to the difference of the initial isotope ratios of the indistinguishable atoms that have combined. Therefore, a measured statistical clumped isotope anomaly, relative to an expected (e.g. thermodynamical) clumped isotope composition, may allow assessment of the heterogeneity of the isotopic pools of atoms that are the substrate for formation of molecules. PMID:27535168
Statistical clumped isotope signatures
NASA Astrophysics Data System (ADS)
Röckmann, T.; Popa, M. E.; Krol, M. C.; Hofmann, M. E. G.
2016-08-01
High precision measurements of molecules containing more than one heavy isotope may provide novel constraints on element cycles in nature. These so-called clumped isotope signatures are reported relative to the random (stochastic) distribution of heavy isotopes over all available isotopocules of a molecule, which is the conventional reference. When multiple indistinguishable atoms of the same element are present in a molecule, this reference is calculated from the bulk (≈average) isotopic composition of the involved atoms. We show here that this referencing convention leads to apparent negative clumped isotope anomalies (anti-clumping) when the indistinguishable atoms originate from isotopically different populations. Such statistical clumped isotope anomalies must occur in any system where two or more indistinguishable atoms of the same element, but with different isotopic composition, combine in a molecule. The size of the anti-clumping signal is closely related to the difference of the initial isotope ratios of the indistinguishable atoms that have combined. Therefore, a measured statistical clumped isotope anomaly, relative to an expected (e.g. thermodynamical) clumped isotope composition, may allow assessment of the heterogeneity of the isotopic pools of atoms that are the substrate for formation of molecules.
Statistical clumped isotope signatures.
Röckmann, T; Popa, M E; Krol, M C; Hofmann, M E G
2016-01-01
High precision measurements of molecules containing more than one heavy isotope may provide novel constraints on element cycles in nature. These so-called clumped isotope signatures are reported relative to the random (stochastic) distribution of heavy isotopes over all available isotopocules of a molecule, which is the conventional reference. When multiple indistinguishable atoms of the same element are present in a molecule, this reference is calculated from the bulk (≈average) isotopic composition of the involved atoms. We show here that this referencing convention leads to apparent negative clumped isotope anomalies (anti-clumping) when the indistinguishable atoms originate from isotopically different populations. Such statistical clumped isotope anomalies must occur in any system where two or more indistinguishable atoms of the same element, but with different isotopic composition, combine in a molecule. The size of the anti-clumping signal is closely related to the difference of the initial isotope ratios of the indistinguishable atoms that have combined. Therefore, a measured statistical clumped isotope anomaly, relative to an expected (e.g. thermodynamical) clumped isotope composition, may allow assessment of the heterogeneity of the isotopic pools of atoms that are the substrate for formation of molecules. PMID:27535168
Sufficient Statistics: an Example
NASA Technical Reports Server (NTRS)
Quirein, J.
1973-01-01
The feature selection problem is considered resulting from the transformation x = Bz where B is a k by n matrix of rank k and k is or = to n. Such a transformation can be considered to reduce the dimension of each observation vector z, and in general, such a transformation results in a loss of information. In terms of the divergence, this information loss is expressed by the fact that the average divergence D sub B computed using variable x is less than or equal to the average divergence D computed using variable z. If D sub B = D, then B is said to be a sufficient statistic for the average divergence D. If B is a sufficient statistic for the average divergence, then it can be shown that the probability of misclassification computed using variable x (of dimension k is or = to n) is equal to the probability of misclassification computed using variable z. Also included is what is believed to be a new proof of the well known fact that D is or = to D sub B. Using the techniques necessary to prove the above fact, it is shown that the Brattacharyya distance as measured by variable x is less than or equal to the Brattacharyya distance as measured by variable z.
Fungi producing significant mycotoxins.
2012-01-01
Mycotoxins are secondary metabolites of microfungi that are known to cause sickness or death in humans or animals. Although many such toxic metabolites are known, it is generally agreed that only a few are significant in causing disease: aflatoxins, fumonisins, ochratoxin A, deoxynivalenol, zearalenone, and ergot alkaloids. These toxins are produced by just a few species from the common genera Aspergillus, Penicillium, Fusarium, and Claviceps. All Aspergillus and Penicillium species either are commensals, growing in crops without obvious signs of pathogenicity, or invade crops after harvest and produce toxins during drying and storage. In contrast, the important Fusarium and Claviceps species infect crops before harvest. The most important Aspergillus species, occurring in warmer climates, are A. flavus and A. parasiticus, which produce aflatoxins in maize, groundnuts, tree nuts, and, less frequently, other commodities. The main ochratoxin A producers, A. ochraceus and A. carbonarius, commonly occur in grapes, dried vine fruits, wine, and coffee. Penicillium verrucosum also produces ochratoxin A but occurs only in cool temperate climates, where it infects small grains. F. verticillioides is ubiquitous in maize, with an endophytic nature, and produces fumonisins, which are generally more prevalent when crops are under drought stress or suffer excessive insect damage. It has recently been shown that Aspergillus niger also produces fumonisins, and several commodities may be affected. F. graminearum, which is the major producer of deoxynivalenol and zearalenone, is pathogenic on maize, wheat, and barley and produces these toxins whenever it infects these grains before harvest. Also included is a short section on Claviceps purpurea, which produces sclerotia among the seeds in grasses, including wheat, barley, and triticale. The main thrust of the chapter contains information on the identification of these fungi and their morphological characteristics, as well as factors
ERIC Educational Resources Information Center
Perepiczka, Michelle; Chandler, Nichelle; Becerra, Michael
2011-01-01
Statistics plays an integral role in graduate programs. However, numerous intra- and interpersonal factors may lead to successful completion of needed coursework in this area. The authors examined the extent of the relationship between self-efficacy to learn statistics and statistics anxiety, attitude towards statistics, and social support of 166…
Nonlinear Statistical Modeling of Speech
NASA Astrophysics Data System (ADS)
Srinivasan, S.; Ma, T.; May, D.; Lazarou, G.; Picone, J.
2009-12-01
Contemporary approaches to speech and speaker recognition decompose the problem into four components: feature extraction, acoustic modeling, language modeling and search. Statistical signal processing is an integral part of each of these components, and Bayes Rule is used to merge these components into a single optimal choice. Acoustic models typically use hidden Markov models based on Gaussian mixture models for state output probabilities. This popular approach suffers from an inherent assumption of linearity in speech signal dynamics. Language models often employ a variety of maximum entropy techniques, but can employ many of the same statistical techniques used for acoustic models. In this paper, we focus on introducing nonlinear statistical models to the feature extraction and acoustic modeling problems as a first step towards speech and speaker recognition systems based on notions of chaos and strange attractors. Our goal in this work is to improve the generalization and robustness properties of a speech recognition system. Three nonlinear invariants are proposed for feature extraction: Lyapunov exponents, correlation fractal dimension, and correlation entropy. We demonstrate an 11% relative improvement on speech recorded under noise-free conditions, but show a comparable degradation occurs for mismatched training conditions on noisy speech. We conjecture that the degradation is due to difficulties in estimating invariants reliably from noisy data. To circumvent these problems, we introduce two dynamic models to the acoustic modeling problem: (1) a linear dynamic model (LDM) that uses a state space-like formulation to explicitly model the evolution of hidden states using an autoregressive process, and (2) a data-dependent mixture of autoregressive (MixAR) models. Results show that LDM and MixAR models can achieve comparable performance with HMM systems while using significantly fewer parameters. Currently we are developing Bayesian parameter estimation and
Should College Algebra be a Prerequisite for Taking Psychology Statistics?
ERIC Educational Resources Information Center
Sibulkin, Amy E.; Butler, J. S.
2008-01-01
In order to consider whether a course in college algebra should be a prerequisite for taking psychology statistics, we recorded students' grades in elementary psychology statistics and in college algebra at a 4-year university. Students who earned credit in algebra prior to enrolling in statistics for the first time had a significantly higher mean…
A Statistics Curriculum for the Undergraduate Chemistry Major
ERIC Educational Resources Information Center
Schlotter, Nicholas E.
2013-01-01
Our ability to statistically analyze data has grown significantly with the maturing of computer hardware and software. However, the evolution of our statistics capabilities has taken place without a corresponding evolution in the curriculum for the undergraduate chemistry major. Most faculty understands the need for a statistical educational…
A Tablet-PC Software Application for Statistics Classes
ERIC Educational Resources Information Center
Probst, Alexandre C.
2014-01-01
A significant deficiency in the area of introductory statistics education exists: Student performance on standardized assessments after a full semester statistics course is poor and students report a very low desire to learn statistics. Research on the current generation of students indicates an affinity for technology and for multitasking.…
"t" for Two: Using Mnemonics to Teach Statistics
ERIC Educational Resources Information Center
Stalder, Daniel R.; Olson, Elizabeth A.
2011-01-01
This article provides a list of statistical mnemonics for instructor use. This article also reports on the potential for such mnemonics to help students learn, enjoy, and become less apprehensive about statistics. Undergraduates from two sections of a psychology statistics course rated 8 of 11 mnemonics as significantly memorable and helpful in…
Innovative trend significance test and applications
NASA Astrophysics Data System (ADS)
Şen, Zekai
2015-11-01
Hydro-climatological time series might embed characteristics of past changes concerning climate variability in terms of shifts, cyclic fluctuations, and more significantly in the form of trends. Identification of such features from the available records is one of the prime tasks of hydrologists, climatologists, applied statisticians, or experts in related topics. Although there are different trend identification and significance tests in the literature, they require restrictive assumptions, which may not be existent in the structure of hydro-climatological time series. In this paper, a method is suggested with statistical significance test for trend identification in an innovative manner. This method has non-parametric basis without any restrictive assumption, and its application is rather simple with the concept of sub-series comparisons that are extracted from the main time series. The method provides privilege for selection of sub-temporal half periods for the comparison and, finally, generates trend on objective and quantitative manners. The necessary statistical equations are derived for innovative trend identification and statistical significance test application. The application of the proposed methodology is suggested for three time series from different parts of the world including Southern New Jersey annual temperature, Danube River annual discharge, and Tigris River Diyarbakir meteorology station annual total rainfall records. Each record has significant trend with increasing type in the New Jersey case, whereas in other two cases, decreasing trends exist.
NASA Astrophysics Data System (ADS)
Talkner, Peter
2003-03-01
The statistical properties of discrete Markov processes are investigated in terms of entrance times. Simple relations are given for their density and higher order distributions. These quantities are used for introducing a generalized Rice phase and for characterizing the synchronization of a process with an external driving force. For the McNamara Wiesenfeld model of stochastic resonance parameter regions (spanned by the noise strength, driving frequency and strength) are identified in which the process is locked with the frequency of the external driving and in which the diffusion of the Rice phase becomes minimal. At the same time the Fano factor of the number of entrances per period of the driving force has a minimum.
Dienes, J.K.
1983-01-01
An alternative to the use of plasticity theory to characterize the inelastic behavior of solids is to represent the flaws by statistical methods. We have taken such an approach to study fragmentation because it offers a number of advantages. Foremost among these is that, by considering the effects of flaws, it becomes possible to address the underlying physics directly. For example, we have been able to explain why rocks exhibit large strain-rate effects (a consequence of the finite growth rate of cracks), why a spherical explosive imbedded in oil shale produces a cavity with a nearly square section (opening of bedding cracks) and why propellants may detonate following low-speed impact (a consequence of frictional hot spots).
Conditional statistical model building
NASA Astrophysics Data System (ADS)
Hansen, Mads Fogtmann; Hansen, Michael Sass; Larsen, Rasmus
2008-03-01
We present a new statistical deformation model suited for parameterized grids with different resolutions. Our method models the covariances between multiple grid levels explicitly, and allows for very efficient fitting of the model to data on multiple scales. The model is validated on a data set consisting of 62 annotated MR images of Corpus Callosum. One fifth of the data set was used as a training set, which was non-rigidly registered to each other without a shape prior. From the non-rigidly registered training set a shape prior was constructed by performing principal component analysis on each grid level and using the results to construct a conditional shape model, conditioning the finer parameters with the coarser grid levels. The remaining shapes were registered with the constructed shape prior. The dice measures for the registration without prior and the registration with a prior were 0.875 +/- 0.042 and 0.8615 +/- 0.051, respectively.
Statistical design controversy
Evans, L.S.; Hendrey, G.R.; Thompson, K.H.
1985-02-01
This article was in response to criticisms received by Evans, Hendrey, and Thompson that their article was biased because of omissions and misrepresentations. The authors contend that experimental designs having only one plot per treatment ''were, from the outset, not capable of differentiating between treatment effects and field-position effects,'' remains valid and is supported by decades of agronomic research. Several men, Irving, Troiano, and McCune thought of the article as a review of all studies of acidic rain effects on soybeans. It was not. The article was written over the concern of the comparisons which were being made among studies which purport to evaluate effects of acid deposition on field-grown crops, and implicitly assumes that all of the studies are of equal scientific value. They are not. Only experimental approaches that are well-focused and designed with appropriate agronomic and statistical procedures should be used for credible regional and national assessments of crop inventories. 12 references.
Rossell, David
2016-01-01
Big Data brings unprecedented power to address scientific, economic and societal issues, but also amplifies the possibility of certain pitfalls. These include using purely data-driven approaches that disregard understanding the phenomenon under study, aiming at a dynamically moving target, ignoring critical data collection issues, summarizing or preprocessing the data inadequately and mistaking noise for signal. We review some success stories and illustrate how statistical principles can help obtain more reliable information from data. We also touch upon current challenges that require active methodological research, such as strategies for efficient computation, integration of heterogeneous data, extending the underlying theory to increasingly complex questions and, perhaps most importantly, training a new generation of scientists to develop and deploy these strategies. PMID:27722040
Statistical physics ""Beyond equilibrium
Ecke, Robert E
2009-01-01
The scientific challenges of the 21st century will increasingly involve competing interactions, geometric frustration, spatial and temporal intrinsic inhomogeneity, nanoscale structures, and interactions spanning many scales. We will focus on a broad class of emerging problems that will require new tools in non-equilibrium statistical physics and that will find application in new material functionality, in predicting complex spatial dynamics, and in understanding novel states of matter. Our work will encompass materials under extreme conditions involving elastic/plastic deformation, competing interactions, intrinsic inhomogeneity, frustration in condensed matter systems, scaling phenomena in disordered materials from glasses to granular matter, quantum chemistry applied to nano-scale materials, soft-matter materials, and spatio-temporal properties of both ordinary and complex fluids.
Statistically determined nickel cadmium performance relationships
NASA Technical Reports Server (NTRS)
Gross, Sidney
1987-01-01
A statistical analysis was performed on sealed nickel cadmium cell manufacturing data and cell matching data. The cells subjected to the analysis were 30 Ah sealed Ni/Cd cells, made by General Electric. A total of 213 data parameters was investigated, including such information as plate thickness, amount of electrolyte added, weight of active material, positive and negative capacity, and charge-discharge behavior. Statistical analyses were made to determine possible correlations between test events. The data show many departures from normal distribution. Product consistency from one lot to another is an important attribute for aerospace applications. It is clear from these examples that there are some significant differences between lots. Statistical analyses are seen to be an excellent way to spot those differences. Also, it is now proven beyond doubt that battery testing is one of the leading causes of statistics.
Wide Wide World of Statistics: International Statistics on the Internet.
ERIC Educational Resources Information Center
Foudy, Geraldine
2000-01-01
Explains how to find statistics on the Internet, especially international statistics. Discusses advantages over print sources, including convenience, currency of information, cost effectiveness, and value-added formatting; sources of international statistics; United Nations agencies; search engines and power searching; and evaluating sources. (LRW)
Understanding Statistics and Statistics Education: A Chinese Perspective
ERIC Educational Resources Information Center
Shi, Ning-Zhong; He, Xuming; Tao, Jian
2009-01-01
In recent years, statistics education in China has made great strides. However, there still exists a fairly large gap with the advanced levels of statistics education in more developed countries. In this paper, we identify some existing problems in statistics education in Chinese schools and make some proposals as to how they may be overcome. We…
Statistical Literacy: Developing a Youth and Adult Education Statistical Project
ERIC Educational Resources Information Center
Conti, Keli Cristina; Lucchesi de Carvalho, Dione
2014-01-01
This article focuses on the notion of literacy--general and statistical--in the analysis of data from a fieldwork research project carried out as part of a master's degree that investigated the teaching and learning of statistics in adult education mathematics classes. We describe the statistical context of the project that involved the…
Statistical Modelling of Compound Floods
NASA Astrophysics Data System (ADS)
Bevacqua, Emanuele; Maraun, Douglas; Vrac, Mathieu; Widmann, Martin; Manning, Colin
2016-04-01
of interest. This is based on real data for River discharge (Y RIV ER') and Sea level (Y SEA), from the River Têt in south of France. The impact of the compound flood is the water level in the area between the River and Sea station, which we define here as h = αY RIV ER + (1 ‑ α)Y SEA. Here we show the sensitivity of the system to a changes in the two physical parameters. Through variations in α we can study the system in one or two dimensions which allows for the assessment of the risk associated with either of the two variables alone or with a combination of them. Varying instead the second parameter, i.e. the dependence among the variables Y RIV ER and Y SEA, we show how an apparently weak dependence can increase the risk of flooding significantly with respect to the independent case. The model can be applied to future climate inserting predictors into the statistical model as additional conditioning variables. Through conditioning the simulation of the statistical model on the predictors obtained for future projections from Climate Models, both the change of the risk and characteristics of compound floods for the future can be analysed.
Heart Disease and Stroke Statistics
... Nutrition (PDF) Obesity (PDF) Peripheral Artery Disease (PDF) ... statistics, please contact the American Heart Association National Center, Office of Science & Medicine at statistics@heart.org . Please direct all ...
Muscular Dystrophy: Data and Statistics
... Statistics Recommend on Facebook Tweet Share Compartir MD STAR net Data and Statistics The following data and ... research [ Read Article ] For more information on MD STAR net see Research and Tracking . Key Findings Feature ...
Thoughts About Theories and Statistics.
Fawcett, Jacqueline
2015-07-01
The purpose of this essay is to share my ideas about the connection between theories and statistics. The essay content reflects my concerns about some researchers' and readers' apparent lack of clarity about what constitutes appropriate statistical testing and conclusions about the empirical adequacy of theories. The reciprocal relation between theories and statistics is emphasized and the conclusion is that statistics without direction from theory is no more than a hobby.
Springer Handbook of Engineering Statistics
NASA Astrophysics Data System (ADS)
Pham, Hoang
The Springer Handbook of Engineering Statistics gathers together the full range of statistical techniques required by engineers from all fields to gain sensible statistical feedback on how their processes or products are functioning and to give them realistic predictions of how these could be improved.
Statistical log analysis made practical
Mitchell, W.K.; Nelson, R.J. )
1991-06-01
This paper discusses the advantages of a statistical approach to log analysis. Statistical techniques use inverse methods to calculate formation parameters. The use of statistical techniques has been limited, however, by the complexity of the mathematics and lengthy computer time required to minimize traditionally used nonlinear equations.
Invention Activities Support Statistical Reasoning
ERIC Educational Resources Information Center
Smith, Carmen Petrick; Kenlan, Kris
2016-01-01
Students' experiences with statistics and data analysis in middle school are often limited to little more than making and interpreting graphs. Although students may develop fluency in statistical procedures and vocabulary, they frequently lack the skills necessary to apply statistical reasoning in situations other than clear-cut textbook examples.…
Explorations in Statistics: the Bootstrap
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2009-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fourth installment of Explorations in Statistics explores the bootstrap. The bootstrap gives us an empirical approach to estimate the theoretical variability among possible values of a sample statistic such as the…
Teaching Statistics Online Using "Excel"
ERIC Educational Resources Information Center
Jerome, Lawrence
2011-01-01
As anyone who has taught or taken a statistics course knows, statistical calculations can be tedious and error-prone, with the details of a calculation sometimes distracting students from understanding the larger concepts. Traditional statistics courses typically use scientific calculators, which can relieve some of the tedium and errors but…
Statistics Anxiety and Instructor Immediacy
ERIC Educational Resources Information Center
Williams, Amanda S.
2010-01-01
The purpose of this study was to investigate the relationship between instructor immediacy and statistics anxiety. It was predicted that students receiving immediacy would report lower levels of statistics anxiety. Using a pretest-posttest-control group design, immediacy was measured using the Instructor Immediacy scale. Statistics anxiety was…
Statistics: It's in the Numbers!
ERIC Educational Resources Information Center
Deal, Mary M.; Deal, Walter F., III
2007-01-01
Mathematics and statistics play important roles in peoples' lives today. A day hardly passes that they are not bombarded with many different kinds of statistics. As consumers they see statistical information as they surf the web, watch television, listen to their satellite radios, or even read the nutrition facts panel on a cereal box in the…
Statistics of indistinguishable particles.
Wittig, Curt
2009-07-01
The wave function of a system containing identical particles takes into account the relationship between a particle's intrinsic spin and its statistical property. Specifically, the exchange of two identical particles having odd-half-integer spin results in the wave function changing sign, whereas the exchange of two identical particles having integer spin is accompanied by no such sign change. This is embodied in a term (-1)(2s), which has the value +1 for integer s (bosons), and -1 for odd-half-integer s (fermions), where s is the particle spin. All of this is well-known. In the nonrelativistic limit, a detailed consideration of the exchange of two identical particles shows that exchange is accompanied by a 2pi reorientation that yields the (-1)(2s) term. The same bookkeeping is applicable to the relativistic case described by the proper orthochronous Lorentz group, because any proper orthochronous Lorentz transformation can be expressed as the product of spatial rotations and a boost along the direction of motion. PMID:19552474
International petroleum statistics report
1996-05-01
The International Petroleum Statistics Report presents data on international oil production, demand, imports, exports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1995; OECD stocks from 1973 through 1995; and OECD trade from 1084 through 1994.
International petroleum statistics report
1995-11-01
The International Petroleum Statistics Report presents data on international oil production, demand, imports, exports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1994; OECD stocks from 1973 through 1994; and OECD trade from 1984 through 1994.
International petroleum statistics report
1995-07-27
The International Petroleum Statistics Report presents data on international oil production, demand, imports, and exports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1994; OECD stocks from 1973 through 1994; and OECD trade from 1984 through 1994.
Topics in statistical mechanics
Elser, V.
1984-05-01
This thesis deals with four independent topics in statistical mechanics: (1) the dimer problem is solved exactly for a hexagonal lattice with general boundary using a known generating function from the theory of partitions. It is shown that the leading term in the entropy depends on the shape of the boundary; (2) continuum models of percolation and self-avoiding walks are introduced with the property that their series expansions are sums over linear graphs with intrinsic combinatorial weights and explicit dimension dependence; (3) a constrained SOS model is used to describe the edge of a simple cubic crystal. Low and high temperature results are derived as well as the detailed behavior near the crystal facet; (4) the microscopic model of the lambda-transition involving atomic permutation cycles is reexamined. In particular, a new derivation of the two-component field theory model of the critical behavior is presented. Results for a lattice model originally proposed by Kikuchi are extended with a high temperature series expansion and Monte Carlo simulation. 30 references.
Statistical mechanics of nucleosomes
NASA Astrophysics Data System (ADS)
Chereji, Razvan V.
Eukaryotic cells contain long DNA molecules (about two meters for a human cell) which are tightly packed inside the micrometric nuclei. Nucleosomes are the basic packaging unit of the DNA which allows this millionfold compactification. A longstanding puzzle is to understand the principles which allow cells to both organize their genomes into chromatin fibers in the crowded space of their nuclei, and also to keep the DNA accessible to many factors and enzymes. With the nucleosomes covering about three quarters of the DNA, their positions are essential because these influence which genes can be regulated by the transcription factors and which cannot. We study physical models which predict the genome-wide organization of the nucleosomes and also the relevant energies which dictate this organization. In the last five years, the study of chromatin knew many important advances. In particular, in the field of nucleosome positioning, new techniques of identifying nucleosomes and the competing DNA-binding factors appeared, as chemical mapping with hydroxyl radicals, ChIP-exo, among others, the resolution of the nucleosome maps increased by using paired-end sequencing, and the price of sequencing an entire genome decreased. We present a rigorous statistical mechanics model which is able to explain the recent experimental results by taking into account nucleosome unwrapping, competition between different DNA-binding proteins, and both the interaction between histones and DNA, and between neighboring histones. We show a series of predictions of our new model, all in agreement with the experimental observations.
International petroleum statistics report
1997-07-01
The International Petroleum Statistics Report is a monthly publication that provides current international data. The report presents data on international oil production, demand, imports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent 12 months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. World oil production and OECD demand data are for the years 1970 through 1996; OECD stocks from 1973 through 1996; and OECD trade from 1986 through 1996.
International petroleum statistics report
1996-10-01
The International Petroleum Statistics Report presents data on international oil production, demand, imports, and stocks. The report has four sections. Section 1 contains time series data on world oil production, and on oil demand and stocks in the Organization for Economic Cooperation and Development (OECD). This section contains annual data beginning in 1985, and monthly data for the most recent two years. Section 2 presents an oil supply/demand balance for the world. This balance is presented in quarterly intervals for the most recent two years. Section 3 presents data on oil imports by OECD countries. This section contains annual data for the most recent year, quarterly data for the most recent two quarters, and monthly data for the most recent twelve months. Section 4 presents annual time series data on world oil production and oil stocks, demand, and trade in OECD countries. Word oil production and OECD demand data are for the years 1970 through 1995; OECD stocks from 1973 through 1995; and OECD trade from 1985 through 1995.
A statistical mechanical problem?
Costa, Tommaso; Ferraro, Mario
2014-01-01
The problem of deriving the processes of perception and cognition or the modes of behavior from states of the brain appears to be unsolvable in view of the huge numbers of elements involved. However, neural activities are not random, nor independent, but constrained to form spatio-temporal patterns, and thanks to these restrictions, which in turn are due to connections among neurons, the problem can at least be approached. The situation is similar to what happens in large physical ensembles, where global behaviors are derived by microscopic properties. Despite the obvious differences between neural and physical systems a statistical mechanics approach is almost inescapable, since dynamics of the brain as a whole are clearly determined by the outputs of single neurons. In this paper it will be shown how, starting from very simple systems, connectivity engenders levels of increasing complexity in the functions of the brain depending on specific constraints. Correspondingly levels of explanations must take into account the fundamental role of constraints and assign at each level proper model structures and variables, that, on one hand, emerge from outputs of the lower levels, and yet are specific, in that they ignore irrelevant details. PMID:25228891
2008-01-01
There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study. PMID:18765754
Exact significance test for Markov order
NASA Astrophysics Data System (ADS)
Pethel, S. D.; Hahs, D. W.
2014-02-01
We describe an exact significance test of the null hypothesis that a Markov chain is nth order. The procedure utilizes surrogate data to yield an exact test statistic distribution valid for any sample size. Surrogate data are generated using a novel algorithm that guarantees, per shot, a uniform sampling from the set of sequences that exactly match the nth order properties of the observed data. Using the test, the Markov order of Tel Aviv rainfall data is examined.
Multivariate statistical analysis of environmental monitoring data
Ross, D.L.
1997-11-01
EPA requires statistical procedures to determine whether soil or ground water adjacent to or below waste units is contaminated. These statistical procedures are often based on comparisons between two sets of data: one representing background conditions, and one representing site conditions. Since statistical requirements were originally promulgated in the 1980s, EPA has made several improvements and modifications. There are, however, problems which remain. One problem is that the regulations do not require a minimum probability that contaminated sites will be correctly identified. Another problems is that the effect of testing several correlated constituents on the probable outcome of the statistical tests has not been quantified. Results from computer simulations to determine power functions for realistic monitoring situations are presented here. Power functions for two different statistical procedures: the Student`s t-test, and the multivariate Hotelling`s T{sup 2} test, are compared. The comparisons indicate that the multivariate test is often more powerful when the tests are applied with significance levels to control the probability of falsely identifying clean sites as contaminated. This program could also be used to verify that statistical procedures achieve some minimum power standard at a regulated waste unit.
Statistical Symbolic Execution with Informed Sampling
NASA Technical Reports Server (NTRS)
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Statistical Analysis Experiment for Freshman Chemistry Lab.
ERIC Educational Resources Information Center
Salzsieder, John C.
1995-01-01
Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…
The Academic Pecking Order: A Statistical Expose.
ERIC Educational Resources Information Center
Ciampa, Bartholomew J.
This study was designed to provide statistical analysis of certain curricular characteristics that could be used as a projective device to be considered prior to the implementation of any further changes of curricular or philosophical significance. The population of the study comprised all students at Nasson College in the classes of 1968 through…
Ideal statistically quasi Cauchy sequences
NASA Astrophysics Data System (ADS)
Savas, Ekrem; Cakalli, Huseyin
2016-08-01
An ideal I is a family of subsets of N, the set of positive integers which is closed under taking finite unions and subsets of its elements. A sequence (xk) of real numbers is said to be S(I)-statistically convergent to a real number L, if for each ɛ > 0 and for each δ > 0 the set { n ∈N :1/n | { k ≤n :| xk-L | ≥ɛ } | ≥δ } belongs to I. We introduce S(I)-statistically ward compactness of a subset of R, the set of real numbers, and S(I)-statistically ward continuity of a real function in the senses that a subset E of R is S(I)-statistically ward compact if any sequence of points in E has an S(I)-statistically quasi-Cauchy subsequence, and a real function is S(I)-statistically ward continuous if it preserves S(I)-statistically quasi-Cauchy sequences where a sequence (xk) is called to be S(I)-statistically quasi-Cauchy when (Δxk) is S(I)-statistically convergent to 0. We obtain results related to S(I)-statistically ward continuity, S(I)-statistically ward compactness, Nθ-ward continuity, and slowly oscillating continuity.
Basic statistics in cell biology.
Vaux, David L
2014-01-01
The physicist Ernest Rutherford said, "If your experiment needs statistics, you ought to have done a better experiment." Although this aphorism remains true for much of today's research in cell biology, a basic understanding of statistics can be useful to cell biologists to help in monitoring the conduct of their experiments, in interpreting the results, in presenting them in publications, and when critically evaluating research by others. However, training in statistics is often focused on the sophisticated needs of clinical researchers, psychologists, and epidemiologists, whose conclusions depend wholly on statistics, rather than the practical needs of cell biologists, whose experiments often provide evidence that is not statistical in nature. This review describes some of the basic statistical principles that may be of use to experimental biologists, but it does not cover the sophisticated statistics needed for papers that contain evidence of no other kind.
Statistical Seismology and Induced Seismicity
NASA Astrophysics Data System (ADS)
Tiampo, K. F.; González, P. J.; Kazemian, J.
2014-12-01
While seismicity triggered or induced by natural resources production such as mining or water impoundment in large dams has long been recognized, the recent increase in the unconventional production of oil and gas has been linked to rapid rise in seismicity in many places, including central North America (Ellsworth et al., 2012; Ellsworth, 2013). Worldwide, induced events of M~5 have occurred and, although rare, have resulted in both damage and public concern (Horton, 2012; Keranen et al., 2013). In addition, over the past twenty years, the increase in both number and coverage of seismic stations has resulted in an unprecedented ability to precisely record the magnitude and location of large numbers of small magnitude events. The increase in the number and type of seismic sequences available for detailed study has revealed differences in their statistics that previously difficult to quantify. For example, seismic swarms that produce significant numbers of foreshocks as well as aftershocks have been observed in different tectonic settings, including California, Iceland, and the East Pacific Rise (McGuire et al., 2005; Shearer, 2012; Kazemian et al., 2014). Similarly, smaller events have been observed prior to larger induced events in several occurrences from energy production. The field of statistical seismology has long focused on the question of triggering and the mechanisms responsible (Stein et al., 1992; Hill et al., 1993; Steacy et al., 2005; Parsons, 2005; Main et al., 2006). For example, in most cases the associated stress perturbations are much smaller than the earthquake stress drop, suggesting an inherent sensitivity to relatively small stress changes (Nalbant et al., 2005). Induced seismicity provides the opportunity to investigate triggering and, in particular, the differences between long- and short-range triggering. Here we investigate the statistics of induced seismicity sequences from around the world, including central North America and Spain, and
Gaussian statistics for palaeomagnetic vectors
Love, J.J.; Constable, C.G.
2003-01-01
formulate the inverse problem, and how to estimate the mean and variance of the magnetic vector field, even when the data consist of mixed combinations of directions and intensities. We examine palaeomagnetic secular-variation data from Hawaii and Re??union, and although these two sites are on almost opposite latitudes, we find significant differences in the mean vector and differences in the local vectorial variances, with the Hawaiian data being particularly anisotropic. These observations are inconsistent with a description of the mean field as being a simple geocentric axial dipole and with secular variation being statistically symmetrical with respect to reflection through the equatorial plane. Finally, our analysis of palaeomagnetic acquisition data from the 1960 Kilauea flow in Hawaii and the Holocene Xitle flow in Mexico, is consistent with the widely held suspicion that directional data are more accurate than intensity data.
NASA Astrophysics Data System (ADS)
Holmes, Jon L.
2000-06-01
IP-number access. Current subscriptions can be upgraded to IP-number access at little additional cost. We are pleased to be able to offer to institutions and libraries this convenient mode of access to subscriber only resources at JCE Online. JCE Online Usage Statistics We are continually amazed by the activity at JCE Online. So far, the year 2000 has shown a marked increase. Given the phenomenal overall growth of the Internet, perhaps our surprise is not warranted. However, during the months of January and February 2000, over 38,000 visitors requested over 275,000 pages. This is a monthly increase of over 33% from the October-December 1999 levels. It is good to know that people are visiting, but we would very much like to know what you would most like to see at JCE Online. Please send your suggestions to JCEOnline@chem.wisc.edu. For those who are interested, JCE Online year-to-date statistics are available. Biographical Snapshots of Famous Chemists: Mission Statement Feature Editor: Barbara Burke Chemistry Department, California State Polytechnic University-Pomona, Pomona, CA 91768 phone: 909/869-3664 fax: 909/869-4616 email: baburke@csupomona.edu The primary goal of this JCE Internet column is to provide information about chemists who have made important contributions to chemistry. For each chemist, there is a short biographical "snapshot" that provides basic information about the person's chemical work, gender, ethnicity, and cultural background. Each snapshot includes links to related websites and to a biobibliographic database. The database provides references for the individual and can be searched through key words listed at the end of each snapshot. All students, not just science majors, need to understand science as it really is: an exciting, challenging, human, and creative way of learning about our natural world. Investigating the life experiences of chemists can provide a means for students to gain a more realistic view of chemistry. In addition students
NASA Astrophysics Data System (ADS)
Roychoudhuri, Chandrasekhar
2009-03-01
We use the following epistemology—understanding and visualizing the invisible processes behind all natural phenomena through iterative reconstruction and/or refinement of current working theories towards their limits, constitute our best approach towards discovering actual realities of nature followed by new break-through theories. We use this epistemology to explore the roots of statistical nature of the real world—classical physics, quantum physics and even our mental constructs. Diversity is a natural and healthy outcome of this statistical nature. First, we use a two-beam superposition experiment as an illustrative example of the quantum world to visualize the root of fluctuations (or randomness) in the photo electron counting statistics. We recognize that the fluctuating weak background fields make the quantum world inherently random but the fluctuations are still statistically bounded, indicating that the fundamental laws of nature are still causal. Theoreticians will be challenged for ever to construct a causal and closed form theory free of statistical randomness out of incomplete information. We show by analyzing the essential steps behind any experiment that gaps in the information gathered about any phenomenon is inevitable. This lack of information also influences our personal epistemologies to have "statistical spread" due to its molecular origin, albeit bounded and constrained by the causally driven atomic and molecular interactions across the board. While there are clear differences in the root and manifestation of classical and quantum statistical behavior, on a fundamental level they originate in our theories due to lack of complete information about everything that is involved in every interaction in our experiments. Statistical nature of our theories is a product of incomplete information and we should take it as an inevitable paradigm.
Analysis and modeling of resistive switching statistics
NASA Astrophysics Data System (ADS)
Long, Shibing; Cagli, Carlo; Ielmini, Daniele; Liu, Ming; Suñé, Jordi
2012-04-01
The resistive random access memory (RRAM), based on the reversible switching between different resistance states, is a promising candidate for next-generation nonvolatile memories. One of the most important challenges to foster the practical application of RRAM is the control of the statistical variation of switching parameters to gain low variability and high reliability. In this work, starting from the well-known percolation model of dielectric breakdown (BD), we establish a framework of analysis and modeling of the resistive switching statistics in RRAM devices, which are based on the formation and disconnection of a conducting filament (CF). One key aspect of our proposal is the relation between the CF resistance and the switching statistics. Hence, establishing the correlation between SET and RESET switching variables and the initial resistance of the device in the OFF and ON states, respectively, is a fundamental issue. Our modeling approach to the switching statistics is fully analytical and contains two main elements: (i) a geometrical cell-based description of the CF and (ii) a deterministic model for the switching dynamics. Both ingredients might be slightly different for the SET and RESET processes, for the type of switching (bipolar or unipolar), and for the kind of considered resistive structure (oxide-based, conductive bridge, etc.). However, the basic structure of our approach is thought to be useful for all the cases and should provide a framework for the physics-based understanding of the switching mechanisms and the associated statistics, for the trustful estimation of RRAM performance, and for the successful forecast of reliability. As a first application example, we start by considering the case of the RESET statistics of NiO-based RRAM structures. In particular, we statistically analyze the RESET transitions of a statistically significant number of switching cycles of Pt/NiO/W devices. In the RESET transition, the ON-state resistance (RON) is a
Statistics without Tears: Complex Statistics with Simple Arithmetic
ERIC Educational Resources Information Center
Smith, Brian
2011-01-01
One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
SOCR: Statistics Online Computational Resource
Dinov, Ivo D.
2011-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
Characterizations of linear sufficient statistics
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Redner, R.; Decell, H. P., Jr.
1976-01-01
A necessary and sufficient condition is developed such that there exists a continous linear sufficient statistic T for a dominated collection of totally finite measures defined on the Borel field generated by the open sets of a Banach space X. In particular, corollary necessary and sufficient conditions are given so that there exists a rank K linear sufficient statistic T for any finite collection of probability measures having n-variate normal densities. In this case a simple calculation, involving only the population means and covariances, determines the smallest integer K for which there exists a rank K linear sufficient statistic T (as well as an associated statistic T itself).
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
A spatial scan statistic for multinomial data
Jung, Inkyung; Kulldorff, Martin; Richard, Otukei John
2014-01-01
As a geographical cluster detection analysis tool, the spatial scan statistic has been developed for different types of data such as Bernoulli, Poisson, ordinal, exponential and normal. Another interesting data type is multinomial. For example, one may want to find clusters where the disease-type distribution is statistically significantly different from the rest of the study region when there are different types of disease. In this paper, we propose a spatial scan statistic for such data, which is useful for geographical cluster detection analysis for categorical data without any intrinsic order information. The proposed method is applied to meningitis data consisting of five different disease categories to identify areas with distinct disease-type patterns in two counties in the U.K. The performance of the method is evaluated through a simulation study. PMID:20680984
Statistical process control in nursing research.
Polit, Denise F; Chaboyer, Wendy
2012-02-01
In intervention studies in which randomization to groups is not possible, researchers typically use quasi-experimental designs. Time series designs are strong quasi-experimental designs but are seldom used, perhaps because of technical and analytic hurdles. Statistical process control (SPC) is an alternative analytic approach to testing hypotheses about intervention effects using data collected over time. SPC, like traditional statistical methods, is a tool for understanding variation and involves the construction of control charts that distinguish between normal, random fluctuations (common cause variation), and statistically significant special cause variation that can result from an innovation. The purpose of this article is to provide an overview of SPC and to illustrate its use in a study of a nursing practice improvement intervention. PMID:22095634
Statistical Approaches to Functional Neuroimaging Data
DuBois Bowman, F; Guo, Ying; Derado, Gordana
2007-01-01
Synopsis The field of statistics makes valuable contributions to functional neuroimaging research by establishing procedures for the design and conduct of neuroimaging experiements and by providing tools for objectively quantifying and measuring the strength of scientific evidence provided by the data. Two common functional neuroimaging research objecitves include detecting brain regions that reveal task-related alterations in measured brain activity (activations) and identifying highly correlated brain regions that exhibit similar patterns of activity over time (functional connectivity). In this article, we highlight various statistical procedures for analyzing data from activation studies and from functional connectivity studies, focusing on functional magnetic resonance imaging (fMRI) and positron emission tomography (PET) data. We also discuss emerging statistical methods for prediction using fMRI and PET data, which stand to increase the translational significance of functional neuroimaging data to clinical practice. PMID:17983962
Statistical concepts in metrology with a postscript on statistical graphics
NASA Astrophysics Data System (ADS)
Ku, Harry H.
1988-08-01
Statistical Concepts in Metrology was originally written as Chapter 2 for the Handbook of Industrial Metrology published by the American Society of Tool and Manufacturing Engineers, 1967. It was reprinted as one of 40 papers in NBS Special Publication 300, Volume 1, Precision Measurement and Calibration; Statistical Concepts and Procedures, 1969. Since then this chapter has been used as basic text in statistics in Bureau-sponsored courses and seminars, including those for Electricity, Electronics, and Analytical Chemistry. While concepts and techniques introduced in the original chapter remain valid and appropriate, some additions on recent development of graphical methods for the treatment of data would be useful. Graphical methods can be used effectively to explore information in data sets prior to the application of classical statistical procedures. For this reason additional sections on statistical graphics are added as a postscript.
ERIC Educational Resources Information Center
La Spata, Michelle G.; Carter, Christopher W.; Johnson, Wendi L.; McGill, Ryan J.
2016-01-01
The present study examined the utility of video self-modeling (VSM) for reducing externalizing behaviors (e.g., aggression, conduct problems, hyperactivity, and impulsivity) observed within the classroom environment. After identification of relevant target behaviors, VSM interventions were developed for first and second grade students (N = 4),…
Chekanov, S.; Levchenko, B. B.; High Energy Physics; Skobeltsyn Inst. of Nuclear Physics
2007-01-01
An empirical principle for the construction of a linear relationship between the total angular momentum and squared-mass of baryons is proposed. In order to examine linearity of the trajectories, a rigorous least-squares regression analysis was performed. Unlike the standard Regge-Chew-Frautschi approach, the constructed trajectories do not have nonlinear behavior. A similar regularity may exist for lowest-mass mesons. The linear baryonic trajectories are well described by a semiclassical picture based on a spinning relativistic string with tension. The obtained numerical solution of this model was used to extract the (di)quark masses.
Constructing the Exact Significance Level for a Person-Fit Statistic.
ERIC Educational Resources Information Center
Liou, Michelle; Chang, Chih-Hsin
1992-01-01
An extension is proposed for the network algorithm introduced by C.R. Mehta and N.R. Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. A simulation study indicates the efficiency of the algorithm. (SLD)
ERIC Educational Resources Information Center
Buchanan, Taylor L.; Lohse, Keith R.
2016-01-01
We surveyed researchers in the health and exercise sciences to explore different areas and magnitudes of bias in researchers' decision making. Participants were presented with scenarios (testing a central hypothesis with p = 0.06 or p = 0.04) in a random order and surveyed about what they would do in each scenario. Participants showed significant…
Deriving statistical significance maps for support vector regression using medical imaging data.
Gaonkar, Bilwaj; Sotiras, Aristeidis; Davatzikos, Christos
2013-01-01
Regression analysis involves predicting a continuous variable using imaging data. The Support Vector Regression (SVR) algorithm has previously been used in addressing regression analysis in neuroimaging. However, identifying the regions of the image that the SVR uses to model the dependence of a target variable remains an open problem. It is an important issue when one wants to biologically interpret the meaning of a pattern that predicts the variable(s) of interest, and therefore to understand normal or pathological process. One possible approach to the identification of these regions is the use of permutation testing. Permutation testing involves 1) generation of a large set of 'null SVR models' using randomly permuted sets of target variables, and 2) comparison of the SVR model trained using the original labels to the set of null models. These permutation tests often require prohibitively long computational time. Recent work in support vector classification shows that it is possible to analytically approximate the results of permutation testing in medical image analysis. We propose an analogous approach to approximate permutation testing based analysis for support vector regression with medical imaging data. In this paper we present 1) the theory behind our approximation, and 2) experimental results using two real datasets.
Statistical Significance, Effect Size Reporting, and Confidence Intervals: Best Reporting Strategies
ERIC Educational Resources Information Center
Capraro, Robert M.
2004-01-01
With great interest the author read the May 2002 editorial in the "Journal for Research in Mathematics Education (JRME)" (King, 2002) regarding changes to the 5th edition of the "Publication Manual of the American Psychological Association" (APA, 2001). Of special note to him, and of great import to the field of mathematics education research, are…
ERIC Educational Resources Information Center
Bothe, Anne K.; Richardson, Jessica D.
2011-01-01
Purpose: To discuss constructs and methods related to assessing the magnitude and the meaning of clinical outcomes, with a focus on applications in speech-language pathology. Method: Professionals in medicine, allied health, psychology, education, and many other fields have long been concerned with issues referred to variously as practical…
ERIC Educational Resources Information Center
Thompson, Bruce
After presenting a general linear model as a framework for discussion, this paper reviews five methodology errors that occur in educational research: (1) the use of stepwise methods; (2) the failure to consider in result interpretation the context specificity of analytic weights (e.g., regression beta weights, factor pattern coefficients,…
Efforts to improve international migration statistics: a historical perspective.
Kraly, E P; Gnanasekaran, K S
1987-01-01
During the past decade, the international statistical community has made several efforts to develop standards for the definition, collection and publication of statistics on international migration. This article surveys the history of official initiatives to standardize international migration statistics by reviewing the recommendations of the International Statistical Institute, International Labor Organization, and the UN, and reports a recently proposed agenda for moving toward comparability among national statistical systems. Heightening awareness of the benefits of exchange and creating motivation to implement international standards requires a 3-pronged effort from the international statistical community. 1st, it is essential to continue discussion about the significance of improvement, specifically standardization, of international migration statistics. The move from theory to practice in this area requires ongoing focus by migration statisticians so that conformity to international standards itself becomes a criterion by which national statistical practices are examined and assessed. 2nd, the countries should be provided with technical documentation to support and facilitate the implementation of the recommended statistical systems. Documentation should be developed with an understanding that conformity to international standards for migration and travel statistics must be achieved within existing national statistical programs. 3rd, the call for statistical research in this area requires more efforts by the community of migration statisticians, beginning with the mobilization of bilateral and multilateral resources to undertake the preceding list of activities. PMID:12280924
NASA Astrophysics Data System (ADS)
Combes, Frédéric; Trescher, Maximilian; Piéchon, Frédéric; Fuchs, Jean-Noël
2016-10-01
We develop a theory for the analytic computation of the free energy of band insulators in the presence of a uniform and constant electric field. The two key ingredients are a perturbation-like expression of the Wannier-Stark energy spectrum of electrons and a modified statistical mechanics approach involving a local chemical potential in order to deal with the unbounded spectrum and impose the physically relevant electronic filling. At first order in the field, we recover the result of King-Smith, Vanderbilt, and Resta for the electric polarization in terms of a Zak phase—albeit at finite temperature—and, at second order, deduce a general formula for the electric susceptibility, or equivalently for the dielectric constant. Advantages of our method are the validity of the formalism both at zero and finite temperature and the easy computation of higher order derivatives of the free energy. We verify our findings on two different one-dimensional tight-binding models.
Statistical label fusion with hierarchical performance models
Asman, Andrew J.; Dagley, Alexander S.; Landman, Bennett A.
2014-01-01
Label fusion is a critical step in many image segmentation frameworks (e.g., multi-atlas segmentation) as it provides a mechanism for generalizing a collection of labeled examples into a single estimate of the underlying segmentation. In the multi-label case, typical label fusion algorithms treat all labels equally – fully neglecting the known, yet complex, anatomical relationships exhibited in the data. To address this problem, we propose a generalized statistical fusion framework using hierarchical models of rater performance. Building on the seminal work in statistical fusion, we reformulate the traditional rater performance model from a multi-tiered hierarchical perspective. This new approach provides a natural framework for leveraging known anatomical relationships and accurately modeling the types of errors that raters (or atlases) make within a hierarchically consistent formulation. Herein, we describe several contributions. First, we derive a theoretical advancement to the statistical fusion framework that enables the simultaneous estimation of multiple (hierarchical) performance models within the statistical fusion context. Second, we demonstrate that the proposed hierarchical formulation is highly amenable to the state-of-the-art advancements that have been made to the statistical fusion framework. Lastly, in an empirical whole-brain segmentation task we demonstrate substantial qualitative and significant quantitative improvement in overall segmentation accuracy. PMID:24817809
Rock Statistics at the Mars Pathfinder Landing Site, Roughness and Roving on Mars
NASA Technical Reports Server (NTRS)
Haldemann, A. F. C.; Bridges, N. T.; Anderson, R. C.; Golombek, M. P.
1999-01-01
Several rock counts have been carried out at the Mars Pathfinder landing site producing consistent statistics of rock coverage and size-frequency distributions. These rock statistics provide a primary element of "ground truth" for anchoring remote sensing information used to pick the Pathfinder, and future, landing sites. The observed rock population statistics should also be consistent with the emplacement and alteration processes postulated to govern the landing site landscape. The rock population databases can however be used in ways that go beyond the calculation of cumulative number and cumulative area distributions versus rock diameter and height. Since the spatial parameters measured to characterize each rock are determined with stereo image pairs, the rock database serves as a subset of the full landing site digital terrain model (DTM). Insofar as a rock count can be carried out in a speedier, albeit coarser, manner than the full DTM analysis, rock counting offers several operational and scientific products in the near term. Quantitative rock mapping adds further information to the geomorphic study of the landing site, and can also be used for rover traverse planning. Statistical analysis of the surface roughness using the rock count proxy DTM is sufficiently accurate when compared to the full DTM to compare with radar remote sensing roughness measures, and with rover traverse profiles.
Research Design and Statistical Design.
ERIC Educational Resources Information Center
Szymanski, Edna Mora
1993-01-01
Presents fourth editorial in series, this one describing research design and explaining its relationship to statistical design. Research design, validity, and research approaches are examined, quantitative research designs and hypothesis testing are described, and control and statistical designs are discussed. Concludes with section on the art of…
Explorations in Statistics: Confidence Intervals
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2009-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This third installment of "Explorations in Statistics" investigates confidence intervals. A confidence interval is a range that we expect, with some level of confidence, to include the true value of a population parameter…
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Bosch, Stephen; Ink, Gary; Lofquist, William S.
1998-01-01
Provides data on prices of U.S. and foreign materials; book title output and average prices, 1996 final and 1997 preliminary figures; book sales statistics, 1997--AAP preliminary estimates; U.S. trade in books, 1997; international book title output, 1990-95; book review media statistics; and number of book outlets in the U.S. and Canada. (PEN)
Representational Versatility in Learning Statistics
ERIC Educational Resources Information Center
Graham, Alan T.; Thomas, Michael O. J.
2005-01-01
Statistical data can be represented in a number of qualitatively different ways, the choice depending on the following three conditions: the concepts to be investigated; the nature of the data; and the purpose for which they were collected. This paper begins by setting out frameworks that describe the nature of statistical thinking in schools, and…
Motivating Play Using Statistical Reasoning
ERIC Educational Resources Information Center
Cross Francis, Dionne I.; Hudson, Rick A.; Lee, Mi Yeon; Rapacki, Lauren; Vesperman, Crystal Marie
2014-01-01
Statistical literacy is essential in everyone's personal lives as consumers, citizens, and professionals. To make informed life and professional decisions, students are required to read, understand, and interpret vast amounts of information, much of which is quantitative. To develop statistical literacy so students are able to make sense of…
Statistical Methods in Psychology Journals.
ERIC Educational Resources Information Center
Willkinson, Leland
1999-01-01
Proposes guidelines for revising the American Psychological Association (APA) publication manual or other APA materials to clarify the application of statistics in research reports. The guidelines are intended to induce authors and editors to recognize the thoughtless application of statistical methods. Contains 54 references. (SLD)
Computing contingency statistics in parallel.
Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre
2010-09-01
Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy, and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel.We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.
Education Statistics Quarterly, Spring 2001.
ERIC Educational Resources Information Center
Education Statistics Quarterly, 2001
2001-01-01
The "Education Statistics Quarterly" gives a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products and funding opportunities developed over a 3-month period. Each issue also…
SOCR: Statistics Online Computational Resource
ERIC Educational Resources Information Center
Dinov, Ivo D.
2006-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an…
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Bosch, Stephen; Ink, Gary; Greco, Albert N.
1999-01-01
Presents: "Prices of United States and Foreign Published Materials"; "Book Title Output and Average Prices"; "Book Sales Statistics, 1998"; "United States Book Exports and Imports: 1998"; "International Book Title Output: 1990-96"; "Number of Book Outlets in the United States and Canada"; and "Book Review Media Statistics". (AEF)
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Sullivan, Sharon G.; Ink, Gary; Grabois, Andrew; Barr, Catherine
2001-01-01
Includes six articles that discuss research and statistics relating to the book trade. Topics include prices of U.S. and foreign materials; book title output and average prices; book sales statistics; book exports and imports; book outlets in the U.S. and Canada; and books and other media reviewed. (LRW)
Book Trade Research and Statistics.
ERIC Educational Resources Information Center
Alexander, Adrian W.; And Others
1994-01-01
The six articles in this section examine prices of U.S. and foreign materials; book title output and average prices; book sales statistics; U.S. book exports and imports; number of book outlets in the United States and Canada; and book review media statistics. (LRW)
Education Statistics Quarterly, Fall 2000.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2000-01-01
The "Education Statistics Quarterly" gives a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released during a 3-month period. Each message also contains a message from…
Students' Attitudes toward Statistics (STATS).
ERIC Educational Resources Information Center
Sutarso, Toto
The purposes of this study were to develop an instrument to measure students' attitude toward statistics (STATS), and to define the underlying dimensions that comprise the STATS. The instrument consists of 24 items. The sample included 79 male and 97 female students from the statistics classes at the College of Education and the College of…
Statistical Factors in Complexation Reactions.
ERIC Educational Resources Information Center
Chung, Chung-Sun
1985-01-01
Four cases which illustrate statistical factors in complexation reactions (where two of the reactants are monodentate ligands) are presented. Included are tables showing statistical factors for the reactions of: (1) square-planar complexes; (2) tetrahedral complexes; and (3) octahedral complexes. (JN)
Will health fund rationalisation lead to significant premium reductions?
Hanning, Brian
2003-01-01
It has been suggested that rationalisation of health funds will generate significant albeit unquantified cost savings and thus hold or reduce health fund premiums. 2001-2 Private Health Industry Administration Council (PHIAC) data has been used to analyse these suggestions. Payments by funds for clinical services will not vary after fund rationalisation. The savings after rationalisation will arise from reductions in management expenses, which form 10.9% of total fund expenditure. A number of rationalisation scenarios are considered. The highest theoretical industry wide saving found in any plausible scenario is 2.5%, and it is uncertain whether this level of saving could be achieved in practice. If a one off saving of this order were achieved, it would have no medium and long term impact on fund premiums increases given funds are facing cost increases of 4% to 5% per annum due to demographic changes and age standardised utilization increases. It is suggested discussions on fund amalgamation divert attention from the major factors increasing fund costs, which are substantially beyond fund control.
Design of order statistics filters using feedforward neural networks
NASA Astrophysics Data System (ADS)
Maslennikova, Yu. S.; Bochkarev, V. V.
2016-08-01
In recent years significant progress have been made in the development of nonlinear data processing techniques. Such techniques are widely used in digital data filtering and image enhancement. Many of the most effective nonlinear filters based on order statistics. The widely used median filter is the best known order statistic filter. Generalized form of these filters could be presented based on Lloyd's statistics. Filters based on order statistics have excellent robustness properties in the presence of impulsive noise. In this paper, we present special approach for synthesis of order statistics filters using artificial neural networks. Optimal Lloyd's statistics are used for selecting of initial weights for the neural network. Adaptive properties of neural networks provide opportunities to optimize order statistics filters for data with asymmetric distribution function. Different examples demonstrate the properties and performance of presented approach.
Impaired Statistical Learning in Developmental Dyslexia
Thiessen, Erik D.; Holt, Lori L.
2015-01-01
Purpose Developmental dyslexia (DD) is commonly thought to arise from phonological impairments. However, an emerging perspective is that a more general procedural learning deficit, not specific to phonological processing, may underlie DD. The current study examined if individuals with DD are capable of extracting statistical regularities across sequences of passively experienced speech and nonspeech sounds. Such statistical learning is believed to be domain-general, to draw upon procedural learning systems, and to relate to language outcomes. Method DD and control groups were familiarized with a continuous stream of syllables or sine-wave tones, the ordering of which was defined by high or low transitional probabilities across adjacent stimulus pairs. Participants subsequently judged two 3-stimulus test items with either high or low statistical coherence as being the most similar to the sounds heard during familiarization. Results As with control participants, the DD group was sensitive to the transitional probability structure of the familiarization materials as evidenced by above-chance performance. However, the performance of participants with DD was significantly poorer than controls across linguistic and nonlinguistic stimuli. In addition, reading-related measures were significantly correlated with statistical learning performance of both speech and nonspeech material. Conclusion Results are discussed in light of procedural learning impairments among participants with DD. PMID:25860795
Students' attitudes towards learning statistics
NASA Astrophysics Data System (ADS)
Ghulami, Hassan Rahnaward; Hamid, Mohd Rashid Ab; Zakaria, Roslinazairimah
2015-05-01
Positive attitude towards learning is vital in order to master the core content of the subject matters under study. This is unexceptional in learning statistics course especially at the university level. Therefore, this study investigates the students' attitude towards learning statistics. Six variables or constructs have been identified such as affect, cognitive competence, value, difficulty, interest, and effort. The instrument used for the study is questionnaire that was adopted and adapted from the reliable instrument of Survey of Attitudes towards Statistics(SATS©). This study is conducted to engineering undergraduate students in one of the university in the East Coast of Malaysia. The respondents consist of students who were taking the applied statistics course from different faculties. The results are analysed in terms of descriptive analysis and it contributes to the descriptive understanding of students' attitude towards the teaching and learning process of statistics.
Probability, Information and Statistical Physics
NASA Astrophysics Data System (ADS)
Kuzemsky, A. L.
2016-03-01
In this short survey review we discuss foundational issues of the probabilistic approach to information theory and statistical mechanics from a unified standpoint. Emphasis is on the inter-relations between theories. The basic aim is tutorial, i.e. to carry out a basic introduction to the analysis and applications of probabilistic concepts to the description of various aspects of complexity and stochasticity. We consider probability as a foundational concept in statistical mechanics and review selected advances in the theoretical understanding of interrelation of the probability, information and statistical description with regard to basic notions of statistical mechanics of complex systems. It includes also a synthesis of past and present researches and a survey of methodology. The purpose of this terse overview is to discuss and partially describe those probabilistic methods and approaches that are used in statistical mechanics with the purpose of making these ideas easier to understanding and to apply.
Statistical Thermodynamics and Microscale Thermophysics
NASA Astrophysics Data System (ADS)
Carey, Van P.
1999-08-01
Many exciting new developments in microscale engineering are based on the application of traditional principles of statistical thermodynamics. In this text Van Carey offers a modern view of thermodynamics, interweaving classical and statistical thermodynamic principles and applying them to current engineering systems. He begins with coverage of microscale energy storage mechanisms from a quantum mechanics perspective and then develops the fundamental elements of classical and statistical thermodynamics. Subsequent chapters discuss applications of equilibrium statistical thermodynamics to solid, liquid, and gas phase systems. The remainder of the book is devoted to nonequilibrium thermodynamics of transport phenomena and to nonequilibrium effects and noncontinuum behavior at the microscale. Although the text emphasizes mathematical development, Carey includes many examples and exercises to illustrate how the theoretical concepts are applied to systems of scientific and engineering interest. In the process he offers a fresh view of statistical thermodynamics for advanced undergraduate and graduate students, as well as practitioners, in mechanical, chemical, and materials engineering.
Clinical statistics: five key statistical concepts for clinicians.
Choi, Yong-Geun
2013-10-01
Statistics is the science of data. As the foundation of scientific knowledge, data refers to evidentiary facts from the nature of reality by human action, observation, or experiment. Clinicians should be aware of the conditions of good data to support the validity of clinical modalities in reading scientific articles, one of the resources to revise or update their clinical knowledge and skills. The cause-effect link between clinical modality and outcome is ascertained as pattern statistic. The uniformity of nature guarantees the recurrence of data as the basic scientific evidence. Variation statistics are examined for patterns of recurrence. This provides information on the probability of recurrence of the cause-effect phenomenon. Multiple causal factors of natural phenomenon need a counterproof of absence in terms of the control group. A pattern of relation between a causal factor and an effect becomes recognizable, and thus, should be estimated as relation statistic. The type and meaning of each relation statistic should be well-understood. A study regarding a sample from the population of wide variations require clinicians to be aware of error statistics due to random chance. Incomplete human sense, coarse measurement instrument, and preconceived idea as a hypothesis that tends to bias the research, which gives rise to the necessity of keen critical independent mind with regard to the reported data.
The faulty statistics of complementary alternative medicine (CAM).
Pandolfi, Maurizio; Carreras, Giulia
2014-09-01
The authors illustrate the difficulties involved in obtaining a valid statistical significance in clinical studies especially when the prior probability of the hypothesis under scrutiny is low. Since the prior probability of a research hypothesis is directly related to its scientific plausibility, the commonly used frequentist statistics, which does not take into account this probability, is particularly unsuitable for studies exploring matters in various degree disconnected from science such as complementary alternative medicine (CAM) interventions. Any statistical significance obtained in this field should be considered with great caution and may be better applied to more plausible hypotheses (like placebo effect) than that examined - which usually is the specific efficacy of the intervention. Since achieving meaningful statistical significance is an essential step in the validation of medical interventions, CAM practices, producing only outcomes inherently resistant to statistical validation, appear not to belong to modern evidence-based medicine.
Fresh stirrings among statisticians: statistical commentary.
Godfrey, Keith
2016-05-01
For some years there has been unrest in the statistical world regarding the use of the p-value. It has been indicated that the significance of p-values is open to question, which therefore reduces the ability to measure the strength of evidence. This paper examines the use and misuse of the p-value and recommends consideration in its application. PMID:27468598
Indigenous family violence: a statistical challenge.
Cripps, Kyllie
2008-12-01
The issue of family violence and sexual abuse in Indigenous communities across Australia has attracted much attention throughout 2007, including significant intervention by the federal government into communities deemed to be in crisis. This paper critically examines the reporting and recording of Indigenous violence in Australia and reflects on what 'statistics' can offer as we grapple with how to respond appropriately to a problem defined as a 'national emergency'. PMID:19130914
Characterizations of linear sufficient statistics
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Reoner, R.; Decell, H. P., Jr.
1977-01-01
A surjective bounded linear operator T from a Banach space X to a Banach space Y must be a sufficient statistic for a dominated family of probability measures defined on the Borel sets of X. These results were applied, so that they characterize linear sufficient statistics for families of the exponential type, including as special cases the Wishart and multivariate normal distributions. The latter result was used to establish precisely which procedures for sampling from a normal population had the property that the sample mean was a sufficient statistic.
An introduction to statistical finance
NASA Astrophysics Data System (ADS)
Bouchaud, Jean-Philippe
2002-10-01
We summarize recent research in a rapid growing field, that of statistical finance, also called ‘econophysics’. There are three main themes in this activity: (i) empirical studies and the discovery of interesting universal features in the statistical texture of financial time series, (ii) the use of these empirical results to devise better models of risk and derivative pricing, of direct interest for the financial industry, and (iii) the study of ‘agent-based models’ in order to unveil the basic mechanisms that are responsible for the statistical ‘anomalies’ observed in financial time series. We give a brief overview of some of the results in these three directions.
The Effect Size Statistic: Overview of Various Choices.
ERIC Educational Resources Information Center
Mahadevan, Lakshmi
Over the years, methodologists have been recommending that researchers use magnitude of effect estimates in result interpretation to highlight the distinction between statistical and practical significance (cf. R. Kirk, 1996). A magnitude of effect statistic (i.e., effect size) tells to what degree the dependent variable can be controlled,…
What's Funny about Statistics? A Technique for Reducing Student Anxiety.
ERIC Educational Resources Information Center
Schacht, Steven; Stewart, Brad J.
1990-01-01
Studied the use of humorous cartoons to reduce the anxiety levels of students in statistics classes. Used the Mathematics Anxiety Rating Scale (MARS) to measure the level of student anxiety before and after a statistics course. Found that there was a significant reduction in levels of mathematics anxiety after the course. (SLM)
Program for standard statistical distributions
NASA Technical Reports Server (NTRS)
Falls, L. W.
1972-01-01
Development of procedure to describe frequency distributions involved in statistical theory is discussed. Representation of frequency distributions by first order differential equation is presented. Classification of various types of distributions based on Pearson parameters is analyzed.
Statistical ecology comes of age
Gimenez, Olivier; Buckland, Stephen T.; Morgan, Byron J. T.; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M.; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M.; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-01-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1–4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data. PMID:25540151
Statistical ecology comes of age.
Gimenez, Olivier; Buckland, Stephen T; Morgan, Byron J T; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-12-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1-4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data.
Back Pain Facts and Statistics
... Pain and Depression Preventing Travel Aches and Strains Back Pain Facts and Statistics Although doctors of chiropractic (DCs) ... time. 1 A few interesting facts about back pain: Low back pain is the single leading cause of disability ...
Statistical description of turbulent dispersion
NASA Astrophysics Data System (ADS)
Brouwers, J. J. H.
2012-12-01
We derive a comprehensive statistical model for dispersion of passive or almost passive admixture particles such as fine particulate matter, aerosols, smoke, and fumes in turbulent flow. The model rests on the Markov limit for particle velocity. It is in accordance with the asymptotic structure of turbulence at large Reynolds number as described by Kolmogorov. The model consists of Langevin and diffusion equations in which the damping and diffusivity are expressed by expansions in powers of the reciprocal Kolmogorov constant C0. We derive solutions of O(C00) and O(C0-1). We truncate at O(C0-2) which is shown to result in an error of a few percentages in predicted dispersion statistics for representative cases of turbulent flow. We reveal analogies and remarkable differences between the solutions of classical statistical mechanics and those of statistical turbulence.
The Malpractice of Statistical Interpretation
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
1978-01-01
Problems associated with the use of gain scores, analysis of covariance, multicollinearity, part and partial correlation, and the lack of rectilinearity in regression are discussed. Particular attention is paid to the misuse of statistical techniques. (JKS)
National Center for Health Statistics
... Topics Data and Tools Publications News and Events Population Surveys National Health and Nutrition Examination Survey National Health Interview Survey National Survey of Family Growth Vital Records National Vital Statistics System National Death ...
Spina Bifida Data and Statistics
... Materials About Us Information For... Media Policy Makers Data and Statistics Recommend on Facebook Tweet Share Compartir ... non-Hispanic white and non-Hispanic black women. Data from 12 state-based birth defects tracking programs ...
Birth Defects Data and Statistics
... Websites About Us Information For... Media Policy Makers Data & Statistics Language: English Español (Spanish) Recommend on Facebook ... of birth defects in the United States. For data on specific birth defects, please visit the specific ...
Hidden Statistics of Schroedinger Equation
NASA Technical Reports Server (NTRS)
Zak, Michail
2011-01-01
Work was carried out in determination of the mathematical origin of randomness in quantum mechanics and creating a hidden statistics of Schr dinger equation; i.e., to expose the transitional stochastic process as a "bridge" to the quantum world. The governing equations of hidden statistics would preserve such properties of quantum physics as superposition, entanglement, and direct-product decomposability while allowing one to measure its state variables using classical methods.
Statistical Physics, 2nd Edition
NASA Astrophysics Data System (ADS)
Mandl, F.
1989-01-01
The Manchester Physics Series General Editors: D. J. Sandiford; F. Mandl; A. C. Phillips Department of Physics and Astronomy, University of Manchester Properties of Matter B. H. Flowers and E. Mendoza Optics Second Edition F. G. Smith and J. H. Thomson Statistical Physics Second Edition E. Mandl Electromagnetism Second Edition I. S. Grant and W. R. Phillips Statistics R. J. Barlow Solid State Physics Second Edition J. R. Hook and H. E. Hall Quantum Mechanics F. Mandl Particle Physics Second Edition B. R. Martin and G. Shaw The Physics of Stars Second Edition A. C. Phillips Computing for Scientists R. J. Barlow and A. R. Barnett Statistical Physics, Second Edition develops a unified treatment of statistical mechanics and thermodynamics, which emphasises the statistical nature of the laws of thermodynamics and the atomic nature of matter. Prominence is given to the Gibbs distribution, leading to a simple treatment of quantum statistics and of chemical reactions. Undergraduate students of physics and related sciences will find this a stimulating account of the basic physics and its applications. Only an elementary knowledge of kinetic theory and atomic physics, as well as the rudiments of quantum theory, are presupposed for an understanding of this book. Statistical Physics, Second Edition features: A fully integrated treatment of thermodynamics and statistical mechanics. A flow diagram allowing topics to be studied in different orders or omitted altogether. Optional "starred" and highlighted sections containing more advanced and specialised material for the more ambitious reader. Sets of problems at the end of each chapter to help student understanding. Hints for solving the problems are given in an Appendix.
[Statistical process control in healthcare].
Anhøj, Jacob; Bjørn, Brian
2009-05-18
Statistical process control (SPC) is a branch of statistical science which comprises methods for the study of process variation. Common cause variation is inherent in any process and predictable within limits. Special cause variation is unpredictable and indicates change in the process. The run chart is a simple tool for analysis of process variation. Run chart analysis may reveal anomalies that suggest shifts or unusual patterns that are attributable to special cause variation. PMID:19454196
1993-02-01
In 1984, 99% of abortions conducted in Bombay, India, were of female fetuses. In 1986-87, 30,000-50,000 female fetuses were aborted in India. In 1987-88, 7 Delhi clinics conducted 13,000 sex determination tests. Thus, discrimination against females begins before birth in India. Some states (Maharashtra, Goa, and Gujarat) have drafted legislation to prevent the use of prenatal diagnostic tests (e.g., ultrasonography) for sex determination purposes. Families make decisions about an infant's nutrition based on the infant's sex so it is not surprising to see a higher incidence of morbidity among girls than boys (e.g., for respiratory infections in 1985, 55.5% vs. 27.3%). Consequently, they are more likely to die than boys. Even though vasectomy is simpler and safer than tubectomy, the government promotes female sterilizations. The percentage of all sexual sterilizations being tubectomy has increased steadily from 84% to 94% (1986-90). Family planning programs focus on female contraceptive methods, despite the higher incidence of adverse health effects from female methods (e.g., IUD causes pain and heavy bleeding). Some women advocates believe the effects to be so great that India should ban contraceptives and injectable contraceptives. The maternal mortality rate is quite high (460/100,000 live births), equaling a lifetime risk of 1:18 of a pregnancy-related death. 70% of these maternal deaths are preventable. Leading causes of maternal deaths in India are anemia, hemorrhage, eclampsia, sepsis, and abortion. Most pregnant women do not receive prenatal care. Untrained personnel attend about 70% of deliveries in rural areas and 29% in urban areas. Appropriate health services and other interventions would prevent the higher age specific death rates for females between 0 and 35 years old. Even though the government does provide maternal and child health services, it needs to stop decreasing resource allocate for health and start increasing it. PMID:12286355
Applied extreme-value statistics
Kinnison, R.R.
1983-05-01
The statistical theory of extreme values is a well established part of theoretical statistics. Unfortunately, it is seldom part of applied statistics and is infrequently a part of statistical curricula except in advanced studies programs. This has resulted in the impression that it is difficult to understand and not of practical value. In recent environmental and pollution literature, several short articles have appeared with the purpose of documenting all that is necessary for the practical application of extreme value theory to field problems (for example, Roberts, 1979). These articles are so concise that only a statistician can recognise all the subtleties and assumptions necessary for the correct use of the material presented. The intent of this text is to expand upon several recent articles, and to provide the necessary statistical background so that the non-statistician scientist can recognize and extreme value problem when it occurs in his work, be confident in handling simple extreme value problems himself, and know when the problem is statistically beyond his capabilities and requires consultation.
Ergodic theorem, ergodic theory, and statistical mechanics
Moore, Calvin C.
2015-01-01
This perspective highlights the mean ergodic theorem established by John von Neumann and the pointwise ergodic theorem established by George Birkhoff, proofs of which were published nearly simultaneously in PNAS in 1931 and 1932. These theorems were of great significance both in mathematics and in statistical mechanics. In statistical mechanics they provided a key insight into a 60-y-old fundamental problem of the subject—namely, the rationale for the hypothesis that time averages can be set equal to phase averages. The evolution of this problem is traced from the origins of statistical mechanics and Boltzman's ergodic hypothesis to the Ehrenfests' quasi-ergodic hypothesis, and then to the ergodic theorems. We discuss communications between von Neumann and Birkhoff in the Fall of 1931 leading up to the publication of these papers and related issues of priority. These ergodic theorems initiated a new field of mathematical-research called ergodic theory that has thrived ever since, and we discuss some of recent developments in ergodic theory that are relevant for statistical mechanics. PMID:25691697
Statistical Sampling of Tide Heights Study
NASA Technical Reports Server (NTRS)
2002-01-01
The goal of the study was to determine if it was possible to reduce the cost of verifying computational models of tidal waves and currents. Statistical techniques were used to determine the least number of samples required, in a given situation, to remain statistically significant, and thereby reduce overall project costs. Commercial, academic, and Federal agencies could benefit by applying these techniques, without the need to 'touch' every item in the population. For example, the requirement of this project was to measure the heights and times of high and low tides at 8,000 locations for verification of computational models of tidal waves and currents. The application of the statistical techniques began with observations to determine the correctness of submitted measurement data, followed by some assumptions based on the observations. Among the assumptions were that the data were representative of data-collection techniques used at the measurement locations, that time measurements could be ignored (that is, height measurements alone would suffice), and that the height measurements were from a statistically normal distribution. Sample means and standard deviations were determined for all locations. Interval limits were determined for confidence levels of 95, 98, and 99 percent. It was found that the numbers of measurement locations needed to attain these confidence levels were 55, 78, and 96, respectively.
Statistical genetics in traditionally cultivated crops.
Artoisenet, Pierre; Minsart, Laure-Anne
2014-11-01
Traditional farming systems have attracted a lot of attention over the past decades as they have been recognized to supply an important component in the maintenance of the genetic diversity worldwide. A broad spectrum of traditionally managed crops has been studied to investigate how reproductive properties in combination with husbandry characteristics shape the genetic structure of the crops over time. However, traditional farms typically involve populations of small size whose genetic evolution is overwhelmed with statistic fluctuations inherent to the stochastic nature of the crossings. Hence there is generally no one-to-one mapping between crop properties and measured genotype data, and claims regarding crop properties on the basis of the observed genetic structure must be stated within a confidence level to be estimated by means of a dedicated statistical analysis. In this paper, we propose a comprehensive framework to carry out such statistical analyses. We illustrate the capabilities of our approach by applying it to crops of C. lanatus var. lanatus oleaginous type cultivated in Côte d׳Ivoire. While some properties such as the effective field size considerably evade the constraints from experimental data, others such as the mating system turn out to be characterized with a higher statistical significance. We discuss the importance of our approach for studies on traditionally cultivated crops in general. PMID:24992232
Blinking statistics of silicon quantum dots.
Bruhn, Benjamin; Valenta, Jan; Sangghaleh, Fatemeh; Linnros, Jan
2011-12-14
The blinking statistics of numerous single silicon quantum dots fabricated by electron-beam lithography, plasma etching, and oxidation have been analyzed. Purely exponential on- and off-time distributions were found consistent with the absence of statistical aging. This is in contrast to blinking reports in the literature where power-law distributions prevail as well as observations of statistical aging in nanocrystal ensembles. A linear increase of the switching frequency with excitation power density indicates a domination of single-photon absorption processes, possibly through a direct transfer of charges to trap states without the need for a bimolecular Auger mechanism. Photoluminescence saturation with increasing excitation is not observed; however, there is a threshold in excitation (coinciding with a mean occupation of one exciton per nanocrystal) where a change from linear to square-root increase occurs. Finally, the statistics of blinking of single quantum dots in terms of average on-time, blinking frequency and blinking amplitude reveal large variations (several orders) without any significant correlation demonstrating the individual microscopic character of each quantum dot.
Statistical Treatment of Looking-Time Data
2016-01-01
Looking times (LTs) are frequently measured in empirical research on infant cognition. We analyzed the statistical distribution of LTs across participants to develop recommendations for their treatment in infancy research. Our analyses focused on a common within-subject experimental design, in which longer looking to novel or unexpected stimuli is predicted. We analyzed data from 2 sources: an in-house set of LTs that included data from individual participants (47 experiments, 1,584 observations), and a representative set of published articles reporting group-level LT statistics (149 experiments from 33 articles). We established that LTs are log-normally distributed across participants, and therefore, should always be log-transformed before parametric statistical analyses. We estimated the typical size of significant effects in LT studies, which allowed us to make recommendations about setting sample sizes. We show how our estimate of the distribution of effect sizes of LT studies can be used to design experiments to be analyzed by Bayesian statistics, where the experimenter is required to determine in advance the predicted effect size rather than the sample size. We demonstrate the robustness of this method in both sets of LT experiments. PMID:26845505
DNA statistics, overlapping word paradox and Conway equation
Pevzner, P.A.
1993-12-31
Overlapping word paradox known in combinatorics for 20 years is to this day disregarded in many papers on DNA statistics. The author considers Conway equation for the best bet for simpletons as an example of the overlapping word paradox. He gives a new short proof of Conway equation and discusses the implications of the overlapping word paradox for DNA statistics. In particular, he demonstrates that ignoring overlapping word paradox in DNA statistics can easily lead to 500% mistakes in estimations of statistical significance. He also presents formulas allowing one to find `anomalous` words in DNA texts.
ERIC Educational Resources Information Center
Martins, Jose Alexandre; Nascimento, Maria Manuel; Estrada, Assumpta
2012-01-01
Teachers' attitudes towards statistics can have a significant effect on their own statistical training, their teaching of statistics, and the future attitudes of their students. The influence of attitudes in teaching statistics in different contexts was previously studied in the work of Estrada et al. (2004, 2010a, 2010b) and Martins et al.…
On More Sensitive Periodogram Statistics
NASA Astrophysics Data System (ADS)
Bélanger, G.
2016-05-01
Period searches in event data have traditionally used the Rayleigh statistic, R 2. For X-ray pulsars, the standard has been the Z 2 statistic, which sums over more than one harmonic. For γ-rays, the H-test, which optimizes the number of harmonics to sum, is often used. These periodograms all suffer from the same problem, namely artifacts caused by correlations in the Fourier components that arise from testing frequencies with a non-integer number of cycles. This article addresses this problem. The modified Rayleigh statistic is discussed, its generalization to any harmonic, {{ R }}k2, is formulated, and from the latter, the modified Z 2 statistic, {{ Z }}2, is constructed. Versions of these statistics for binned data and point measurements are derived, and it is shown that the variance in the uncertainties can have an important influence on the periodogram. It is shown how to combine the information about the signal frequency from the different harmonics to estimate its value with maximum accuracy. The methods are applied to an XMM-Newton observation of the Crab pulsar for which a decomposition of the pulse profile is presented, and shows that most of the power is in the second, third, and fifth harmonics. Statistical detection power of the {{ R }}k2 statistic is superior to the FFT and equivalent to the Lomb--Scargle (LS). Response to gaps in the data is assessed, and it is shown that the LS does not protect against the distortions they cause. The main conclusion of this work is that the classical R 2 and Z 2 should be replaced by {{ R }}k2 and {{ Z }}2 in all applications with event data, and the LS should be replaced by the {{ R }}k2 when the uncertainty varies from one point measurement to another.
FRB repetition and non-Poissonian statistics
NASA Astrophysics Data System (ADS)
Connor, Liam; Pen, Ue-Li; Oppermann, Niels
2016-05-01
We discuss some of the claims that have been made regarding the statistics of fast radio bursts (FRBs). In an earlier Letter, we conjectured that flicker noise associated with FRB repetition could show up in non-cataclysmic neutron star emission models, like supergiant pulses. We show how the current limits of repetition would be significantly weakened if their repeat rate really were non-Poissonian and had a pink or red spectrum. Repetition and its statistics have implications for observing strategy, generally favouring shallow wide-field surveys, since in the non-repeating scenario survey depth is unimportant. We also discuss the statistics of the apparent latitudinal dependence of FRBs, and offer a simple method for calculating the significance of this effect. We provide a generalized Bayesian framework for addressing this problem, which allows for direct model comparison. It is shown how the evidence for a steep latitudinal gradient of the FRB rate is less strong than initially suggested and simple explanations like increased scattering and sky temperature in the plane are sufficient to decrease the low-latitude burst rate, given current data. The reported dearth of bursts near the plane is further complicated if FRBs have non-Poissonian repetition, since in that case the event rate inferred from observation depends on observing strategy.
Statistical analysis of diversification with species traits.
Paradis, Emmanuel
2005-01-01
Testing whether some species traits have a significant effect on diversification rates is central in the assessment of macroevolutionary theories. However, we still lack a powerful method to tackle this objective. I present a new method for the statistical analysis of diversification with species traits. The required data are observations of the traits on recent species, the phylogenetic tree of these species, and reconstructions of ancestral values of the traits. Several traits, either continuous or discrete, and in some cases their interactions, can be analyzed simultaneously. The parameters are estimated by the method of maximum likelihood. The statistical significance of the effects in a model can be tested with likelihood ratio tests. A simulation study showed that past random extinction events do not affect the Type I error rate of the tests, whereas statistical power is decreased, though some power is still kept if the effect of the simulated trait on speciation is strong. The use of the method is illustrated by the analysis of published data on primates. The analysis of these data showed that the apparent overall positive relationship between body mass and species diversity is actually an artifact due to a clade-specific effect. Within each clade the effect of body mass on speciation rate was in fact negative. The present method allows to take both effects (clade and body mass) into account simultaneously.
Statistical Evaluation of Small-scale Explosives Testing
NASA Astrophysics Data System (ADS)
Guymon, Clint
2013-06-01
Small-scale explosives sensitivity testing is used to qualitatively and quantitatively evaluate risk. Both relative comparison and characterization of the transition from no reaction to reaction is used to estimate that risk. Statistical comparisons and use of statistically efficient methods are critical to accurately and efficiently make risk related decisions. Many public and private entities are not making accurate decisions based on the test data because of the lack of properly applying basic statistical principles. We present methods and examples showing how to use statistics to accurately and efficiently evaluate the risk for relative comparison and in-process risk evaluation. Some of the methods presented include the Significance Chart Method and adaptive step-size techniques like the Neyer D-Optimal method. These methods are compared to the more traditional approaches like Bruceton and Probit. Use of statistical methods can significantly improve the efficiency, accuracy, and applicability of small-scale explosives sensitivity testing.
Assay optimization: a statistical design of experiments approach.
Altekar, Maneesha; Homon, Carol A; Kashem, Mohammed A; Mason, Steven W; Nelson, Richard M; Patnaude, Lori A; Yingling, Jeffrey; Taylor, Paul B
2007-03-01
With the transition from manual to robotic HTS in the last several years, assay optimization has become a significant bottleneck. Recent advances in robotic liquid handling have made it feasible to reduce assay optimization timelines with the application of statistically designed experiments. When implemented, they can efficiently optimize assays by rapidly identifying significant factors, complex interactions, and nonlinear responses. This article focuses on the use of statistically designed experiments in assay optimization.
Integrable matrix theory: Level statistics
NASA Astrophysics Data System (ADS)
Scaramazza, Jasen A.; Shastry, B. Sriram; Yuzbashyan, Emil A.
2016-09-01
We study level statistics in ensembles of integrable N ×N matrices linear in a real parameter x . The matrix H (x ) is considered integrable if it has a prescribed number n >1 of linearly independent commuting partners Hi(x ) (integrals of motion) "]Hi(x ) ,Hj(x ) ]">H (x ) ,Hi(x ) =0 , for all x . In a recent work [Phys. Rev. E 93, 052114 (2016), 10.1103/PhysRevE.93.052114], we developed a basis-independent construction of H (x ) for any n from which we derived the probability density function, thereby determining how to choose a typical integrable matrix from the ensemble. Here, we find that typical integrable matrices have Poisson statistics in the N →∞ limit provided n scales at least as logN ; otherwise, they exhibit level repulsion. Exceptions to the Poisson case occur at isolated coupling values x =x0 or when correlations are introduced between typically independent matrix parameters. However, level statistics cross over to Poisson at O (N-0.5) deviations from these exceptions, indicating that non-Poissonian statistics characterize only subsets of measure zero in the parameter space. Furthermore, we present strong numerical evidence that ensembles of integrable matrices are stationary and ergodic with respect to nearest-neighbor level statistics.
Statistical methods in language processing.
Abney, Steven
2011-05-01
The term statistical methods here refers to a methodology that has been dominant in computational linguistics since about 1990. It is characterized by the use of stochastic models, substantial data sets, machine learning, and rigorous experimental evaluation. The shift to statistical methods in computational linguistics parallels a movement in artificial intelligence more broadly. Statistical methods have so thoroughly permeated computational linguistics that almost all work in the field draws on them in some way. There has, however, been little penetration of the methods into general linguistics. The methods themselves are largely borrowed from machine learning and information theory. We limit attention to that which has direct applicability to language processing, though the methods are quite general and have many nonlinguistic applications. Not every use of statistics in language processing falls under statistical methods as we use the term. Standard hypothesis testing and experimental design, for example, are not covered in this article. WIREs Cogni Sci 2011 2 315-322 DOI: 10.1002/wcs.111 For further resources related to this article, please visit the WIREs website.
40 CFR Appendix IV to Part 265 - Tests for Significance
Code of Federal Regulations, 2010 CFR
2010-07-01
... changes in the concentration or value of an indicator parameter in periodic ground-water samples when... then be compared to the value of the t-statistic found in a table for t-test of significance at the specified level of significance. A calculated value of t which exceeds the value of t found in the...
40 CFR Appendix IV to Part 265 - Tests for Significance
Code of Federal Regulations, 2012 CFR
2012-07-01
... changes in the concentration or value of an indicator parameter in periodic ground-water samples when... then be compared to the value of the t-statistic found in a table for t-test of significance at the specified level of significance. A calculated value of t which exceeds the value of t found in the...
Clinically Significant Change: Jacobson and Truax (1991) Revisited.
ERIC Educational Resources Information Center
Speer, David C.
1992-01-01
Considers relationship between statistically and clinically significant change. Sees Jacobson and Truax's index of clinically significant change as neglecting possible confounding of improvement rate estimates by regression to the mean. Describes alternative method (Edwards-Nunnally method) that incorporates an adjustment that minimizes this…
Wallace, D L; Perlman, M D
1980-06-01
This report describes the research activities of the Department of Statistics, University of Chicago, during the period June 15, 1975 to July 30, 1979. Nine research projects are briefly described on the following subjects: statistical computing and approximation techniques in statistics; numerical computation of first passage distributions; probabilities of large deviations; combining independent tests of significance; small-sample efficiencies of tests and estimates; improved procedures for simultaneous estimation and testing of many correlations; statistical computing and improved regression methods; comparison of several populations; and unbiasedness in multivariate statistics. A description of the statistical consultation activities of the Department that are of interest to DOE, in particular, the scientific interactions between the Department and the scientists at Argonne National Laboratories, is given. A list of publications issued during the term of the contract is included.
Statistical algorithms for ontology-based annotation of scientific literature
2014-01-01
Background Ontologies encode relationships within a domain in robust data structures that can be used to annotate data objects, including scientific papers, in ways that ease tasks such as search and meta-analysis. However, the annotation process requires significant time and effort when performed by humans. Text mining algorithms can facilitate this process, but they render an analysis mainly based upon keyword, synonym and semantic matching. They do not leverage information embedded in an ontology's structure. Methods We present a probabilistic framework that facilitates the automatic annotation of literature by indirectly modeling the restrictions among the different classes in the ontology. Our research focuses on annotating human functional neuroimaging literature within the Cognitive Paradigm Ontology (CogPO). We use an approach that combines the stochastic simplicity of naïve Bayes with the formal transparency of decision trees. Our data structure is easily modifiable to reflect changing domain knowledge. Results We compare our results across naïve Bayes, Bayesian Decision Trees, and Constrained Decision Tree classifiers that keep a human expert in the loop, in terms of the quality measure of the F1-mirco score. Conclusions Unlike traditional text mining algorithms, our framework can model the knowledge encoded by the dependencies in an ontology, albeit indirectly. We successfully exploit the fact that CogPO has explicitly stated restrictions, and implicit dependencies in the form of patterns in the expert curated annotations. PMID:25093071
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood. PMID:25213136
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-10-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, however, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1) P-hacking, which is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want; 2) overemphasis on P values rather than on the actual size of the observed effect; 3) overuse of statistical hypothesis testing, and being seduced by the word "significant"; and 4) over-reliance on standard errors, which are often misunderstood. PMID:25204545
Protein Sectors: Statistical Coupling Analysis versus Conservation
Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas
2015-01-01
Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-10-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, however, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1) P-hacking, which is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want; 2) overemphasis on P values rather than on the actual size of the observed effect; 3) overuse of statistical hypothesis testing, and being seduced by the word "significant"; and 4) over-reliance on standard errors, which are often misunderstood.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
Statistics for People Who (Think They) Hate Statistics. Third Edition
ERIC Educational Resources Information Center
Salkind, Neil J.
2007-01-01
This text teaches an often intimidating and difficult subject in a way that is informative, personable, and clear. The author takes students through various statistical procedures, beginning with correlation and graphical representation of data and ending with inferential techniques and analysis of variance. In addition, the text covers SPSS, and…
Writing to Learn Statistics in an Advanced Placement Statistics Course
ERIC Educational Resources Information Center
Northrup, Christian Glenn
2012-01-01
This study investigated the use of writing in a statistics classroom to learn if writing provided a rich description of problem-solving processes of students as they solved problems. Through analysis of 329 written samples provided by students, it was determined that writing provided a rich description of problem-solving processes and enabled…
Ethical Statistics and Statistical Ethics: Making an Interdisciplinary Module
ERIC Educational Resources Information Center
Lesser, Lawrence M.; Nordenhaug, Erik
2004-01-01
This article describes an innovative curriculum module the first author created on the two-way exchange between statistics and applied ethics. The module, having no particular mathematical prerequisites beyond high school algebra, is part of an undergraduate interdisciplinary ethics course which begins with a 3-week introduction to basic applied…
ERIC Educational Resources Information Center
Schneider, William R.
2011-01-01
The purpose of this study was to determine the relationship between statistics self-efficacy, statistics anxiety, and performance in introductory graduate statistics courses. The study design compared two statistics self-efficacy measures developed by Finney and Schraw (2003), a statistics anxiety measure developed by Cruise and Wilkins (1980),…
Population and vital statistics, 1981.
1983-02-01
"For various reasons some of the data relating to population estimates, vital statistics and causes of death in 1981 were not included in the Statistical Abstract of Israel No. 33, 1982. The purpose of this [article] is to complete the missing data and to revise and update some other data." Statistics are included on population by age, sex, marital status, population group, origin, continent of birth, period of immigration, and religion; marriages, divorces, live births, deaths, natural increase, infant deaths, and stillbirths by religion; characteristics of persons marrying and divorcing, including place of residence, religion, age, previous marital status, and year and duration of marriage; live births, deaths, and infant deaths by district, sub-district, and type of locality of residence; deaths by age, sex, and continent of birth; infant deaths by age, sex, and population group; and selected life table values by population group and sex.
The Statistical Loop Analyzer (SLA)
NASA Technical Reports Server (NTRS)
Lindsey, W. C.
1985-01-01
The statistical loop analyzer (SLA) is designed to automatically measure the acquisition, tracking and frequency stability performance characteristics of symbol synchronizers, code synchronizers, carrier tracking loops, and coherent transponders. Automated phase lock and system level tests can also be made using the SLA. Standard baseband, carrier and spread spectrum modulation techniques can be accomodated. Through the SLA's phase error jitter and cycle slip measurements the acquisition and tracking thresholds of the unit under test are determined; any false phase and frequency lock events are statistically analyzed and reported in the SLA output in probabilistic terms. Automated signal drop out tests can be performed in order to trouble shoot algorithms and evaluate the reacquisition statistics of the unit under test. Cycle slip rates and cycle slip probabilities can be measured using the SLA. These measurements, combined with bit error probability measurements, are all that are needed to fully characterize the acquisition and tracking performance of a digital communication system.
Statistical modeling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1992-01-01
This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.
Statistical mechanics of complex networks
NASA Astrophysics Data System (ADS)
Albert, Reka Zsuzsanna
2001-07-01
The emergence of order in natural systems is a constant source of inspiration for both physical and biological sciences. While the spatial order characterizing for example the crystals has been the basis of many advances in contemporary physics, most complex systems in nature do not offer such high degree of order. Many of these systems form complex networks whose nodes are the elements of the system and edges represent the interactions between them. Traditionally complex networks have been described by the random graph theory founded in 1959 by Paul Erdoḧs and Alfréd Rényi. One of the defining features of random graphs is that they are statistically homogeneous, and their degree distribution (characterizing the spread in the number of edges starting from a node) is a Poisson distribution. In contrast, recent empirical studies, including the work of our group, indicate that the topology of real networks is much richer than that of random graphs. In particular, the degree distribution of real networks is a power-law, indicating a heterogeneous topology in which the majority of the nodes have a small degree, but there is a significant fraction of highly connected nodes that play an important role in the connectivity of the network. The scale-free topology of real networks has very important consequences on their functioning. For example, we have discovered that scale-free networks are extremely resilient to the random disruption of their nodes. On the other hand, the selective removal of the nodes with highest degree induces a rapid breakdown of the network to isolated subparts that cannot communicate with each other. The non-trivial scaling of the degree distribution of real networks is also an indication of their assembly and evolution. Indeed, our modeling studies have shown us that there are general principles governing the evolution of networks. Most networks start from a small seed and grow by the addition of new nodes which attach to the nodes already in
XMM-Newton publication statistics
NASA Astrophysics Data System (ADS)
Ness, J.-U.; Parmar, A. N.; Valencic, L. A.; Smith, R.; Loiseau, N.; Salama, A.; Ehle, M.; Schartel, N.
2014-02-01
We assessed the scientific productivity of XMM-Newton by examining XMM-Newton publications and data usage statistics. We analyse 3272 refereed papers, published until the end of 2012, that directly use XMM-Newton data. The SAO/NASA Astrophysics Data System (ADS) was used to provide additional information on each paper including the number of citations. For each paper, the XMM-Newton observation identifiers and instruments used to provide the scientific results were determined. The identifiers were used to access the XMM-{Newton} Science Archive (XSA) to provide detailed information on the observations themselves and on the original proposals. The information obtained from these sources was then combined to allow the scientific productivity of the mission to be assessed. Since around three years after the launch of XMM-Newton there have been around 300 refereed papers per year that directly use XMM-Newton data. After more than 13 years in operation, this rate shows no evidence that it is decreasing. Since 2002, around 100 scientists per year become lead authors for the first time on a refereed paper which directly uses XMM-Newton data. Each refereed XMM-Newton paper receives around four citations per year in the first few years with a long-term citation rate of three citations per year, more than five years after publication. About half of the articles citing XMM-Newton articles are not primarily X-ray observational papers. The distribution of elapsed time between observations taken under the Guest Observer programme and first article peaks at 2 years with a possible second peak at 3.25 years. Observations taken under the Target of Opportunity programme are published significantly faster, after one year on average. The fraction of science time taken until the end of 2009 that has been used in at least one article is {˜ 90} %. Most observations were used more than once, yielding on average a factor of two in usage on available observing time per year. About 20 % of
Statistics of Statisticians: Critical Mass of Statistics and Operational Research Groups
NASA Astrophysics Data System (ADS)
Kenna, Ralph; Berche, Bertrand
Using a recently developed model, inspired by mean field theory in statistical physics, and data from the UK's Research Assessment Exercise, we analyse the relationship between the qualities of statistics and operational research groups and the quantities of researchers in them. Similar to other academic disciplines, we provide evidence for a linear dependency of quality on quantity up to an upper critical mass, which is interpreted as the average maximum number of colleagues with whom a researcher can communicate meaningfully within a research group. The model also predicts a lower critical mass, which research groups should strive to achieve to avoid extinction. For statistics and operational research, the lower critical mass is estimated to be 9 ± 3. The upper critical mass, beyond which research quality does not significantly depend on group size, is 17 ± 6.
Statistical learning under incidental versus intentional conditions.
Arciuli, Joanne; Torkildsen, Janne von Koss; Stevens, David J; Simpson, Ian C
2014-01-01
Statistical learning (SL) studies have shown that participants are able to extract regularities in input they are exposed to without any instruction to do so. This and other findings, such as the fact that participants are often unable to verbalize their acquired knowledge, suggest that SL can occur implicitly or incidentally. Interestingly, several studies using the related paradigms of artificial grammar learning and serial response time tasks have shown that explicit instructions can aid learning under certain conditions. Within the SL literature, however, very few studies have contrasted incidental and intentional learning conditions. The aim of the present study was to investigate the effect of having prior knowledge of the statistical regularities in the input when undertaking a task of visual sequential SL. Specifically, we compared the degree of SL exhibited by participants who were informed (intentional group) versus those who were uninformed (incidental group) about the presence of embedded triplets within a familiarization stream. Somewhat surprisingly, our results revealed that there were no statistically significant differences (and only a small effect size) in the amount of SL exhibited between the intentional versus the incidental groups. We discuss the ways in which this result can be interpreted and suggest that short presentation times for stimuli in the familiarization stream in our study may have limited the opportunity for explicit learning. This suggestion is in line with recent research revealing a statistically significant difference (and a large effect size) between intentional versus incidental groups using a very similar visual sequential SL task, but with longer presentation times. Finally, we outline a number of directions for future research. PMID:25071692
Illustrating the practice of statistics
Hamada, Christina A; Hamada, Michael S
2009-01-01
The practice of statistics involves analyzing data and planning data collection schemes to answer scientific questions. Issues often arise with the data that must be dealt with and can lead to new procedures. In analyzing data, these issues can sometimes be addressed through the statistical models that are developed. Simulation can also be helpful in evaluating a new procedure. Moreover, simulation coupled with optimization can be used to plan a data collection scheme. The practice of statistics as just described is much more than just using a statistical package. In analyzing the data, it involves understanding the scientific problem and incorporating the scientist's knowledge. In modeling the data, it involves understanding how the data were collected and accounting for limitations of the data where possible. Moreover, the modeling is likely to be iterative by considering a series of models and evaluating the fit of these models. Designing a data collection scheme involves understanding the scientist's goal and staying within hislher budget in terms of time and the available resources. Consequently, a practicing statistician is faced with such tasks and requires skills and tools to do them quickly. We have written this article for students to provide a glimpse of the practice of statistics. To illustrate the practice of statistics, we consider a problem motivated by some precipitation data that our relative, Masaru Hamada, collected some years ago. We describe his rain gauge observational study in Section 2. We describe modeling and an initial analysis of the precipitation data in Section 3. In Section 4, we consider alternative analyses that address potential issues with the precipitation data. In Section 5, we consider the impact of incorporating additional infonnation. We design a data collection scheme to illustrate the use of simulation and optimization in Section 6. We conclude this article in Section 7 with a discussion.
The Statistics of Visual Representation
NASA Technical Reports Server (NTRS)
Jobson, Daniel J.; Rahman, Zia-Ur; Woodell, Glenn A.
2002-01-01
The experience of retinex image processing has prompted us to reconsider fundamental aspects of imaging and image processing. Foremost is the idea that a good visual representation requires a non-linear transformation of the recorded (approximately linear) image data. Further, this transformation appears to converge on a specific distribution. Here we investigate the connection between numerical and visual phenomena. Specifically the questions explored are: (1) Is there a well-defined consistent statistical character associated with good visual representations? (2) Does there exist an ideal visual image? And (3) what are its statistical properties?
Key China Energy Statistics 2012
Levine, Mark; Fridley, David; Lu, Hongyou; Fino-Chen, Cecilia
2012-05-01
The China Energy Group at Lawrence Berkeley National Laboratory (LBNL) was established in 1988. Over the years the Group has gained recognition as an authoritative source of China energy statistics through the publication of its China Energy Databook (CED). The Group has published seven editions to date of the CED (http://china.lbl.gov/research/chinaenergy-databook). This handbook summarizes key statistics from the CED and is expressly modeled on the International Energy Agency’s “Key World Energy Statistics” series of publications. The handbook contains timely, clearly-presented data on the supply, transformation, and consumption of all major energy sources.
Key China Energy Statistics 2011
Levine, Mark; Fridley, David; Lu, Hongyou; Fino-Chen, Cecilia
2012-01-15
The China Energy Group at Lawrence Berkeley National Laboratory (LBNL) was established in 1988. Over the years the Group has gained recognition as an authoritative source of China energy statistics through the publication of its China Energy Databook (CED). In 2008 the Group published the Seventh Edition of the CED (http://china.lbl.gov/research/chinaenergy-databook). This handbook summarizes key statistics from the CED and is expressly modeled on the International Energy Agency’s “Key World Energy Statistics” series of publications. The handbook contains timely, clearly-presented data on the supply, transformation, and consumption of all major energy sources.
Statistical mechanics of prion diseases.
Slepoy, A; Singh, R R; Pázmándi, F; Kulkarni, R V; Cox, D L
2001-07-30
We present a two-dimensional, lattice based, protein-level statistical mechanical model for prion diseases (e.g., mad cow disease) with concomitant prion protein misfolding and aggregation. Our studies lead us to the hypothesis that the observed broad incubation time distribution in epidemiological data reflect fluctuation dominated growth seeded by a few nanometer scale aggregates, while much narrower incubation time distributions for innoculated lab animals arise from statistical self-averaging. We model "species barriers" to prion infection and assess a related treatment protocol. PMID:11497806
On statistical aspects of Qjets
NASA Astrophysics Data System (ADS)
Ellis, Stephen D.; Hornig, Andrew; Krohn, David; Roy, Tuhin S.
2015-01-01
The process by which jet algorithms construct jets and subjets is inherently ambiguous and equally well motivated algorithms often return very different answers. The Qjets procedure was introduced by the authors to account for this ambiguity by considering many reconstructions of a jet at once, allowing one to assign a weight to each interpretation of the jet. Employing these weighted interpretations leads to an improvement in the statistical stability of many measurements. Here we explore in detail the statistical properties of these sets of weighted measurements and demonstrate how they can be used to improve the reach of jet-based studies.
Statistical parameters for gloss evaluation
Peiponen, Kai-Erik; Juuti, Mikko
2006-02-13
The measurement of minute changes in local gloss has not been presented in international standards due to a lack of suitable glossmeters. The development of a diffractive-element-based glossmeter (DOG) made it possible to detect local variation of gloss from planar and complex-shaped surfaces. Hence, a demand for proper statistical gloss parameters for classifying surface quality by gloss, similar to the standardized surface roughness classification, has become necessary. In this letter, we define statistical gloss parameters and utilize them as an example in the characterization of gloss from metal surface roughness standards by the DOG.
Statistical Mechanics of Prion Diseases
NASA Astrophysics Data System (ADS)
Slepoy, A.; Singh, R. R.; Pázmándi, F.; Kulkarni, R. V.; Cox, D. L.
2001-07-01
We present a two-dimensional, lattice based, protein-level statistical mechanical model for prion diseases (e.g., mad cow disease) with concomitant prion protein misfolding and aggregation. Our studies lead us to the hypothesis that the observed broad incubation time distribution in epidemiological data reflect fluctuation dominated growth seeded by a few nanometer scale aggregates, while much narrower incubation time distributions for innoculated lab animals arise from statistical self-averaging. We model ``species barriers'' to prion infection and assess a related treatment protocol.
Statistical Constraints on Joy's Law
NASA Astrophysics Data System (ADS)
Amouzou, Ernest C.; Munoz-Jaramillo, Andres; Martens, Petrus C.; DeLuca, Edward E.
2014-06-01
Using sunspot data from the observatories at Mt. Wilson and Kodaikanal, active region tilt angles are analyzed for different active region sizes and latitude bins. A number of similarly-shaped statistical distributions were fitted to the data using maximum likelihood estimation. In all cases, we find that the statistical distribution best describing the number of active regions at a given tilt angle is a Laplace distribution with the form (2β)-1*exp(-|x-μ|/β), with 2° ≤ μ ≤ 11°, and 10° ≤ β ≤ 40°.
Probability, statistics, and computational science.
Beerenwinkel, Niko; Siebourg, Juliane
2012-01-01
In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.
Statistical Mechanics of Prion Diseases
Slepoy, A.; Singh, R. R. P.; Pazmandi, F.; Kulkarni, R. V.; Cox, D. L.
2001-07-30
We present a two-dimensional, lattice based, protein-level statistical mechanical model for prion diseases (e.g., mad cow disease) with concomitant prion protein misfolding and aggregation. Our studies lead us to the hypothesis that the observed broad incubation time distribution in epidemiological data reflect fluctuation dominated growth seeded by a few nanometer scale aggregates, while much narrower incubation time distributions for innoculated lab animals arise from statistical self-averaging. We model ''species barriers'' to prion infection and assess a related treatment protocol.
ERIC Educational Resources Information Center
Leaf, Donald C., Comp.; Neely, Linda, Comp.
This edition focuses on statistical data supplied by Michigan public libraries, public library cooperatives, and those public libraries which serve as regional or subregional outlets for blind and physically handicapped services. Since statistics in Michigan academic libraries are typically collected in odd-numbered years, they are not included…
Teaching Statistics in Integration with Psychology
ERIC Educational Resources Information Center
Wiberg, Marie
2009-01-01
The aim was to revise a statistics course in order to get the students motivated to learn statistics and to integrate statistics more throughout a psychology course. Further, we wish to make students become more interested in statistics and to help them see the importance of using statistics in psychology research. To achieve this goal, several…
20 CFR 634.4 - Statistical standards.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 20 Employees' Benefits 3 2012-04-01 2012-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs....
ERIC Educational Resources Information Center
Kyrillidou, Martha, Comp.; Bland, Les, Comp.
2009-01-01
"ARL Statistics 2007-2008" is the latest in a series of annual publications that describe collections, staffing, expenditures, and service activities for the 123 members of the Association of Research Libraries (ARL). Of these, 113 are university libraries; the remaining 10 are public, governmental, and nonprofit research libraries. Data reported…
Education Statistics Quarterly, Spring 2002.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2002-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products, and funding opportunities developed over a 3-month period. Each issue also contains a message…
Statistics by Example, Finding Models.
ERIC Educational Resources Information Center
Mosteller, Frederick; And Others
This booklet, part of a series of four which provide problems in probability and statistics for the secondary school level, is aimed at aiding the student in developing models as structure for data and in learning how to change models to fit real-life problems. Twelve different problem situations arising from biology, business, English, physical…
Statistics by Example, Weighing Chances.
ERIC Educational Resources Information Center
Mosteller, Frederick; And Others
Part of a series of four pamphlets providing problems in probability and statistics taken from real-life situations, this booklet develops probability methods through random numbers, simulations, and simple probability models, and presents the idea of scatter and residuals for analyzing complex data. The pamphlet was designed for a student having…
Statistics by Example, Exploring Data.
ERIC Educational Resources Information Center
Mosteller, Frederick; And Others
Part of a series of four pamphlets providing real-life problems in probability and statistics for the secondary school level, this booklet shows how to organize data in tables and graphs in order to get and to exhibit messages. Elementary probability concepts are also introduced. Fourteen different problem situations arising from biology,…
Statistics of premixed flame cells
NASA Technical Reports Server (NTRS)
Noever, David A.
1991-01-01
The statistics of random cellular patterns in premixed flames are analyzed. Agreement is found with a variety of topological relations previously found for other networks, namely, Lewis's law and Aboav's law. Despite the diverse underlying physics, flame cells are shown to share a broad class of geometric properties with other random networks-metal grains, soap foams, bioconvection, and Langmuir monolayers.
Education Statistics Quarterly, Fall 2001.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2001-01-01
The publication gives a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products, and funding opportunities developed over a 3-month period. Each issue also contains a message from…
China's Statistical System and Resources
ERIC Educational Resources Information Center
Xue, Susan
2004-01-01
As the People's Republic of China plays an increasingly important role in international politics and trade, countries with economic interests there find they need to know more about this nation. Access to primary information sources, including official statistics from China, however, is very limited, as little exploration has been done into this…
A Simple Statistical Thermodynamics Experiment
ERIC Educational Resources Information Center
LoPresto, Michael C.
2010-01-01
Comparing the predicted and actual rolls of combinations of both two and three dice can help to introduce many of the basic concepts of statistical thermodynamics, including multiplicity, probability, microstates, and macrostates, and demonstrate that entropy is indeed a measure of randomness, that disordered states (those of higher entropy) are…
Education Statistics Quarterly, Winter 2001.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2002-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released in a 3-month period. Each issue also contains a message from the NCES on a timely…
Understanding Statistics Using Computer Demonstrations
ERIC Educational Resources Information Center
Dunn, Peter K.
2004-01-01
This paper discusses programs that clarify some statistical ideas often discussed yet poorly understood by students. The programs adopt the approach of demonstrating what is happening, rather than using the computer to do the work for the students (and hide the understanding). The programs demonstrate normal probability plots, overfitting of…
Inverting an Introductory Statistics Classroom
ERIC Educational Resources Information Center
Kraut, Gertrud L.
2015-01-01
The inverted classroom allows more in-class time for inquiry-based learning and for working through more advanced problem-solving activities than does the traditional lecture class. The skills acquired in this learning environment offer benefits far beyond the statistics classroom. This paper discusses four ways that can make the inverted…
GPS: Geometry, Probability, and Statistics
ERIC Educational Resources Information Center
Field, Mike
2012-01-01
It might be said that for most occupations there is now less of a need for mathematics than there was say fifty years ago. But, the author argues, geometry, probability, and statistics constitute essential knowledge for everyone. Maybe not the geometry of Euclid, but certainly geometrical ways of thinking that might enable us to describe the world…
Fit Indices Versus Test Statistics
ERIC Educational Resources Information Center
Yuan, Ke-Hai
2005-01-01
Model evaluation is one of the most important aspects of structural equation modeling (SEM). Many model fit indices have been developed. It is not an exaggeration to say that nearly every publication using the SEM methodology has reported at least one fit index. Most fit indices are defined through test statistics. Studies and interpretation of…
Introductory Statistics and Fish Management.
ERIC Educational Resources Information Center
Jardine, Dick
2002-01-01
Describes how fisheries research and management data (available on a website) have been incorporated into an Introductory Statistics course. In addition to the motivation gained from seeing the practical relevance of the course, some students have participated in the data collection and analysis for the New Hampshire Fish and Game Department. (MM)
What Price Statistical Tables Now?
ERIC Educational Resources Information Center
Hunt, Neville
1997-01-01
Describes the generation of all the tables required for school-level study of statistics using Microsoft's Excel spreadsheet package. Highlights cumulative binomial probabilities, cumulative Poisson probabilities, normal distribution, t-distribution, chi-squared distribution, F-distribution, random numbers, and accuracy. (JRH)
Teaching Statistics through Learning Projects
ERIC Educational Resources Information Center
Moreira da Silva, Mauren Porciúncula; Pinto, Suzi Samá
2014-01-01
This paper aims to reflect on the teaching of statistics through student research, in the form of projects carried out by students on self-selected topics. The paper reports on a study carried out with two undergraduate classes using a methodology of teaching that we call "learning projects." Monitoring the development of the various…
Education Statistics Quarterly, Summer 2001.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2001-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications and data products released during a 3-month period. Each issue also contains a message from the NCES on a…
American Youth: A Statistical Snapshot.
ERIC Educational Resources Information Center
Wetzel, James R.
This report presents and analyzes statistical data on the status and condition of American youth, ages 16-24. A brief commentary on the problems of collecting data concerning Hispanic youth precedes the report's seven main sections, which deal with the following topics: population; marriage; childbearing and living arrangements; family income and…
... Abroad Treatment Basic Statistics Get Tested Find an HIV testing site near you. Enter ZIP code or city Follow HIV/AIDS CDC HIV CDC HIV/AIDS See RSS | ... Collapse All How many people are diagnosed with HIV each year in the United States? In 2014, ...
Statistical Prediction in Proprietary Rehabilitation.
ERIC Educational Resources Information Center
Johnson, Kurt L.; And Others
1987-01-01
Applied statistical methods to predict case expenditures for low back pain rehabilitation cases in proprietary rehabilitation. Extracted predictor variables from case records of 175 workers compensation claimants with some degree of permanent disability due to back injury. Performed several multiple regression analyses resulting in a formula that…
American Youth: A Statistical Snapshot.
ERIC Educational Resources Information Center
Wetzel, James R.
This document presents a statistics snapshot of young people, aged 15 to 24 years. It provides a broad overview of trends documenting the direction of changes in social behavior and economic circumstances. The projected decline in the total number of youth from 43 million in 1980 to 35 million in 1995 will affect marriage and childbearing…
ERIC Educational Resources Information Center
Gordon, Sheldon P.; Gordon, Florence S.
2010-01-01
One of the most important applications of the definite integral in a modern calculus course is the mean value of a function. Thus, if a function "f" is defined on an interval ["a", "b"], then the mean, or average value, of "f" is given by [image omitted]. In this note, we will investigate the meaning of other statistics associated with a function…
Concept Maps in Introductory Statistics
ERIC Educational Resources Information Center
Witmer, Jeffrey A.
2016-01-01
Concept maps are tools for organizing thoughts on the main ideas in a course. I present an example of a concept map that was created through the work of students in an introductory class and discuss major topics in statistics and relationships among them.
Undergraduate experiments on statistical optics
NASA Astrophysics Data System (ADS)
Scholz, Ruediger; Friege, Gunnar; Weber, Kim-Alessandro
2016-09-01
Since the pioneering experiments of Forrester et al (1955 Phys. Rev. 99 1691) and Hanbury Brown and Twiss (1956 Nature 177 27; Nature 178 1046), along with the introduction of the laser in the 1960s, the systematic analysis of random fluctuations of optical fields has developed to become an indispensible part of physical optics for gaining insight into features of the fields. In 1985 Joseph W Goodman prefaced his textbook on statistical optics with a strong commitment to the ‘tools of probability and statistics’ (Goodman 2000 Statistical Optics (New York: John Wiley & Sons Inc.)) in the education of advanced optics. Since then a wide range of novel undergraduate optical counting experiments and corresponding pedagogical approaches have been introduced to underpin the rapid growth of the interest in coherence and photon statistics. We propose low cost experimental steps that are a fair way off ‘real’ quantum optics, but that give deep insight into random optical fluctuation phenomena: (1) the introduction of statistical methods into undergraduate university optical lab work, and (2) the connection between the photoelectrical signal and the characteristics of the light source. We describe three experiments and theoretical approaches which may be used to pave the way for a well balanced growth of knowledge, providing students with an opportunity to enhance their abilities to adapt the ‘tools of probability and statistics’.
Tsallis statistics and neurodegenerative disorders
NASA Astrophysics Data System (ADS)
Iliopoulos, Aggelos C.; Tsolaki, Magdalini; Aifantis, Elias C.
2016-08-01
In this paper, we perform statistical analysis of time series deriving from four neurodegenerative disorders, namely epilepsy, amyotrophic lateral sclerosis (ALS), Parkinson's disease (PD), Huntington's disease (HD). The time series are concerned with electroencephalograms (EEGs) of healthy and epileptic states, as well as gait dynamics (in particular stride intervals) of the ALS, PD and HDs. We study data concerning one subject for each neurodegenerative disorder and one healthy control. The analysis is based on Tsallis non-extensive statistical mechanics and in particular on the estimation of Tsallis q-triplet, namely {qstat, qsen, qrel}. The deviation of Tsallis q-triplet from unity indicates non-Gaussian statistics and long-range dependencies for all time series considered. In addition, the results reveal the efficiency of Tsallis statistics in capturing differences in brain dynamics between healthy and epileptic states, as well as differences between ALS, PD, HDs from healthy control subjects. The results indicate that estimations of Tsallis q-indices could be used as possible biomarkers, along with others, for improving classification and prediction of epileptic seizures, as well as for studying the gait complex dynamics of various diseases providing new insights into severity, medications and fall risk, improving therapeutic interventions.
The Statistical Handbook on Technology.
ERIC Educational Resources Information Center
Berinstein, Paula
This volume tells stories about the tools we use, but these narratives are told in numbers rather than in words. Organized by various aspects of society, each chapter uses tables and statistics to examine everything from budgets, costs, sales, trade, employment, patents, prices, usage, access and consumption. In each chapter, each major topic is…
Education Statistics Quarterly, Summer 2002.
ERIC Educational Resources Information Center
Dillow, Sally, Ed.
2002-01-01
This publication provides a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products, and funding opportunities developed over a 3-month period. Each issue also contains a message…
The ENSEMBLES Statistical Downscaling Portal
NASA Astrophysics Data System (ADS)
Cofino, Antonio S.; San-Martín, Daniel; Gutiérrez, Jose M.
2010-05-01
The demand for high-resolution seasonal and ACC predictions is continuously increasing due to the multiple end-user applications in a variety of sectors (hydrology, agronomy, energy, etc.) which require regional meteorological inputs. To fill the gap between the coarse-resolution grids used by global weather models and the regional needs of applications, a number of statistical downscaling techniques have been proposed. Statistical downscaling is a complex multi-disciplinary problem which requires a cascade of different scientific tools to access and process different sources of data, from GCM outputs to local observations and to run complex statistical algorithms. Thus, an end-to-end approach is needed in order to link the outputs of the ensemble prediction systems to a range of impact applications. To accomplish this task in an interactive and user-friendly form, a Web portal has been developed within the European ENSEMBLES project, integrating the necessary tools and providing the appropriate technology for distributed data access and computing. In this form, users can obtain their downscaled data testing and validating different statistical methods (from the categories weather typing, regression or weather generators) in a transparent form, not worrying about the details of the downscaling techniques and the data formats and access.
Statistics of premixed flame cells
Noever, D.A. )
1991-07-15
The statistics of random cellular patterns in premixed flames are analyzed. Agreement is found with a variety of topological relations previously found for other networks, namely, Lewis's law and Aboav's law. Despite the diverse underlying physics, flame cells are shown to share a broad class of geometric properties with other random networks---metal grains, soap foams, bioconvection, and Langmuir monolayers.
Statistical modelling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1991-01-01
During the six-month period from 1 April 1991 to 30 September 1991 the following research papers in statistical modeling of software reliability appeared: (1) A Nonparametric Software Reliability Growth Model; (2) On the Use and the Performance of Software Reliability Growth Models; (3) Research and Development Issues in Software Reliability Engineering; (4) Special Issues on Software; and (5) Software Reliability and Safety.
Astronomical Significance of Ancient Monuments
NASA Astrophysics Data System (ADS)
Simonia, I.
2011-06-01
Astronomical significance of Gokhnari megalithic monument (eastern Georgia) is considered. Possible connection of Amirani ancient legend with Gokhnari monument is discussed. Concepts of starry practicality and solar stations are proposed.
[Forensic significance of depressive syndromes].
Lammel, M
1987-10-01
The three chief problems arising when an expert opinion is to be given are dealt with in brief, and the forensic significance of the depressive syndrome is described, without entering into the question of giving an opinion as to responsibility.
Two statistical tests for meiotic breakpoint analysis.
Plaetke, R; Schachtel, G A
1995-01-01
Meiotic breakpoint analysis (BPA), a statistical method for ordering genetic markers, is increasing in importance as a method for building genetic maps of human chromosomes. Although BPA does not provide estimates of genetic distances between markers, it efficiently locates new markers on already defined dense maps, when likelihood analysis becomes cumbersome or the sample size is small. However, until now no assessments of statistical significance have been available for evaluating the possibility that the results of a BPA were produced by chance. In this paper, we propose two statistical tests to determine whether the size of a sample and its genetic information content are sufficient to distinguish between "no linkage" and "linkage" of a marker mapped by BPA to a certain region. Both tests are exact and should be conducted after a BPA has assigned the marker to an interval on the map. Applications of the new tests are demonstrated by three examples: (1) a synthetic data set, (2) a data set of five markers on human chromosome 8p, and (3) a data set of four markers on human chromosome 17q. PMID:7847387
Critical analysis of adsorption data statistically
NASA Astrophysics Data System (ADS)
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are <1, indicating favourable isotherms. Karl Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
Investigating statistical epistasis in complex disorders.
Turton, James C; Bullock, James; Medway, Christopher; Shi, Hui; Brown, Kristelle; Belbin, Olivia; Kalsheker, Noor; Carrasquillo, Minerva M; Dickson, Dennis W; Graff-Radford, Neill R; Petersen, Ronald C; Younkin, Steven G; Morgan, Kevin
2011-01-01
The missing heritability exhibited by late-onset Alzheimer's disease is unexplained and has been partly attributed to epistatic interaction. Methods available to explore this are often based on logistic regression and allow for determination of deviation from an expected outcome as a result of statistical epistasis. Three such methodologies including Synergy Factor and the PLINK modules, -epistasis and -fast-epistasis, were applied to study an epistatic interaction between interleukin-6 and interleukin-10. The models analyzed consisted of two synergistic interactions (SF ≈ 4.2 and 1.6) and two antagonistic interactions (SF ≈ 0.9 and 0.6). As with any statistical test, power to detect association is paramount; and most studies will be underpowered for the task. However, the availability of large sample sizes through genome-wide association studies make it feasible to examine approaches for determining epistatic interactions. This study documents the sample sizes needed to achieve a statistically significant outcome from each of the methods examined and discusses the limitations/advantages of the chosen approaches.
van den Bos, Ruud; Harteveld, Marlies; Stoop, Hein
2009-11-01
Acutely elevated levels of cortisol are associated with euphoria and reward-like properties related to sensation-seeking behaviour. Thus, acute stress and elevated levels of cortisol may promote risk-taking behaviour. High cortisol responders are more sensitive to immediate rewards than low cortisol responders. In this study we therefore tested whether acute stress in male and female subjects, induced by the Trier Social Stress Test (TSST), affects decision-making as measured by the Iowa Gambling Task (IGT) and to what extent this is related to cortisol reactivity. Control subjects did not receive the stress manipulation. We specifically predict that high responders show risk-taking behaviour in the IGT compared to low responders and controls. The data show that the more (salivary) cortisol levels are elevated after the TSST the poorer the subsequent performance in the IGT in male subjects. In female subjects an inverse relationship between cortisol levels and IGT performance is observed: slightly elevated levels of cortisol after the TSST improve IGT performance, while highly elevated levels decrease IGT performance. Thus, acute stress as induced by the TSST affects decision-making behaviour of men and women differently and cortisol reactivity is associated with decision-making performance. PMID:19497677
Salcedo, Guillermo; Cano-Sánchez, Patricia; de Gómez-Puyou, Marietta Tuena; Velázquez-Campoy, Adrián; García-Hernández, Enrique
2014-01-01
The function of F1-ATPase relies critically on the intrinsic ability of its catalytic and noncatalytic subunits to interact with nucleotides. Therefore, the study of isolated subunits represents an opportunity to dissect elementary energetic contributions that drive the enzyme's rotary mechanism. In this study we have calorimetrically characterized the association of adenosine nucleotides to the isolated noncatalytic α-subunit. The resulting recognition behavior was compared with that previously reported for the isolated catalytic β-subunit (N.O. Pulido, G. Salcedo, G. Pérez-Hernández, C. José-Núñez, A. Velázquez-Campoy, E. García-Hernández, Energetic effects of magnesium in the recognition of adenosine nucleotides by the F1-ATPase β subunit, Biochemistry 49 (2010) 5258-5268). The two subunits exhibit nucleotide-binding thermodynamic signatures similar to each other, characterized by enthalpically-driven affinities in the μM range. Nevertheless, contrary to the catalytic subunit that recognizes MgATP and MgADP with comparable strength, the noncatalytic subunit much prefers the triphosphate nucleotide. Besides, the α-subunit depends more on Mg(II) for stabilizing the interaction with ATP, while both subunits are rather metal-independent for ADP recognition. These binding behaviors are discussed in terms of the properties that the two subunits exhibit in the whole enzyme.
Introductory Statistics: Questions, Content, and Approach.
ERIC Educational Resources Information Center
Weaver, Frederick Stirton
1989-01-01
An introductory statistics course is a common requirement for undergraduate economics, psychology, and sociology majors. An approach to statistics that involves the effort to encourage habits of systematic, critical quantitative thinking through focusing on descriptive statistics is discussed. (MLW)
Ideas for Effective Communication of Statistical Results
Anderson-Cook, Christine M.
2015-03-01
Effective presentation of statistical results to those with less statistical training, including managers and decision-makers requires planning, anticipation and thoughtful delivery. Here are several recommendations for effectively presenting statistical results.
Societal Statistics by virtue of the Statistical Drake Equation
NASA Astrophysics Data System (ADS)
Maccone, Claudio
2012-09-01
The Drake equation, first proposed by Frank D. Drake in 1961, is the foundational equation of SETI. It yields an estimate of the number N of extraterrestrial communicating civilizations in the Galaxy given by the product N=Ns×fp×ne×fl×fi×fc×fL, where: Ns is the number of stars in the Milky Way Galaxy; fp is the fraction of stars that have planetary systems; ne is the number of planets in a given system that are ecologically suitable for life; fl is the fraction of otherwise suitable planets on which life actually arises; fi is the fraction of inhabited planets on which an intelligent form of life evolves; fc is the fraction of planets inhabited by intelligent beings on which a communicative technical civilization develops; and fL is the fraction of planetary lifetime graced by a technical civilization. The first three terms may be called "the astrophysical terms" in the Drake equation since their numerical value is provided by astrophysical considerations. The fourth term, fl, may be called "the origin-of-life term" and entails biology. The last three terms may be called "the societal terms" inasmuch as their respective numerical values are provided by anthropology, telecommunication science and "futuristic science", respectively. In this paper, we seek to provide a statistical estimate of the three societal terms in the Drake equation basing our calculations on the Statistical Drake Equation first proposed by this author at the 2008 IAC. In that paper the author extended the simple 7-factor product so as to embody Statistics. He proved that, no matter which probability distribution may be assigned to each factor, if the number of factors tends to infinity, then the random variable N follows the lognormal distribution (central limit theorem of Statistics). This author also proved at the 2009 IAC that the Dole (1964) [7] equation, yielding the number of Habitable Planets for Man in the Galaxy, has the same mathematical structure as the Drake equation. So the
Significance testing testate amoeba water table reconstructions
NASA Astrophysics Data System (ADS)
Payne, Richard J.; Babeshko, Kirill V.; van Bellen, Simon; Blackford, Jeffrey J.; Booth, Robert K.; Charman, Dan J.; Ellershaw, Megan R.; Gilbert, Daniel; Hughes, Paul D. M.; Jassey, Vincent E. J.; Lamentowicz, Łukasz; Lamentowicz, Mariusz; Malysheva, Elena A.; Mauquoy, Dmitri; Mazei, Yuri; Mitchell, Edward A. D.; Swindles, Graeme T.; Tsyganov, Andrey N.; Turner, T. Edward; Telford, Richard J.
2016-04-01
Transfer functions are valuable tools in palaeoecology, but their output may not always be meaningful. A recently-developed statistical test ('randomTF') offers the potential to distinguish among reconstructions which are more likely to be useful, and those less so. We applied this test to a large number of reconstructions of peatland water table depth based on testate amoebae. Contrary to our expectations, a substantial majority (25 of 30) of these reconstructions gave non-significant results (P > 0.05). The underlying reasons for this outcome are unclear. We found no significant correlation between randomTF P-value and transfer function performance, the properties of the training set and reconstruction, or measures of transfer function fit. These results give cause for concern but we believe it would be extremely premature to discount the results of non-significant reconstructions. We stress the need for more critical assessment of transfer function output, replication of results and ecologically-informed interpretation of palaeoecological data.
Statistical phenomena in particle beams
Bisognano, J.J.
1984-09-01
Particle beams are subject to a variety of apparently distinct statistical phenomena such as intrabeam scattering, stochastic cooling, electron cooling, coherent instabilities, and radiofrequency noise diffusion. In fact, both the physics and mathematical description of these mechanisms are quite similar, with the notion of correlation as a powerful unifying principle. In this presentation we will attempt to provide both a physical and a mathematical basis for understanding the wide range of statistical phenomena that have been discussed. In the course of this study the tools of the trade will be introduced, e.g., the Vlasov and Fokker-Planck equations, noise theory, correlation functions, and beam transfer functions. Although a major concern will be to provide equations for analyzing machine design, the primary goal is to introduce a basic set of physical concepts having a very broad range of applicability.
Universal Grammar, statistics or both?
Yang, Charles D
2004-10-01
Recent demonstrations of statistical learning in infants have reinvigorated the innateness versus learning debate in language acquisition. This article addresses these issues from both computational and developmental perspectives. First, I argue that statistical learning using transitional probabilities cannot reliably segment words when scaled to a realistic setting (e.g. child-directed English). To be successful, it must be constrained by knowledge of phonological structure. Then, turning to the bona fide theory of innateness--the Principles and Parameters framework--I argue that a full explanation of children's grammar development must abandon the domain-specific learning model of triggering, in favor of probabilistic learning mechanisms that might be domain-general but nevertheless operate in the domain-specific space of syntactic parameters.
Parallel contingency statistics with Titan.
Thompson, David C.; Pebay, Philippe Pierre
2009-09-01
This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized contingency statistics engine. It is a sequel to [PT08] and [BPRT09] which studied the parallel descriptive, correlative, multi-correlative, and principal component analysis engines. The ease of use of this new parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; however, the very nature of contingency tables prevent this new engine from exhibiting optimal parallel speed-up as the aforementioned engines do. This report therefore discusses the design trade-offs we made and study performance with up to 200 processors.
Status and Significance of Credentialing.
ERIC Educational Resources Information Center
Musgrave, Dorothea
1984-01-01
Discusses the current status, significance, and future of credentialing in the field of environmental health. Also discusses four phases of a Bureau of Health Professions (BHP) Credentialing Program and BHP-funded projects related to their development and implementation. Phases include role delineation, resources development, examination…
Statistical Perspectives on Stratospheric Transport
NASA Technical Reports Server (NTRS)
Sparling, L. C.
1999-01-01
Long-lived tropospheric source gases, such as nitrous oxide, enter the stratosphere through the tropical tropopause, are transported throughout the stratosphere by the Brewer-Dobson circulation, and are photochemically destroyed in the upper stratosphere. These chemical constituents, or "tracers" can be used to track mixing and transport by the stratospheric winds. Much of our understanding about the stratospheric circulation is based on large scale gradients and other spatial features in tracer fields constructed from satellite measurements. The point of view presented in this paper is different, but complementary, in that transport is described in terms of tracer probability distribution functions (PDFs). The PDF is computed from the measurements, and is proportional to the area occupied by tracer values in a given range. The flavor of this paper is tutorial, and the ideas are illustrated with several examples of transport-related phenomena, annotated with remarks that summarize the main point or suggest new directions. One example shows how the multimodal shape of the PDF gives information about the different branches of the circulation. Another example shows how the statistics of fluctuations from the most probable tracer value give insight into mixing between different regions of the atmosphere. Also included is an analysis of the time-dependence of the PDF during the onset and decline of the winter circulation, and a study of how "bursts" in the circulation are reflected in transient periods of rapid evolution of the PDF. The dependence of the statistics on location and time are also shown to be important for practical problems related to statistical robustness and satellite sampling. The examples illustrate how physically-based statistical analysis can shed some light on aspects of stratospheric transport that may not be obvious or quantifiable with other types of analyses. An important motivation for the work presented here is the need for synthesis of the
[Pro Familia statistics for 1974].
1975-09-01
Statistics for 1974 for the West German family planning organization Pro Familia are reported. 56 offices are now operating, and 23,726 clients were seen. Men were seen more frequently than previously. 10,000 telephone calls were also handled. 16-25 year olds were increasingly represented in the clientele, as were unmarried persons of all ages. 1,242 patients were referred to physicians or clinics for clinical diagnosis.
Statistical Description of Associative Memory
NASA Astrophysics Data System (ADS)
Samengo, Inés
2003-03-01
The storage of memories, in the brain, induces some kind of modification in the structural and functional properties of a neural network. Here, a few neuropsychological and neurophysiological experiments are reviewed, suggesting that the plastic changes taking place during memory storage are governed, among other things, by the correlations in the activity of a set of neurons. The Hopfield model is briefly described, showing the way the methods of statistical physics can be useful to describe the storage and retrieval of memories.
Leadership statistics in random structures
NASA Astrophysics Data System (ADS)
Ben-Naim, E.; Krapivsky, P. L.
2004-01-01
The largest component ("the leader") in evolving random structures often exhibits universal statistical properties. This phenomenon is demonstrated analytically for two ubiquitous structures: random trees and random graphs. In both cases, lead changes are rare as the average number of lead changes increases quadratically with logarithm of the system size. As a function of time, the number of lead changes is self-similar. Additionally, the probability that no lead change ever occurs decays exponentially with the average number of lead changes.
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
The natural statistics of blur.
Sprague, William W; Cooper, Emily A; Reissier, Sylvain; Yellapragada, Baladitya; Banks, Martin S
2016-08-01
Blur from defocus can be both useful and detrimental for visual perception: It can be useful as a source of depth information and detrimental because it degrades image quality. We examined these aspects of blur by measuring the natural statistics of defocus blur across the visual field. Participants wore an eye-and-scene tracker that measured gaze direction, pupil diameter, and scene distances as they performed everyday tasks. We found that blur magnitude increases with increasing eccentricity. There is a vertical gradient in the distances that generate defocus blur: Blur below the fovea is generally due to scene points nearer than fixation; blur above the fovea is mostly due to points farther than fixation. There is no systematic horizontal gradient. Large blurs are generally caused by points farther rather than nearer than fixation. Consistent with the statistics, participants in a perceptual experiment perceived vertical blur gradients as slanted top-back whereas horizontal gradients were perceived equally as left-back and right-back. The tendency for people to see sharp as near and blurred as far is also consistent with the observed statistics. We calculated how many observations will be perceived as unsharp and found that perceptible blur is rare. Finally, we found that eye shape in ground-dwelling animals conforms to that required to put likely distances in best focus.
The natural statistics of blur.
Sprague, William W; Cooper, Emily A; Reissier, Sylvain; Yellapragada, Baladitya; Banks, Martin S
2016-08-01
Blur from defocus can be both useful and detrimental for visual perception: It can be useful as a source of depth information and detrimental because it degrades image quality. We examined these aspects of blur by measuring the natural statistics of defocus blur across the visual field. Participants wore an eye-and-scene tracker that measured gaze direction, pupil diameter, and scene distances as they performed everyday tasks. We found that blur magnitude increases with increasing eccentricity. There is a vertical gradient in the distances that generate defocus blur: Blur below the fovea is generally due to scene points nearer than fixation; blur above the fovea is mostly due to points farther than fixation. There is no systematic horizontal gradient. Large blurs are generally caused by points farther rather than nearer than fixation. Consistent with the statistics, participants in a perceptual experiment perceived vertical blur gradients as slanted top-back whereas horizontal gradients were perceived equally as left-back and right-back. The tendency for people to see sharp as near and blurred as far is also consistent with the observed statistics. We calculated how many observations will be perceived as unsharp and found that perceptible blur is rare. Finally, we found that eye shape in ground-dwelling animals conforms to that required to put likely distances in best focus. PMID:27580043
Statistical analysis of planetary surfaces
NASA Astrophysics Data System (ADS)
Schmidt, Frederic; Landais, Francois; Lovejoy, Shaun
2015-04-01
In the last decades, a huge amount of topographic data has been obtained by several techniques (laser and radar altimetry, DTM…) for different bodies in the solar system, including Earth, Mars, the Moon etc.. In each case, topographic fields exhibit an extremely high variability with details at each scale, from millimeter to thousands of kilometers. This complexity seems to prohibit global descriptions or global topography models. Nevertheless, this topographic complexity is well-known to exhibit scaling laws that establish a similarity between scales and permit simpler descriptions and models. Indeed, efficient simulations can be made using the statistical properties of scaling fields (fractals). But realistic simulations of global topographic fields must be multi (not mono) scaling behaviour, reflecting the extreme variability and intermittency observed in real fields that can not be generated by simple scaling models. A multiscaling theory has been developed in order to model high variability and intermittency. This theory is a good statistical candidate to model the topography field with a limited number of parameters (called the multifractal parameters). In our study, we show that statistical properties of the Martian topography is accurately reproduced by this model, leading to new interpretation of geomorphological processes.
Environmental statistics and optimal regulation
NASA Astrophysics Data System (ADS)
Sivak, David; Thomson, Matt
2015-03-01
The precision with which an organism can detect its environment, and the timescale for and statistics of environmental change, will affect the suitability of different strategies for regulating protein levels in response to environmental inputs. We propose a general framework--here applied to the enzymatic regulation of metabolism in response to changing nutrient concentrations--to predict the optimal regulatory strategy given the statistics of fluctuations in the environment and measurement apparatus, and the costs associated with enzyme production. We find: (i) relative convexity of enzyme expression cost and benefit influences the fitness of thresholding or graded responses; (ii) intermediate levels of measurement uncertainty call for a sophisticated Bayesian decision rule; and (iii) in dynamic contexts, intermediate levels of uncertainty call for retaining memory of the past. Statistical properties of the environment, such as variability and correlation times, set optimal biochemical parameters, such as thresholds and decay rates in signaling pathways. Our framework provides a theoretical basis for interpreting molecular signal processing algorithms and a classification scheme that organizes known regulatory strategies and may help conceptualize heretofore unknown ones.
Statistical ring current of Saturn
NASA Astrophysics Data System (ADS)
Carbary, J. F.; Achilleos, N.; Arridge, C. S.
2012-06-01
The statistical ring current of Saturn has been determined from the curl of the median magnetic field derived from over 5 years of observations of the Cassini magnetometer. The main issue addressed here is the calculation of the statistical ring current of Saturn by directly computing, for the first time, the symmetrical part of the ring current J from the Maxwell equation ∇ × B = μ0J from assembling the perturbation magnetic field B from 2004 through 2010. This study validates previous studies, based on fewer data and not using ∇ × B, and shows that the ring current flows eastward (in the +ϕ or corotation direction) and extends from ˜3 RS to at least ˜20 RS (1 RS = 60,268 km), which is the vicinity of the dayside magnetopause; that the ring current has a peak strength of ˜75 pA/m2 at ˜9.5 RS; and that the ring current has a half-width of ˜1.5 RS. Two outcomes of this study are that the ring current bends northward, as suggested by the “bowl” model of Saturn's plasma sheet, and that the total ring current is 9.2 ± 1.0 MA. In the context of future endeavors, the statistical ring current presented here can be used for calculations of the magnetic field of Saturn for particle drifts, field line mapping, and J × B force.
Statistical theory of Internet exploration
NASA Astrophysics Data System (ADS)
Dall'Asta, Luca; Alvarez-Hamelin, Ignacio; Barrat, Alain; Vázquez, Alexei; Vespignani, Alessandro
2005-03-01
The general methodology used to construct Internet maps consists in merging all the discovered paths obtained by sending data packets from a set of active computers to a set of destination hosts, obtaining a graphlike representation of the network. This technique, sometimes referred to as Internet tomography, spurs the issue concerning the statistical reliability of such empirical maps. We tackle this problem by modeling the network sampling process on synthetic graphs and by using a mean-field approximation to obtain expressions for the probability of edge and vertex detection in the sampled graph. This allows a general understanding of the origin of possible sampling biases. In particular, we find a direct dependence of the map statistical accuracy upon the topological properties (in particular, the betweenness centrality property) of the underlying network. In this framework, it appears that statistically heterogeneous network topologies are captured better than the homogeneous ones during the mapping process. Finally, the analytical discussion is complemented with a thorough numerical investigation of simulated mapping strategies in network models with varying topological properties.
Statistical mechanics of economics I
NASA Astrophysics Data System (ADS)
Kusmartsev, F. V.
2011-02-01
We show that statistical mechanics is useful in the description of financial crisis and economics. Taking a large amount of instant snapshots of a market over an interval of time we construct their ensembles and study their statistical interference. This results in a probability description of the market and gives capital, money, income, wealth and debt distributions, which in the most cases takes the form of the Bose-Einstein distribution. In addition, statistical mechanics provides the main market equations and laws which govern the correlations between the amount of money, debt, product, prices and number of retailers. We applied the found relations to a study of the evolution of the economics in USA between the years 1996 to 2008 and observe that over that time the income of a major population is well described by the Bose-Einstein distribution which parameters are different for each year. Each financial crisis corresponds to a peak in the absolute activity coefficient. The analysis correctly indicates the past crises and predicts the future one.
The natural statistics of blur
Sprague, William W.; Cooper, Emily A.; Reissier, Sylvain; Yellapragada, Baladitya; Banks, Martin S.
2016-01-01
Blur from defocus can be both useful and detrimental for visual perception: It can be useful as a source of depth information and detrimental because it degrades image quality. We examined these aspects of blur by measuring the natural statistics of defocus blur across the visual field. Participants wore an eye-and-scene tracker that measured gaze direction, pupil diameter, and scene distances as they performed everyday tasks. We found that blur magnitude increases with increasing eccentricity. There is a vertical gradient in the distances that generate defocus blur: Blur below the fovea is generally due to scene points nearer than fixation; blur above the fovea is mostly due to points farther than fixation. There is no systematic horizontal gradient. Large blurs are generally caused by points farther rather than nearer than fixation. Consistent with the statistics, participants in a perceptual experiment perceived vertical blur gradients as slanted top-back whereas horizontal gradients were perceived equally as left-back and right-back. The tendency for people to see sharp as near and blurred as far is also consistent with the observed statistics. We calculated how many observations will be perceived as unsharp and found that perceptible blur is rare. Finally, we found that eye shape in ground-dwelling animals conforms to that required to put likely distances in best focus. PMID:27580043
Statistical Behavior of Filamentary Plasmas
NASA Astrophysics Data System (ADS)
Kinney, Rodney Michael
This work describes a study of plasmas with highly intermittent filamentary structures. A statistical model of two-dimensional magnetohydrodynamics is presented, based on a representation of the fluid as a collection of discrete current-vorticity concentrations. This approach is modeled after discrete vortex models of hydrodynamical turbulence, which cannot be expected in general to produce results identical to a theory based on a Fourier decomposition of the fields. In a highly intermittent plasma, the induction force is small compared to the convective motion, and when this force is neglected, the plasma vortex system is described by a Hamiltonian. Canonical and micro-canonical statistical calculations show that both the vorticity and the current may exhibit large-scale structure, and the expected states revert to known hydrodynamical states as the magnetic field vanishes. These results differ from previous Fourier-based statistical theories, but it is found that when the filament calculation is expanded to include the inductive force, the results approach the Fourier equilibria in the low -temperature limit, and the previous Hamiltonian plasma vortex results in the high-temperature limit. Numerical simulations of a large number of filaments are carried out and support the theory. A three-dimensional vortex model is outlined as well, which is also Hamiltonian when the inductive force is neglected. A statistical calculation in the canonical ensemble and numerical simulations show that a non-zero large-scale magnetic field is statistically favored, and that the preferred shape of this field is a long, thin tube of flux. In a tokamak, a stochastic magnetic field will give rise to strongly filamented current distributions. An external magnetic field possesses field lines described by a non-linear map, while current fluctuations along these field lines have a toroidal dependence which takes the same form as the time dependence of a system of hydrodynamical vortices
What can we learn from noise? - Mesoscopic nonequilibrium statistical physics.
Kobayashi, Kensuke
2016-01-01
Mesoscopic systems - small electric circuits working in quantum regime - offer us a unique experimental stage to explorer quantum transport in a tunable and precise way. The purpose of this Review is to show how they can contribute to statistical physics. We introduce the significance of fluctuation, or equivalently noise, as noise measurement enables us to address the fundamental aspects of a physical system. The significance of the fluctuation theorem (FT) in statistical physics is noted. We explain what information can be deduced from the current noise measurement in mesoscopic systems. As an important application of the noise measurement to statistical physics, we describe our experimental work on the current and current noise in an electron interferometer, which is the first experimental test of FT in quantum regime. Our attempt will shed new light in the research field of mesoscopic quantum statistical physics. PMID:27477456
[Submitting studies without significant results].
Texier, Gaëtan; Meynard, Jean-Baptiste; Michel, Rémy; Migliani, René; Boutin, Jean-Paul
2007-03-01
When a study finds that no exposure factor or therapy is significantly related to a given effect, researchers legitimately wonder if the results should be submitted for publication and to what journal. Clinical trials that report significant associations have a higher probability of publication, a phenomenon known as selective publication. The principal reasons of this selective publication include author self-censorship, peer-reviewing, trials not intended for publication, interpretation of the p value, cost of journal subscriptions, and policies. Subsequent reviews and meta-analyses are biased by the unavailability of nonsignificant results. Suggestions for preventing this risk include university training, trial registries, an international standard randomised controlled trial number (ISRCTN), Cochrane collaboration, and the gray literature. Journals (including electronic journals) interested in studies with nonsignificant results are listed. New technologies are changing the relations between publishers, libraries, authors and readers. PMID:17287106
Further developments in cloud statistics for computer simulations
NASA Technical Reports Server (NTRS)
Chang, D. T.; Willand, J. H.
1972-01-01
This study is a part of NASA's continued program to provide global statistics of cloud parameters for computer simulation. The primary emphasis was on the development of the data bank of the global statistical distributions of cloud types and cloud layers and their applications in the simulation of the vertical distributions of in-cloud parameters such as liquid water content. These statistics were compiled from actual surface observations as recorded in Standard WBAN forms. Data for a total of 19 stations were obtained and reduced. These stations were selected to be representative of the 19 primary cloud climatological regions defined in previous studies of cloud statistics. Using the data compiled in this study, a limited study was conducted of the hemogeneity of cloud regions, the latitudinal dependence of cloud-type distributions, the dependence of these statistics on sample size, and other factors in the statistics which are of significance to the problem of simulation. The application of the statistics in cloud simulation was investigated. In particular, the inclusion of the new statistics in an expanded multi-step Monte Carlo simulation scheme is suggested and briefly outlined.
Insights into Corona Formation Through Statistical Analyses
NASA Technical Reports Server (NTRS)
Glaze, L. S.; Stofan, E. R.; Smrekar, S. E.; Baloga, S. M.
2002-01-01
Statistical analysis of an expanded database of coronae on Venus indicates that the populations of Type 1 (with fracture annuli) and 2 (without fracture annuli) corona diameters are statistically indistinguishable, and therefore we have no basis for assuming different formation mechanisms. Analysis of the topography and diameters of coronae shows that coronae that are depressions, rimmed depressions, and domes tend to be significantly smaller than those that are plateaus, rimmed plateaus, or domes with surrounding rims. This is consistent with the model of Smrekar and Stofan and inconsistent with predictions of the spreading drop model of Koch and Munga. The diameter range for domes, the initial stage of corona formation, provides a broad constraint on the buoyancy of corona-forming plumes. Coronae are only slightly more likely to be topographically raised than depressions, with Type 1 coronae most frequently occurring as rimmed depressions and Type 2 coronae most frequently occurring with flat interiors and raised rims. Most Type 1 coronae are located along chasmata systems or fracture belts, while Type 2 coronae are found predominantly as isolated features in the plains. Coronae at hot spot rises tend to be significantly lager than coronae in other settings, consistent with a hotter upper mantle at hot spot rises and their active state.
Statistical methods of estimating mining costs
Long, K.R.
2011-01-01
Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.
Insights into Corona Formation through Statistical Analyses
NASA Technical Reports Server (NTRS)
Glaze, L. S.; Stofan, E. R.; Smrekar, S. E.; Baloga, S. M.
2002-01-01
Statistical analysis of an expanded database of coronae on Venus indicates that the populations of Type 1 (with fracture annuli) and 2 (without fracture annuli) corona diameters are statistically indistinguishable, and therefore we have no basis for assuming different formation mechanisms. Analysis of the topography and diameters of coronae shows that coronae that are depressions, rimmed depressions, and domes tend to be significantly smaller than those that are plateaus, rimmed plateaus, or domes with surrounding rims. This is consistent with the model of Smrekar and Stofan and inconsistent with predictions of the spreading drop model of Koch and Manga. The diameter range for domes, the initial stage of corona formation, provides a broad constraint on the buoyancy of corona-forming plumes. Coronae are only slightly more likely to be topographically raised than depressions, with Type 1 coronae most frequently occurring as rimmed depressions and Type 2 coronae most frequently occuring with flat interiors and raised rims. Most Type 1 coronae are located along chasmata systems or fracture belts, while Type 2 coronas are found predominantly as isolated features in the plains. Coronae at hotspot rises tend to be significantly larger than coronae in other settings, consistent with a hotter upper mantle at hotspot rises and their active state.
Statistical anisotropies in gravitational waves in solid inflation
Akhshik, Mohammad; Emami, Razieh; Firouzjahi, Hassan; Wang, Yi E-mail: emami@ipm.ir E-mail: yw366@cam.ac.uk
2014-09-01
Solid inflation can support a long period of anisotropic inflation. We calculate the statistical anisotropies in the scalar and tensor power spectra and their cross-correlation in anisotropic solid inflation. The tensor-scalar cross-correlation can either be positive or negative, which impacts the statistical anisotropies of the TT and TB spectra in CMB map more significantly compared with the tensor self-correlation. The tensor power spectrum contains potentially comparable contributions from quadrupole and octopole angular patterns, which is different from the power spectra of scalar, the cross-correlation or the scalar bispectrum, where the quadrupole type statistical anisotropy dominates over octopole.
Statistical limitations in functional neuroimaging. II. Signal detection and statistical inference.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
The field of functional neuroimaging (FNI) methodology has developed into a mature but evolving area of knowledge and its applications have been extensive. A general problem in the analysis of FNI data is finding a signal embedded in noise. This is sometimes called signal detection. Signal detection theory focuses in general on issues relating to the optimization of conditions for separating the signal from noise. When methods from probability theory and mathematical statistics are directly applied in this procedure it is also called statistical inference. In this paper we briefly discuss some aspects of signal detection theory relevant to FNI and, in addition, some common approaches to statistical inference used in FNI. Low-pass filtering in relation to functional-anatomical variability and some effects of filtering on signal detection of interest to FNI are discussed. Also, some general aspects of hypothesis testing and statistical inference are discussed. This includes the need for characterizing the signal in data when the null hypothesis is rejected, the problem of multiple comparisons that is central to FNI data analysis, omnibus tests and some issues related to statistical power in the context of FNI. In turn, random field, scale space, non-parametric and Monte Carlo approaches are reviewed, representing the most common approaches to statistical inference used in FNI. Complementary to these issues an overview and discussion of non-inferential descriptive methods, common statistical models and the problem of model selection is given in a companion paper. In general, model selection is an important prelude to subsequent statistical inference. The emphasis in both papers is on the assumptions and inherent limitations of the methods presented. Most of the methods described here generally serve their purposes well when the inherent assumptions and limitations are taken into account. Significant differences in results between different methods are most apparent in
The use and misuse of statistics in space physics
NASA Technical Reports Server (NTRS)
Reiff, Patricia H.
1990-01-01
This paper presents several statistical techniques most commonly used in space physics, including Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density and superimposed epoch analysis, and presents tests to assess the significance of the results. New techniques such as bootstrapping and jackknifing are presented. When no test of significance is in common usage, a plausible test is suggested.