Liu, Yang; Vijver, Martina G; Qiu, Hao; Baas, Jan; Peijnenburg, Willie J G M
2015-12-01
There is increasing attention from scientists and policy makers to the joint effects of multiple metals on organisms when present in a mixture. Using root elongation of lettuce (Lactuca sativa L.) as a toxicity endpoint, the combined effects of binary mixtures of Cu, Cd, and Ni were studied. The statistical MixTox model was used to search deviations from the reference models i.e. concentration addition (CA) and independent action (IA). The deviations were subsequently interpreted as 'interactions'. A comprehensive experiment was designed to test the reproducibility of the 'interactions'. The results showed that the toxicity of binary metal mixtures was equally well predicted by both reference models. We found statistically significant 'interactions' in four of the five total datasets. However, the patterns of 'interactions' were found to be inconsistent or even contradictory across the different independent experiments. It is recommended that a statistically significant 'interaction', must be treated with care and is not necessarily biologically relevant. Searching a statistically significant interaction can be the starting point for further measurements and modeling to advance the understanding of underlying mechanisms and non-additive interactions occurring inside the organisms. PMID:26188643
Statistical Significance Testing.
ERIC Educational Resources Information Center
McLean, James E., Ed.; Kaufman, Alan S., Ed.
1998-01-01
The controversy about the use or misuse of statistical significance testing has become the major methodological issue in educational research. This special issue contains three articles that explore the controversy, three commentaries on these articles, an overall response, and three rejoinders by the first three authors. They are: (1)…
Lack of Statistical Significance
ERIC Educational Resources Information Center
Kehle, Thomas J.; Bray, Melissa A.; Chafouleas, Sandra M.; Kawano, Takuji
2007-01-01
Criticism has been leveled against the use of statistical significance testing (SST) in many disciplines. However, the field of school psychology has been largely devoid of critiques of SST. Inspection of the primary journals in school psychology indicated numerous examples of SST with nonrandom samples and/or samples of convenience. In this…
Statistical or biological significance?
Saxon, Emma
2015-01-01
Oat plants grown at an agricultural research facility produce higher yields in Field 1 than in Field 2, under well fertilised conditions and with similar weather exposure; all oat plants in both fields are healthy and show no sign of disease. In this study, the authors hypothesised that the soil microbial community might be different in each field, and these differences might explain the difference in oat plant growth. They carried out a metagenomic analysis of the 16 s ribosomal 'signature' sequences from bacteria in 50 randomly located soil samples in each field to determine the composition of the bacterial community. The study identified >1000 species, most of which were present in both fields. The authors identified two plant growth-promoting species that were significantly reduced in soil from Field 2 (Student's t-test P < 0.05), and concluded that these species might have contributed to reduced yield. PMID:26541972
Statistically significant relational data mining :
Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.
2014-02-01
This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.
Significant results: statistical or clinical?
2016-01-01
The null hypothesis significance test method is popular in biological and medical research. Many researchers have used this method for their research without exact knowledge, though it has both merits and shortcomings. Readers will know its shortcomings, as well as several complementary or alternative methods, as such the estimated effect size and the confidence interval. PMID:27066201
Statistical significance of the gallium anomaly
Giunti, Carlo; Laveder, Marco
2011-06-15
We calculate the statistical significance of the anomalous deficit of electron neutrinos measured in the radioactive source experiments of the GALLEX and SAGE solar neutrino detectors, taking into account the uncertainty of the detection cross section. We found that the statistical significance of the anomaly is {approx}3.0{sigma}. A fit of the data in terms of neutrino oscillations favors at {approx}2.7{sigma} short-baseline electron neutrino disappearance with respect to the null hypothesis of no oscillations.
Statistical Significance vs. Practical Significance: An Exploration through Health Education
ERIC Educational Resources Information Center
Rosen, Brittany L.; DeMaria, Andrea L.
2012-01-01
The purpose of this paper is to examine the differences between statistical and practical significance, including strengths and criticisms of both methods, as well as provide information surrounding the application of various effect sizes and confidence intervals within health education research. Provided are recommendations, explanations and…
Comments on the Statistical Significance Testing Articles.
ERIC Educational Resources Information Center
Knapp, Thomas R.
1998-01-01
Expresses a "middle-of-the-road" position on statistical significance testing, suggesting that it has its place but that confidence intervals are generally more useful. Identifies 10 errors of omission or commission in the papers reviewed that weaken the positions taken in their discussions. (SLD)
Statistical significance of normalized global alignment.
Peris, Guillermo; Marzal, Andrés
2014-03-01
The comparison of homologous proteins from different species is a first step toward a function assignment and a reconstruction of the species evolution. Though local alignment is mostly used for this purpose, global alignment is important for constructing multiple alignments or phylogenetic trees. However, statistical significance of global alignments is not completely clear, lacking a specific statistical model to describe alignments or depending on computationally expensive methods like Z-score. Recently we presented a normalized global alignment, defined as the best compromise between global alignment cost and length, and showed that this new technique led to better classification results than Z-score at a much lower computational cost. However, it is necessary to analyze the statistical significance of the normalized global alignment in order to be considered a completely functional algorithm for protein alignment. Experiments with unrelated proteins extracted from the SCOP ASTRAL database showed that normalized global alignment scores can be fitted to a log-normal distribution. This fact, obtained without any theoretical support, can be used to derive statistical significance of normalized global alignments. Results are summarized in a table with fitted parameters for different scoring schemes. PMID:24400820
Assessing the statistical significance of periodogram peaks
NASA Astrophysics Data System (ADS)
Baluev, R. V.
2008-04-01
The least-squares (or Lomb-Scargle) periodogram is a powerful tool that is routinely used in many branches of astronomy to search for periodicities in observational data. The problem of assessing the statistical significance of candidate periodicities for a number of periodograms is considered. Based on results in extreme value theory, improved analytic estimations of false alarm probabilities are given. These include an upper limit to the false alarm probability (or a lower limit to the significance). The estimations are tested numerically in order to establish regions of their practical applicability.
Social significance of community structure: Statistical view
NASA Astrophysics Data System (ADS)
Li, Hui-Jia; Daniels, Jasmine J.
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p -value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc.
Social significance of community structure: statistical view.
Li, Hui-Jia; Daniels, Jasmine J
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p-value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc. PMID:25679651
Statistical Significance of Trends in Exoplanetary Atmospheres
NASA Astrophysics Data System (ADS)
Harrington, Joseph; Bowman, M.; Blumenthal, S. D.; Loredo, T. J.; UCF Exoplanets Group
2013-10-01
Cowan and Agol (2011) and we (Harrington et al. 2007, 2010, 2011, 2012, 2013) have noted that at higher equilibrium temperatures, observed exoplanet fluxes are substantially higher than even the elevated equilibrium temperature predicts. With a substantial increase in the number of atmospheric flux measurements, we can now test the statistical significance of this trend. We can also cast the data on a variety of axes to search further for the physics behind both the jump in flux above about 2000 K and the wide scatter in fluxes at all temperatures. This work was supported by NASA Planetary Atmospheres grant NNX12AI69G and NASA Astrophysics Data Analysis Program grant NNX13AF38G.
On the statistical significance of climate trends
NASA Astrophysics Data System (ADS)
Franzke, Christian
2010-05-01
One of the major problems in climate science is the prediction of future climate change due to anthropogenic green-house gas emissions. The earth's climate is not changing in a uniform way because it is a complex nonlinear system of many interacting components. The overall warming trend can be interrupted by cooling periods due to natural variability. Thus, in order to statistically distinguish between internal climate variability and genuine trends one has to assume a certain null model of the climate variability. Traditionally a short-range, and not a long-range, dependent null model is chosen. Here I show evidence for the first time that temperature data at 8 stations across Antarctica are long-range dependent and that the choice of a long-range, rather than a short-range, dependent null model negates the statistical significance of temperature trends at 2 out of 3 stations. These results show the short comings of traditional trend analysis and imply that more attention should be given to the correlation structure of climate data, in particular if they are long-range dependent. In this study I use the Empirical Mode Decomposition (EMD) to decompose the univariate temperature time series into a finite number of Intrinsic Mode Functions (IMF) and an instantaneous mean. While there is no unambiguous definition of a trend, in this study we interpret the instantaneous mean as a trend which is possibly nonlinear. The EMD method has been shown to be a powerful method for extracting trends from noisy and nonlinear time series. I will show that this way of identifying trends is superior to the traditional linear least-square fits.
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance
Kramer, Karen L.; Veile, Amanda; Otárola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children’s growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children’s monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children’s growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children’s growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children’s growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance. PMID:26938742
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
ERIC Educational Resources Information Center
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Reviewer Bias for Statistically Significant Results: A Reexamination.
ERIC Educational Resources Information Center
Fagley, N. S.; McKinney, I. Jean
1983-01-01
Reexamines the article by Atkinson, Furlong, and Wampold (1982) and questions their conclusion that reviewers were biased toward statistically significant results. A statistical power analysis shows the power of their bogus study was low. Low power in a study reporting nonsignificant findings is a valid reason for recommending not to publish.…
Advances in Testing the Statistical Significance of Mediation Effects
ERIC Educational Resources Information Center
Mallinckrodt, Brent; Abraham, W. Todd; Wei, Meifen; Russell, Daniel W.
2006-01-01
P. A. Frazier, A. P. Tix, and K. E. Barron (2004) highlighted a normal theory method popularized by R. M. Baron and D. A. Kenny (1986) for testing the statistical significance of indirect effects (i.e., mediator variables) in multiple regression contexts. However, simulation studies suggest that this method lacks statistical power relative to some…
Statistical significance test for transition matrices of atmospheric Markov chains
NASA Technical Reports Server (NTRS)
Vautard, Robert; Mo, Kingtse C.; Ghil, Michael
1990-01-01
Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.
Decadal power in land air temperatures: Is it statistically significant?
NASA Astrophysics Data System (ADS)
Thejll, Peter A.
2001-12-01
The geographical distribution and properties of the well-known 10-11 year signal in terrestrial temperature records is investigated. By analyzing the Global Historical Climate Network data for surface air temperatures we verify that the signal is strongest in North America and is similar in nature to that reported earlier by R. G. Currie. The decadal signal is statistically significant for individual stations, but it is not possible to show that the signal is statistically significant globally, using strict tests. In North America, during the twentieth century, the decadal variability in the solar activity cycle is associated with the decadal part of the North Atlantic Oscillation index series in such a way that both of these signals correspond to the same spatial pattern of cooling and warming. A method for testing statistical results with Monte Carlo trials on data fields with specified temporal structure and specific spatial correlation retained is presented.
Your Chi-Square Test Is Statistically Significant: Now What?
ERIC Educational Resources Information Center
Sharpe, Donald
2015-01-01
Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…
A Comparison of Statistical Significance Tests for Selecting Equating Functions
ERIC Educational Resources Information Center
Moses, Tim
2009-01-01
This study compared the accuracies of nine previously proposed statistical significance tests for selecting identity, linear, and equipercentile equating functions in an equivalent groups equating design. The strategies included likelihood ratio tests for the loglinear models of tests' frequency distributions, regression tests, Kolmogorov-Smirnov…
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Assigning statistical significance to proteotypic peptides via database searches.
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2011-02-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId's knowledge database to include proteotypic information, utilized RAId's statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId's programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those that occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Addition of Cryoprotectant Significantly Alters the Epididymal Sperm Proteome
Yoon, Sung-Jae; Rahman, Md Saidur; Kwon, Woo-Sung; Park, Yoo-Jin; Pang, Myung-Geol
2016-01-01
Although cryopreservation has been developed and optimized over the past decades, it causes various stresses, including cold shock, osmotic stress, and ice crystal formation, thereby reducing fertility. During cryopreservation, addition of cryoprotective agent (CPA) is crucial for protecting spermatozoa from freezing damage. However, the intrinsic toxicity and osmotic stress induced by CPA cause damage to spermatozoa. To identify the effects of CPA addition during cryopreservation, we assessed the motility (%), motion kinematics, capacitation status, and viability of epididymal spermatozoa using computer-assisted sperm analysis and Hoechst 33258/chlortetracycline fluorescence staining. Moreover, the effects of CPA addition were also demonstrated at the proteome level using two-dimensional electrophoresis. Our results demonstrated that CPA addition significantly reduced sperm motility (%), curvilinear velocity, viability (%), and non-capacitated spermatozoa, whereas straightness and acrosome-reacted spermatozoa increased significantly (p < 0.05). Ten proteins were differentially expressed (two decreased and eight increased) (>3 fold, p < 0.05) after CPA, whereas NADH dehydrogenase flavoprotein 2, f-actin-capping protein subunit beta, superoxide dismutase 2, and outer dense fiber protein 2 were associated with several important signaling pathways (p < 0.05). The present study provides a mechanistic basis for specific cryostresses and potential markers of CPA-induced stress. Therefore, these might provide information about the development of safe biomaterials for cryopreservation and basic ground for sperm cryopreservation. PMID:27031703
Statistical significance of climate sensitivity predictors obtained by data mining
NASA Astrophysics Data System (ADS)
Caldwell, Peter M.; Bretherton, Christopher S.; Zelinka, Mark D.; Klein, Stephen A.; Santer, Benjamin D.; Sanderson, Benjamin M.
2014-03-01
Several recent efforts to estimate Earth's equilibrium climate sensitivity (ECS) focus on identifying quantities in the current climate which are skillful predictors of ECS yet can be constrained by observations. This study automates the search for observable predictors using data from phase 5 of the Coupled Model Intercomparison Project. The primary focus of this paper is assessing statistical significance of the resulting predictive relationships. Failure to account for dependence between models, variables, locations, and seasons is shown to yield misleading results. A new technique for testing the field significance of data-mined correlations which avoids these problems is presented. Using this new approach, all 41,741 relationships we tested were found to be explainable by chance. This leads us to conclude that data mining is best used to identify potential relationships which are then validated or discarded using physically based hypothesis testing.
Statistical significance across multiple optimization models for community partition
NASA Astrophysics Data System (ADS)
Li, Ju; Li, Hui-Jia; Mao, He-Jin; Chen, Junhua
2016-05-01
The study of community structure is an important problem in a wide range of applications, which can help us understand the real network system deeply. However, due to the existence of random factors and error edges in real networks, how to measure the significance of community structure efficiently is a crucial question. In this paper, we present a novel statistical framework computing the significance of community structure across multiple optimization methods. Different from the universal approaches, we calculate the similarity between a given node and its leader and employ the distribution of link tightness to derive the significance score, instead of a direct comparison to a randomized model. Based on the distribution of community tightness, a new “p-value” form significance measure is proposed for community structure analysis. Specially, the well-known approaches and their corresponding quality functions are unified to a novel general formulation, which facilitates in providing a detailed comparison across them. To determine the position of leaders and their corresponding followers, an efficient algorithm is proposed based on the spectral theory. Finally, we apply the significance analysis to some famous benchmark networks and the good performance verified the effectiveness and efficiency of our framework.
ERIC Educational Resources Information Center
Gordon, Howard R. D.
A random sample of 113 members of the American Vocational Education Research Association (AVERA) was surveyed to obtain baseline information regarding AVERA members' perceptions of statistical significance tests. The Psychometrics Group Instrument was used to collect data from participants. Of those surveyed, 67% were male, 93% had earned a…
Weak additivity principle for current statistics in d dimensions.
Pérez-Espigares, C; Garrido, P L; Hurtado, P I
2016-04-01
The additivity principle (AP) allows one to compute the current distribution in many one-dimensional nonequilibrium systems. Here we extend this conjecture to general d-dimensional driven diffusive systems, and validate its predictions against both numerical simulations of rare events and microscopic exact calculations of three paradigmatic models of diffusive transport in d=2. Crucially, the existence of a structured current vector field at the fluctuating level, coupled to the local mobility, turns out to be essential to understand current statistics in d>1. We prove that, when compared to the straightforward extension of the AP to high d, the so-called weak AP always yields a better minimizer of the macroscopic fluctuation theory action for current statistics. PMID:27176236
Weak additivity principle for current statistics in d dimensions
NASA Astrophysics Data System (ADS)
Pérez-Espigares, C.; Garrido, P. L.; Hurtado, P. I.
2016-04-01
The additivity principle (AP) allows one to compute the current distribution in many one-dimensional nonequilibrium systems. Here we extend this conjecture to general d -dimensional driven diffusive systems, and validate its predictions against both numerical simulations of rare events and microscopic exact calculations of three paradigmatic models of diffusive transport in d =2 . Crucially, the existence of a structured current vector field at the fluctuating level, coupled to the local mobility, turns out to be essential to understand current statistics in d >1 . We prove that, when compared to the straightforward extension of the AP to high d , the so-called weak AP always yields a better minimizer of the macroscopic fluctuation theory action for current statistics.
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Statistical tests of additional plate boundaries from plate motion inversions
NASA Technical Reports Server (NTRS)
Stein, S.; Gordon, R. G.
1984-01-01
The application of the F-ratio test, a standard statistical technique, to the results of relative plate motion inversions has been investigated. The method tests whether the improvement in fit of the model to the data resulting from the addition of another plate to the model is greater than that expected purely by chance. This approach appears to be useful in determining whether additional plate boundaries are justified. Previous results have been confirmed favoring separate North American and South American plates with a boundary located beween 30 N and the equator. Using Chase's global relative motion data, it is shown that in addition to separate West African and Somalian plates, separate West Indian and Australian plates, with a best-fitting boundary between 70 E and 90 E, can be resolved. These results are generally consistent with the observation that the Indian plate's internal deformation extends somewhat westward of the Ninetyeast Ridge. The relative motion pole is similar to Minster and Jordan's and predicts the NW-SE compression observed in earthquake mechanisms near the Ninetyeast Ridge.
Statistical controversies in clinical research: statistical significance-too much of a good thing ….
Buyse, M; Hurvitz, S A; Andre, F; Jiang, Z; Burris, H A; Toi, M; Eiermann, W; Lindsay, M-A; Slamon, D
2016-05-01
The use and interpretation of P values is a matter of debate in applied research. We argue that P values are useful as a pragmatic guide to interpret the results of a clinical trial, not as a strict binary boundary that separates real treatment effects from lack thereof. We illustrate our point using the result of BOLERO-1, a randomized, double-blind trial evaluating the efficacy and safety of adding everolimus to trastuzumab and paclitaxel as first-line therapy for HER2+ advanced breast cancer. In this trial, the benefit of everolimus was seen only in the predefined subset of patients with hormone receptor-negative breast cancer at baseline (progression-free survival hazard ratio = 0.66, P = 0.0049). A strict interpretation of this finding, based on complex 'alpha splitting' rules to assess statistical significance, led to the conclusion that the benefit of everolimus was not statistically significant either overall or in the subset. We contend that this interpretation does not do justice to the data, and we argue that the benefit of everolimus in hormone receptor-negative breast cancer is both statistically compelling and clinically relevant. PMID:26861602
Statistical downscaling rainfall using artificial neural network: significantly wetter Bangkok?
NASA Astrophysics Data System (ADS)
Vu, Minh Tue; Aribarg, Thannob; Supratid, Siriporn; Raghavan, Srivatsan V.; Liong, Shie-Yui
2015-08-01
Artificial neural network (ANN) is an established technique with a flexible mathematical structure that is capable of identifying complex nonlinear relationships between input and output data. The present study utilizes ANN as a method of statistically downscaling global climate models (GCMs) during the rainy season at meteorological site locations in Bangkok, Thailand. The study illustrates the applications of the feed forward back propagation using large-scale predictor variables derived from both the ERA-Interim reanalyses data and present day/future GCM data. The predictors are first selected over different grid boxes surrounding Bangkok region and then screened by using principal component analysis (PCA) to filter the best correlated predictors for ANN training. The reanalyses downscaled results of the present day climate show good agreement against station precipitation with a correlation coefficient of 0.8 and a Nash-Sutcliffe efficiency of 0.65. The final downscaled results for four GCMs show an increasing trend of precipitation for rainy season over Bangkok by the end of the twenty-first century. The extreme values of precipitation determined using statistical indices show strong increases of wetness. These findings will be useful for policy makers in pondering adaptation measures due to flooding such as whether the current drainage network system is sufficient to meet the changing climate and to plan for a range of related adaptation/mitigation measures.
Stork, LeAnna M.; Gennings, Chris; Carchman, Richard; Carter, Jr., Walter H.; Pounds, Joel G.; Mumtaz, Moiz
2006-12-01
Several assumptions, defined and undefined, are used in the toxicity assessment of chemical mixtures. In scientific practice mixture components in the low-dose region, particularly subthreshold doses, are often assumed to behave additively (i.e., zero interaction) based on heuristic arguments. This assumption has important implications in the practice of risk assessment, but has not been experimentally tested. We have developed methodology to test for additivity in the sense of Berenbaum (Advances in Cancer Research, 1981), based on the statistical equivalence testing literature where the null hypothesis of interaction is rejected for the alternative hypothesis of additivity when data support the claim. The implication of this approach is that conclusions of additivity are made with a false positive rate controlled by the experimenter. The claim of additivity is based on prespecified additivity margins, which are chosen using expert biological judgment such that small deviations from additivity, which are not considered to be biologically important, are not statistically significant. This approach is in contrast to the usual hypothesis-testing framework that assumes additivity in the null hypothesis and rejects when there is significant evidence of interaction. In this scenario, failure to reject may be due to lack of statistical power making the claim of additivity problematic. The proposed method is illustrated in a mixture of five organophosphorus pesticides that were experimentally evaluated alone and at relevant mixing ratios. Motor activity was assessed in adult male rats following acute exposure. Four low-dose mixture groups were evaluated. Evidence of additivity is found in three of the four low-dose mixture groups.The proposed method tests for additivity of the whole mixture and does not take into account subset interactions (e.g., synergistic, antagonistic) that may have occurred and cancelled each other out.
Effect size, confidence interval and statistical significance: a practical guide for biologists.
Nakagawa, Shinichi; Cuthill, Innes C
2007-11-01
Null hypothesis significance testing (NHST) is the dominant statistical approach in biology, although it has many, frequently unappreciated, problems. Most importantly, NHST does not provide us with two crucial pieces of information: (1) the magnitude of an effect of interest, and (2) the precision of the estimate of the magnitude of that effect. All biologists should be ultimately interested in biological importance, which may be assessed using the magnitude of an effect, but not its statistical significance. Therefore, we advocate presentation of measures of the magnitude of effects (i.e. effect size statistics) and their confidence intervals (CIs) in all biological journals. Combined use of an effect size and its CIs enables one to assess the relationships within data more effectively than the use of p values, regardless of statistical significance. In addition, routine presentation of effect sizes will encourage researchers to view their results in the context of previous research and facilitate the incorporation of results into future meta-analysis, which has been increasingly used as the standard method of quantitative review in biology. In this article, we extensively discuss two dimensionless (and thus standardised) classes of effect size statistics: d statistics (standardised mean difference) and r statistics (correlation coefficient), because these can be calculated from almost all study designs and also because their calculations are essential for meta-analysis. However, our focus on these standardised effect size statistics does not mean unstandardised effect size statistics (e.g. mean difference and regression coefficient) are less important. We provide potential solutions for four main technical problems researchers may encounter when calculating effect size and CIs: (1) when covariates exist, (2) when bias in estimating effect size is possible, (3) when data have non-normal error structure and/or variances, and (4) when data are non
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
ERIC Educational Resources Information Center
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
Smith, Ariana L; Wein, Alan J
2011-05-01
To evaluate the statistical and clinical efficacy of the pharmacological treatments of nocturia using non-antidiuretic agents. A literature review of treatments of nocturia specifically addressing the impact of alpha blockers, 5-alpha reductase inhibitors (5ARI) and antimuscarinics on reduction in nocturnal voids. Despite commonly reported statistically significant results, nocturia has shown a poor clinical response to traditional therapies for benign prostatic hyperplasia including alpha blockers and 5ARI. Similarly, nocturia has shown a poor clinical response to traditional therapies for overactive bladder including antimuscarinics. Statistical success has been achieved in some groups with a variety of alpha blockers and antimuscarinic agents, but the clinical significance of these changes is doubtful. It is likely that other types of therapy will need to be employed in order to achieve a clinically significant reduction in nocturia. PMID:21518417
ERIC Educational Resources Information Center
Monterde-i-Bort, Hector; Frias-Navarro, Dolores; Pascual-Llobell, Juan
2010-01-01
The empirical study we present here deals with a pedagogical issue that has not been thoroughly explored up until now in our field. Previous empirical studies in other sectors have identified the opinions of researchers about this topic, showing that completely unacceptable interpretations have been made of significance tests and other statistical…
ERIC Educational Resources Information Center
Simpson, Robert G.
1981-01-01
Occasionally, differences in test scores seem to indicate that a student performs much better in one reading area than in another when, in reality, the differences may not be statistically significant. The author presents a table in which statistically significant differences between Woodcock test standard scores are identified. (Author)
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
ERIC Educational Resources Information Center
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
ERIC Educational Resources Information Center
Norris, John M.
2015-01-01
Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…
The Importance of Invariance Procedures as against Tests of Statistical Significance.
ERIC Educational Resources Information Center
Fish, Larry
A growing controversy surrounds the strict interpretation of statistical significance tests in social research. Statistical significance tests fail in particular to provide estimates for the stability of research results. Methods that do provide such estimates are known as invariance or cross-validation procedures. Invariance analysis is largely…
A Review of Post-1994 Literature on Whether Statistical Significance Tests Should Be Banned.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
This paper summarizes the literature regarding statistical significance testing with an emphasis on: (1) the post-1994 literature in various disciplines; (2) alternatives to statistical significance testing; and (3) literature exploring why researchers have demonstrably failed to be influenced by the 1994 American Psychological Association…
NASA Astrophysics Data System (ADS)
Ritter, Axel; Muñoz-Carpena, Rafael
2013-02-01
SummarySuccess in the use of computer models for simulating environmental variables and processes requires objective model calibration and verification procedures. Several methods for quantifying the goodness-of-fit of observations against model-calculated values have been proposed but none of them is free of limitations and are often ambiguous. When a single indicator is used it may lead to incorrect verification of the model. Instead, a combination of graphical results, absolute value error statistics (i.e. root mean square error), and normalized goodness-of-fit statistics (i.e. Nash-Sutcliffe Efficiency coefficient, NSE) is currently recommended. Interpretation of NSE values is often subjective, and may be biased by the magnitude and number of data points, data outliers and repeated data. The statistical significance of the performance statistics is an aspect generally ignored that helps in reducing subjectivity in the proper interpretation of the model performance. In this work, approximated probability distributions for two common indicators (NSE and root mean square error) are derived with bootstrapping (block bootstrapping when dealing with time series), followed by bias corrected and accelerated calculation of confidence intervals. Hypothesis testing of the indicators exceeding threshold values is proposed in a unified framework for statistically accepting or rejecting the model performance. It is illustrated how model performance is not linearly related with NSE, which is critical for its proper interpretation. Additionally, the sensitivity of the indicators to model bias, outliers and repeated data is evaluated. The potential of the difference between root mean square error and mean absolute error for detecting outliers is explored, showing that this may be considered a necessary but not a sufficient condition of outlier presence. The usefulness of the approach for the evaluation of model performance is illustrated with case studies including those with
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
2001-01-01
Summarizes the post-1994 literature in psychology and education regarding statistical significance testing, emphasizing limitations and defenses of statistical testing and alternatives or supplements to statistical significance testing. (SLD)
There's more than one way to conduct a replication study: Beyond statistical significance.
Anderson, Samantha F; Maxwell, Scott E
2016-03-01
As the field of psychology struggles to trust published findings, replication research has begun to become more of a priority to both scientists and journals. With this increasing emphasis placed on reproducibility, it is essential that replication studies be capable of advancing the field. However, we argue that many researchers have been only narrowly interpreting the meaning of replication, with studies being designed with a simple statistically significant or nonsignificant results framework in mind. Although this interpretation may be desirable in some cases, we develop a variety of additional "replication goals" that researchers could consider when planning studies. Even if researchers are aware of these goals, we show that they are rarely used in practice-as results are typically analyzed in a manner only appropriate to a simple significance test. We discuss each goal conceptually, explain appropriate analysis procedures, and provide 1 or more examples to illustrate these analyses in practice. We hope that these various goals will allow researchers to develop a more nuanced understanding of replication that can be flexible enough to answer the various questions that researchers might seek to understand. PMID:26214497
Chládek, J; Brázdil, M; Halámek, J; Plešinger, F; Jurák, P
2013-01-01
We present an off-line analysis procedure for exploring brain activity recorded from intra-cerebral electroencephalographic data (SEEG). The objective is to determine the statistical differences between different types of stimulations in the time-frequency domain. The procedure is based on computing relative signal power change and subsequent statistical analysis. An example of characteristic statistically significant event-related de/synchronization (ERD/ERS) detected across different frequency bands following different oddball stimuli is presented. The method is used for off-line functional classification of different brain areas. PMID:24109865
Armijo-Olivo, Susan; Warren, Sharon; Fuentes, Jorge; Magee, David J
2011-12-01
Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results. PMID:21658987
Fidler, Fiona; Burgman, Mark A; Cumming, Geoff; Buttrose, Robert; Thomason, Neil
2006-10-01
Over the last decade, criticisms of null-hypothesis significance testing have grown dramatically, and several alternative practices, such as confidence intervals, information theoretic, and Bayesian methods, have been advocated. Have these calls for change had an impact on the statistical reporting practices in conservation biology? In 2000 and 2001, 92% of sampled articles in Conservation Biology and Biological Conservation reported results of null-hypothesis tests. In 2005 this figure dropped to 78%. There were corresponding increases in the use of confidence intervals, information theoretic, and Bayesian techniques. Of those articles reporting null-hypothesis testing--which still easily constitute the majority--very few report statistical power (8%) and many misinterpret statistical nonsignificance as evidence for no effect (63%). Overall, results of our survey show some improvements in statistical practice, but further efforts are clearly required to move the discipline toward improved practices. PMID:17002771
NASA Astrophysics Data System (ADS)
Wilks, Daniel S.
1996-04-01
A simple approach to long-range forecasting of monthly or seasonal quantities is as the average of observations over some number of the most recent years. Finding this `optimal climate normal' (OCN) involves examining the relationships between the observed variable and averages of its values over the previous one to 30 years and selecting the averaging period yielding the best results. This procedure involves a multiplicity of comparisons, which will lead to misleadingly positive results for developments data. The statistical significance of these OCNs are assessed here using a resampling procedure, in which time series of U.S. Climate Division data are repeatedly shuffled to produce statistical distributions of forecast performance measures, under the null hypothesis that the OCNs exhibit no predictive skill. Substantial areas in the United States are found for which forecast performance appears to be significantly better than would occur by chance.Another complication in the assessment of the statistical significance of the OCNs derives from the spatial correlation exhibited by the data. Because of this correlation, instances of Type I errors (false rejections of local null hypotheses) will tend to occur with spatial coherency and accordingly have the potential to be confused with regions for which there may be real predictability. The `field significance' of the collections of local tests is also assessed here by simultaneously and coherently shuffling the time series for the Climate Divisions. Areas exhibiting significant local tests are large enough to conclude that seasonal OCN temperature forecasts exhibit significant skill over parts of the United States for all seasons except SON, OND, and NDJ, and that seasonal OCN precipitation forecasts are significantly skillful only in the fall. Statistical significance is weaker for monthly than for seasonal OCN temperature forecasts, and the monthly OCN precipitation forecasts do not exhibit significant predictive
Alphas and Asterisks: The Development of Statistical Significance Testing Standards in Sociology
ERIC Educational Resources Information Center
Leahey, Erin
2005-01-01
In this paper, I trace the development of statistical significance testing standards in sociology by analyzing data from articles published in two prestigious sociology journals between 1935 and 2000. I focus on the role of two key elements in the diffusion literature, contagion and rationality, as well as the role of institutional factors. I…
ERIC Educational Resources Information Center
Snyder, Patricia; Lawson, Stephen
Magnitude of effect measures (MEMs), when adequately understood and correctly used, are important aids for researchers who do not want to rely solely on tests of statistical significance in substantive result interpretation. The MEM tells how much of the dependent variable can be controlled, predicted, or explained by the independent variables.…
ERIC Educational Resources Information Center
Spinella, Sarah
2011-01-01
As result replicability is essential to science and difficult to achieve through external replicability, the present paper notes the insufficiency of null hypothesis statistical significance testing (NHSST) and explains the bootstrap as a plausible alternative, with a heuristic example to illustrate the bootstrap method. The bootstrap relies on…
Statistical Significance, Effect Size, and Replication: What Do the Journals Say?
ERIC Educational Resources Information Center
DeVaney, Thomas A.
2001-01-01
Studied the attitudes of representatives of journals in education, sociology, and psychology through an electronic survey completed by 194 journal representatives. Results suggest that the majority of journals do not have written policies concerning the reporting of results from statistical significance testing, and most indicated that statistical…
Statistical Significance of the Trends in Monthly Heavy Precipitation Over the US
Mahajan, Salil; North, Dr. Gerald R.; Saravanan, Dr. R.; Genton, Dr. Marc G.
2012-01-01
Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall's {tau} test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong.
ERIC Educational Resources Information Center
Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.
2011-01-01
In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…
Weighing the costs of different errors when determining statistical significant during monitoring
Technology Transfer Automated Retrieval System (TEKTRAN)
Selecting appropriate significance levels when constructing confidence intervals and performing statistical analyses with rangeland monitoring data is not a straightforward process. This process is burdened by the conventional selection of “95% confidence” (i.e., Type I error rate, a =0.05) as the d...
Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.
ERIC Educational Resources Information Center
Deegear, James
This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
ERIC Educational Resources Information Center
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate unless "corrected" effect…
The Effects of Electrode Impedance on Data Quality and Statistical Significance in ERP Recordings
Kappenman, Emily S.; Luck, Steven J.
2010-01-01
To determine whether data quality is meaningfully reduced by high electrode impedance, EEG was recorded simultaneously from low- and high-impedance electrode sites during an oddball task. Low-frequency noise was found to be increased at high-impedance sites relative to low-impedance sites, especially when the recording environment was warm and humid. The increased noise at the high-impedance sites caused an increase in the number of trials needed to obtain statistical significance in analyses of P3 amplitude, but this could be partially mitigated by high-pass filtering and artifact rejection. High electrode impedance did not reduce statistical power for the N1 wave unless the recording environment was warm and humid. Thus, high electrode impedance may increase noise and decrease statistical power under some conditions, but these effects can be reduced by using a cool and dry recording environment and appropriate signal processing methods. PMID:20374541
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Suffredini, Anthony F; Sacks, David B; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple 'fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ. PMID:26510657
NASA Astrophysics Data System (ADS)
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Mass spectrometry-based protein identification with accurate statistical significance assignment
Alves, Gelio; Yu, Yi-Kuo
2015-01-01
Motivation: Assigning statistical significance accurately has become increasingly important as metadata of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of metadata at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry-based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. Results: We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database P-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level E-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Sorić formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. Availability and implementation: The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit. Contact: yyu@ncbi.nlm.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25362092
Rudd, James; Moore, Jason H.; Urbanowicz, Ryan J.
2013-01-01
Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear. PMID:24358057
On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis
Li, Bing; Chun, Hyonho; Zhao, Hongyu
2014-01-01
We introduce a nonparametric method for estimating non-gaussian graphical models based on a new statistical relation called additive conditional independence, which is a three-way relation among random vectors that resembles the logical structure of conditional independence. Additive conditional independence allows us to use one-dimensional kernel regardless of the dimension of the graph, which not only avoids the curse of dimensionality but also simplifies computation. It also gives rise to a parallel structure to the gaussian graphical model that replaces the precision matrix by an additive precision operator. The estimators derived from additive conditional independence cover the recently introduced nonparanormal graphical model as a special case, but outperform it when the gaussian copula assumption is violated. We compare the new method with existing ones by simulations and in genetic pathway analysis. PMID:26401064
Kim, In-Soo; Yang, Pil-Sung; Kim, Tae-Hoon; Park, Junbeum; Park, Jin-Kyu; Uhm, Jae Sun; Joung, Boyoung; Lee, Moon Hyoung
2016-01-01
Purpose The clinical significance of post-procedural atrial premature beats immediately after catheter ablation for atrial fibrillation (AF) has not been clearly determined. We hypothesized that the provocation of immediate recurrence of atrial premature beats (IRAPB) and additional ablation improves the clinical outcome of AF ablation. Materials and Methods We enrolled 200 patients with AF (76.5% males; 57.4±11.1 years old; 64.3% paroxysmal AF) who underwent catheter ablation. Post-procedure IRAPB was defined as frequent atrial premature beats (≥6/min) under isoproterenol infusion (5 µg/min), monitored for 10 min after internal cardioversion, and we ablated mappable IRAPBs. Post-procedural IRAPB provocations were conducted in 100 patients. We compared the patients who showed IRAPB with those who did not. We also compared the IRAPB provocation group with 100 age-, sex-, and AF-type-matched patients who completed ablation without provocation (No-Test group). Results 1) Among the post-procedural IRAPB provocation group, 33% showed IRAPB and required additional ablation with a longer procedure time (p=0.001) than those without IRAPB, without increasing the complication rate. 2) During 18.0±6.6 months of follow-up, the patients who showed IRAPB had a worse clinical recurrence rate than those who did not (27.3% vs. 9.0%; p=0.016), in spite of additional IRAPB ablation. 3) However, the clinical recurrence rate was significantly lower in the IRAPB provocation group (15.0%) than in the No-Test group (28.0%; p=0.025) without lengthening of the procedure time or raising complication rate. Conclusion The presence of post-procedural IRAPB was associated with a higher recurrence rate after AF ablation. However, IRAPB provocation and additional ablation might facilitate a better clinical outcome. A further prospective randomized study is warranted. PMID:26632385
On the statistical significance of the bulk flow measured by the Planck satellite
NASA Astrophysics Data System (ADS)
Atrio-Barandela, F.
2013-09-01
A recent analysis of data collected by the Planck satellite detected a net dipole at the location of X-ray selected galaxy clusters, corresponding to a large-scale bulk flow extending at least to z ~ 0.18, the median redshift of the cluster sample. The amplitude of this flow, as measured with Planck, is consistent with earlier findings based on data from the Wilkinson Microwave Anisotropy Probe (WMAP). However, the uncertainty assigned to the dipole by the Planck team is much larger than that found in the WMAP studies, leading the authors of the Planck study to conclude that the observed bulk flow is not statistically significant. Here, we show that two of the three implementations of random sampling used in the error analysis of the Planck study lead to systematic overestimates in the uncertainty of the measured dipole. Random simulations of the sky do not take into account that the actual realization of the sky leads to filtered data that have a 12% lower root-mean-square dispersion than the average simulation. Using rotations around the Galactic pole (the Z axis), increases the uncertainty of the X and Y components of the dipole and artificially reduces the significance of the dipole detection from 98-99% to less than 90% confidence. When either effect is taken into account, the corrected errors agree with those obtained using random distributions of clusters on Planck data, and the resulting statistical significance of the dipole measured by Planck is consistent with that of the WMAP results.
NASA Astrophysics Data System (ADS)
Santer, B. D.; Wigley, T. M. L.; Boyle, J. S.; Gaffen, D. J.; Hnilo, J. J.; Nychka, D.; Parker, D. E.; Taylor, K. E.
2000-03-01
This paper examines trend uncertainties in layer-average free atmosphere temperatures arising from the use of different trend estimation methods. It also considers statistical issues that arise in assessing the significance of individual trends and of trend differences between data sets. Possible causes of these trends are not addressed. We use data from satellite and radiosonde measurements and from two reanalysis projects. To facilitate intercomparison, we compute from reanalyses and radiosonde data temperatures equivalent to those from the satellite-based Microwave Sounding Unit (MSU). We compare linear trends based on minimization of absolute deviations (LA) and minimization of squared deviations (LS). Differences are generally less than 0.05°C/decade over 1959-1996. Over 1979-1993, they exceed 0.10°C/decade for lower tropospheric time series and 0.15°C/decade for the lower stratosphere. Trend fitting by the LA method can degrade the lower-tropospheric trend agreement of 0.03°C/decade (over 1979-1996) previously reported for the MSU and radiosonde data. In assessing trend significance we employ two methods to account for temporal autocorrelation effects. With our preferred method, virtually none of the individual 1979-1993 trends in deep-layer temperatures are significantly different from zero. To examine trend differences between data sets we compute 95% confidence intervals for individual trends and show that these overlap for almost all data sets considered. Confidence intervals for lower-tropospheric trends encompass both zero and the model-projected trends due to anthropogenic effects. We also test the significance of a trend in d(t), the time series of differences between a pair of data sets. Use of d(t) removes variability common to both time series and facilitates identification of small trend differences. This more discerning test reveals that roughly 30% of the data set comparisons have significant differences in lower-tropospheric trends
Crow, C.J.
1985-01-01
Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.
A Network-Based Method to Assess the Statistical Significance of Mild Co-Regulation Effects
Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna
2013-01-01
Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis. PMID:24039936
NASA Astrophysics Data System (ADS)
Kellerer-Pirklbauer, Andreas
2016-04-01
Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter
Statistics, Probability, Significance, Likelihood: Words Mean What We Define Them to Mean
ERIC Educational Resources Information Center
Drummond, Gordon B.; Tom, Brian D. M.
2011-01-01
Statisticians use words deliberately and specifically, but not necessarily in the way they are used colloquially. For example, in general parlance "statistics" can mean numerical information, usually data. In contrast, one large statistics textbook defines the term "statistic" to denote "a characteristic of a "sample", such as the average score",…
Carr, J.R.; Roberts, K.P.
1989-02-01
Universal kriging is compared with ordinary kriging for estimation of earthquake ground motion. Ordinary kriging is based on a stationary random function model; universal kriging is based on a nonstationary random function model representing first-order drift. Accuracy of universal kriging is compared with that for ordinary kriging; cross-validation is used as the basis for comparison. Hypothesis testing on these results shows that accuracy obtained using universal kriging is not significantly different from accuracy obtained using ordinary kriging. Test based on normal distribution assumptions are applied to errors measured in the cross-validation procedure; t and F tests reveal no evidence to suggest universal and ordinary kriging are different for estimation of earthquake ground motion. Nonparametric hypothesis tests applied to these errors and jackknife statistics yield the same conclusion: universal and ordinary kriging are not significantly different for this application as determined by a cross-validation procedure. These results are based on application to four independent data sets (four different seismic events).
Key statistics related to CO/sub 2/ emissions: Significant contributing countries
Kellogg, M.A.; Edmonds, J.A.; Scott, M.J.; Pomykala, J.S.
1987-07-01
This country selection task report describes and applies a methodology for identifying a set of countries responsible for significant present and anticipated future emissions of CO/sub 2/ and other radiatively important gases (RIGs). The identification of countries responsible for CO/sub 2/ and other RIGs emissions will help determine to what extent a select number of countries might be capable of influencing future emissions. Once identified, those countries could potentially exercise cooperative collective control of global emissions and thus mitigate the associated adverse affects of those emissions. The methodology developed consists of two approaches: the resource approach and the emissions approach. While conceptually very different, both approaches yield the same fundamental conclusion. The core of any international initiative to control global emissions must include three key countries: the US, USSR, and the People's Republic of China. It was also determined that broader control can be achieved through the inclusion of sixteen additional countries with significant contributions to worldwide emissions.
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
Gehrmann, Thies; Reinders, Marcel J.T.
2015-01-01
Background: With more and more genomes being sequenced, detecting synteny between genomes becomes more and more important. However, for microorganisms the genomic divergence quickly becomes large, resulting in different codon usage and shuffling of gene order and gene elements such as exons. Results: We present Proteny, a methodology to detect synteny between diverged genomes. It operates on the amino acid sequence level to be insensitive to codon usage adaptations and clusters groups of exons disregarding order to handle diversity in genomic ordering between genomes. Furthermore, Proteny assigns significance levels to the syntenic clusters such that they can be selected on statistical grounds. Finally, Proteny provides novel ways to visualize results at different scales, facilitating the exploration and interpretation of syntenic regions. We test the performance of Proteny on a standard ground truth dataset, and we illustrate the use of Proteny on two closely related genomes (two different strains of Aspergillus niger) and on two distant genomes (two species of Basidiomycota). In comparison to other tools, we find that Proteny finds clusters with more true homologies in fewer clusters that contain more genes, i.e. Proteny is able to identify a more consistent synteny. Further, we show how genome rearrangements, assembly errors, gene duplications and the conservation of specific genes can be easily studied with Proteny. Availability and implementation: Proteny is freely available at the Delft Bioinformatics Lab website http://bioinformatics.tudelft.nl/dbl/software. Contact: t.gehrmann@tudelft.nl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26116928
Alexandrov, N. N.; Go, N.
1994-01-01
We have completed an exhaustive search for the common spatial arrangements of backbone fragments (SARFs) in nonhomologous proteins. This type of local structural similarity, incorporating short fragments of backbone atoms, arranged not necessarily in the same order along the polypeptide chain, appears to be important for protein function and stability. To estimate the statistical significance of the similarities, we have introduced a similarity score. We present several locally similar structures, with a large similarity score, which have not yet been reported. On the basis of the results of pairwise comparison, we have performed hierarchical cluster analysis of protein structures. Our analysis is not limited by comparison of single chains but also includes complex molecules consisting of several subunits. The SARFs with backbone fragments from different polypeptide chains provide a stable interaction between subunits in protein molecules. In many cases the active site of enzyme is located at the same position relative to the common SARFs, implying a function of the certain SARFs as a universal interface of the protein-substrate interaction. PMID:8069217
NASA Astrophysics Data System (ADS)
Zhu, Yunmin; Rong, Haibo; Mai, Shaowei; Luo, Xueyi; Li, Xiaoping; Li, Weishan
2015-12-01
Spinel lithium manganese oxide, LiMn2O4, is a promising cathode for lithium ion battery in large-scale applications, because it possesses many advantages compared with currently used layered lithium cobalt oxide (LiCoO2) and olivine phosphate (LiFePO4), including naturally abundant resource, environmental friendliness and high and long work potential plateau. Its poor cyclability under high temperature, however, limits its application. In this work, we report a significant cyclability improvement of LiMn2O4 under elevated temperature by using dimethyl phenylphonite (DMPP) as an electrolyte additive. Charge/discharge tests demonstrate that the application of 0.5 wt.% DMPP yields a capacity retention improvement from 16% to 82% for LiMn2O4 after 200 cycles under 55 °C at 1 C (1C = 148 mAh g-1) between 3 and 4.5 V. Electrochemical and physical characterizations indicate that DMPP is electrochemically oxidized at the potential lower than that for lithium extraction, forming a protective cathode interphase on LiMn2O4, which suppresses the electrolyte decomposition and prevents LiMn2O4 from crystal destruction.
2013-01-01
Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9). PMID:23721463
NASA Technical Reports Server (NTRS)
Smalheer, C. V.
1973-01-01
The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.
NASA Astrophysics Data System (ADS)
Volk, William W.; Hess, Carl; Ruch, Wayne; Yu, Zongchang; Ma, Weimin; Fisher, Lisa; Vickery, Carl; Ma, Z. Mark
2003-12-01
With expected implementation of low k1 lithography on 193nm scanners for 65nm node wafer production, high resolution defect inspection will be needed to insure reticle quality and reticle manufacture process monitoring. Reticle cost and reticle defectivity are both increasing with each shrink to the next node. Simultaneously, system on chip (SoC) designs are increasing in which a large area of the exposure field typically contains dummy patterns and other features which are not electrically active. Knowing which defects will electrically impact device yield and performance can improve reticle manufacturing yield and cycle time -- resulting in lower reticle costs. This investigation examines the feasibility of using additional design data layers for die-to-database reticle inspection to determine in real time the relevance of a reticle defect by its location in the device (Smart InspectionTM). The impact to data preparation and inspection throughput is evaluated. The current prototype algorithm is built on the XPA and XPE die-to-database algorithms for chrome-on-glass and EPSM reticles, respectively. The algorithms implement variable sensitivity based on the additional design data regions. During defect review the defects are intelligently binned into the different predetermined design regions. Tests show the new Smart Inspection algorithm provides the capability of using higher than normal sensitivity in critical regions while reducing sensitivity in less critical regions to filter total defect counts and allow for the review of just defects that matter. Performance characterization of a variable sensitivity Smart Inspection algorithm is discussed in addition to the filtering of the total defect count during review to show the defects that matter to device performance. Using seven critical layer production reticles from a system on chip device we examine the applications of Smart Inspection by layer including active, poly, contact, metal and via layers. Data volume
Badr, Lina Kurdahi
2009-01-01
By adopting more appropriate statistical methods to appraise data from a previously published randomized controlled trial (RCT), we evaluated the statistical and clinical significance of an intervention on the 18 month neurodevelopmental outcome of infants with suspected brain injury. The intervention group (n =32) received extensive, individualized cognitive/sensorimotor stimulation by public health nurses (PHNs) while the control group (n = 30) received standard follow-up care. At 18 months 43 infants remained in the study (22 = intervention, 21 = control). The results indicate that there was a significant statistical change within groups and a clinical significance whereby more infants in the intervention group improved in mental, motor and neurological functioning at 18 months compared to the control group. The benefits of looking at clinical significance from a meaningful aspect for practitioners are emphasized. PMID:19276403
NASA Astrophysics Data System (ADS)
Alves, Gelio
After the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two concrete examples: global sequence alignment statistics and peptide sequencing/identification using mass spectrometry. High-performance liquid chromatography coupled to a mass spectrometer (HPLC/MS/MS), enabling peptide identifications and thus protein identifications, has become the tool of choice in large-scale proteomics experiments. Peptide identification is usually done by database searches methods. The lack of robust statistical significance assignment among current methods motivated the development of a novel de novo algorithm, RAId, whose score statistics then provide statistical significance for high scoring peptides found in our custom, enzyme-digested peptide library. The ease of incorporating post-translation modifications is another important feature of RAId. To organize the massive protein/DNA data accumulated, biologists often cluster proteins according to their similarity via tools such as sequence alignment. Homologous proteins share similar domains. To assess the similarity of two domains usually requires alignment from head to toe, ie. a global alignment. A good alignment score statistics with an appropriate null model enable us to distinguish the biologically meaningful similarity from chance similarity. There has been much progress in local alignment statistics, which characterize score statistics when alignments tend to appear as a short segment of the whole sequence. For global alignment, which is useful in domain alignment, there is still much room for
NASA Astrophysics Data System (ADS)
Williams, Arnold C.; Pachowicz, Peter W.
2004-09-01
Current mine detection research indicates that no single sensor or single look from a sensor will detect mines/minefields in a real-time manner at a performance level suitable for a forward maneuver unit. Hence, the integrated development of detectors and fusion algorithms are of primary importance. A problem in this development process has been the evaluation of these algorithms with relatively small data sets, leading to anecdotal and frequently over trained results. These anecdotal results are often unreliable and conflicting among various sensors and algorithms. Consequently, the physical phenomena that ought to be exploited and the performance benefits of this exploitation are often ambiguous. The Army RDECOM CERDEC Night Vision Laboratory and Electron Sensors Directorate has collected large amounts of multisensor data such that statistically significant evaluations of detection and fusion algorithms can be obtained. Even with these large data sets care must be taken in algorithm design and data processing to achieve statistically significant performance results for combined detectors and fusion algorithms. This paper discusses statistically significant detection and combined multilook fusion results for the Ellipse Detector (ED) and the Piecewise Level Fusion Algorithm (PLFA). These statistically significant performance results are characterized by ROC curves that have been obtained through processing this multilook data for the high resolution SAR data of the Veridian X-Band radar. We discuss the implications of these results on mine detection and the importance of statistical significance, sample size, ground truth, and algorithm design in performance evaluation.
Determination of significant variables in compound wear using a statistical model
Pumwa, J.; Griffin, R.B.; Smith, C.M.
1997-07-01
This paper will report on a study of dry compound wear of normalized 1018 steel on A2 tool steel. Compound wear is a combination of sliding and impact wear. The compound wear machine consisted of an A2 tool steel wear plate that could be rotated, and an indentor head that held the 1018 carbon steel wear pins. The variables in the system were the rpm of the wear plate, the force with which the indentor strikes the wear plate, and the frequency with which the indentor strikes the wear plate. A statistically designed experiment was used to analyze the effects of the different variables on the compound wear process. The model developed showed that wear could be reasonably well predicted using a defined variable that was called the workrate. The paper will discuss the results of the modeling and the metallurgical changes that occurred at the indentor interface, with the wear plate, during the wear process.
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated. PMID:25388545
ERIC Educational Resources Information Center
Jacobson, Neil S.; Truax, Paula
1991-01-01
Describes ways of operationalizing clinically significant change, defined as extent to which therapy moves someone outside range of dysfunctional population or within range of functional population. Uses examples to show how clients can be categorized on basis of this definition. Proposes reliable change index (RC) to determine whether magnitude…
ERIC Educational Resources Information Center
Hojat, Mohammadreza; Xu, Gang
2004-01-01
Effect Sizes (ES) are an increasingly important index used to quantify the degree of practical significance of study results. This paper gives an introduction to the computation and interpretation of effect sizes from the perspective of the consumer of the research literature. The key points made are: (1) "ES" is a useful indicator of the…
ERIC Educational Resources Information Center
Onwuegbuzie, Anthony J.; Roberts, J. Kyle; Daniel, Larry G.
2005-01-01
In this article, the authors (a) illustrate how displaying disattenuated correlation coefficients alongside their unadjusted counterparts will allow researchers to assess the impact of unreliability on bivariate relationships and (b) demonstrate how a proposed new "what if reliability" analysis can complement null hypothesis significance tests of…
NASA Technical Reports Server (NTRS)
Staubert, R.
1985-01-01
Methods for calculating the statistical significance of excess events and the interpretation of the formally derived values are discussed. It is argued that a simple formula for a conservative estimate should generally be used in order to provide a common understanding of quoted values.
NASA Technical Reports Server (NTRS)
Massey, J. L.
1976-01-01
The very low error probability obtained with long error-correcting codes results in a very small number of observed errors in simulation studies of practical size and renders the usual confidence interval techniques inapplicable to the observed error probability. A natural extension of the notion of a 'confidence interval' is made and applied to such determinations of error probability by simulation. An example is included to show the surprisingly great significance of as few as two decoding errors in a very large number of decoding trials.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Potts, T.T.; Hylko, J.M.; Almond, D.
2007-07-01
A company's overall safety program becomes an important consideration to continue performing work and for procuring future contract awards. When injuries or accidents occur, the employer ultimately loses on two counts - increased medical costs and employee absences. This paper summarizes the human and organizational components that contributed to successful safety programs implemented by WESKEM, LLC's Environmental, Safety, and Health Departments located in Paducah, Kentucky, and Oak Ridge, Tennessee. The philosophy of 'safety, compliance, and then production' and programmatic components implemented at the start of the contracts were qualitatively identified as contributing factors resulting in a significant accumulation of safe work hours and an Experience Modification Rate (EMR) of <1.0. Furthermore, a study by the Associated General Contractors of America quantitatively validated components, already found in the WESKEM, LLC programs, as contributing factors to prevent employee accidents and injuries. Therefore, an investment in the human and organizational components now can pay dividends later by reducing the EMR, which is the key to reducing Workers' Compensation premiums. Also, knowing your employees' demographics and taking an active approach to evaluate and prevent fatigue may help employees balance work and non-work responsibilities. In turn, this approach can assist employers in maintaining a healthy and productive workforce. For these reasons, it is essential that safety needs be considered as the starting point when performing work. (authors)
Denton, Debra L; Diamond, Jerry; Zheng, Lei
2011-05-01
The U.S. Environmental Protection Agency (U.S. EPA) and state agencies implement the Clean Water Act, in part, by evaluating the toxicity of effluent and surface water samples. A common goal for both regulatory authorities and permittees is confidence in an individual test result (e.g., no-observed-effect concentration [NOEC], pass/fail, 25% effective concentration [EC25]), which is used to make regulatory decisions, such as reasonable potential determinations, permit compliance, and watershed assessments. This paper discusses an additional statistical approach (test of significant toxicity [TST]), based on bioequivalence hypothesis testing, or, more appropriately, test of noninferiority, which examines whether there is a nontoxic effect at a single concentration of concern compared with a control. Unlike the traditional hypothesis testing approach in whole effluent toxicity (WET) testing, TST is designed to incorporate explicitly both α and β error rates at levels of toxicity that are unacceptable and acceptable, given routine laboratory test performance for a given test method. Regulatory management decisions are used to identify unacceptable toxicity levels for acute and chronic tests, and the null hypothesis is constructed such that test power is associated with the ability to declare correctly a truly nontoxic sample as acceptable. This approach provides a positive incentive to generate high-quality WET data to make informed decisions regarding regulatory decisions. This paper illustrates how α and β error rates were established for specific test method designs and tests the TST approach using both simulation analyses and actual WET data. In general, those WET test endpoints having higher routine (e.g., 50th percentile) within-test control variation, on average, have higher method-specific α values (type I error rate), to maintain a desired type II error rate. This paper delineates the technical underpinnings of this approach and demonstrates the benefits
Statistical inference for the additive hazards model under outcome-dependent sampling
Yu, Jichang; Liu, Yanyan; Sandler, Dale P.; Zhou, Haibo
2015-01-01
Cost-effective study design and proper inference procedures for data from such designs are always of particular interests to study investigators. In this article, we propose a biased sampling scheme, an outcome-dependent sampling (ODS) design for survival data with right censoring under the additive hazards model. We develop a weighted pseudo-score estimator for the regression parameters for the proposed design and derive the asymptotic properties of the proposed estimator. We also provide some suggestions for using the proposed method by evaluating the relative efficiency of the proposed method against simple random sampling design and derive the optimal allocation of the subsamples for the proposed design. Simulation studies show that the proposed ODS design is more powerful than other existing designs and the proposed estimator is more efficient than other estimators. We apply our method to analyze a cancer study conducted at NIEHS, the Cancer Incidence and Mortality of Uranium Miners Study, to study the risk of radon exposure to cancer. PMID:26379363
NASA Astrophysics Data System (ADS)
Baluev, Roman V.
2013-11-01
We consider the `multifrequency' periodogram, in which the putative signal is modelled as a sum of two or more sinusoidal harmonics with independent frequencies. It is useful in cases when the data may contain several periodic components, especially when their interaction with each other and with the data sampling patterns might produce misleading results. Although the multifrequency statistic itself was constructed earlier, for example by G. Foster in his CLEANest algorithm, its probabilistic properties (the detection significance levels) are still poorly known and much of what is deemed known is not rigorous. These detection levels are nonetheless important for data analysis. We argue that to prove the simultaneous existence of all n components revealed in a multiperiodic variation, it is mandatory to apply at least 2n - 1 significance tests, among which most involve various multifrequency statistics, and only n tests are single-frequency ones. The main result of this paper is an analytic estimation of the statistical significance of the frequency tuples that the multifrequency periodogram can reveal. Using the theory of extreme values of random fields (the generalized Rice method), we find a useful approximation to the relevant false alarm probability. For the double-frequency periodogram, this approximation is given by the elementary formula (π/16)W2e- zz2, where W denotes the normalized width of the settled frequency range, and z is the observed periodogram maximum. We carried out intensive Monte Carlo simulations to show that the practical quality of this approximation is satisfactory. A similar analytic expression for the general multifrequency periodogram is also given, although with less numerical verification.
NASA Astrophysics Data System (ADS)
Casati, Michele
2014-05-01
The assertion that solar activity may play a significant role in the trigger of large volcanic eruptions is, and has been discussed by many geophysicists. Numerous scientific papers have established a possible correlation between these events and the electromagnetic coupling between the Earth and the Sun, but none of them has been able to highlight a possible statistically significant relationship between large volcanic eruptions and any of the series, such as geomagnetic activity, solar wind, sunspots number. In our research, we compare the 148 volcanic eruptions with index VEI4, the major 37 historical volcanic eruptions equal to or greater than index VEI5, recorded from 1610 to 2012 , with its sunspots number. Staring, as the threshold value, a monthly sunspot number of 46 (recorded during the great eruption of Krakatoa VEI6 historical index, August 1883), we note some possible relationships and conduct a statistical test. • Of the historical 31 large volcanic eruptions with index VEI5+, recorded between 1610 and 1955, 29 of these were recorded when the SSN<46. The remaining 2 eruptions were not recorded when the SSN<46, but rather during solar maxima of the solar cycle of the year 1739 and in the solar cycle No. 14 (Shikotsu eruption of 1739 and Ksudach 1907). • Of the historical 8 large volcanic eruptions with index VEI6+, recorded from 1610 to the present, 7 of these were recorded with SSN<46 and more specifically, within the three large solar minima known : Maunder (1645-1710), Dalton (1790-1830) and during the solar minimums occurred between 1880 and 1920. As the only exception, we note the eruption of Pinatubo of June 1991, recorded in the solar maximum of cycle 22. • Of the historical 6 major volcanic eruptions with index VEI5+, recorded after 1955, 5 of these were not recorded during periods of low solar activity, but rather during solar maxima, of the cycles 19,21 and 22. The significant tests, conducted with the chi-square χ ² = 7,782, detect a
NASA Technical Reports Server (NTRS)
Meador, Michael A.
2005-01-01
Single-wall carbon nanotubes have been shown to possess a combination of outstanding mechanical, electrical, and thermal properties. The use of carbon nanotubes as an additive to improve the mechanical properties of polymers and/or enhance their thermal and electrical conductivity has been a topic of intense interest. Nanotube-modified polymeric materials could find a variety of applications in NASA missions including large-area antennas, solar arrays, and solar sails; radiation shielding materials for vehicles, habitats, and extravehicular activity suits; and multifunctional materials for vehicle structures and habitats. Use of these revolutionary materials could reduce vehicle weight significantly and improve vehicle performance and capabilities.
Webb-Robertson, Bobbie-Jo M.; McCue, Lee Ann; Waters, Katrina M.; Matzke, Melissa M.; Jacobs, Jon M.; Metz, Thomas O.; Varnum, Susan M.; Pounds, Joel G.
2010-11-01
Liquid chromatography-mass spectrometry-based (LC-MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in peptide intensities and observations (presence/absence) of peptides makes data analysis quite challenging. The missing abundance values in LC-MS proteomics data are difficult to address with traditional imputation-based approaches because the mechanisms by which data are missing are unknown a priori. Data can be missing due to random mechanisms such as experimental error, or non-random mechanisms such as a true biological effect. We present a statistical approach that uses a test of independence known as a G-test to test the null hypothesis of independence between the number of missing values and the experimental groups. We pair the G-test results evaluating independence of missing data (IMD) with a standard analysis of variance (ANOVA) that uses only means and variances computed from the observed data. Each peptide is therefore represented by two statistical confidence metrics, one for qualitative differential observation and one for quantitative differential intensity. We use two simulated and two real LC-MS datasets to demonstrate the robustness and sensitivity of the ANOVA-IMD approach for assigning confidence to peptides with significant differential abundance among experimental groups.
NASA Astrophysics Data System (ADS)
Wang, H. J.; Shi, W. L.; Chen, X. H.
2006-05-01
The West Development Policy being implemented in China is causing significant land use and land cover (LULC) changes in West China. With the up-to-date satellite database of the Global Land Cover Characteristics Database (GLCCD) that characterizes the lower boundary conditions, the regional climate model RIEMS-TEA is used to simulate possible impacts of the significant LULC variation. The model was run for five continuous three-month periods from 1 June to 1 September of 1993, 1994, 1995, 1996, and 1997, and the results of the five groups are examined by means of a student t-test to identify the statistical significance of regional climate variation. The main results are: (1) The regional climate is affected by the LULC variation because the equilibrium of water and heat transfer in the air-vegetation interface is changed. (2) The integrated impact of the LULC variation on regional climate is not only limited to West China where the LULC varies, but also to some areas in the model domain where the LULC does not vary at all. (3) The East Asian monsoon system and its vertical structure are adjusted by the large scale LULC variation in western China, where the consequences axe the enhancement of the westward water vapor transfer from the east east and the relevant increase of wet-hydrostatic energy in the middle-upper atmospheric layers. (4) The ecological engineering in West China affects significantly the regional climate in Northwest China, North China and the middle-lower reaches of the Yangtze River; there are obvious effects in South, Northeast, and Southwest China, but minor effects in Tibet.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
Zamboras, R.L.
1995-10-01
The Oligocene Hackberry sands of the Hackberry Embayment represent a complex and elusive exploration target. 3-D seismic evaluation along the headward erosional limits of the embayment provides a reconstructive framework of tectonic and sedimentation patterns which facilitate hydrocarbon exploration. The 3-D seismic along the Orange County, Texas portion of the Oligocene Hackberry trend indicates: (1) similarities of Hackberry structural and depositional setting to that of the underlying Eocene Yegua Formation; (2) four distinct cyclical sedimentation episodes associated with basin floor slump faulting: (3) the usefulness of seismic attributes as direct hydrocarbon indicators, and (4) the potential for significant oil and gas reserves additions in a mature trend. The Hackberry embayment represents a microcosm of the basin structural and depositional processes. Utilizing 3-D seismic to lower risk and finding cost will renew interest in trends such as the Hackberry of the Upper Texas-Louisiana Gulf Coast.
Liu, Yiwen; Ni, Bing-Jie
2015-01-01
The application of anaerobic ammonium oxidation (Anammox) process is often limited by the slow growth rate of Anammox bacteria. As the essential substrate element that required for culturing Anammox sludge, Fe (II) is expected to affect Anammox bacterial growth. This work systematically studied the effects of Fe (II) addition on Anammox activity based on the kinetic analysis of specific growth rate using data from batch tests with an enriched Anammox sludge at different dosing levels. Results clearly demonstrated that appropriate Fe (II) dosing (i.e., 0.09 mM) significantly enhanced the specific Anammox growth rate up to 0.172 d−1 compared to 0.118 d−1 at regular Fe (II) level (0.03 mM). The relationship between Fe (II) concentration and specific Anammox growth rate was found to be well described by typical substrate inhibition kinetics, which was integrated into currently well-established Anammox model to describe the enhanced Anammox growth with Fe (II) addition. The validity of the integrated Anammox model was verified using long-term experimental data from three independent Anammox reactors with different Fe (II) dosing levels. This Fe (II)-based approach could be potentially implemented to enhance the process rate for possible mainstream application of Anammox technology, in order for an energy autarchic wastewater treatment. PMID:25644239
NASA Astrophysics Data System (ADS)
Liu, Yiwen; Ni, Bing-Jie
2015-02-01
The application of anaerobic ammonium oxidation (Anammox) process is often limited by the slow growth rate of Anammox bacteria. As the essential substrate element that required for culturing Anammox sludge, Fe (II) is expected to affect Anammox bacterial growth. This work systematically studied the effects of Fe (II) addition on Anammox activity based on the kinetic analysis of specific growth rate using data from batch tests with an enriched Anammox sludge at different dosing levels. Results clearly demonstrated that appropriate Fe (II) dosing (i.e., 0.09 mM) significantly enhanced the specific Anammox growth rate up to 0.172 d-1 compared to 0.118 d-1 at regular Fe (II) level (0.03 mM). The relationship between Fe (II) concentration and specific Anammox growth rate was found to be well described by typical substrate inhibition kinetics, which was integrated into currently well-established Anammox model to describe the enhanced Anammox growth with Fe (II) addition. The validity of the integrated Anammox model was verified using long-term experimental data from three independent Anammox reactors with different Fe (II) dosing levels. This Fe (II)-based approach could be potentially implemented to enhance the process rate for possible mainstream application of Anammox technology, in order for an energy autarchic wastewater treatment.
Fisher, Aaron; Anderson, G Brooke; Peng, Roger; Leek, Jeff
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%-49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%-76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Fisher, Aaron; Anderson, G. Brooke; Peng, Roger
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%–49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%–76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
NASA Technical Reports Server (NTRS)
Friedlander, Alan L.; Harry, David P., III
1960-01-01
An exploratory analysis of vehicle guidance during the approach to a target planet is presented. The objective of the guidance maneuver is to guide the vehicle to a specific perigee distance with a high degree of accuracy and minimum corrective velocity expenditure. The guidance maneuver is simulated by considering the random sampling of real measurements with significant error and reducing this information to prescribe appropriate corrective action. The instrumentation system assumed includes optical and/or infrared devices to indicate range and a reference angle in the trajectory plane. Statistical results are obtained by Monte-Carlo techniques and are shown as the expectation of guidance accuracy and velocity-increment requirements. Results are nondimensional and applicable to any planet within limits of two-body assumptions. The problem of determining how many corrections to make and when to make them is a consequence of the conflicting requirement of accurate trajectory determination and propulsion. Optimum values were found for a vehicle approaching a planet along a parabolic trajectory with an initial perigee distance of 5 radii and a target perigee of 1.02 radii. In this example measurement errors were less than i minute of arc. Results indicate that four corrections applied in the vicinity of 50, 16, 15, and 1.5 radii, respectively, yield minimum velocity-increment requirements. Thrust devices capable of producing a large variation of velocity-increment size are required. For a vehicle approaching the earth, miss distances within 32 miles are obtained with 90-percent probability. Total velocity increments used in guidance are less than 3300 feet per second with 90-percent probability. It is noted that the above representative results are valid only for the particular guidance scheme hypothesized in this analysis. A parametric study is presented which indicates the effects of measurement error size, initial perigee, and initial energy on the guidance
Nibhanipudi, Kumara V
2016-01-01
Objective. A study to determine if addition of palatal petechiae to Centor criteria adds more value for clinical diagnosis of acute strep pharyngitis in children. Hypothesis. In children, Centor Criteria does not cover all the symptoms and signs of acute strep pharyngitis. We hypothesize that addition of palatal petechiae to Centor Criteria will increase the possibility of clinical diagnosis of group A streptococcal pharyngitis in children. Methods. One hundred patients with a complaint of sore throat were enrolled in the study. All the patients were examined clinically using the Centor Criteria. They were also examined for other signs and symptoms like petechial lesions over the palate, abdominal pain, and skin rash. All the patients were given rapid strep tests, and throat cultures were sent. No antibiotics were given until culture results were obtained. Results. The sample size was 100 patients. All 100 had fever, sore throat, and erythema of tonsils. Twenty of the 100 patients had tonsillar exudates, 85/100 had tender anterior cervical lymph nodes, and 86/100 had no cough. In total, 9 out of the 100 patients had positive throat cultures. We observed that petechiae over the palate, a very significant sign, is not included in the Centor Criteria. Palatal petechiae were present in 8 out of the 100 patients. Six out of these 8 with palatal petechiae had positive throat culture for strep (75%). Only 7 out of 20 with exudates had positive strep culture. Sixteen out of the 100 patients had rapid strep test positive. Those 84/100 who had negative rapid strep also had negative throat culture. Statistics. We used Fisher's exact test, comparing throat culture positive and negative versus presence of exudates and palatal hemorrhages with positive and negative throat cultures and the resultant P value <.0001. Conclusion. Our study concludes that addition of petechiae over the palate to Centor Criteria will increase the possibility of diagnosing acute group A streptococcal
Nibhanipudi, Kumara V.
2016-01-01
Objective. A study to determine if addition of palatal petechiae to Centor criteria adds more value for clinical diagnosis of acute strep pharyngitis in children. Hypothesis. In children, Centor Criteria does not cover all the symptoms and signs of acute strep pharyngitis. We hypothesize that addition of palatal petechiae to Centor Criteria will increase the possibility of clinical diagnosis of group A streptococcal pharyngitis in children. Methods. One hundred patients with a complaint of sore throat were enrolled in the study. All the patients were examined clinically using the Centor Criteria. They were also examined for other signs and symptoms like petechial lesions over the palate, abdominal pain, and skin rash. All the patients were given rapid strep tests, and throat cultures were sent. No antibiotics were given until culture results were obtained. Results. The sample size was 100 patients. All 100 had fever, sore throat, and erythema of tonsils. Twenty of the 100 patients had tonsillar exudates, 85/100 had tender anterior cervical lymph nodes, and 86/100 had no cough. In total, 9 out of the 100 patients had positive throat cultures. We observed that petechiae over the palate, a very significant sign, is not included in the Centor Criteria. Palatal petechiae were present in 8 out of the 100 patients. Six out of these 8 with palatal petechiae had positive throat culture for strep (75%). Only 7 out of 20 with exudates had positive strep culture. Sixteen out of the 100 patients had rapid strep test positive. Those 84/100 who had negative rapid strep also had negative throat culture. Statistics. We used Fisher’s exact test, comparing throat culture positive and negative versus presence of exudates and palatal hemorrhages with positive and negative throat cultures and the resultant P value <.0001. Conclusion. Our study concludes that addition of petechiae over the palate to Centor Criteria will increase the possibility of diagnosing acute group A streptococcal
Greenhalgh, T.
1997-01-01
It is possible to be seriously misled by taking the statistical competence (and/or the intellectual honesty) of authors for granted. Some common errors committed (deliberately or inadvertently) by the authors of papers are given in the final box. PMID:9277611
Collar, Concha; Conte, Paola; Fadda, Costantino; Piga, Antonio
2015-10-01
The capability of different gluten-free (GF) basic formulations made of flour (rice, amaranth and chickpea) and starch (corn and cassava) blends, to make machinable and viscoelastic GF-doughs in absence/presence of single hydrocolloids (guar gum, locust bean and psyllium fibre), proteins (milk and egg white) and surfactants (neutral, anionic and vegetable oil) have been investigated. Macroscopic (high deformation) and macromolecular (small deformation) mechanical, viscometric (gelatinization, pasting, gelling) and thermal (gelatinization, melting, retrogradation) approaches were performed on the different matrices in order to (a) identify similarities and differences in GF-doughs in terms of a small number of rheological and thermal analytical parameters according to the formulations and (b) to assess single and interactive effects of basic ingredients and additives on GF-dough performance to achieve GF-flat breads. Larger values for the static and dynamic mechanical characteristics and higher viscometric profiles during both cooking and cooling corresponded to doughs formulated with guar gum and Psyllium fibre added to rice flour/starch and rice flour/corn starch/chickpea flour, while surfactant- and protein-formulated GF-doughs added to rice flour/starch/amaranth flour based GF-doughs exhibited intermediate and lower values for the mechanical parameters and poorer viscometric profiles. In addition, additive-free formulations exhibited higher values for the temperature of both gelatinization and retrogradation and lower enthalpies for the thermal transitions. Single addition of 10% of either chickpea flour or amaranth flour to rice flour/starch blends provided a large GF-dough hardening effect in presence of corn starch and an intermediate effect in presence of cassava starch (chickpea), and an intermediate reinforcement of GF-dough regardless the source of starch (amaranth). At macromolecular level, both chickpea and amaranth flours, singly added, determined
NASA Astrophysics Data System (ADS)
Buck, Henk M.
Various examples are given in which compounds are characterized as products or intermediates in a (distorted) trigonal pyramidal (TP) geometry. These observations have taken place mainly in the field of carbocation chemistry. Special attention is given to carbenium ions formed by halogen addition to 1,1-diarylsubstituted ethylenes focused on the electronic effects of the C-halogen bond as axial bond in a TP geometry with regard to the ?-distribution in the rest of the molecular system. The experimental verification is accompanied by quantum chemical calculations. We also used the TP structure as a reactive model for specific enzymatic reactions. The relevance of this geometry is shown for the dehalogenation reaction of the nucleophilic displacement in dichloroethane catalyzed by haloalkane dehalogenase and for the decarboxylation of L-ornithine with ornithine decarboxylase under loss of carbon dioxide.
Pregowski, Jerzy; Kepka, Cezary; Kruk, Mariusz; Mintz, Gary S; Kalinczuk, Lukasz; Ciszewski, Michal; Kochanowski, Lukasz; Wolny, Rafal; Chmielak, Zbigniew; Jastrzębski, Jan; Klopotowski, Mariusz; Zalewska, Joanna; Demkow, Marcin; Karcz, Maciej; Witkowski, Adam
2014-04-01
To assess the anatomical background and significance of incomplete invasive coronary angiography (ICA) and to evaluate the value of coronary computed tomography angiography (CTA) in this scenario. The current study is an analysis of high volume center experience with prospective registry of coronary CTA and ICA. The target population was identified through a review of the electronic database. We included consecutive patients referred for coronary CTA after ICA, which did not visualize at least one native coronary artery or by-pass graft. Between January 2009 and April 2013, 13,603 diagnostic ICA were performed. There were 45 (0.3 %) patients referred for coronary CTA after incomplete ICA. Patients were divided into 3 groups: angina symptoms without previous coronary artery by-pass grafting (CABG) (n = 11,212), angina symptoms with previous CABG (n = 986), and patients prior to valvular surgery (n = 925). ICA did not identify by-pass grafts in 21 (2.2 %) patients and in 24 (0.2 %) cases of native arteries. The explanations for an incomplete ICA included: 11 ostium anomalies, 2 left main spasms, 5 access site problems, 5 ascending aorta aneurysms, and 2 tortuous take-off of a subclavian artery. However, in 20 (44 %) patients no specific reason for the incomplete ICA was identified. After coronary CTA revascularization was performed in 11 (24 %) patients: 6 successful repeat ICA and percutaneous intervention and 5 CABG. Incomplete ICA constitutes rare, but a significant clinical problem. Coronary CTA provides adequate clinical information in these patients. PMID:24623270
Kurtz, S.E.; Fields, D.E.
1983-10-01
This report describes a version of the TERPED/P computer code that is very useful for small data sets. A new algorithm for determining the Kolmogorov-Smirnov (KS) statistics is used to extend program applicability. The TERPED/P code facilitates the analysis of experimental data and assists the user in determining its probability distribution function. Graphical and numerical tests are performed interactively in accordance with the user's assumption of normally or log-normally distributed data. Statistical analysis options include computation of the chi-square statistic and the KS one-sample test statistic and the corresponding significance levels. Cumulative probability plots of the user's data are generated either via a local graphics terminal, a local line printer or character-oriented terminal, or a remote high-resolution graphics device such as the FR80 film plotter or the Calcomp paper plotter. Several useful computer methodologies suffer from limitations of their implementations of the KS nonparametric test. This test is one of the more powerful analysis tools for examining the validity of an assumption about the probability distribution of a set of data. KS algorithms are found in other analysis codes, including the Statistical Analysis Subroutine (SAS) package and earlier versions of TERPED. The inability of these algorithms to generate significance levels for sample sizes less than 50 has limited their usefulness. The release of the TERPED code described herein contains algorithms to allow computation of the KS statistic and significance level for data sets of, if the user wishes, as few as three points. Values computed for the KS statistic are within 3% of the correct value for all data set sizes.
Lindmark, Anita; van Rompaye, Bart; Goetghebeur, Els; Glader, Eva-Lotta; Eriksson, Marie
2016-01-01
Background When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke) to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance. Methods The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008–2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method. Results Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252) and high specificity (0.991). There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence. Conclusions The study emphasizes the importance of combining clinical relevance
Gaonkar, Bilwaj; Davatzikos, Christos
2013-01-01
Multivariate pattern analysis (MVPA) methods such as support vector machines (SVMs) have been increasingly applied to fMRI and sMRI analyses, enabling the detection of distinctive imaging patterns. However, identifying brain regions that significantly contribute to the classification/group separation requires computationally expensive permutation testing. In this paper we show that the results of SVM-permutation testing can be analytically approximated. This approximation leads to more than a thousand fold speed up of the permutation testing procedure, thereby rendering it feasible to perform such tests on standard computers. The speed up achieved makes SVM based group difference analysis competitive with standard univariate group difference analysis methods. PMID:23583748
Boareto, Marcelo; Caticha, Nestor
2014-01-01
Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.
Fixed-ratio ray designs have been used for detecting and characterizing interactions of large numbers of chemicals in combination. Single chemical dose-response data are used to predict an “additivity curve” along an environmentally relevant ray. A “mixture curve” is estimated fr...
NASA Technical Reports Server (NTRS)
Druzhinin, I. P.; Khamyanova, N. V.; Yagodinskiy, V. N.
1974-01-01
Statistical evaluations of the significance of the relationship of abrupt changes in solar activity and discontinuities in the multi-year pattern of an epidemic process are reported. They reliably (with probability of more than 99.9%) show the real nature of this relationship and its great specific weight (about half) in the formation of discontinuities in the multi-year pattern of the processes in question.
NASA Astrophysics Data System (ADS)
Kalimeris, A.; Potirakis, S. M.; Eftaxias, K.; Antonopoulos, G.; Kopanas, J.; Nomikos, C.
2016-05-01
A multi-spectral analysis of the kHz electromagnetic time series associated with Athens' earthquake (M = 5.9, 7 September 1999) is presented here, that results to the reliable discrimination of the fracto-electromagnetic emissions from the natural geo-electromagnetic field background. Five spectral analysis methods are utilized in order to resolve the statistically significant variability modes of the studied dynamical system out of a red noise background (the revised Multi-Taper Method, the Singular Spectrum Analysis, and the Wavelet Analysis among them). The performed analysis reveals the existence of three distinct epochs in the time series for the period before the earthquake, a "quiet", a "transitional" and an "active" epoch. Towards the end of the active epoch, during a sub-period which is approximately starting two days before the earthquake, the dynamical system passes into a high activity state, where electromagnetic signal emissions become powerful and statistically significant almost in all time-scales. The temporal behavior of the studied system in each one of these epochs is further searched through mathematical reconstruction in the time domain of those spectral features that were found to be statistically significant. The transition of the system from the quiet to the active state proved to be detectable first in the long time-scales and afterwards in the short scales. Finally, a Hurst exponent analysis revealed persistent characteristics embedded in the two strong EM bursts observed during the "active" epoch.
Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Aliakbari, Massumeh; Lindlöf, Angelica; Ebrahimi, Mahdi; Ebrahimie, Esmaeil
2015-01-01
Cis regulatory elements (CREs), located within promoter regions, play a significant role in the blueprint for transcriptional regulation of genes. There is a growing interest to study the combinatorial nature of CREs including presence or absence of CREs, the number of occurrences of each CRE, as well as of their order and location relative to their target genes. Comparative promoter analysis has been shown to be a reliable strategy to test the significance of each component of promoter architecture. However, it remains unclear what level of difference in the number of occurrences of each CRE is of statistical significance in order to explain different expression patterns of two genes. In this study, we present a novel statistical approach for pairwise comparison of promoters of Arabidopsis genes in the context of number of occurrences of each CRE within the promoters. First, using the sample of 1000 Arabidopsis promoters, the results of the goodness of fit test and non-parametric analysis revealed that the number of occurrences of CREs in a promoter sequence is Poisson distributed. As a promoter sequence contained functional and non-functional CREs, we addressed the issue of the statistical distribution of functional CREs by analyzing the ChIP-seq datasets. The results showed that the number of occurrences of functional CREs over the genomic regions was determined as being Poisson distributed. In accordance with the obtained distribution of CREs occurrences, we suggested the Audic and Claverie (AC) test to compare two promoters based on the number of occurrences for the CREs. Superiority of the AC test over Chi-square (2×2) and Fisher's exact tests was also shown, as the AC test was able to detect a higher number of significant CREs. The two case studies on the Arabidopsis genes were performed in order to biologically verify the pairwise test for promoter comparison. Consequently, a number of CREs with significantly different occurrences was identified between
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the 'IMGT clonotype (AA)' (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
Best, R; Harrell, A; Geesey, C; Libby, B; Wijesooriya, K
2014-06-15
Purpose: The purpose of this study is to inter-compare and find statistically significant differences between flattened field fixed-beam (FB) IMRT with flattening-filter free (FFF) volumetric modulated arc therapy (VMAT) for stereotactic body radiation therapy SBRT. Methods: SBRT plans using FB IMRT and FFF VMAT were generated for fifteen SBRT lung patients using 6 MV beams. For each patient, both IMRT and VMAT plans were created for comparison. Plans were generated utilizing RTOG 0915 (peripheral, 10 patients) and RTOG 0813 (medial, 5 patients) lung protocols. Target dose, critical structure dose, and treatment time were compared and tested for statistical significance. Parameters of interest included prescription isodose surface coverage, target dose heterogeneity, high dose spillage (location and volume), low dose spillage (location and volume), lung dose spillage, and critical structure maximum- and volumetric-dose limits. Results: For all criteria, we found equivalent or higher conformality with VMAT plans as well as reduced critical structure doses. Several differences passed a Student's t-test of significance: VMAT reduced the high dose spillage, evaluated with conformality index (CI), by an average of 9.4%±15.1% (p=0.030) compared to IMRT. VMAT plans reduced the lung volume receiving 20 Gy by 16.2%±15.0% (p=0.016) compared with IMRT. For the RTOG 0915 peripheral lesions, the volumes of lung receiving 12.4 Gy and 11.6 Gy were reduced by 27.0%±13.8% and 27.5%±12.6% (for both, p<0.001) in VMAT plans. Of the 26 protocol pass/fail criteria, VMAT plans were able to achieve an average of 0.2±0.7 (p=0.026) more constraints than the IMRT plans. Conclusions: FFF VMAT has dosimetric advantages over fixed beam IMRT for lung SBRT. Significant advantages included increased dose conformity, and reduced organs-at-risk doses. The overall improvements in terms of protocol pass/fail criteria were more modest and will require more patient data to establish difference
Wang, Hong-Qiang; Tsai, Chung-Jui
2013-01-01
With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. Software
NASA Astrophysics Data System (ADS)
Zhu, Lili; Wu, Xinhu; Zhao, Gaiqing; Wang, Xiaobo
2016-02-01
A new antiwear additive of Bisphenol AF bis(diphenyl phosphate) (BAFDP) was synthesized and characterized. The tribological behaviors of the additive for polyalkylene glycol (PAG) and polyurea grease (PG) application in steel/steel contacts were evaluated on an Optimol SRV-IV oscillating reciprocating friction and wear tester at elevated temperature. The results revealed that BAFDP could drastically reduce friction and wear of sliding pairs in both PAG and also in PG at 100 °C. The tribological properties of BAFDP are superior to the normally used zinc dialkyldithiophosphate-based additive package (ZDDP) in PAG and PG. Moreover, BAFDP as additive for PAG and PG displays relatively significant tribological properties in temperature-ramp tests by performing well at 50-300 °C, indicating the excellent high temperature friction reduction and anti-wear capacity of BAFDP. XPS results showed that boundary lubrication films composed of Fe(OH)O, Fe3O4, FePO4, FeF2, FeF3, compounds containing the Psbnd O bonds, nitrogen oxide, and so forth, were formed on the worn surface, which contributed to excellent friction reduction and antiwear performance.
NASA Astrophysics Data System (ADS)
Draper, David S.; van Westrenen, Wim
2007-12-01
As a complement to our efforts to update and revise the thermodynamic basis for predicting garnet-melt trace element partitioning using lattice-strain theory (van Westrenen and Draper in Contrib Mineral Petrol, this issue), we have performed detailed statistical evaluations of possible correlations between intensive and extensive variables and experimentally determined garnet-melt partitioning values for trivalent cations (rare earth elements, Y, and Sc) entering the dodecahedral garnet X-site. We applied these evaluations to a database containing over 300 partition coefficient determinations, compiled both from literature values and from our own work designed in part to expand that database. Available data include partitioning measurements in ultramafic to basaltic to intermediate bulk compositions, and recent studies in Fe-rich systems relevant to extraterrestrial petrogenesis, at pressures sufficiently high such that a significant component of majorite, the high-pressure form of garnet, is present. Through the application of lattice-strain theory, we obtained best-fit values for the ideal ionic radius of the dodecahedral garnet X-site, r 0(3+), its apparent Young’s modulus E(3+), and the strain-free partition coefficient D 0(3+) for a fictive REE element J of ionic radius r 0(3+). Resulting values of E, D 0, and r 0 were used in multiple linear regressions involving sixteen variables that reflect the possible influence of garnet composition and stoichiometry, melt composition and structure, major-element partitioning, pressure, and temperature. We find no statistically significant correlations between fitted r 0 and E values and any combination of variables. However, a highly robust correlation between fitted D 0 and garnet-melt Fe Mg exchange and D Mg is identified. The identification of more explicit melt-compositional influence is a first for this type of predictive modeling. We combine this statistically-derived expression for predicting D 0 with the new
Cho, Yun Hee; Cho, Kyu Ran; Park, Eun Kyung; Seo, Bo Kyoung; Woo, Ok Hee; Cho, Sung Bum; Bae, Jeoung Won
2016-01-01
Background In preoperative assessment of breast cancer, MRI has been shown to identify more additional breast lesions than are detectable using conventional imaging techniques. The characterization of additional lesions is more important than detection for optimal surgical treatment. Additional breast lesions can be included in focus, mass, and non-mass enhancement (NME) on MRI. According to the fifth edition of the breast imaging reporting and data system (BI-RADS®), which includes several changes in the NME descriptors, few studies to date have evaluated NME in preoperative assessment of breast cancer. Objectives We investigated the diagnostic accuracy of BI-RADS descriptors in predicting malignancy for additional NME lesions detected on preoperative 3T dynamic contrast enhanced MRI (DCE-MRI) in patients with newly diagnosed breast cancer. Patients and Methods Between January 2008 and December 2012, 88 patients were enrolled in our study, all with NME lesions other than the index cancer on preoperative 3T DCE-MRI and all with accompanying histopathologic examination. The MRI findings were analyzed according to the BI-RADS MRI lexicon. We evaluated the size, distribution, internal enhancement pattern, and location of NME lesions relative to the index cancer (i.e., same quadrant, different quadrant, or contralateral breast). Results On histopathologic analysis of the 88 NME lesions, 73 (83%) were malignant and 15 (17%) were benign. Lesion size did not differ significantly between malignant and benign lesions (P = 0.410). Malignancy was more frequent in linear (P = 0.005) and segmental (P = 0.011) distributions, and benignancy was more frequent in focal (P = 0.004) and regional (P < 0.001) NME lesions. The highest positive predictive value (PPV) for malignancy occurred in segmental (96.8%), linear (95.1%), clustered ring (100%), and clumped (92.0%) enhancement. Asymmetry demonstrated a high positive predictive value of 85.9%. The frequency of malignancy was higher
Stanic, Sinisa; Mathai, Mathew; Mayadev, Jyoti S.; Do, Ly V.; Purdy, James A.; Chen, Allen M.
2012-07-01
Our goal was to evaluate brachial plexus (BP) dose with and without the use of supraclavicular (SCL) irradiation in patients undergoing breast-conserving therapy with whole-breast radiation therapy (RT) after lumpectomy. Using the standardized Radiation Therapy Oncology Group (RTOG)-endorsed guidelines delineation, we contoured the BP for 10 postlumpectomy breast cancer patients. The radiation dose to the whole breast was 50.4 Gy using tangential fields in 1.8-Gy fractions, followed by a conedown to the operative bed using electrons (10 Gy). The prescription dose to the SCL field was 50.4 Gy, delivered to 3-cm depth. The mean BP volume was 14.5 {+-} 1.5 cm{sup 3}. With tangential fields alone, the median mean dose to the BP was 0.57 Gy, the median maximum dose was 1.93 Gy, and the irradiated volume of the BP receiving 40, 45, and 50 Gy was 0%. When the third (SCL field) was added, the dose to the BP was significantly increased (P = .01): the median mean dose to the BP was 40.60 Gy, and the median maximum dose was 52.22 Gy. With 3-field RT, the median irradiated volume of the BP receiving 40, 45, and 50 Gy was 83.5%, 68.5%, and 24.6%, respectively. The addition of the SCL field significantly increases dose to the BP. The possibility of increasing the risk of BP morbidity should be considered in the context of clinical decision making.
2014-01-01
Background The purpose of this study was to assess the relationship of CT-perfusion (CTP), 18F-FDG-PET/CT and histological parameters, and the possible added value of CTP to FDG-PET/CT in the initial staging of lung cancer. Methods Fifty-four consecutive patients (median age 65 years, 15 females, 39 males) with suspected lung cancer were evaluated prospectively by CT-perfusion scan and 18F-FDG-PET/CT scan. Overall, 46 tumors were identified. CTP parameters blood flow (BF), blood volume (BV), and mean transit time (MTT) of the tumor tissue were calculated. Intratumoral microvessel density (MVD) was assessed quantitatively. Differences in CTP parameters concerning tumor type, location, PET positivity of lymph nodes, TNM status, and UICC stage were analyzed. Spearman correlation analyses between CTP and 18F-FDG-PET/CT parameters (SUVmax, SUVmean, PETvol, and TLG), MVD, tumor size, and tumor stage were performed. Results The mean BF (mL/100 mL min-1), BV (mL/100 mL), and MTT (s) was 35.5, 8.4, and 14.2, respectively. The BF and BV were lower in tumors with PET-positive lymph nodes (p = 0.02). However, the CTP values were not significantly different among the N stages. The CTP values were not different, depending on tumor size and location. No significant correlation was found between CTP parameters and MVD. Conclusions Overall, the CTP information showed only little additional information for the initial staging compared with standard FDG-PET/CT. Low perfusion in lung tumors might possibly be associated with metabolically active regional lymph nodes. Apart from that, both CTP and 18F-FDG-PET/CT parameter sets may reflect different pathophysiological mechanisms in lung cancer. PMID:24450990
ERIC Educational Resources Information Center
Meyer, Donald L.
Bayesian statistical methodology and its possible uses in the behavioral sciences are discussed in relation to the solution of problems in both the use and teaching of fundamental statistical methods, including confidence intervals, significance tests, and sampling. The Bayesian model explains these statistical methods and offers a consistent…
Reddy, Samala Murali Mohan; Shanmugam, Ganesh; Duraipandy, Natarajan; Kiran, Manikantan Syamala; Mandal, Asit Baran
2015-11-01
In recent years, several fluorenylmethoxycarbonyl (Fmoc)-functionalized amino acids and peptides have been used to construct hydrogels, which find a wide range of applications. Although several hydrogels have been prepared from mono Fmoc-functionalized amino acids, herein, we demonstrate the importance of an additional Fmoc-moiety in the hydrogelation of double Fmoc-functionalized L-lysine [Fmoc(Nα)-L-lysine(NεFmoc)-OH, (Fmoc-K(Fmoc))] as a low molecular weight gelator (LMWG). Unlike other Fmoc-functionalized amino acid gelators, Fmoc-K(Fmoc) exhibits pH-controlled ambidextrous gelation (hydrogelation at different pH values as well as organogelation), which is significant among the gelators. Distinct fibrous morphologies were observed for Fmoc-K(Fmoc) hydrogels formed at different pH values, which are different from organogels in which Fmoc-K(Fmoc) showed bundles of long fibers. In both hydrogels and organogels, the self-assembly of Fmoc-K(Fmoc) was driven by aromatic π-π stacking and hydrogen bonding interactions, as evidenced from spectroscopic analyses. Characterization of Fmoc-K(Fmoc) gels using several biophysical methods indicates that Fmoc-K(Fmoc) has several advantages and significant importance as a LMWG. The advantages of Fmoc-K(Fmoc) include pH-controlled ambidextrous gelation, pH stimulus response, high thermal stability (∼100 °C) even at low minimum hydrogelation concentration (0.1 wt%), thixotropic property, high kinetic and mechanical stability, dye removal properties, cell viability to the selected cell type, and as a drug carrier. While single Fmoc-functionalized L-lysine amino acids failed to exhibit gelation under similar experimental conditions, the pH-controlled ambidextrous gelation of Fmoc-K(Fmoc) demonstrates the benefit of a second Fmoc moiety in inducing gelation in a LMWG. We thus strongly believe that the current findings provide a lead to construct or design various new synthetic Fmoc-based LMW organic gelators for several
Yabusaki, N; Komatsu, H; Tago, K; Yamada, Y; Ueno, A
1991-02-01
The efficacy of maintenance bacillus Calmette-Guerin (BCG) instillations for superficial bladder tumors was studied by prospective randomized trial. From June 1985 to October 1988, 42 newly diagnosed patients with superficial bladder carcinoma (pTa or pT1) were treated by transurethral tumor resection and subsequent five daily instillations of mitomycin C. Then they were divided into non-maintenance group (22 patients) and maintenance group (20 patients) by randomization. The patients received six weekly instillations of 80 mg of BCG. Tokyo strain (Japan BCG manufacturing Co., Tokyo, Japan), suspended in 40 ml of physiological saline, and the patients in the maintenance group received four additional instillations of BCG every three months. We could not complete the six-week course of BCG instillations in three patients due to adverse effects (two in non-maintenance group and one in maintenance group) and we lost six patients from follow-up within one year (one in non-maintenance group and five in maintenance group). The mean follow-up period of the remaining 33 patients was 28.1 months. Of these 33 patients, six patients had been found to have recurrent tumors, and the over-all three-year non-recurrence rate was 82%. Before employing BCG, when we used only mitomycin C after TUR-Bt, the three year non-recurrence rate was 58%. This indicates prophylactic effect of BCG instillations. The stage of the initial tumor of the six recurrent cases were all pT1b. The non-recurrence rate of the patients with pT1b tumor was significantly lower than that of the patients with pTa and pT1a tumor. However, multiplicity and grade of tumors did not affect the non-recurrence rate.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:1904120
Minor changes in the indicator used to measure fine PM, which cause only modest changes in Mass concentrations, can lead to dramatic changes in the statistical relationship of fine PM mass with cardiovascular mortality. An epidemiologic study in Phoenix (Mar et al., 2000), augme...
Williams, Scott G. Buyyounouski, Mark K.; Pickles, Tom; Kestin, Larry; Martinez, Alvaro; Hanlon, Alexandra L.; Duchesne, Gillian M.
2008-03-15
Purpose: To define and incorporate the impact of the percentage of positive biopsy cores (PPC) into a predictive model of prostate cancer radiotherapy biochemical outcome. Methods and Materials: The data of 3264 men with clinically localized prostate cancer treated with external beam radiotherapy at four institutions were retrospectively analyzed. Standard prognostic and treatment factors plus the number of biopsy cores collected and the number positive for malignancy by transrectal ultrasound-guided biopsy were available. The primary endpoint was biochemical failure (bF, Phoenix definition). Multivariate proportional hazards analyses were performed and expressed as a nomogram and the model's predictive ability assessed using the concordance index (c-index). Results: The cohort consisted of 21% low-, 51% intermediate-, and 28% high-risk cancer patients, and 30% had androgen deprivation with radiotherapy. The median PPC was 50% (interquartile range [IQR] 29-67%), and median follow-up was 51 months (IQR 29-71 months). Percentage of positive biopsy cores displayed an independent association with the risk of bF (p = 0.01), as did age, prostate-specific antigen value, Gleason score, clinical stage, androgen deprivation duration, and radiotherapy dose (p < 0.001 for all). Including PPC increased the c-index from 0.72 to 0.73 in the overall model. The influence of PPC varied significantly with radiotherapy dose and clinical stage (p = 0.02 for both interactions), with doses <66 Gy and palpable tumors showing the strongest relationship between PPC and bF. Intermediate-risk patients were poorly discriminated regardless of PPC inclusion (c-index 0.65 for both models). Conclusions: Outcome models incorporating PPC show only minor additional ability to predict biochemical failure beyond those containing standard prognostic factors.
Technology Transfer Automated Retrieval System (TEKTRAN)
Saccharomyces cerevisiae shows great potential for development of bioreactor systems geared towards the production of high-value lipids such as polyunsaturated omega-3 fatty acids, the yields of which are largely dependent on the activity of ectopically-expressed enzymes. Here we show that the addit...
NASA Astrophysics Data System (ADS)
Temme, F. P.
1992-12-01
Realisation of the invariance properties of the p ⩽ 2 number partitional inventory components of the 20-fold spin algebra associated with [A] 20 nuclear spin clusters under SU2 × L20 allows the mappings {[λ] → Γ} to be derived. In addition, recent general inner tensor product expressions under Ln, for n even (odd), also facilitates the evaluation of many higher [λ] ( L20; p = 3) correlative mappings onto SU3↓SO(3) × L↓20T A 5 subduced symmetry from SU2 duality, thus providing results that determine the nature of adapted NMR bases for both dodecahedrane and its d 20 analogue. The significance of this work lies in the pertinence of nuclear spin statistics to both selective MQ-NMR and to other spectroscopic aspects of cage clusters, e.g., [ 13C] n, n = 20, 60, fullerenes. Mappings onto Ln irreps sets of specific p ⩽ 3 number partitions arise in combinatorial treatment of {M iti} Rota fields, defining scalar invariants in the context of Cayley algebra. Inclusion of the Ln group in the specific Racah chain for NMR symmetry gives rise to significant further physical insight.
NASA Astrophysics Data System (ADS)
Topal, Uğur; Aksan, Mehmet Ali
2016-05-01
Magnetite nanoparticles (MNPs) are extensively investigated for biomedical applications, particularly as contrast agents for Magnetic Resonance Imaging and as drug delivery agent and heat mediators for cancer therapy. Tuning the magnetic properties of the magnetite nanoparticles with doping of foreign atoms has a crucial importance for determining the application areas of these materials and so attracts much interests. On the other hand the doping with foreign atoms requires high temperature annealing, and it causes a phase transition to the hematite phase above 400 °C. In this work the phase transition temperature from the magnetite to the hematite phase has been increased by 200 °C, which is the highest enhancement reported in literature. It was achieved by addition of the appropriate amounts of B2O3. Our experiments indicates that the 5.0 wt% of B2O3 addition stabilizes and keeps the existence of single phase magnetite up to 600 °C.
Ding, Shipeng; Liu, Fudong; Shi, Xiaoyan; Liu, Kuo; Lian, Zhihua; Xie, Lijuan; He, Hong
2015-05-13
A novel Mo-promoted Ce-Zr mixed oxide catalyst prepared by a homogeneous precipitation method was used for the selective catalytic reduction (SCR) of NO(x) with NH3. The optimal catalyst showed high NH3-SCR activity, SO2/H2O durability, and thermal stability under test conditions. The addition of Mo inhibited growth of the CeO2 particle size, improved the redox ability, and increased the amount of surface acidity, especially the Lewis acidity, all of which were favorable for the excellent NH3-SCR performance. It is believed that the catalyst is promising for the removal of NO(x) from diesel engine exhaust. PMID:25894854
ERIC Educational Resources Information Center
Cicchetti, Domenic V.; Koenig, Kathy; Klin, Ami; Volkmar, Fred R.; Paul, Rhea; Sparrow, Sara
2011-01-01
The objectives of this report are: (a) to trace the theoretical roots of the concept clinical significance that derives from Bayesian thinking, Marginal Utility/Diminishing Returns in Economics, and the "just noticeable difference", in Psychophysics. These concepts then translated into: Effect Size (ES), strength of agreement, clinical…
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2001-01-01
Since 1750, the number of cataclysmic volcanic eruptions (volcanic explosivity index (VEI)>=4) per decade spans 2-11, with 96 percent located in the tropics and extra-tropical Northern Hemisphere. A two-point moving average of the volcanic time series has higher values since the 1860's than before, being 8.00 in the 1910's (the highest value) and 6.50 in the 1980's, the highest since the 1910's peak. Because of the usual behavior of the first difference of the two-point moving averages, one infers that its value for the 1990's will measure approximately 6.50 +/- 1, implying that approximately 7 +/- 4 cataclysmic volcanic eruptions should be expected during the present decade (2000-2009). Because cataclysmic volcanic eruptions (especially those having VEI>=5) nearly always have been associated with short-term episodes of global cooling, the occurrence of even one might confuse our ability to assess the effects of global warming. Poisson probability distributions reveal that the probability of one or more events with a VEI>=4 within the next ten years is >99 percent. It is approximately 49 percent for an event with a VEI>=5, and 18 percent for an event with a VEI>=6. Hence, the likelihood that a climatically significant volcanic eruption will occur within the next ten years appears reasonably high.
Cosmic statistics of statistics
NASA Astrophysics Data System (ADS)
Szapudi, István; Colombi, Stéphane; Bernardeau, Francis
1999-12-01
The errors on statistics measured in finite galaxy catalogues are exhaustively investigated. The theory of errors on factorial moments by Szapudi & Colombi is applied to cumulants via a series expansion method. All results are subsequently extended to the weakly non-linear regime. Together with previous investigations this yields an analytic theory of the errors for moments and connected moments of counts in cells from highly non-linear to weakly non-linear scales. For non-linear functions of unbiased estimators, such as the cumulants, the phenomenon of cosmic bias is identified and computed. Since it is subdued by the cosmic errors in the range of applicability of the theory, correction for it is inconsequential. In addition, the method of Colombi, Szapudi & Szalay concerning sampling effects is generalized, adapting the theory for inhomogeneous galaxy catalogues. While previous work focused on the variance only, the present article calculates the cross-correlations between moments and connected moments as well for a statistically complete description. The final analytic formulae representing the full theory are explicit but somewhat complicated. Therefore we have made available a fortran program capable of calculating the described quantities numerically (for further details e-mail SC at colombi@iap.fr). An important special case is the evaluation of the errors on the two-point correlation function, for which this should be more accurate than any method put forward previously. This tool will be immensely useful in the future for assessing the precision of measurements from existing catalogues, as well as aiding the design of new galaxy surveys. To illustrate the applicability of the results and to explore the numerical aspects of the theory qualitatively and quantitatively, the errors and cross-correlations are predicted under a wide range of assumptions for the future Sloan Digital Sky Survey. The principal results concerning the cumulants ξ, Q3 and Q4 is that
Suite versus composite statistics
Balsillie, J.H.; Tanner, W.F.
1999-01-01
Suite and composite methodologies, two statistically valid approaches for producing statistical descriptive measures, are investigated for sample groups representing a probability distribution where, in addition, each sample is probability distribution. Suite and composite means (first moment measures) are always equivalent. Composite standard deviations (second moment measures) are always larger than suite standard deviations. Suite and composite values for higher moment measures have more complex relationships. Very seldom, however, are they equivalent, and they normally yield statistically significant but different results. Multiple samples are preferable to single samples (including composites) because they permit the investigator to examine sample-to-sample variability. These and other relationships for suite and composite probability distribution analyses are investigated and reported using granulometric data.
Rule, Simon; Smith, Paul; Johnson, Peter W.M.; Bolam, Simon; Follows, George; Gambell, Joanne; Hillmen, Peter; Jack, Andrew; Johnson, Stephen; Kirkwood, Amy A; Kruger, Anton; Pocock, Christopher; Seymour, John F.; Toncheva, Milena; Walewski, Jan; Linch, David
2016-01-01
Mantle cell lymphoma is an incurable and generally aggressive lymphoma that is more common in elderly patients. Whilst a number of different chemotherapeutic regimens are active in this disease, there is no established gold standard therapy. Rituximab has been used widely to good effect in B-cell malignancies but there is no evidence that it improves outcomes when added to chemotherapy in this disease. We performed a randomized, open-label, multicenter study looking at the addition of rituximab to the standard chemotherapy regimen of fludarabine and cyclophosphamide in patients with newly diagnosed mantle cell lymphoma. A total of 370 patients were randomized. With a median follow up of six years, rituximab improved the median progression-free survival from 14.9 to 29.8 months (P<0.001) and overall survival from 37.0 to 44.5 months (P=0.005). This equates to absolute differences of 9.0% and 22.1% for overall and progression-free survival, respectively, at two years. Overall response rates were similar, but complete response rates were significantly higher in the rituximab arm: 52.7% vs. 39.9% (P=0.014). There was no clinically significant additional toxicity observed with the addition of rituximab. Overall, approximately 18% of patients died of non-lymphomatous causes, most commonly infections. The addition of rituximab to fludarabine and cyclophosphamide chemotherapy significantly improves outcomes in patients with mantle cell lymphoma. However, these regimens have significant late toxicity and should be used with caution. This trial has been registered (ISRCTN81133184 and clinicaltrials.gov:00641095) and is supported by the UK National Cancer Research Network. PMID:26611473
Rule, Simon; Smith, Paul; Johnson, Peter W M; Bolam, Simon; Follows, George; Gambell, Joanne; Hillmen, Peter; Jack, Andrew; Johnson, Stephen; Kirkwood, Amy A; Kruger, Anton; Pocock, Christopher; Seymour, John F; Toncheva, Milena; Walewski, Jan; Linch, David
2016-02-01
Mantle cell lymphoma is an incurable and generally aggressive lymphoma that is more common in elderly patients. Whilst a number of different chemotherapeutic regimens are active in this disease, there is no established gold standard therapy. Rituximab has been used widely to good effect in B-cell malignancies but there is no evidence that it improves outcomes when added to chemotherapy in this disease. We performed a randomized, open-label, multicenter study looking at the addition of rituximab to the standard chemotherapy regimen of fludarabine and cyclophosphamide in patients with newly diagnosed mantle cell lymphoma. A total of 370 patients were randomized. With a median follow up of six years, rituximab improved the median progression-free survival from 14.9 to 29.8 months (P<0.001) and overall survival from 37.0 to 44.5 months (P=0.005). This equates to absolute differences of 9.0% and 22.1% for overall and progression-free survival, respectively, at two years. Overall response rates were similar, but complete response rates were significantly higher in the rituximab arm: 52.7% vs. 39.9% (P=0.014). There was no clinically significant additional toxicity observed with the addition of rituximab. Overall, approximately 18% of patients died of non-lymphomatous causes, most commonly infections. The addition of rituximab to fludarabine and cyclophosphamide chemotherapy significantly improves outcomes in patients with mantle cell lymphoma. However, these regimens have significant late toxicity and should be used with caution. This trial has been registered (ISRCTN81133184 and clinicaltrials.gov:00641095) and is supported by the UK National Cancer Research Network. PMID:26611473
Grossling, Bernardo F.
1975-01-01
Exploratory drilling is still in incipient or youthful stages in those areas of the world where the bulk of the potential petroleum resources is yet to be discovered. Methods of assessing resources from projections based on historical production and reserve data are limited to mature areas. For most of the world's petroleum-prospective areas, a more speculative situation calls for a critical review of resource-assessment methodology. The language of mathematical statistics is required to define more rigorously the appraisal of petroleum resources. Basically, two approaches have been used to appraise the amounts of undiscovered mineral resources in a geologic province: (1) projection models, which use statistical data on the past outcome of exploration and development in the province; and (2) estimation models of the overall resources of the province, which use certain known parameters of the province together with the outcome of exploration and development in analogous provinces. These two approaches often lead to widely different estimates. Some of the controversy that arises results from a confusion of the probabilistic significance of the quantities yielded by each of the two approaches. Also, inherent limitations of analytic projection models-such as those using the logistic and Gomperts functions --have often been ignored. The resource-assessment problem should be recast in terms that provide for consideration of the probability of existence of the resource and of the probability of discovery of a deposit. Then the two above-mentioned models occupy the two ends of the probability range. The new approach accounts for (1) what can be expected with reasonably high certainty by mere projections of what has been accomplished in the past; (2) the inherent biases of decision-makers and resource estimators; (3) upper bounds that can be set up as goals for exploration; and (4) the uncertainties in geologic conditions in a search for minerals. Actual outcomes can then
Schulte-Spechtel, Ulrike; Lehnert, Gisela; Liegl, Gaby; Fingerle, Volker; Heimerl, Christiane; Johnson, Barbara J B; Wilske, Bettina
2003-03-01
We investigated whether the recombinant Borrelia Western blot test previously described (B. Wilske, C. Habermann, V. Fingerle, B. Hillenbrand, S. Jauris-Heipke, G. Lehnert, I. Pradel, D. Rössler, and U. Schulte-Spechtel, Med. Microbiol. Immunol. 188:139-144, 1999) can be improved by the addition of VlsE and additional DbpA and OspC homologues. By using a panel of sera from 36 neuroborreliosis patients and 67 control patients, the diagnostic sensitivity of the recombinant immunoblot test was significantly increased (86.1% versus 52.7%) without loss of specificity and was higher (86.1% versus 63.8%) than that of the conventional whole-cell lysate immunoblot test (U. Hauser, G. Lehnert, R. Lobentanzer, and B. Wilske, J. Clin. Microbiol. 35:1433-1444, 1997). Improvement was mainly due to the presence of VlsE and DbpA. PMID:12624072
Shi, Runhua; McLarty, Jerry W
2009-10-01
In this article, we introduced basic concepts of statistics, type of distributions, and descriptive statistics. A few examples were also provided. The basic concepts presented herein are only a fraction of the concepts related to descriptive statistics. Also, there are many commonly used distributions not presented herein, such as Poisson distributions for rare events and exponential distributions, F distributions, and logistic distributions. More information can be found in many statistics books and publications. PMID:19891281
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2008-01-01
As a branch of knowledge, Statistics is ubiquitous and its applications can be found in (almost) every field of human endeavour. In this article, the authors track down the possible source of the link between the "Siren song" and applications of Statistics. Answers to their previous five questions and five new questions on Statistics are presented.
ERIC Educational Resources Information Center
Callamaras, Peter
1983-01-01
This buyer's guide to seven major types of statistics software packages for microcomputers reviews Edu-Ware Statistics 3.0; Financial Planning; Speed Stat; Statistics with DAISY; Human Systems Dynamics package of Stats Plus, ANOVA II, and REGRESS II; Maxistat; and Moore-Barnes' MBC Test Construction and MBC Correlation. (MBR)
Significant lexical relationships
Pedersen, T.; Kayaalp, M.; Bruce, R.
1996-12-31
Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.
ERIC Educational Resources Information Center
Andrews, Ian A.
1999-01-01
Provides a crossword puzzle with an answer key corresponding to the book entitled "Significant Treasures/Tresors Parlants" that is filled with color and black-and-white prints of paintings and artifacts from 131 museums and art galleries as a sampling of the 2,200 such Canadian institutions. (CMK)
Titanic: A Statistical Exploration.
ERIC Educational Resources Information Center
Takis, Sandra L.
1999-01-01
Uses the available data about the Titanic's passengers to interest students in exploring categorical data and the chi-square distribution. Describes activities incorporated into a statistics class and gives additional resources for collecting information about the Titanic. (ASK)
Information geometry of Bayesian statistics
NASA Astrophysics Data System (ADS)
Matsuzoe, Hiroshi
2015-01-01
A survey of geometry of Bayesian statistics is given. From the viewpoint of differential geometry, a prior distribution in Bayesian statistics is regarded as a volume element on a statistical model. In this paper, properties of Bayesian estimators are studied by applying equiaffine structures of statistical manifolds. In addition, geometry of anomalous statistics is also studied. Deformed expectations and deformed independeces are important in anomalous statistics. After summarizing geometry of such deformed structues, a generalization of maximum likelihood method is given. A suitable weight on a parameter space is important in Bayesian statistics, whereas a suitable weight on a sample space is important in anomalous statistics.
Food additives are substances that become part of a food product when they are added during the processing or making of that food. "Direct" food additives are often added during processing to: Add nutrients ...
Kogalovskii, M.R.
1995-03-01
This paper presents a review of problems related to statistical database systems, which are wide-spread in various fields of activity. Statistical databases (SDB) are referred to as databases that consist of data and are used for statistical analysis. Topics under consideration are: SDB peculiarities, properties of data models adequate for SDB requirements, metadata functions, null-value problems, SDB compromise protection problems, stored data compression techniques, and statistical data representation means. Also examined is whether the present Database Management Systems (DBMS) satisfy the SDB requirements. Some actual research directions in SDB systems are considered.
Smith, Alwyn
1969-01-01
This paper is based on an analysis of questionnaires sent to the health ministries of Member States of WHO asking for information about the extent, nature, and scope of morbidity statistical information. It is clear that most countries collect some statistics of morbidity and many countries collect extensive data. However, few countries relate their collection to the needs of health administrators for information, and many countries collect statistics principally for publication in annual volumes which may appear anything up to 3 years after the year to which they refer. The desiderata of morbidity statistics may be summarized as reliability, representativeness, and relevance to current health problems. PMID:5306722
Spencer, Michael
1974-01-01
Food additives are discussed from the food technology point of view. The reasons for their use are summarized: (1) to protect food from chemical and microbiological attack; (2) to even out seasonal supplies; (3) to improve their eating quality; (4) to improve their nutritional value. The various types of food additives are considered, e.g. colours, flavours, emulsifiers, bread and flour additives, preservatives, and nutritional additives. The paper concludes with consideration of those circumstances in which the use of additives is (a) justified and (b) unjustified. PMID:4467857
ERIC Educational Resources Information Center
Petocz, Peter; Sowey, Eric
2008-01-01
In this article, the authors focus on hypothesis testing--that peculiarly statistical way of deciding things. Statistical methods for testing hypotheses were developed in the 1920s and 1930s by some of the most famous statisticians, in particular Ronald Fisher, Jerzy Neyman and Egon Pearson, who laid the foundations of almost all modern methods of…
ERIC Educational Resources Information Center
Huberty, Carl J.
An approach to statistical testing, which combines Neyman-Pearson hypothesis testing and Fisher significance testing, is recommended. The use of P-values in this approach is discussed in some detail. The author also discusses some problems which are often found in introductory statistics textbooks. The problems involve the definitions of…
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute works to provide information on cancer statistics in an effort to reduce the burden of cancer among the U.S. population.
... cancer statistics across the world. U.S. Cancer Mortality Trends The best indicator of progress against cancer is ... the number of cancer survivors has increased. These trends show that progress is being made against the ...
NASA Astrophysics Data System (ADS)
Hermann, Claudine
Statistical Physics bridges the properties of a macroscopic system and the microscopic behavior of its constituting particles, otherwise impossible due to the giant magnitude of Avogadro's number. Numerous systems of today's key technologies - such as semiconductors or lasers - are macroscopic quantum objects; only statistical physics allows for understanding their fundamentals. Therefore, this graduate text also focuses on particular applications such as the properties of electrons in solids with applications, and radiation thermodynamics and the greenhouse effect.
Berglund, F
1978-01-01
The use of additives to food fulfils many purposes, as shown by the index issued by the Codex Committee on Food Additives: Acids, bases and salts; Preservatives, Antioxidants and antioxidant synergists; Anticaking agents; Colours; Emulfifiers; Thickening agents; Flour-treatment agents; Extraction solvents; Carrier solvents; Flavours (synthetic); Flavour enhancers; Non-nutritive sweeteners; Processing aids; Enzyme preparations. Many additives occur naturally in foods, but this does not exclude toxicity at higher levels. Some food additives are nutrients, or even essential nutritents, e.g. NaCl. Examples are known of food additives causing toxicity in man even when used according to regulations, e.g. cobalt in beer. In other instances, poisoning has been due to carry-over, e.g. by nitrate in cheese whey - when used for artificial feed for infants. Poisonings also occur as the result of the permitted substance being added at too high levels, by accident or carelessness, e.g. nitrite in fish. Finally, there are examples of hypersensitivity to food additives, e.g. to tartrazine and other food colours. The toxicological evaluation, based on animal feeding studies, may be complicated by impurities, e.g. orthotoluene-sulfonamide in saccharin; by transformation or disappearance of the additive in food processing in storage, e.g. bisulfite in raisins; by reaction products with food constituents, e.g. formation of ethylurethane from diethyl pyrocarbonate; by metabolic transformation products, e.g. formation in the gut of cyclohexylamine from cyclamate. Metabolic end products may differ in experimental animals and in man: guanylic acid and inosinic acid are metabolized to allantoin in the rat but to uric acid in man. The magnitude of the safety margin in man of the Acceptable Daily Intake (ADI) is not identical to the "safety factor" used when calculating the ADI. The symptoms of Chinese Restaurant Syndrome, although not hazardous, furthermore illustrate that the whole ADI
Han, Yali; Liu, Jie; Sun, Meili; Zhang, Zongpu; Liu, Chuanyong; Sun, Yuping
2016-01-01
Background. There is no definitive conclusion so far on the predictive values of ERCC1 polymorphisms for clinical outcomes of platinum-based chemotherapy in non-small cell lung cancer (NSCLC). We updated this meta-analysis with an expectation to obtain some statistical advancement on this issue. Methods. Relevant studies were identified by searching MEDLINE, EMBASE databases from inception to April 2015. Primary outcomes included objective response rate (ORR), progression-free survival (PFS), and overall survival (OS). All analyses were performed using the Review Manager version 5.3 and the Stata version 12.0. Results. A total of 33 studies including 5373 patients were identified. ERCC1 C118T and C8092A could predict both ORR and OS for platinum-based chemotherapy in Asian NSCLC patients (CT + TT versus CC, ORR: OR = 0.80, 95% CI = 0.67–0.94; OS: HR = 1.24, 95% CI = 1.01–1.53) (CA + AA versus CC, ORR: OR = 0.76, 95% CI = 0.60–0.96; OS: HR = 1.37, 95% CI = 1.06–1.75). Conclusions. Current evidence strongly indicated the prospect of ERCC1 C118T and C8092A as predictive biomarkers for platinum-based chemotherapy in Asian NSCLC patients. However, the results should be interpreted with caution and large prospective studies are still required to further investigate these findings. PMID:27057082
NASA Astrophysics Data System (ADS)
Goodman, J. W.
This book is based on the thesis that some training in the area of statistical optics should be included as a standard part of any advanced optics curriculum. Random variables are discussed, taking into account definitions of probability and random variables, distribution functions and density functions, an extension to two or more random variables, statistical averages, transformations of random variables, sums of real random variables, Gaussian random variables, complex-valued random variables, and random phasor sums. Other subjects examined are related to random processes, some first-order properties of light waves, the coherence of optical waves, some problems involving high-order coherence, effects of partial coherence on imaging systems, imaging in the presence of randomly inhomogeneous media, and fundamental limits in photoelectric detection of light. Attention is given to deterministic versus statistical phenomena and models, the Fourier transform, and the fourth-order moment of the spectrum of a detected speckle image.
ERIC Educational Resources Information Center
Chicot, Katie; Holmes, Hilary
2012-01-01
The use, and misuse, of statistics is commonplace, yet in the printed format data representations can be either over simplified, supposedly for impact, or so complex as to lead to boredom, supposedly for completeness and accuracy. In this article the link to the video clip shows how dynamic visual representations can enliven and enhance the…
ERIC Educational Resources Information Center
Catley, Alan
2007-01-01
Following the announcement last year that there will be no more math coursework assessment at General Certificate of Secondary Education (GCSE), teachers will in the future be able to devote more time to preparing learners for formal examinations. One of the key things that the author has learned when teaching statistics is that it makes for far…
Rudolf Keller
2004-08-10
In this project, a concept to improve the performance of aluminum production cells by introducing potlining additives was examined and tested. Boron oxide was added to cathode blocks, and titanium was dissolved in the metal pool; this resulted in the formation of titanium diboride and caused the molten aluminum to wet the carbonaceous cathode surface. Such wetting reportedly leads to operational improvements and extended cell life. In addition, boron oxide suppresses cyanide formation. This final report presents and discusses the results of this project. Substantial economic benefits for the practical implementation of the technology are projected, especially for modern cells with graphitized blocks. For example, with an energy savings of about 5% and an increase in pot life from 1500 to 2500 days, a cost savings of $ 0.023 per pound of aluminum produced is projected for a 200 kA pot.
Harrup, Mason K; Rollins, Harry W
2013-11-26
An additive comprising a phosphazene compound that has at least two reactive functional groups and at least one capping functional group bonded to phosphorus atoms of the phosphazene compound. One of the at least two reactive functional groups is configured to react with cellulose and the other of the at least two reactive functional groups is configured to react with a resin, such as an amine resin of a polycarboxylic acid resin. The at least one capping functional group is selected from the group consisting of a short chain ether group, an alkoxy group, or an aryloxy group. Also disclosed are an additive-resin admixture, a method of treating a wood product, and a wood product.
Januszyk, Michael; Gurtner, Geoffrey C
2011-01-01
The scope of biomedical research has expanded rapidly during the past several decades, and statistical analysis has become increasingly necessary to understand the meaning of large and diverse quantities of raw data. As such, a familiarity with this lexicon is essential for critical appraisal of medical literature. This article attempts to provide a practical overview of medical statistics, with an emphasis on the selection, application, and interpretation of specific tests. This includes a brief review of statistical theory and its nomenclature, particularly with regard to the classification of variables. A discussion of descriptive methods for data presentation is then provided, followed by an overview of statistical inference and significance analysis, and detailed treatment of specific statistical tests and guidelines for their interpretation. PMID:21200241
ERIC Educational Resources Information Center
Peterson, Lisa S.
2008-01-01
Clinical significance is an important concept in research, particularly in education and the social sciences. The present article first compares clinical significance to other measures of "significance" in statistics. The major methods used to determine clinical significance are explained and the strengths and weaknesses of clinical significance…
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. PMID:26466186
NASA Astrophysics Data System (ADS)
Goodman, Joseph W.
2000-07-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Robert G. Bartle The Elements of Integration and Lebesgue Measure George E. P. Box & Norman R. Draper Evolutionary Operation: A Statistical Method for Process Improvement George E. P. Box & George C. Tiao Bayesian Inference in Statistical Analysis R. W. Carter Finite Groups of Lie Type: Conjugacy Classes and Complex Characters R. W. Carter Simple Groups of Lie Type William G. Cochran & Gertrude M. Cox Experimental Designs, Second Edition Richard Courant Differential and Integral Calculus, Volume I RIchard Courant Differential and Integral Calculus, Volume II Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume I Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume II D. R. Cox Planning of Experiments Harold S. M. Coxeter Introduction to Geometry, Second Edition Charles W. Curtis & Irving Reiner Representation Theory of Finite Groups and Associative Algebras Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume I Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume II Cuthbert Daniel Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition Bruno de Finetti Theory of Probability, Volume I Bruno de Finetti Theory of Probability, Volume 2 W. Edwards Deming Sample Design in Business Research
"Old" tail lobes provide significant additional substorm power
NASA Astrophysics Data System (ADS)
Mishin, V.; Mishin, V. V.; Karavaev, Y.
2012-12-01
In each polar cap (PC) we mark out "old PC" observed during quiet time before the event under consideration, and "new PC" that emerges during rounding the old one and expanding the PC total area. Old and new PCs correspond in the magnetosphere to the old and new tail lobes, respectively. The new lobe variable magnetic flux Ψ1 is usually assumed to be active, i.e. it provides transport of the electromagnetic energy flux (Poynting flux) ɛ' from solar wind into the magnetosphere. The old lobe magnetic flux Ψ2 is usually supposed to be passive, i.e. it remains constant during the disturbance and does not participate in the transporting process which would mean the old PC electric field absolute screening from the convection electric field created by the magnetopause reconnection. In fact, screening is observed, but it is far from absolute. We suggest a model of screening and determine its quantitative characteristics in the selected superstorm. The coefficient of a screening is the β = Ψ2/Ψ02, where Ψ02 = const is open magnetic flux through the old PC measured prior to the substorm, and Ψ2 is variable magnetic flux during the substorm. We consider three various regimes of disturbance. In each, the coefficient β decreased during the loading phase and increased at the unloading phase, but the rates and amplitudes of variations exhibited a strong dependence on the regime. We interpreted decrease in β as a result of involving the old PC magnetic flux Ψ2, which was considered to be constant earlier, to the ' transport process of the Poynting flux from the solar wind into the magnetosphere. A weakening of the transport process at the subsequent unloading phase creates increase in β. Estimates showed that coefficient β during each regime and the computed Poynting flux varied manifolds. In general, unlike the existing substorm conception, the new scenario describes an unknown earlier tail lobe activation process during a substorm growth phase that effectively increases the accumulated tail energy for the expansion and recovery phases.
1986-01-01
Official population data for the USSR are presented for 1985 and 1986. Part 1 (pp. 65-72) contains data on capitals of union republics and cities with over one million inhabitants, including population estimates for 1986 and vital statistics for 1985. Part 2 (p. 72) presents population estimates by sex and union republic, 1986. Part 3 (pp. 73-6) presents data on population growth, including birth, death, and natural increase rates, 1984-1985; seasonal distribution of births and deaths; birth order; age-specific birth rates in urban and rural areas and by union republic; marriages; age at marriage; and divorces. PMID:12178831
Intervention for Maltreating Fathers: Statistically and Clinically Significant Change
ERIC Educational Resources Information Center
Scott, Katreena L.; Lishak, Vicky
2012-01-01
Objective: Fathers are seldom the focus of efforts to address child maltreatment and little is currently known about the effectiveness of intervention for this population. To address this gap, we examined the efficacy of a community-based group treatment program for fathers who had abused or neglected their children or exposed their children to…
Worry, Intolerance of Uncertainty, and Statistics Anxiety
ERIC Educational Resources Information Center
Williams, Amanda S.
2013-01-01
Statistics anxiety is a problem for most graduate students. This study investigates the relationship between intolerance of uncertainty, worry, and statistics anxiety. Intolerance of uncertainty was significantly related to worry, and worry was significantly related to three types of statistics anxiety. Six types of statistics anxiety were…
Lubricant and additive effects on spur gear fatigue life
NASA Technical Reports Server (NTRS)
Townsend, D. P.; Zaretsky, E. V.; Scibbe, H. W.
1985-01-01
Spur gear endurance tests were conducted with six lubricants using a single lot of consumable-electrode vacuum melted (CVM) AISI 9310 spur gears. The sixth lubricant was divided into four batches each of which had a different additive content. Lubricants tested with a phosphorus-type load carrying additive showed a statistically significant improvement in life over lubricants without this type of additive. The presence of sulfur type antiwear additives in the lubricant did not appear to affect the surface fatigue life of the gears. No statistical difference in life was produced with those lubricants of different base stocks but with similar viscosity, pressure-viscosity coefficients and antiwear additives. Gears tested with a 0.1 wt % sulfur and 0.1 wt % phosphorus EP additives in the lubricant had reactive films that were 200 to 400 (0.8 to 1.6 microns) thick.
Using scientifically and statistically sufficient statistics in comparing image segmentations.
Chi, Yueh-Yun; Muller, Keith E
2010-01-01
Automatic computer segmentation in three dimensions creates opportunity to reduce the cost of three-dimensional treatment planning of radiotherapy for cancer treatment. Comparisons between human and computer accuracy in segmenting kidneys in CT scans generate distance values far larger in number than the number of CT scans. Such high dimension, low sample size (HDLSS) data present a grand challenge to statisticians: how do we find good estimates and make credible inference? We recommend discovering and using scientifically and statistically sufficient statistics as an additional strategy for overcoming the curse of dimensionality. First, we reduced the three-dimensional array of distances for each image comparison to a histogram to be modeled individually. Second, we used non-parametric kernel density estimation to explore distributional patterns and assess multi-modality. Third, a systematic exploratory search for parametric distributions and truncated variations led to choosing a Gaussian form as approximating the distribution of a cube root transformation of distance. Fourth, representing each histogram by an individually estimated distribution eliminated the HDLSS problem by reducing on average 26,000 distances per histogram to just 2 parameter estimates. In the fifth and final step we used classical statistical methods to demonstrate that the two human observers disagreed significantly less with each other than with the computer segmentation. Nevertheless, the size of all disagreements was clinically unimportant relative to the size of a kidney. The hierarchal modeling approach to object-oriented data created response variables deemed sufficient by both the scientists and statisticians. We believe the same strategy provides a useful addition to the imaging toolkit and will succeed with many other high throughput technologies in genetics, metabolomics and chemical analysis. PMID:24967000
Candidate Assembly Statistical Evaluation
Energy Science and Technology Software Center (ESTSC)
1998-07-15
The Savannah River Site (SRS) receives aluminum clad spent Material Test Reactor (MTR) fuel from all over the world for storage and eventual reprocessing. There are hundreds of different kinds of MTR fuels and these fuels will continue to be received at SRS for approximately ten more years. SRS''s current criticality evaluation methodology requires the modeling of all MTR fuels utilizing Monte Carlo codes, which is extremely time consuming and resource intensive. Now that amore » significant number of MTR calculations have been conducted it is feasible to consider building statistical models that will provide reasonable estimations of MTR behavior. These statistical models can be incorporated into a standardized model homogenization spreadsheet package to provide analysts with a means of performing routine MTR fuel analyses with a minimal commitment of time and resources. This became the purpose for development of the Candidate Assembly Statistical Evaluation (CASE) program at SRS.« less
Smith, Kristy Breuhl; Smith, Michael Seth
2016-03-01
Obesity is a chronic disease that is strongly associated with an increase in mortality and morbidity including, certain types of cancer, cardiovascular disease, disability, diabetes mellitus, hypertension, osteoarthritis, and stroke. In adults, overweight is defined as a body mass index (BMI) of 25 kg/m(2) to 29 kg/m(2) and obesity as a BMI of greater than 30 kg/m(2). If current trends continue, it is estimated that, by the year 2030, 38% of the world's adult population will be overweight and another 20% obese. Significant global health strategies must reduce the morbidity and mortality associated with the obesity epidemic. PMID:26896205
Perception in statistical graphics
NASA Astrophysics Data System (ADS)
VanderPlas, Susan Ruth
There has been quite a bit of research on statistical graphics and visualization, generally focused on new types of graphics, new software to create graphics, interactivity, and usability studies. Our ability to interpret and use statistical graphics hinges on the interface between the graph itself and the brain that perceives and interprets it, and there is substantially less research on the interplay between graph, eye, brain, and mind than is sufficient to understand the nature of these relationships. The goal of the work presented here is to further explore the interplay between a static graph, the translation of that graph from paper to mental representation (the journey from eye to brain), and the mental processes that operate on that graph once it is transferred into memory (mind). Understanding the perception of statistical graphics should allow researchers to create more effective graphs which produce fewer distortions and viewer errors while reducing the cognitive load necessary to understand the information presented in the graph. Taken together, these experiments should lay a foundation for exploring the perception of statistical graphics. There has been considerable research into the accuracy of numerical judgments viewers make from graphs, and these studies are useful, but it is more effective to understand how errors in these judgments occur so that the root cause of the error can be addressed directly. Understanding how visual reasoning relates to the ability to make judgments from graphs allows us to tailor graphics to particular target audiences. In addition, understanding the hierarchy of salient features in statistical graphics allows us to clearly communicate the important message from data or statistical models by constructing graphics which are designed specifically for the perceptual system.
DESIGNING ENVIRONMENTAL MONITORING DATABASES FOR STATISTIC ASSESSMENT
Databases designed for statistical analyses have characteristics that distinguish them from databases intended for general use. EMAP uses a probabilistic sampling design to collect data to produce statistical assessments of environmental conditions. In addition to supporting the ...
A SIGNIFICANCE TEST FOR THE LASSO1
Lockhart, Richard; Taylor, Jonathan; Tibshirani, Ryan J.; Tibshirani, Robert
2014-01-01
In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test statistic, and show that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model). Our proof of this result for the special case of the first predictor to enter the model (i.e., testing for a single significant predictor variable against the global null) requires only weak assumptions on the predictor matrix X. On the other hand, our proof for a general step in the lasso path places further technical assumptions on X and the generative model, but still allows for the important high-dimensional case p > n, and does not necessarily require that the current lasso model achieves perfect recovery of the truly active variables. Of course, for testing the significance of an additional variable between two nested linear models, one typically uses the chi-squared test, comparing the drop in residual sum of squares (RSS) to a χ12 distribution. But when this additional variable is not fixed, and has been chosen adaptively or greedily, this test is no longer appropriate: adaptivity makes the drop in RSS stochastically much larger than χ12 under the null hypothesis. Our analysis explicitly accounts for adaptivity, as it must, since the lasso builds an adaptive sequence of linear models as the tuning parameter λ decreases. In this analysis, shrinkage plays a key role: though additional variables are chosen adaptively, the coefficients of lasso active variables are shrunken due to the l1 penalty. Therefore, the test statistic (which is based on lasso fitted values) is in a sense balanced by these two opposing properties—adaptivity and
Cosmetic Plastic Surgery Statistics
2014 Cosmetic Plastic Surgery Statistics Cosmetic Procedure Trends 2014 Plastic Surgery Statistics Report Please credit the AMERICAN SOCIETY OF PLASTIC SURGEONS when citing statistical data or using ...
Statistical Studies of Supernova Environments
NASA Astrophysics Data System (ADS)
Anderson, Joseph P.; James, Phil A.; Habergham, Stacey M.; Galbany, Lluís; Kuncarayakti, Hanindyo
2015-05-01
Mapping the diversity of SNe to progenitor properties is key to our understanding of stellar evolution and explosive stellar death. Investigations of the immediate environments of SNe allow statistical constraints to be made on progenitor properties such as mass and metallicity. Here, we review the progress that has been made in this field. Pixel statistics using tracers of e.g. star formation within galaxies show intriguing differences in the explosion sites of, in particular SNe types II and Ibc (SNe II and SNe Ibc respectively), suggesting statistical differences in population ages. Of particular interest is that SNe Ic are significantly more associated with host galaxy Hα emission than SNe Ib, implying shorter lifetimes for the former. In addition, such studies have shown (unexpectedly) that the interacting SNe IIn do not explode in regions containing the most massive stars, which suggests that at least a significant fraction of their progenitors arise from the lower end of the core-collapse SN mass range. Host H ii region spectroscopy has been obtained for a significant number of core-collapse events, however definitive conclusions on differences between distinct SN types have to-date been elusive. Single stellar evolution models predict that the relative fraction of SNe Ibc to SNe II should increase with increasing metallicity, due to the dependence of mass-loss rates on progenitor metallicity. We present a meta-analysis of all current host H ii region oxygen abundances for CC SNe. It is concluded that the SN II to SN Ibc ratio shows little variation with oxygen abundance, with only a suggestion that the ratio increases in the lowest bin. Radial distributions of different SNe are discussed, where a central excess of SNe Ibc has been observed within disturbed galaxy systems, which is difficult to ascribe to metallicity or selection effects. Environment studies are also being undertaken for SNe Ia, where constraints can be made on the shortest delay times of
Statistics Anxiety among Postgraduate Students
ERIC Educational Resources Information Center
Koh, Denise; Zawi, Mohd Khairi
2014-01-01
Most postgraduate programmes, that have research components, require students to take at least one course of research statistics. Not all postgraduate programmes are science based, there are a significant number of postgraduate students who are from the social sciences that will be taking statistics courses, as they try to complete their…
Additive Manufacturing Infrared Inspection
NASA Technical Reports Server (NTRS)
Gaddy, Darrell
2014-01-01
Additive manufacturing is a rapid prototyping technology that allows parts to be built in a series of thin layers from plastic, ceramics, and metallics. Metallic additive manufacturing is an emerging form of rapid prototyping that allows complex structures to be built using various metallic powders. Significant time and cost savings have also been observed using the metallic additive manufacturing compared with traditional techniques. Development of the metallic additive manufacturing technology has advanced significantly over the last decade, although many of the techniques to inspect parts made from these processes have not advanced significantly or have limitations. Several external geometry inspection techniques exist such as Coordinate Measurement Machines (CMM), Laser Scanners, Structured Light Scanning Systems, or even traditional calipers and gages. All of the aforementioned techniques are limited to external geometry and contours or must use a contact probe to inspect limited internal dimensions. This presentation will document the development of a process for real-time dimensional inspection technique and digital quality record of the additive manufacturing process using Infrared camera imaging and processing techniques.
Impaired Statistical Learning in Developmental Dyslexia
Thiessen, Erik D.; Holt, Lori L.
2015-01-01
Purpose Developmental dyslexia (DD) is commonly thought to arise from phonological impairments. However, an emerging perspective is that a more general procedural learning deficit, not specific to phonological processing, may underlie DD. The current study examined if individuals with DD are capable of extracting statistical regularities across sequences of passively experienced speech and nonspeech sounds. Such statistical learning is believed to be domain-general, to draw upon procedural learning systems, and to relate to language outcomes. Method DD and control groups were familiarized with a continuous stream of syllables or sine-wave tones, the ordering of which was defined by high or low transitional probabilities across adjacent stimulus pairs. Participants subsequently judged two 3-stimulus test items with either high or low statistical coherence as being the most similar to the sounds heard during familiarization. Results As with control participants, the DD group was sensitive to the transitional probability structure of the familiarization materials as evidenced by above-chance performance. However, the performance of participants with DD was significantly poorer than controls across linguistic and nonlinguistic stimuli. In addition, reading-related measures were significantly correlated with statistical learning performance of both speech and nonspeech material. Conclusion Results are discussed in light of procedural learning impairments among participants with DD. PMID:25860795
Thermodynamics of cellular statistical inference
NASA Astrophysics Data System (ADS)
Lang, Alex; Fisher, Charles; Mehta, Pankaj
2014-03-01
Successful organisms must be capable of accurately sensing the surrounding environment in order to locate nutrients and evade toxins or predators. However, single cell organisms face a multitude of limitations on their accuracy of sensing. Berg and Purcell first examined the canonical example of statistical limitations to cellular learning of a diffusing chemical and established a fundamental limit to statistical accuracy. Recent work has shown that the Berg and Purcell learning limit can be exceeded using Maximum Likelihood Estimation. Here, we recast the cellular sensing problem as a statistical inference problem and discuss the relationship between the efficiency of an estimator and its thermodynamic properties. We explicitly model a single non-equilibrium receptor and examine the constraints on statistical inference imposed by noisy biochemical networks. Our work shows that cells must balance sample number, specificity, and energy consumption when performing statistical inference. These tradeoffs place significant constraints on the practical implementation of statistical estimators in a cell.
SOCR: Statistics Online Computational Resource
Dinov, Ivo D.
2011-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
NASA Technical Reports Server (NTRS)
1994-01-01
Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
NASA Technical Reports Server (NTRS)
1995-01-01
NASA Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
NASA Technical Reports Server (NTRS)
1996-01-01
This booklet of pocket statistics includes the 1996 NASA Major Launch Record, NASA Procurement, Financial, and Workforce data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Luanch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
Scott, M; Flaherty, D; Currall, J
2013-03-01
This short addition to our series on clinical statistics concerns relationships, and answering questions such as "are blood pressure and weight related?" In a later article, we will answer the more interesting question about how they might be related. This article follows on logically from the previous one dealing with categorical data, the major difference being here that we will consider two continuous variables, which naturally leads to the use of a Pearson correlation or occasionally to a Spearman rank correlation coefficient. PMID:23458641
NASA Technical Reports Server (NTRS)
1993-01-01
Pocket Statistics is published for the use of NASA managers and their staff. Included herein is Administrative and Organizational information, summaries of Space Flight Activity including the NASA Major Launch Record, and NASA Procurement, Financial, and Manpower data. The NASA Major Launch Record includes all launches of Scout class and larger vehicles. Vehicle and spacecraft development flights are also included in the Major Launch Record. Shuttle missions are counted as one launch and one payload, where free flying payloads are not involved. Satellites deployed from the cargo bay of the Shuttle and placed in a separate orbit or trajectory are counted as an additional payload.
Predict! Teaching Statistics Using Informational Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie
2013-01-01
Statistics is one of the most widely used topics for everyday life in the school mathematics curriculum. Unfortunately, the statistics taught in schools focuses on calculations and procedures before students have a chance to see it as a useful and powerful tool. Researchers have found that a dominant view of statistics is as an assortment of tools…
Statistics Poker: Reinforcing Basic Statistical Concepts
ERIC Educational Resources Information Center
Leech, Nancy L.
2008-01-01
Learning basic statistical concepts does not need to be tedious or dry; it can be fun and interesting through cooperative learning in the small-group activity of Statistics Poker. This article describes a teaching approach for reinforcing basic statistical concepts that can help students who have high anxiety and makes learning and reinforcing…
Understanding British addiction statistics.
Johnson, B D
1975-01-01
The statistical data issued by the Home Office and Department of Health and Social Security are quite detailed and generally valid measures of hard core addiction in Great Britain (Judson, 1973). Since 1968, the main basis of these high quality British statistics is the routine reports filed by Drug Treatment Centres. The well-trained, experienced staff of these clinics make knowledgeable dicsions about a cleint's addiction, efficiently regulate dosage, and otherwise exert some degree of control over addicts (Judson, 1973; Johnson, 1974). The co-operation of police, courts, prison physicians, and general practitioners is also valuable in collecting data on drug addiction and convictions. Information presented in the tables above indicates that a rising problem of herion addiction between 1962 and 1967 were arrested by the introduction of the treatment clinics in 1968. Further, legally maintained heroin addiction has been reduced by almost one-third since 1968, since many herion addicts have been transferred to injectable methadone. The decline in herion prescribing and the relatively steady number of narcotics addicts has apparently occurred in the face of a continuing, and perhaps increasing, demand for heroin and other opiates. With few exceptions of a minor nature analysis of various tables suggests that the official statistics are internally consistent. There are apparently few "hidden" addicts, since few unknown addicts die of overdoses or are arrested by police (Lewis, 1973), although Blumberg (1974) indicates that some unknown users may exist. In addition, may opitate usersnot officially notified are known by clinic doctors as friends of addicts receiving prescriptions (Judson, 1973; Home Office, 1974). In brief, offical British drug statistics seem to be generally valid and demonstrate that heroin and perhaps methadone addiction has been well contained by the treatment clinics. PMID:1039283
NASA Astrophysics Data System (ADS)
Dunbar, P. K.; Furtney, M.; McLean, S. J.; Sweeney, A. D.
2014-12-01
Tsunamis have inflicted death and destruction on the coastlines of the world throughout history. The occurrence of tsunamis and the resulting effects have been collected and studied as far back as the second millennium B.C. The knowledge gained from cataloging and examining these events has led to significant changes in our understanding of tsunamis, tsunami sources, and methods to mitigate the effects of tsunamis. The most significant, not surprisingly, are often the most devastating, such as the 2011 Tohoku, Japan earthquake and tsunami. The goal of this poster is to give a brief overview of the occurrence of tsunamis and then focus specifically on several significant tsunamis. There are various criteria to determine the most significant tsunamis: the number of deaths, amount of damage, maximum runup height, had a major impact on tsunami science or policy, etc. As a result, descriptions will include some of the most costly (2011 Tohoku, Japan), the most deadly (2004 Sumatra, 1883 Krakatau), and the highest runup ever observed (1958 Lituya Bay, Alaska). The discovery of the Cascadia subduction zone as the source of the 1700 Japanese "Orphan" tsunami and a future tsunami threat to the U.S. northwest coast, contributed to the decision to form the U.S. National Tsunami Hazard Mitigation Program. The great Lisbon earthquake of 1755 marked the beginning of the modern era of seismology. Knowledge gained from the 1964 Alaska earthquake and tsunami helped confirm the theory of plate tectonics. The 1946 Alaska, 1952 Kuril Islands, 1960 Chile, 1964 Alaska, and the 2004 Banda Aceh, tsunamis all resulted in warning centers or systems being established.The data descriptions on this poster were extracted from NOAA's National Geophysical Data Center (NGDC) global historical tsunami database. Additional information about these tsunamis, as well as water level data can be found by accessing the NGDC website www.ngdc.noaa.gov/hazard/
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-10-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, however, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1) P-hacking, which is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want; 2) overemphasis on P values rather than on the actual size of the observed effect; 3) overuse of statistical hypothesis testing, and being seduced by the word "significant"; and 4) over-reliance on standard errors, which are often misunderstood. PMID:25204545
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood. PMID:25213136
Statistics of atmospheric correlations.
Santhanam, M S; Patra, P K
2001-07-01
For a large class of quantum systems, the statistical properties of their spectrum show remarkable agreement with random matrix predictions. Recent advances show that the scope of random matrix theory is much wider. In this work, we show that the random matrix approach can be beneficially applied to a completely different classical domain, namely, to the empirical correlation matrices obtained from the analysis of the basic atmospheric parameters that characterize the state of atmosphere. We show that the spectrum of atmospheric correlation matrices satisfy the random matrix prescription. In particular, the eigenmodes of the atmospheric empirical correlation matrices that have physical significance are marked by deviations from the eigenvector distribution. PMID:11461326
Neuroendocrine Tumor: Statistics
... Tumor > Neuroendocrine Tumor - Statistics Request Permissions Neuroendocrine Tumor - Statistics Approved by the Cancer.Net Editorial Board , 04/ ... the body. It is important to remember that statistics on how many people survive this type of ...
Antecedents of students' achievement in statistics
NASA Astrophysics Data System (ADS)
Awaludin, Izyan Syazana; Razak, Ruzanna Ab; Harris, Hezlin; Selamat, Zarehan
2015-02-01
The applications of statistics in most fields have been vast. Many degree programmes at local universities require students to enroll in at least one statistics course. The standard of these courses varies across different degree programmes. This is because of students' diverse academic backgrounds in which some comes far from the field of statistics. The high failure rate in statistics courses for non-science stream students had been concerning every year. The purpose of this research is to investigate the antecedents of students' achievement in statistics. A total of 272 students participated in the survey. Multiple linear regression was applied to examine the relationship between the factors and achievement. We found that statistics anxiety was a significant predictor of students' achievement. We also found that students' age has significant effect to achievement. Older students are more likely to achieve lowers scores in statistics. Student's level of study also has a significant impact on their achievement in statistics.
Statistical Modelling of Compound Floods
NASA Astrophysics Data System (ADS)
Bevacqua, Emanuele; Maraun, Douglas; Vrac, Mathieu; Widmann, Martin; Manning, Colin
2016-04-01
of interest. This is based on real data for River discharge (Y RIV ER') and Sea level (Y SEA), from the River Têt in south of France. The impact of the compound flood is the water level in the area between the River and Sea station, which we define here as h = αY RIV ER + (1 ‑ α)Y SEA. Here we show the sensitivity of the system to a changes in the two physical parameters. Through variations in α we can study the system in one or two dimensions which allows for the assessment of the risk associated with either of the two variables alone or with a combination of them. Varying instead the second parameter, i.e. the dependence among the variables Y RIV ER and Y SEA, we show how an apparently weak dependence can increase the risk of flooding significantly with respect to the independent case. The model can be applied to future climate inserting predictors into the statistical model as additional conditioning variables. Through conditioning the simulation of the statistical model on the predictors obtained for future projections from Climate Models, both the change of the risk and characteristics of compound floods for the future can be analysed.
Statistics for People Who (Think They) Hate Statistics. Third Edition
ERIC Educational Resources Information Center
Salkind, Neil J.
2007-01-01
This text teaches an often intimidating and difficult subject in a way that is informative, personable, and clear. The author takes students through various statistical procedures, beginning with correlation and graphical representation of data and ending with inferential techniques and analysis of variance. In addition, the text covers SPSS, and…
[Comment on] Statistical discrimination
NASA Astrophysics Data System (ADS)
Chinn, Douglas
In the December 8, 1981, issue of Eos, a news item reported the conclusion of a National Research Council study that sexual discrimination against women with Ph.D.'s exists in the field of geophysics. Basically, the item reported that even when allowances are made for motherhood the percentage of female Ph.D.'s holding high university and corporate positions is significantly lower than the percentage of male Ph.D.'s holding the same types of positions. The sexual discrimination conclusion, based only on these statistics, assumes that there are no basic psychological differences between men and women that might cause different populations in the employment group studied. Therefore, the reasoning goes, after taking into account possible effects from differences related to anatomy, such as women stopping their careers in order to bear and raise children, the statistical distributions of positions held by male and female Ph.D.'s ought to be very similar to one another. Any significant differences between the distributions must be caused primarily by sexual discrimination.
Statistical Reference Datasets
National Institute of Standards and Technology Data Gateway
Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
Introductory Statistics and Fish Management.
ERIC Educational Resources Information Center
Jardine, Dick
2002-01-01
Describes how fisheries research and management data (available on a website) have been incorporated into an Introductory Statistics course. In addition to the motivation gained from seeing the practical relevance of the course, some students have participated in the data collection and analysis for the New Hampshire Fish and Game Department. (MM)
NASA Astrophysics Data System (ADS)
Hsu, Hsiao-Ping; Nadler, Walder; Grassberger, Peter
2005-07-01
The scaling behavior of randomly branched polymers in a good solvent is studied in two to nine dimensions, modeled by lattice animals on simple hypercubic lattices. For the simulations, we use a biased sequential sampling algorithm with re-sampling, similar to the pruned-enriched Rosenbluth method (PERM) used extensively for linear polymers. We obtain high statistics of animals with up to several thousand sites in all dimension 2⩽d⩽9. The partition sum (number of different animals) and gyration radii are estimated. In all dimensions we verify the Parisi-Sourlas prediction, and we verify all exactly known critical exponents in dimensions 2, 3, 4, and ⩾8. In addition, we present the hitherto most precise estimates for growth constants in d⩾3. For clusters with one site attached to an attractive surface, we verify the superuniversality of the cross-over exponent at the adsorption transition predicted by Janssen and Lyssy.
Comparison of methods for computing streamflow statistics for Pennsylvania streams
Ehlke, Marla H.; Reed, Lloyd A.
1999-01-01
Methods for computing streamflow statistics intended for use on ungaged locations on Pennsylvania streams are presented and compared to frequency distributions of gaged streamflow data. The streamflow statistics used in the comparisons include the 7-day 10-year low flow, 50-year flood flow, and the 100-year flood flow; additional statistics are presented. Streamflow statistics for gaged locations on streams in Pennsylvania were computed using three methods for the comparisons: 1) Log-Pearson type III frequency distribution (Log-Pearson) of continuous-record streamflow data, 2) regional regression equations developed by the U.S. Geological Survey in 1982 (WRI 82-21), and 3) regional regression equations developed by the Pennsylvania State University in 1981 (PSU-IV). Log-Pearson distribution was considered the reference method for evaluation of the regional regression equations. Low-flow statistics were computed using the Log-Pearson distribution and WRI 82-21, whereas flood-flow statistics were computed using all three methods. The urban adjustment for PSU-IV was modified from the recommended computation to exclude Philadelphia and the surrounding areas (region 1) from the adjustment. Adjustments for storage area for PSU-IV were also slightly modified. A comparison of the 7-day 10-year low flow computed from Log-Pearson distribution and WRI-82- 21 showed that the methods produced significantly different values for about 7 percent of the state. The same methods produced 50-year and 100-year flood flows that were significantly different for about 24 percent of the state. Flood-flow statistics computed using Log-Pearson distribution and PSU-IV were not significantly different in any regions of the state. These findings are based on a statistical comparison using the t-test on signed ranks and graphical methods.
Golombick, Terry; Diamond, Terrence H; Manoharan, Arumugam; Ramakrishna, Rajeev
2016-06-01
Hypothesis Prior studies on patients with early B-cell lymphoid malignancies suggest that early intervention with curcumin may lead to delay in progressive disease and prolonged survival. These patients are characterized by increased susceptibility to infections. Rice bran arabinoxylan (Ribraxx) has been shown to have immunostimulatory, anti-inflammatory, and proapoptotic effects. We postulated that addition of Ribraxx to curcumin therapy may be of benefit. Study design Monoclonal gammopathy of undetermined significance (MGUS)/smoldering multiple myeloma (SMM) or stage 0/1 chronic lymphocytic leukemia (CLL) patients who had been on oral curcumin therapy for a period of 6 months or more were administered both curcumin (as Curcuforte) and Ribraxx. Methods Ten MGUS/SMM patients and 10 patients with stage 0/1 CLL were administered 6 g of curcumin and 2 g Ribraxx daily. Blood samples were collected at baseline and at 2-month intervals for a period of 6 months, and various markers were monitored. MGUS/SMM patients included full blood count (FBC); paraprotein; free light chains/ratio; C-reactive protein (CRP)and erythrocyte sedimentation rate (ESR); B2 microglobulin and immunological markers. Markers monitored for stage 0/1 CLL were FBC, CRP and ESR, and immunological markers. Results Of 10 MGUS/SMM patients,5 (50%) were neutropenic at baseline, and the Curcuforte/Ribraxx combination therapy showed an increased neutrophil count, varying between 10% and 90% among 8 of the 10 (80%) MGUS/SMM patients. An additional benefit of the combination therapy was the potent effect in reducing the raised ESR in 4 (44%) of the MGUS/SMM patients. Conclusion Addition of Ribraxx to curcumin therapy may be of benefit to patients with early-stage B-cell lymphoid malignancies. PMID:27154182
Chiou, Chei-Chang; Wang, Yu-Min; Lee, Li-Tze
2014-08-01
Statistical knowledge is widely used in academia; however, statistics teachers struggle with the issue of how to reduce students' statistics anxiety and enhance students' statistics learning. This study assesses the effectiveness of a "one-minute paper strategy" in reducing students' statistics-related anxiety and in improving students' statistics-related achievement. Participants were 77 undergraduates from two classes enrolled in applied statistics courses. An experiment was implemented according to a pretest/posttest comparison group design. The quasi-experimental design showed that the one-minute paper strategy significantly reduced students' statistics anxiety and improved students' statistics learning achievement. The strategy was a better instructional tool than the textbook exercise for reducing students' statistics anxiety and improving students' statistics achievement. PMID:25153964
Statistical Seismology and Induced Seismicity
NASA Astrophysics Data System (ADS)
Tiampo, K. F.; González, P. J.; Kazemian, J.
2014-12-01
While seismicity triggered or induced by natural resources production such as mining or water impoundment in large dams has long been recognized, the recent increase in the unconventional production of oil and gas has been linked to rapid rise in seismicity in many places, including central North America (Ellsworth et al., 2012; Ellsworth, 2013). Worldwide, induced events of M~5 have occurred and, although rare, have resulted in both damage and public concern (Horton, 2012; Keranen et al., 2013). In addition, over the past twenty years, the increase in both number and coverage of seismic stations has resulted in an unprecedented ability to precisely record the magnitude and location of large numbers of small magnitude events. The increase in the number and type of seismic sequences available for detailed study has revealed differences in their statistics that previously difficult to quantify. For example, seismic swarms that produce significant numbers of foreshocks as well as aftershocks have been observed in different tectonic settings, including California, Iceland, and the East Pacific Rise (McGuire et al., 2005; Shearer, 2012; Kazemian et al., 2014). Similarly, smaller events have been observed prior to larger induced events in several occurrences from energy production. The field of statistical seismology has long focused on the question of triggering and the mechanisms responsible (Stein et al., 1992; Hill et al., 1993; Steacy et al., 2005; Parsons, 2005; Main et al., 2006). For example, in most cases the associated stress perturbations are much smaller than the earthquake stress drop, suggesting an inherent sensitivity to relatively small stress changes (Nalbant et al., 2005). Induced seismicity provides the opportunity to investigate triggering and, in particular, the differences between long- and short-range triggering. Here we investigate the statistics of induced seismicity sequences from around the world, including central North America and Spain, and
NASA Astrophysics Data System (ADS)
Holmes, Jon L.
2000-06-01
IP-number access. Current subscriptions can be upgraded to IP-number access at little additional cost. We are pleased to be able to offer to institutions and libraries this convenient mode of access to subscriber only resources at JCE Online. JCE Online Usage Statistics We are continually amazed by the activity at JCE Online. So far, the year 2000 has shown a marked increase. Given the phenomenal overall growth of the Internet, perhaps our surprise is not warranted. However, during the months of January and February 2000, over 38,000 visitors requested over 275,000 pages. This is a monthly increase of over 33% from the October-December 1999 levels. It is good to know that people are visiting, but we would very much like to know what you would most like to see at JCE Online. Please send your suggestions to JCEOnline@chem.wisc.edu. For those who are interested, JCE Online year-to-date statistics are available. Biographical Snapshots of Famous Chemists: Mission Statement Feature Editor: Barbara Burke Chemistry Department, California State Polytechnic University-Pomona, Pomona, CA 91768 phone: 909/869-3664 fax: 909/869-4616 email: baburke@csupomona.edu The primary goal of this JCE Internet column is to provide information about chemists who have made important contributions to chemistry. For each chemist, there is a short biographical "snapshot" that provides basic information about the person's chemical work, gender, ethnicity, and cultural background. Each snapshot includes links to related websites and to a biobibliographic database. The database provides references for the individual and can be searched through key words listed at the end of each snapshot. All students, not just science majors, need to understand science as it really is: an exciting, challenging, human, and creative way of learning about our natural world. Investigating the life experiences of chemists can provide a means for students to gain a more realistic view of chemistry. In addition students
[Big data in official statistics].
Zwick, Markus
2015-08-01
The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany. PMID:26077871
Ranald Macdonald and statistical inference.
Smith, Philip T
2009-05-01
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing. PMID:19351454
Photon statistics: math versus mysticism
NASA Astrophysics Data System (ADS)
Kracklauer, A. F.
2013-10-01
Critical analysis is given for mystical aspects of the current understanding of interaction between charged particles: wave-particle duality and nonlocal entanglement. A possible statistical effect concerning distribution functions for coincidences between the output channels of beam splitters is described. If this effect is observed in beam splitter data, ten significant evidence for photon splitting, i.e. , against the notion that light is ultimately packaged in finite chunks, has been found. An argument is given for the invalidity of the meaning attached to tests of Bell inequalities. Additionally, a totally classical paradigm for the calculation of the customary expression for the "quantum" coincidence coefficient pertaining to the singlet state is described. If fully accounts for the results of experimental tests of Bell inequalities taken nowadays to prove the reality of entanglement and non-locality in quantum phenomena of, inter alia, light. Described. It fully accounts for the results of experimental tests of Bell inequalities take n nowadays to prove the reality of entanglement and non-locality in quantum phenomena of inter alia, light.
... Research AMIGAS Fighting Cervical Cancer Worldwide Stay Informed Statistics for Other Kinds of Cancer Breast Cervical Colorectal ( ... Skin Vaginal and Vulvar Cancer Home Uterine Cancer Statistics Language: English Español (Spanish) Recommend on Facebook Tweet ...
Mathematical and statistical analysis
NASA Technical Reports Server (NTRS)
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Minnesota Health Statistics 1988.
ERIC Educational Resources Information Center
Minnesota State Dept. of Health, St. Paul.
This document comprises the 1988 annual statistical report of the Minnesota Center for Health Statistics. After introductory technical notes on changes in format, sources of data, and geographic allocation of vital events, an overview is provided of vital health statistics in all areas. Thereafter, separate sections of the report provide tables…
ERIC Educational Resources Information Center
Lenard, Christopher; McCarthy, Sally; Mills, Terence
2014-01-01
There are many different aspects of statistics. Statistics involves mathematics, computing, and applications to almost every field of endeavour. Each aspect provides an opportunity to spark someone's interest in the subject. In this paper we discuss some ethical aspects of statistics, and describe how an introduction to ethics has been…
ERIC Educational Resources Information Center
Strasser, Nora
2007-01-01
Avoiding statistical mistakes is important for educators at all levels. Basic concepts will help you to avoid making mistakes using statistics and to look at data with a critical eye. Statistical data is used at educational institutions for many purposes. It can be used to support budget requests, changes in educational philosophy, changes to…
Statistical quality management
NASA Astrophysics Data System (ADS)
Vanderlaan, Paul
1992-10-01
Some aspects of statistical quality management are discussed. Quality has to be defined as a concrete, measurable quantity. The concepts of Total Quality Management (TQM), Statistical Process Control (SPC), and inspection are explained. In most cases SPC is better than inspection. It can be concluded that statistics has great possibilities in the field of TQM.
Explorations in statistics: statistical facets of reproducibility.
Curran-Everett, Douglas
2016-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eleventh installment of Explorations in Statistics explores statistical facets of reproducibility. If we obtain an experimental result that is scientifically meaningful and statistically unusual, we would like to know that our result reflects a general biological phenomenon that another researcher could reproduce if (s)he repeated our experiment. But more often than not, we may learn this researcher cannot replicate our result. The National Institutes of Health and the Federation of American Societies for Experimental Biology have created training modules and outlined strategies to help improve the reproducibility of research. These particular approaches are necessary, but they are not sufficient. The principles of hypothesis testing and estimation are inherent to the notion of reproducibility in science. If we want to improve the reproducibility of our research, then we need to rethink how we apply fundamental concepts of statistics to our science. PMID:27231259
Exploring Correlation Coefficients with Golf Statistics
ERIC Educational Resources Information Center
Quinn, Robert J
2006-01-01
This article explores the relationships between several pairs of statistics kept on professional golfers on the PGA tour. Specifically, two measures related to the player's ability to drive the ball are compared as are two measures related to the player's ability to putt. An additional analysis is made between one statistic related to putting and…
Florida Library Directory with Statistics, 1998.
ERIC Educational Resources Information Center
Florida Dept. of State, Tallahassee. Div. of Library and Information Services.
This 49th annual Florida Library directory with statistics edition includes listings for over 1,000 libraries of all types in Florida, with contact named, phone numbers, addresses, and e-mail and web addresses. In addition, there is a section of library statistics, showing data on the use, resources, and financial condition of Florida's libraries.…
Statistical prediction of cyclostationary processes
Kim, K.Y.
2000-03-15
Considered in this study is a cyclostationary generalization of an EOF-based prediction method. While linear statistical prediction methods are typically optimal in the sense that prediction error variance is minimal within the assumption of stationarity, there is some room for improved performance since many physical processes are not stationary. For instance, El Nino is known to be strongly phase locked with the seasonal cycle, which suggests nonstationarity of the El Nino statistics. Many geophysical and climatological processes may be termed cyclostationary since their statistics show strong cyclicity instead of stationarity. Therefore, developed in this study is a cyclostationary prediction method. Test results demonstrate that performance of prediction methods can be improved significantly by accounting for the cyclostationarity of underlying processes. The improvement comes from an accurate rendition of covariance structure both in space and time.
Thermodynamic Limit in Statistical Physics
NASA Astrophysics Data System (ADS)
Kuzemsky, A. L.
2014-03-01
The thermodynamic limit in statistical thermodynamics of many-particle systems is an important but often overlooked issue in the various applied studies of condensed matter physics. To settle this issue, we review tersely the past and present disposition of thermodynamic limiting procedure in the structure of the contemporary statistical mechanics and our current understanding of this problem. We pick out the ingenious approach by Bogoliubov, who developed a general formalism for establishing the limiting distribution functions in the form of formal series in powers of the density. In that study, he outlined the method of justification of the thermodynamic limit when he derived the generalized Boltzmann equations. To enrich and to weave our discussion, we take this opportunity to give a brief survey of the closely related problems, such as the equipartition of energy and the equivalence and nonequivalence of statistical ensembles. The validity of the equipartition of energy permits one to decide what are the boundaries of applicability of statistical mechanics. The major aim of this work is to provide a better qualitative understanding of the physical significance of the thermodynamic limit in modern statistical physics of the infinite and "small" many-particle systems.
NASA Astrophysics Data System (ADS)
Schieve, William C.; Horwitz, Lawrence P.
2009-04-01
1. Foundations of quantum statistical mechanics; 2. Elementary examples; 3. Quantum statistical master equation; 4. Quantum kinetic equations; 5. Quantum irreversibility; 6. Entropy and dissipation: the microscopic theory; 7. Global equilibrium: thermostatics and the microcanonical ensemble; 8. Bose-Einstein ideal gas condensation; 9. Scaling, renormalization and the Ising model; 10. Relativistic covariant statistical mechanics of many particles; 11. Quantum optics and damping; 12. Entanglements; 13. Quantum measurement and irreversibility; 14. Quantum Langevin equation: quantum Brownian motion; 15. Linear response: fluctuation and dissipation theorems; 16. Time dependent quantum Green's functions; 17. Decay scattering; 18. Quantum statistical mechanics, extended; 19. Quantum transport with tunneling and reservoir ballistic transport; 20. Black hole thermodynamics; Appendix; Index.
Statistical distribution sampling
NASA Technical Reports Server (NTRS)
Johnson, E. S.
1975-01-01
Determining the distribution of statistics by sampling was investigated. Characteristic functions, the quadratic regression problem, and the differential equations for the characteristic functions are analyzed.
Significant Scales in Community Structure
Traag, V. A.; Krings, G.; Van Dooren, P.
2013-01-01
Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant. This problem shows foremost in multi-resolution methods. We here introduce an efficient method for scanning for resolutions in one such method. Additionally, we introduce the notion of “significance” of a partition, based on subgraph probabilities. Significance is independent of the exact method used, so could also be applied in other methods, and can be interpreted as the gain in encoding a graph by making use of a partition. Using significance, we can determine “good” resolution parameters, which we demonstrate on benchmark networks. Moreover, optimizing significance itself also shows excellent performance. We demonstrate our method on voting data from the European Parliament. Our analysis suggests the European Parliament has become increasingly ideologically divided and that nationality plays no role. PMID:24121597
Illustrating the practice of statistics
Hamada, Christina A; Hamada, Michael S
2009-01-01
The practice of statistics involves analyzing data and planning data collection schemes to answer scientific questions. Issues often arise with the data that must be dealt with and can lead to new procedures. In analyzing data, these issues can sometimes be addressed through the statistical models that are developed. Simulation can also be helpful in evaluating a new procedure. Moreover, simulation coupled with optimization can be used to plan a data collection scheme. The practice of statistics as just described is much more than just using a statistical package. In analyzing the data, it involves understanding the scientific problem and incorporating the scientist's knowledge. In modeling the data, it involves understanding how the data were collected and accounting for limitations of the data where possible. Moreover, the modeling is likely to be iterative by considering a series of models and evaluating the fit of these models. Designing a data collection scheme involves understanding the scientist's goal and staying within hislher budget in terms of time and the available resources. Consequently, a practicing statistician is faced with such tasks and requires skills and tools to do them quickly. We have written this article for students to provide a glimpse of the practice of statistics. To illustrate the practice of statistics, we consider a problem motivated by some precipitation data that our relative, Masaru Hamada, collected some years ago. We describe his rain gauge observational study in Section 2. We describe modeling and an initial analysis of the precipitation data in Section 3. In Section 4, we consider alternative analyses that address potential issues with the precipitation data. In Section 5, we consider the impact of incorporating additional infonnation. We design a data collection scheme to illustrate the use of simulation and optimization in Section 6. We conclude this article in Section 7 with a discussion.
Statistical comparison of dissolution profiles.
Wang, Yifan; Snee, Ronald D; Keyvan, Golshid; Muzzio, Fernando J
2016-05-01
Statistical methods to assess similarity of dissolution profiles are introduced. Sixteen groups of dissolution profiles from a full factorial design were used to demonstrate implementation details. Variables in the design include drug strength, tablet stability time, and dissolution testing condition. The 16 groups were considered similar when compared using the similarity factor f2 (f2 > 50). However, multivariate ANOVA (MANOVA) repeated measures suggested statistical differences. A modified principal component analysis (PCA) was used to describe the dissolution curves in terms of level and shape. The advantage of the modified PCA approach is that the calculated shape principal components will not be confounded by level effect. Effect size test using omega-squared was also used for dissolution comparisons. Effects indicated by omega-squared are independent of sample size and are a necessary supplement to p value reported from the MANOVA table. Methods to compare multiple groups show that product strength and dissolution testing condition had significant effects on both level and shape. For pairwise analysis, a post-hoc analysis using Tukey's method categorized three similar groups, and was consistent with level-shape analysis. All these methods provide valuable information that is missed using f2 method alone to compare average profiles. The improved statistical analysis approach introduced here enables one to better ascertain both statistical significance and clinical relevance, supporting more objective regulatory decisions. PMID:26294289
Statistical Mechanics of Zooplankton
Hinow, Peter; Nihongi, Ai; Strickler, J. Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar “microscopic” quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the “ecological temperature” of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean’s swimming behavior. PMID:26270537
Statistical Mechanics of Zooplankton.
Hinow, Peter; Nihongi, Ai; Strickler, J Rudi
2015-01-01
Statistical mechanics provides the link between microscopic properties of many-particle systems and macroscopic properties such as pressure and temperature. Observations of similar "microscopic" quantities exist for the motion of zooplankton, as well as many species of other social animals. Herein, we propose to take average squared velocities as the definition of the "ecological temperature" of a population under different conditions on nutrients, light, oxygen and others. We test the usefulness of this definition on observations of the crustacean zooplankton Daphnia pulicaria. In one set of experiments, D. pulicaria is infested with the pathogen Vibrio cholerae, the causative agent of cholera. We find that infested D. pulicaria under light exposure have a significantly greater ecological temperature, which puts them at a greater risk of detection by visual predators. In a second set of experiments, we observe D. pulicaria in cold and warm water, and in darkness and under light exposure. Overall, our ecological temperature is a good discriminator of the crustacean's swimming behavior. PMID:26270537
The Probabilistic Admissible Region with Additional Constraints
NASA Astrophysics Data System (ADS)
Roscoe, C.; Hussein, I.; Wilkins, M.; Schumacher, P.
The admissible region, in the space surveillance field, is defined as the set of physically acceptable orbits (e.g., orbits with negative energies) consistent with one or more observations of a space object. Given additional constraints on orbital semimajor axis, eccentricity, etc., the admissible region can be constrained, resulting in the constrained admissible region (CAR). Based on known statistics of the measurement process, one can replace hard constraints with a probabilistic representation of the admissible region. This results in the probabilistic admissible region (PAR), which can be used for orbit initiation in Bayesian tracking and prioritization of tracks in a multiple hypothesis tracking framework. The PAR concept was introduced by the authors at the 2014 AMOS conference. In that paper, a Monte Carlo approach was used to show how to construct the PAR in the range/range-rate space based on known statistics of the measurement, semimajor axis, and eccentricity. An expectation-maximization algorithm was proposed to convert the particle cloud into a Gaussian Mixture Model (GMM) representation of the PAR. This GMM can be used to initialize a Bayesian filter. The PAR was found to be significantly non-uniform, invalidating an assumption frequently made in CAR-based filtering approaches. Using the GMM or particle cloud representations of the PAR, orbits can be prioritized for propagation in a multiple hypothesis tracking (MHT) framework. In this paper, the authors focus on expanding the PAR methodology to allow additional constraints, such as a constraint on perigee altitude, to be modeled in the PAR. This requires re-expressing the joint probability density function for the attributable vector as well as the (constrained) orbital parameters and range and range-rate. The final PAR is derived by accounting for any interdependencies between the parameters. Noting that the concepts presented are general and can be applied to any measurement scenario, the idea
Robot Trajectories Comparison: A Statistical Approach
Ansuategui, A.; Arruti, A.; Susperregi, L.; Yurramendi, Y.; Jauregi, E.; Lazkano, E.; Sierra, B.
2014-01-01
The task of planning a collision-free trajectory from a start to a goal position is fundamental for an autonomous mobile robot. Although path planning has been extensively investigated since the beginning of robotics, there is no agreement on how to measure the performance of a motion algorithm. This paper presents a new approach to perform robot trajectories comparison that could be applied to any kind of trajectories and in both simulated and real environments. Given an initial set of features, it automatically selects the most significant ones and performs a statistical comparison using them. Additionally, a graphical data visualization named polygraph which helps to better understand the obtained results is provided. The proposed method has been applied, as an example, to compare two different motion planners, FM2 and WaveFront, using different environments, robots, and local planners. PMID:25525618
Robot trajectories comparison: a statistical approach.
Ansuategui, A; Arruti, A; Susperregi, L; Yurramendi, Y; Jauregi, E; Lazkano, E; Sierra, B
2014-01-01
The task of planning a collision-free trajectory from a start to a goal position is fundamental for an autonomous mobile robot. Although path planning has been extensively investigated since the beginning of robotics, there is no agreement on how to measure the performance of a motion algorithm. This paper presents a new approach to perform robot trajectories comparison that could be applied to any kind of trajectories and in both simulated and real environments. Given an initial set of features, it automatically selects the most significant ones and performs a statistical comparison using them. Additionally, a graphical data visualization named polygraph which helps to better understand the obtained results is provided. The proposed method has been applied, as an example, to compare two different motion planners, FM(2) and WaveFront, using different environments, robots, and local planners. PMID:25525618
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…
Multidimensional Visual Statistical Learning
ERIC Educational Resources Information Center
Turk-Browne, Nicholas B.; Isola, Phillip J.; Scholl, Brian J.; Treat, Teresa A.
2008-01-01
Recent studies of visual statistical learning (VSL) have demonstrated that statistical regularities in sequences of visual stimuli can be automatically extracted, even without intent or awareness. Despite much work on this topic, however, several fundamental questions remain about the nature of VSL. In particular, previous experiments have not…
Deconstructing Statistical Analysis
ERIC Educational Resources Information Center
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Explorations in Statistics: Power
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2010-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fifth installment of "Explorations in Statistics" revisits power, a concept fundamental to the test of a null hypothesis. Power is the probability that we reject the null hypothesis when it is false. Four things affect…