Mathematical and statistical analysis
NASA Technical Reports Server (NTRS)
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Deconstructing Statistical Analysis
ERIC Educational Resources Information Center
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Statistical Energy Analysis Program
NASA Technical Reports Server (NTRS)
Ferebee, R. C.; Trudell, R. W.; Yano, L. I.; Nygaard, S. I.
1985-01-01
Statistical Energy Analysis (SEA) is powerful tool for estimating highfrequency vibration spectra of complex structural systems and incorporated into computer program. Basic SEA analysis procedure divided into three steps: Idealization, parameter generation, and problem solution. SEA computer program written in FORTRAN V for batch execution.
Statistical log analysis made practical
Mitchell, W.K.; Nelson, R.J. )
1991-06-01
This paper discusses the advantages of a statistical approach to log analysis. Statistical techniques use inverse methods to calculate formation parameters. The use of statistical techniques has been limited, however, by the complexity of the mathematics and lengthy computer time required to minimize traditionally used nonlinear equations.
Hahn, A.A.
1994-11-01
The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques.
Weak additivity principle for current statistics in d dimensions.
Pérez-Espigares, C; Garrido, P L; Hurtado, P I
2016-04-01
The additivity principle (AP) allows one to compute the current distribution in many one-dimensional nonequilibrium systems. Here we extend this conjecture to general d-dimensional driven diffusive systems, and validate its predictions against both numerical simulations of rare events and microscopic exact calculations of three paradigmatic models of diffusive transport in d=2. Crucially, the existence of a structured current vector field at the fluctuating level, coupled to the local mobility, turns out to be essential to understand current statistics in d>1. We prove that, when compared to the straightforward extension of the AP to high d, the so-called weak AP always yields a better minimizer of the macroscopic fluctuation theory action for current statistics.
Statistical tests of additional plate boundaries from plate motion inversions
NASA Technical Reports Server (NTRS)
Stein, S.; Gordon, R. G.
1984-01-01
The application of the F-ratio test, a standard statistical technique, to the results of relative plate motion inversions has been investigated. The method tests whether the improvement in fit of the model to the data resulting from the addition of another plate to the model is greater than that expected purely by chance. This approach appears to be useful in determining whether additional plate boundaries are justified. Previous results have been confirmed favoring separate North American and South American plates with a boundary located beween 30 N and the equator. Using Chase's global relative motion data, it is shown that in addition to separate West African and Somalian plates, separate West Indian and Australian plates, with a best-fitting boundary between 70 E and 90 E, can be resolved. These results are generally consistent with the observation that the Indian plate's internal deformation extends somewhat westward of the Ninetyeast Ridge. The relative motion pole is similar to Minster and Jordan's and predicts the NW-SE compression observed in earthquake mechanisms near the Ninetyeast Ridge.
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Statistical Analysis of RNA Backbone
Hershkovitz, Eli; Sapiro, Guillermo; Tannenbaum, Allen; Williams, Loren Dean
2009-01-01
Local conformation is an important determinant of RNA catalysis and binding. The analysis of RNA conformation is particularly difficult due to the large number of degrees of freedom (torsion angles) per residue. Proteins, by comparison, have many fewer degrees of freedom per residue. In this work, we use and extend classical tools from statistics and signal processing to search for clusters in RNA conformational space. Results are reported both for scalar analysis, where each torsion angle is separately studied, and for vectorial analysis, where several angles are simultaneously clustered. Adapting techniques from vector quantization and clustering to the RNA structure, we find torsion angle clusters and RNA conformational motifs. We validate the technique using well-known conformational motifs, showing that the simultaneous study of the total torsion angle space leads to results consistent with known motifs reported in the literature and also to the finding of new ones. PMID:17048391
Entropy in statistical energy analysis.
Le Bot, Alain
2009-03-01
In this paper, the second principle of thermodynamics is discussed in the framework of statistical energy analysis (SEA). It is shown that the "vibrational entropy" and the "vibrational temperature" of sub-systems only depend on the vibrational energy and the number of resonant modes. A SEA system can be described as a thermodynamic system slightly out of equilibrium. In steady-state condition, the entropy exchanged with exterior by sources and dissipation exactly balances the production of entropy by irreversible processes at interface between SEA sub-systems.
Statistical analysis of pyroshock data
NASA Astrophysics Data System (ADS)
Hughes, William O.
2002-05-01
The sample size of aerospace pyroshock test data is typically small. This often forces the engineer to make assumptions on its population distribution and to use conservative margins or methodologies in determining shock specifications. For example, the maximum expected environment is often derived by adding 3-6 dB to the maximum envelope of a limited amount of shock data. The recent availability of a large amount of pyroshock test data has allowed a rare statistical analysis to be performed. Findings and procedures from this analysis will be explained, including information on population distributions, procedures to properly combine families of test data, and methods of deriving appropriate shock specifications for a multipoint shock source.
Statistical Analysis of Protein Ensembles
NASA Astrophysics Data System (ADS)
Máté, Gabriell; Heermann, Dieter
2014-04-01
As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Statistical Analysis of Tsunami Variability
NASA Astrophysics Data System (ADS)
Zolezzi, Francesca; Del Giudice, Tania; Traverso, Chiara; Valfrè, Giulio; Poggi, Pamela; Parker, Eric J.
2010-05-01
similar to that seen in ground motion attenuation correlations used for seismic hazard assessment. The second issue was intra-event variability. This refers to the differences in tsunami wave run-up along a section of coast during a single event. Intra-event variability investigated directly considering field observations. The tsunami events used in the statistical evaluation were selected on the basis of the completeness and reliability of the available data. Tsunami considered for the analysis included the recent and well surveyed tsunami of Boxing Day 2004 (Great Indian Ocean Tsunami), Java 2006, Okushiri 1993, Kocaeli 1999, Messina 1908 and a case study of several historic events in Hawaii. Basic statistical analysis was performed on the field observations from these tsunamis. For events with very wide survey regions, the run-up heights have been grouped in order to maintain a homogeneous distance from the source. Where more than one survey was available for a given event, the original datasets were maintained separately to avoid combination of non-homogeneous data. The observed run-up measurements were used to evaluate the minimum, maximum, average, standard deviation and coefficient of variation for each data set. The minimum coefficient of variation was 0.12 measured for the 2004 Boxing Day tsunami at Nias Island (7 data) while the maximum is 0.98 for the Okushiri 1993 event (93 data). The average coefficient of variation is of the order of 0.45.
Asymptotic modal analysis and statistical energy analysis
NASA Technical Reports Server (NTRS)
Dowell, Earl H.
1992-01-01
Asymptotic Modal Analysis (AMA) is a method which is used to model linear dynamical systems with many participating modes. The AMA method was originally developed to show the relationship between statistical energy analysis (SEA) and classical modal analysis (CMA). In the limit of a large number of modes of a vibrating system, the classical modal analysis result can be shown to be equivalent to the statistical energy analysis result. As the CMA result evolves into the SEA result, a number of systematic assumptions are made. Most of these assumptions are based upon the supposition that the number of modes approaches infinity. It is for this reason that the term 'asymptotic' is used. AMA is the asymptotic result of taking the limit of CMA as the number of modes approaches infinity. AMA refers to any of the intermediate results between CMA and SEA, as well as the SEA result which is derived from CMA. The main advantage of the AMA method is that individual modal characteristics are not required in the model or computations. By contrast, CMA requires that each modal parameter be evaluated at each frequency. In the latter, contributions from each mode are computed and the final answer is obtained by summing over all the modes in the particular band of interest. AMA evaluates modal parameters only at their center frequency and does not sum the individual contributions from each mode in order to obtain a final result. The method is similar to SEA in this respect. However, SEA is only capable of obtaining spatial averages or means, as it is a statistical method. Since AMA is systematically derived from CMA, it can obtain local spatial information as well.
Statistical quality control through overall vibration analysis
NASA Astrophysics Data System (ADS)
Carnero, M. ^{a.} Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence
Statistical Power in Meta-Analysis
ERIC Educational Resources Information Center
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Towards a Judgement-Based Statistical Analysis
ERIC Educational Resources Information Center
Gorard, Stephen
2006-01-01
There is a misconception among social scientists that statistical analysis is somehow a technical, essentially objective, process of decision-making, whereas other forms of data analysis are judgement-based, subjective and far from technical. This paper focuses on the former part of the misconception, showing, rather, that statistical analysis…
Asymptotic modal analysis and statistical energy analysis
NASA Technical Reports Server (NTRS)
Dowell, Earl H.
1988-01-01
Statistical Energy Analysis (SEA) is defined by considering the asymptotic limit of Classical Modal Analysis, an approach called Asymptotic Modal Analysis (AMA). The general approach is described for both structural and acoustical systems. The theoretical foundation is presented for structural systems, and experimental verification is presented for a structural plate responding to a random force. Work accomplished subsequent to the grant initiation focusses on the acoustic response of an interior cavity (i.e., an aircraft or spacecraft fuselage) with a portion of the wall vibrating in a large number of structural modes. First results were presented at the ASME Winter Annual Meeting in December, 1987, and accepted for publication in the Journal of Vibration, Acoustics, Stress and Reliability in Design. It is shown that asymptotically as the number of acoustic modes excited becomes large, the pressure level in the cavity becomes uniform except at the cavity boundaries. However, the mean square pressure at the cavity corner, edge and wall is, respectively, 8, 4, and 2 times the value in the cavity interior. Also it is shown that when the portion of the wall which is vibrating is near a cavity corner or edge, the response is significantly higher.
Tsallis statistics in reliability analysis: Theory and methods
NASA Astrophysics Data System (ADS)
Zhang, Fode; Shi, Yimin; Keung Tony Ng, Hon; Wang, Ruibing
2016-10-01
Tsallis statistics, which is based on a non-additive entropy characterized by an index q, is a very useful tool in physics and statistical mechanics. This paper presents an application of Tsallis statistics in reliability analysis. We first show that the q-gamma and incomplete q-gamma functions are q-generalized. Then, three commonly used statistical distributions in reliability analysis are introduced in Tsallis statistics, and the corresponding reliability characteristics including the reliability function, hazard function, cumulative hazard function and mean time to failure are investigated. In addition, we study the statistical inference based on censored reliability data. Specifically, we investigate the point and interval estimation of the model parameters of the q-exponential distribution based on the maximum likelihood method. Simulated and real-life datasets are used to illustrate the methodologies discussed in this paper. Finally, some concluding remarks are provided.
Statistical analysis of histopathological endpoints.
Green, John W; Springer, Timothy A; Saulnier, Amy N; Swintek, Joe
2014-05-01
Histopathological assessments of fish from aquatic ecotoxicology studies are being performed with increasing frequency. Aquatic ecotoxicology studies performed for submission to regulatory agencies are usually conducted with multiple subjects (e.g., fish) in each of multiple vessels (replicates) within a water control and within each of several concentrations of a test substance. A number of histopathological endpoints are evaluated in each fish, and a severity score is generally recorded for each endpoint. The severity scores are often recorded using a nonquantitative scale of 0 to 4, with 0 indicating no effect, 1 indicating minimal effect, through 4 for severe effect. Statistical methods often used to analyze these scores suffer from several shortcomings: computing average scores as though scores were quantitative values, considering only the frequency of abnormality while ignoring severity, ignoring any concentration-response trend, and ignoring the possible correlation between responses of individuals within test vessels. A new test, the Rao-Scott Cochran-Armitage by Slices (RSCABS), is proposed that incorporates the replicate vessel experimental design and the biological expectation that the severity of the effect tends to increase with increasing doses or concentrations, while retaining the individual subject scores and taking into account the severity as well as frequency of scores. A power simulation and examples demonstrate the performance of the test. R-based software has been developed to carry out this test and is available free of charge at www.epa.gov/med/Prods_Pubs/rscabs.htm. The SAS-based RSCABS software is available from the first and third authors.
Statistical Analysis of DWPF ARG-1 Data
Harris, S.P.
2001-03-02
A statistical analysis of analytical results for ARG-1, an Analytical Reference Glass, blanks, and the associated calibration and bench standards has been completed. These statistics provide a means for DWPF to review the performance of their laboratory as well as identify areas of improvement.
Explorations in Statistics: The Analysis of Change
ERIC Educational Resources Information Center
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
Statistical analysis of trypanosomes' motility
NASA Astrophysics Data System (ADS)
Zaburdaev, Vasily; Uppaluri, Sravanti; Pfohl, Thomas; Engstler, Markus; Stark, Holger; Friedrich, Rudolf
2010-03-01
Trypanosome is a parasite causing the sleeping sickness. The way it moves in the blood stream and penetrates various obstacles is the area of active research. Our goal was to investigate a free trypanosomes' motion in the planar geometry. Our analysis of trypanosomes' trajectories reveals that there are two correlation times - one is associated with a fast motion of its body and the second one with a slower rotational diffusion of the trypanosome as a point object. We propose a system of Langevin equations to model such motion. One of its peculiarities is the presence of multiplicative noise predicting higher level of noise for higher velocity of the trypanosome. Theoretical and numerical results give a comprehensive description of the experimental data such as the mean squared displacement, velocity distribution and auto-correlation function.
Statistical analysis principles for Omics data.
Dunkler, Daniela; Sánchez-Cabo, Fátima; Heinze, Georg
2011-01-01
In Omics experiments, typically thousands of hypotheses are tested simultaneously, each based on very few independent replicates. Traditional tests like the t-test were shown to perform poorly with this new type of data. Furthermore, simultaneous consideration of many hypotheses, each prone to a decision error, requires powerful adjustments for this multiple testing situation. After a general introduction to statistical testing, we present the moderated t-statistic, the SAM statistic, and the RankProduct statistic which have been developed to evaluate hypotheses in typical Omics experiments. We also provide an introduction to the multiple testing problem and discuss some state-of-the-art procedures to address this issue. The presented test statistics are subjected to a comparative analysis of a microarray experiment comparing tissue samples of two groups of tumors. All calculations can be done using the freely available statistical software R. Accompanying, commented code is available at: http://www.meduniwien.ac.at/msi/biometrie/MIMB.
Singularity analysis and robust neighborhood statistics
NASA Astrophysics Data System (ADS)
Zuo, Renguang
2015-04-01
Neighborhood statistics involving data within small neighborhoods have the advantages of revealing more detailed local structures and spatial variations of spatial patterns, and provide less biased information compared with global statistics. However, the resulting neighborhood statistics are influenced by the size of neighborhood. Singularity analysis can be regarded as a type of robust neighborhood statistics. It measures the gradient of relative change within small neighborhoods. The value of singularity index at a location of z rarely relies on the element concentration at that location, but depends on the changes around z. From the multifractal theory viewpoint, the singularity index is independent of the size of neighborhood. Singularity analysis is a powerful tool to identify geochemical and geophysical anomalies in mineral exploration. Recent studies demonstrated singularity analysis can well detect the weak geochemical anomalies related to mineralization due to decaying and masking effects of covers.
Statistical Analysis Techniques for Small Sample Sizes
NASA Technical Reports Server (NTRS)
Navard, S. E.
1984-01-01
The small sample sizes problem which is encountered when dealing with analysis of space-flight data is examined. Because of such a amount of data available, careful analyses are essential to extract the maximum amount of information with acceptable accuracy. Statistical analysis of small samples is described. The background material necessary for understanding statistical hypothesis testing is outlined and the various tests which can be done on small samples are explained. Emphasis is on the underlying assumptions of each test and on considerations needed to choose the most appropriate test for a given type of analysis.
Schmidt decomposition and multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Bogdanov, Yu. I.; Bogdanova, N. A.; Fastovets, D. V.; Luckichev, V. F.
2016-12-01
The new method of multivariate data analysis based on the complements of classical probability distribution to quantum state and Schmidt decomposition is presented. We considered Schmidt formalism application to problems of statistical correlation analysis. Correlation of photons in the beam splitter output channels, when input photons statistics is given by compound Poisson distribution is examined. The developed formalism allows us to analyze multidimensional systems and we have obtained analytical formulas for Schmidt decomposition of multivariate Gaussian states. It is shown that mathematical tools of quantum mechanics can significantly improve the classical statistical analysis. The presented formalism is the natural approach for the analysis of both classical and quantum multivariate systems and can be applied in various tasks associated with research of dependences.
Statistical Tools for Forensic Analysis of Toolmarks
David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser
2004-04-22
Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.
Statistical Analysis Experiment for Freshman Chemistry Lab.
ERIC Educational Resources Information Center
Salzsieder, John C.
1995-01-01
Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…
Applied Behavior Analysis and Statistical Process Control?
ERIC Educational Resources Information Center
Hopkins, B. L.
1995-01-01
Incorporating statistical process control (SPC) methods into applied behavior analysis is discussed. It is claimed that SPC methods would likely reduce applied behavior analysts' intimate contacts with problems and would likely yield poor treatment and research decisions. Cases and data presented by Pfadt and Wheeler (1995) are cited as examples.…
Bayesian Statistics for Biological Data: Pedigree Analysis
ERIC Educational Resources Information Center
Stanfield, William D.; Carlton, Matthew A.
2004-01-01
The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
MICROARRAY DATA ANALYSIS USING MULTIPLE STATISTICAL MODELS
Microarray Data Analysis Using Multiple Statistical Models
Wenjun Bao1, Judith E. Schmid1, Amber K. Goetz1, Ming Ouyang2, William J. Welsh2,Andrew I. Brooks3,4, ChiYi Chu3,Mitsunori Ogihara3,4, Yinhe Cheng5, David J. Dix1. 1National Health and Environmental Effects Researc...
Comments: Statistical Analysis for Multisite Trials
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
In this article, the author shares his comments on statistical analysis for multisite trials, and focuses on the contribution of Stephen Raudenbush, Sean Reardon, and Takako Nomi to future research. Raudenbush, Reardon, and Nomi provide a major contribution to future research on variation in program impacts by showing how to use multisite trials…
Statistical analysis of extreme river flows
NASA Astrophysics Data System (ADS)
Mateus, Ayana; Caeiro, Frederico; Gomes, Dora Prata; Sequeira, Inês J.
2016-12-01
Floods are recurrent events that can have a catastrophic impact. In this work we are interested in the analysis of a data set of gauged daily flows from the Whiteadder Water river, Scotland. Using statistic techniques based on extreme value theory, we estimate several extreme value parameters, including extreme quantiles and return periods of high levels.
Statistical Methods in Algorithm Design and Analysis.
ERIC Educational Resources Information Center
Weide, Bruce W.
The use of statistical methods in the design and analysis of discrete algorithms is explored. The introductory chapter contains a literature survey and background material on probability theory. In Chapter 2, probabilistic approximation algorithms are discussed with the goal of exposing and correcting some oversights in previous work. Chapter 3…
Statistical evaluation of vibration analysis techniques
NASA Technical Reports Server (NTRS)
Milner, G. Martin; Miller, Patrice S.
1987-01-01
An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.
Stork, LeAnna M.; Gennings, Chris; Carchman, Richard; Carter, Jr., Walter H.; Pounds, Joel G.; Mumtaz, Moiz
2006-12-01
Several assumptions, defined and undefined, are used in the toxicity assessment of chemical mixtures. In scientific practice mixture components in the low-dose region, particularly subthreshold doses, are often assumed to behave additively (i.e., zero interaction) based on heuristic arguments. This assumption has important implications in the practice of risk assessment, but has not been experimentally tested. We have developed methodology to test for additivity in the sense of Berenbaum (Advances in Cancer Research, 1981), based on the statistical equivalence testing literature where the null hypothesis of interaction is rejected for the alternative hypothesis of additivity when data support the claim. The implication of this approach is that conclusions of additivity are made with a false positive rate controlled by the experimenter. The claim of additivity is based on prespecified additivity margins, which are chosen using expert biological judgment such that small deviations from additivity, which are not considered to be biologically important, are not statistically significant. This approach is in contrast to the usual hypothesis-testing framework that assumes additivity in the null hypothesis and rejects when there is significant evidence of interaction. In this scenario, failure to reject may be due to lack of statistical power making the claim of additivity problematic. The proposed method is illustrated in a mixture of five organophosphorus pesticides that were experimentally evaluated alone and at relevant mixing ratios. Motor activity was assessed in adult male rats following acute exposure. Four low-dose mixture groups were evaluated. Evidence of additivity is found in three of the four low-dose mixture groups.The proposed method tests for additivity of the whole mixture and does not take into account subset interactions (e.g., synergistic, antagonistic) that may have occurred and cancelled each other out.
Explorations in statistics: the analysis of change.
Curran-Everett, Douglas; Williams, Calvin L
2015-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of Explorations in Statistics explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can account for different initial values of the response. But this creates a problem: percent change is really just a ratio, and a ratio is infamous for its ability to mislead. This means we may fail to find a group difference that does exist, or we may find a group difference that does not exist. What kind of an approach to science is that? In contrast, analysis of covariance is versatile: it can accommodate an analysis of the relationship between absolute change and initial value when percent change is useless.
Statistical Analysis of Iberian Peninsula Megaliths Orientations
NASA Astrophysics Data System (ADS)
González-García, A. C.
2009-08-01
Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.
Multivariate analysis: A statistical approach for computations
NASA Astrophysics Data System (ADS)
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
On statistical approaches to climate change analysis
NASA Astrophysics Data System (ADS)
Lee, Terry Chun Kit
Evidence for a human contribution to climatic changes during the past century is accumulating rapidly. Given the strength of the evidence, it seems natural to ask whether forcing projections can be used to forecast climate change. A Bayesian method for post-processing forced climate model simulations that produces probabilistic hindcasts of inter-decadal temperature changes on large spatial scales is proposed. Hindcasts produced for the last two decades of the 20th century are shown to be skillful. The suggestion that skillful decadal forecasts can be produced on large regional scales by exploiting the response to anthropogenic forcing provides additional evidence that anthropogenic change in the composition of the atmosphere has influenced our climate. In the absence of large negative volcanic forcing on the climate system (which cannot presently be forecast), the global mean temperature for the decade 2000-2009 is predicted to lie above the 1970-1999 normal with probability 0.94. The global mean temperature anomaly for this decade relative to 1970-1999 is predicted to be 0.35°C (5-95% confidence range: 0.21°C--0.48°C). Reconstruction of temperature variability of the past centuries using climate proxy data can also provide important information on the role of anthropogenic forcing in the observed 20th century warming. A state-space model approach that allows incorporation of additional non-temperature information, such as the estimated response to external forcing, to reconstruct historical temperature is proposed. An advantage of this approach is that it permits simultaneous reconstruction and detection analysis as well as future projection. A difficulty in using this approach is that estimation of several unknown state-space model parameters is required. To take advantage of the data structure in the reconstruction problem, the existing parameter estimation approach is modified, resulting in two new estimation approaches. The competing estimation approaches
Statistical Tolerance and Clearance Analysis for Assembly
NASA Technical Reports Server (NTRS)
Lee, S.; Yi, C.
1996-01-01
Tolerance is inevitable because manufacturing exactly equal parts is known to be impossible. Furthermore, the specification of tolerances is an integral part of product design since tolerances directly affect the assemblability, functionality, manufacturability, and cost effectiveness of a product. In this paper, we present statistical tolerance and clearance analysis for the assembly. Our proposed work is expected to make the following contributions: (i) to help the designers to evaluate products for assemblability, (ii) to provide a new perspective to tolerance problems, and (iii) to provide a tolerance analysis tool which can be incorporated into a CAD or solid modeling system.
Statistical analysis in dBASE-compatible databases.
Hauer-Jensen, M
1991-01-01
Database management in clinical and experimental research often requires statistical analysis of the data in addition to the usual functions for storing, organizing, manipulating and reporting. With most database systems, transfer of data to a dedicated statistics package is a relatively simple task. However, many statistics programs lack the powerful features found in database management software. dBASE IV and compatible programs are currently among the most widely used database management programs. d4STAT is a utility program for dBASE, containing a collection of statistical functions and tests for data stored in the dBASE file format. By using d4STAT, statistical calculations may be performed directly on the data stored in the database without having to exit dBASE IV or export data. Record selection and variable transformations are performed in memory, thus obviating the need for creating new variables or data files. The current version of the program contains routines for descriptive statistics, paired and unpaired t-tests, correlation, linear regression, frequency tables, Mann-Whitney U-test, Wilcoxon signed rank test, a time-saving procedure for counting observations according to user specified selection criteria, survival analysis (product limit estimate analysis, log-rank test, and graphics), and normal t and chi-squared distribution functions.
NASA Astrophysics Data System (ADS)
Valotto, Gabrio; Varin, Cristiano
2016-01-01
An additive modeling approach is employed to provide a statistical description of hourly variation in concentrations of NOx measured in proximity of the Venice "Marco Polo" International Airport, Italy. Differently from several previous studies on airport emissions based on daily time series, the paper analyzes hourly data because variations of NOx concentrations during the day are informative about the prevailing emission source. The statistical analysis is carried out using a one-year time series. Confounder effects due to seasonality, meteorology and airport traffic volume are accounted for by suitable covariates. Four different model specifications of increasing complexity are considered. The model with the aircraft source expressed as the NOx emitted near the airport is found to have the best predictive quality. Although the aircraft source is statistically significant, the comparison of model-based predictions suggests that the relative impact of aircraft emissions to ambient NOx concentrations is limited and the road traffic is the likely dominant source near the sampling point.
Statistical analysis of sleep spindle occurrences.
Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr
2013-01-01
Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Physical mechanism and statistics of occurrence of an additional layer in the equatorial ionosphere
NASA Astrophysics Data System (ADS)
Balan, N.; Batista, I. S.; Abdu, M. A.; MacDougall, J.; Bailey, G. J.
1998-12-01
A physical mechanism and the location and latitudinal extent of an additional layer, called the F3 layer, that exists in the equatorial ionosphere are presented. A statistical analysis of the occurrence of the layer recorded at the equatorial station Fortaleza (4°S, 38°W dip 9°S) in Brazil is also presented. The F3 layer forms during the morning-noon period in that equatorial region where the combined effect of the upward
Apparatus for statistical time-series analysis of electrical signals
NASA Technical Reports Server (NTRS)
Stewart, C. H. (Inventor)
1973-01-01
An apparatus for performing statistical time-series analysis of complex electrical signal waveforms, permitting prompt and accurate determination of statistical characteristics of the signal is presented.
Statistical Hot Channel Analysis for the NBSR
Cuadra A.; Baek J.
2014-05-27
A statistical analysis of thermal limits has been carried out for the research reactor (NBSR) at the National Institute of Standards and Technology (NIST). The objective of this analysis was to update the uncertainties of the hot channel factors with respect to previous analysis for both high-enriched uranium (HEU) and low-enriched uranium (LEU) fuels. Although uncertainties in key parameters which enter into the analysis are not yet known for the LEU core, the current analysis uses reasonable approximations instead of conservative estimates based on HEU values. Cumulative distribution functions (CDFs) were obtained for critical heat flux ratio (CHFR), and onset of flow instability ratio (OFIR). As was done previously, the Sudo-Kaminaga correlation was used for CHF and the Saha-Zuber correlation was used for OFI. Results were obtained for probability levels of 90%, 95%, and 99.9%. As an example of the analysis, the results for both the existing reactor with HEU fuel and the LEU core show that CHFR would have to be above 1.39 to assure with 95% probability that there is no CHF. For the OFIR, the results show that the ratio should be above 1.40 to assure with a 95% probability that OFI is not reached.
Statistical addition method for external noise sources affecting HF-MF-LF systems
NASA Astrophysics Data System (ADS)
Neudegg, David
2001-01-01
The current statistical method for the addition of external component noise sources in the LF, MF, and lower HF band (100 kHz to 3 MHz) produces total median noise levels that may be less than the largest-component median in some cases. Several case studies illustrate this anomaly. Methods used to sum the components rely on their power (decibels) distributions being represented as normal by the statistical parameters. The atmospheric noise component is not correctly represented by its decile values when it is assumed to have a normal distribution, causing anomalies in the noise summation when components are similar in magnitude. A revised component summation method is proposed, and the way it provides a more physically realistic total noise median for LF, MF, and lower HF frequencies is illustrated.
Recent advances in statistical energy analysis
NASA Technical Reports Server (NTRS)
Heron, K. H.
1992-01-01
Statistical Energy Analysis (SEA) has traditionally been developed using modal summation and averaging approach, and has led to the need for many restrictive SEA assumptions. The assumption of 'weak coupling' is particularly unacceptable when attempts are made to apply SEA to structural coupling. It is now believed that this assumption is more a function of the modal formulation rather than a necessary formulation of SEA. The present analysis ignores this restriction and describes a wave approach to the calculation of plate-plate coupling loss factors. Predictions based on this method are compared with results obtained from experiments using point excitation on one side of an irregular six-sided box structure. Conclusions show that the use and calculation of infinite transmission coefficients is the way forward for the development of a purely predictive SEA code.
Multivariate statistical analysis of wildfires in Portugal
NASA Astrophysics Data System (ADS)
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
Statistical Analysis of Sleep Spindle Occurrences
Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr
2013-01-01
Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely. PMID:23560045
HistFitter software framework for statistical data analysis
NASA Astrophysics Data System (ADS)
Baak, M.; Besjes, G. J.; Côté, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-04-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface.
Additional EIPC Study Analysis. Final Report
Hadley, Stanton W; Gotham, Douglas J.; Luciani, Ralph L.
2014-12-01
Between 2010 and 2012 the Eastern Interconnection Planning Collaborative (EIPC) conducted a major long-term resource and transmission study of the Eastern Interconnection (EI). With guidance from a Stakeholder Steering Committee (SSC) that included representatives from the Eastern Interconnection States Planning Council (EISPC) among others, the project was conducted in two phases. Phase 1 involved a long-term capacity expansion analysis that involved creation of eight major futures plus 72 sensitivities. Three scenarios were selected for more extensive transmission- focused evaluation in Phase 2. Five power flow analyses, nine production cost model runs (including six sensitivities), and three capital cost estimations were developed during this second phase. The results from Phase 1 and 2 provided a wealth of data that could be examined further to address energy-related questions. A list of 14 topics was developed for further analysis. This paper brings together the earlier interim reports of the first 13 topics plus one additional topic into a single final report.
Analysis of Variance: What Is Your Statistical Software Actually Doing?
ERIC Educational Resources Information Center
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
An R package for statistical provenance analysis
NASA Astrophysics Data System (ADS)
Vermeesch, Pieter; Resentini, Alberto; Garzanti, Eduardo
2016-05-01
This paper introduces provenance, a software package within the statistical programming environment R, which aims to facilitate the visualisation and interpretation of large amounts of sedimentary provenance data, including mineralogical, petrographic, chemical and isotopic provenance proxies, or any combination of these. provenance comprises functions to: (a) calculate the sample size required to achieve a given detection limit; (b) plot distributional data such as detrital zircon U-Pb age spectra as Cumulative Age Distributions (CADs) or adaptive Kernel Density Estimates (KDEs); (c) plot compositional data as pie charts or ternary diagrams; (d) correct the effects of hydraulic sorting on sandstone petrography and heavy mineral composition; (e) assess the settling equivalence of detrital minerals and grain-size dependence of sediment composition; (f) quantify the dissimilarity between distributional data using the Kolmogorov-Smirnov and Sircombe-Hazelton distances, or between compositional data using the Aitchison and Bray-Curtis distances; (e) interpret multi-sample datasets by means of (classical and nonmetric) Multidimensional Scaling (MDS) and Principal Component Analysis (PCA); and (f) simplify the interpretation of multi-method datasets by means of Generalised Procrustes Analysis (GPA) and 3-way MDS. All these tools can be accessed through an intuitive query-based user interface, which does not require knowledge of the R programming language. provenance is free software released under the GPL-2 licence and will be further expanded based on user feedback.
Time Series Analysis Based on Running Mann Whitney Z Statistics
Technology Transfer Automated Retrieval System (TEKTRAN)
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Statistical analysis of single-trial Granger causality spectra.
Brovelli, Andrea
2012-01-01
Granger causality analysis is becoming central for the analysis of interactions between neural populations and oscillatory networks. However, it is currently unclear whether single-trial estimates of Granger causality spectra can be used reliably to assess directional influence. We addressed this issue by combining single-trial Granger causality spectra with statistical inference based on general linear models. The approach was assessed on synthetic and neurophysiological data. Synthetic bivariate data was generated using two autoregressive processes with unidirectional coupling. We simulated two hypothetical experimental conditions: the first mimicked a constant and unidirectional coupling, whereas the second modelled a linear increase in coupling across trials. The statistical analysis of single-trial Granger causality spectra, based on t-tests and linear regression, successfully recovered the underlying pattern of directional influence. In addition, we characterised the minimum number of trials and coupling strengths required for significant detection of directionality. Finally, we demonstrated the relevance for neurophysiology by analysing two local field potentials (LFPs) simultaneously recorded from the prefrontal and premotor cortices of a macaque monkey performing a conditional visuomotor task. Our results suggest that the combination of single-trial Granger causality spectra and statistical inference provides a valuable tool for the analysis of large-scale cortical networks and brain connectivity.
Statistical Analysis of Bus Networks in India
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future. PMID:27992590
Statistical Analysis of Bus Networks in India.
Chatterjee, Atanu; Manohar, Manju; Ramadurai, Gitakrishnan
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future.
CORSSA: Community Online Resource for Statistical Seismicity Analysis
NASA Astrophysics Data System (ADS)
Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.
2011-12-01
Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)
NASA Astrophysics Data System (ADS)
Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee
2010-12-01
Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review
SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS
NASA Technical Reports Server (NTRS)
Brownlow, J. D.
1994-01-01
The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval
Statistical Power Analysis of Rehabilitation Counseling Research.
ERIC Educational Resources Information Center
Kosciulek, John F.; Szymanski, Edna Mora
1993-01-01
Provided initial assessment of the statistical power of rehabilitation counseling research published in selected rehabilitation journals. From 5 relevant journals, found 32 articles that contained statistical tests that could be power analyzed. Findings indicated that rehabilitation counselor researchers had little chance of finding small…
Web-Based Statistical Sampling and Analysis
ERIC Educational Resources Information Center
Quinn, Anne; Larson, Karen
2016-01-01
Consistent with the Common Core State Standards for Mathematics (CCSSI 2010), the authors write that they have asked students to do statistics projects with real data. To obtain real data, their students use the free Web-based app, Census at School, created by the American Statistical Association (ASA) to help promote civic awareness among school…
[Statistical models for spatial analysis in parasitology].
Biggeri, A; Catelan, D; Dreassi, E; Lagazio, C; Cringoli, G
2004-06-01
The simplest way to study the spatial pattern of a disease is the geographical representation of its cases (or some indicators of them) over a map. Maps based on raw data are generally "wrong" since they do not take into consideration for sampling errors. Indeed, the observed differences between areas (or points in the map) are not directly interpretable, as they derive from the composition of true, structural differences and of the noise deriving from the sampling process. This problem is well known in human epidemiology, and several solutions have been proposed to filter the signal from the noise. These statistical methods are usually referred to as Disease Mapping. In geographical analysis a first goal is to evaluate the statistical significance of the heterogeneity between areas (or points). If the test indicates rejection of the hypothesis of homogeneity the following task is to study the spatial pattern of the disease. The spatial variability of risk is usually decomposed into two terms: a spatially structured (clustering) and a non spatially structured (heterogeneity) one. The heterogeneity term reflects spatial variability due to intrinsic characteristics of the sampling units (e.g. igienic conditions of farms), while the clustering term models the association due to proximity between sampling units, that usually depends on ecological conditions that vary over the study area and that affect in similar way breedings that are close to each other. Hierarchical bayesian models are the main tool to make inference over the clustering and heterogeneity components. The results are based on the marginal posterior distributions of the parameters of the model, that are approximated by Monte Carlo Markov Chain methods. Different models can be defined depending on the terms that are considered, namely a model with only the clustering term, a model with only the heterogeneity term and a model where both are included. Model selection criteria based on a compromise between
Self-Contained Statistical Analysis of Gene Sets
Cannon, Judy L.; Ricoy, Ulises M.; Johnson, Christopher
2016-01-01
Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher’s self-contained method for gene set analysis. We improve Fisher’s differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients. PMID:27711232
Modern Statistical Methods for GLAST Event Analysis
Morris, Robin D.; Cohen-Tanugi, Johann; /SLAC /KIPAC, Menlo Park
2007-04-10
We describe a statistical reconstruction methodology for the GLAST LAT. The methodology incorporates in detail the statistics of the interactions of photons and charged particles with the tungsten layers in the LAT, and uses the scattering distributions to compute the full probability distribution over the energy and direction of the incident photons. It uses model selection methods to estimate the probabilities of the possible geometrical configurations of the particles produced in the detector, and numerical marginalization over the energy loss and scattering angles at each layer. Preliminary results show that it can improve on the tracker-only energy estimates for muons and electrons incident on the LAT.
Statistical Analysis of Refractivity in UAE
NASA Astrophysics Data System (ADS)
Al-Ansari, Kifah; Al-Mal, Abdulhadi Abu; Kamel, Rami
2007-07-01
This paper presents the results of the refractivity statistics in the UAE (United Arab Emirates) for a period of 14 years (1990-2003). Six sites have been considered using meteorological surface data (Abu Dhabi, Dubai, Sharjah, Al-Ain, Ras Al-Kaimah, and Al-Fujairah). Upper air (radiosonde) data were available at one site only, Abu Dhabi airport, which has been considered for the refractivity gradient statistics. Monthly and yearly averages are obtained for the two parameters, refractivity and refractivity gradient. Cumulative distributions are also provided.
Notes on numerical reliability of several statistical analysis programs
Landwehr, J.M.; Tasker, Gary D.
1999-01-01
This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
[Total analysis of organic rubber additives].
He, Wen-Xuan; Robert, Shanks; You, Ye-Ming
2010-03-01
In the present paper, after middle pressure chromatograph separation using both positive phase and reversed-phase conditions, the organic additives in ethylene-propylene rubber were identified by infrared spectrometer. At the same time, by using solid phase extraction column to maintain the main component-fuel oil in organic additves to avoid its interfering with minor compounds, other organic additves were separated and analysed by GC/Ms. In addition, the remaining active compound such as benzoyl peroxide was identified by CC/Ms, through analyzing acetone extract directly. Using the above mentioned techniques, soften agents (fuel oil, plant oil and phthalte), curing agent (benzoylperoxide), vulcanizing accelerators (2-mercaptobenzothiazole, ethyl thiuram and butyl thiuram), and antiagers (2, 6-Di-tert-butyl-4-methyl phenol and styrenated phenol) in ethylene-propylene rubber were identified. Although the technique was established in ethylene-propylene rubber system, it can be used in other rubber system.
NASA Technical Reports Server (NTRS)
Smalheer, C. V.
1973-01-01
The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.
Critical analysis of adsorption data statistically
NASA Astrophysics Data System (ADS)
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are <1, indicating favourable isotherms. Karl Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
Image analysis and statistical inference in neuroimaging with R.
Tabelow, K; Clayden, J D; de Micheaux, P Lafaye; Polzehl, J; Schmid, V J; Whitcher, B
2011-04-15
R is a language and environment for statistical computing and graphics. It can be considered an alternative implementation of the S language developed in the 1970s and 1980s for data analysis and graphics (Becker and Chambers, 1984; Becker et al., 1988). The R language is part of the GNU project and offers versions that compile and run on almost every major operating system currently available. We highlight several R packages built specifically for the analysis of neuroimaging data in the context of functional MRI, diffusion tensor imaging, and dynamic contrast-enhanced MRI. We review their methodology and give an overview of their capabilities for neuroimaging. In addition we summarize some of the current activities in the area of neuroimaging software development in R.
A statistical package for computing time and frequency domain analysis
NASA Technical Reports Server (NTRS)
Brownlow, J.
1978-01-01
The spectrum analysis (SPA) program is a general purpose digital computer program designed to aid in data analysis. The program does time and frequency domain statistical analyses as well as some preanalysis data preparation. The capabilities of the SPA program include linear trend removal and/or digital filtering of data, plotting and/or listing of both filtered and unfiltered data, time domain statistical characterization of data, and frequency domain statistical characterization of data.
Statistical Analysis of Random Number Generators
NASA Astrophysics Data System (ADS)
Accardi, Luigi; Gäbler, Markus
2011-01-01
In many applications, for example cryptography and Monte Carlo simulation, there is need for random numbers. Any procedure, algorithm or device which is intended to produce such is called a random number generator (RNG). What makes a good RNG? This paper gives an overview on empirical testing of the statistical properties of the sequences produced by RNGs and special software packages designed for that purpose. We also present the results of applying a particular test suite--TestU01-- to a family of RNGs currently being developed at the Centro Interdipartimentale Vito Volterra (CIVV), Roma, Italy.
Statistical analysis of life history calendar data.
Eerola, Mervi; Helske, Satu
2016-04-01
The life history calendar is a data-collection tool for obtaining reliable retrospective data about life events. To illustrate the analysis of such data, we compare the model-based probabilistic event history analysis and the model-free data mining method, sequence analysis. In event history analysis, we estimate instead of transition hazards the cumulative prediction probabilities of life events in the entire trajectory. In sequence analysis, we compare several dissimilarity metrics and contrast data-driven and user-defined substitution costs. As an example, we study young adults' transition to adulthood as a sequence of events in three life domains. The events define the multistate event history model and the parallel life domains in multidimensional sequence analysis. The relationship between life trajectories and excess depressive symptoms in middle age is further studied by their joint prediction in the multistate model and by regressing the symptom scores on individual-specific cluster indices. The two approaches complement each other in life course analysis; sequence analysis can effectively find typical and atypical life patterns while event history analysis is needed for causal inquiries.
Statistical analysis of Contact Angle Hysteresis
NASA Astrophysics Data System (ADS)
Janardan, Nachiketa; Panchagnula, Mahesh
2015-11-01
We present the results of a new statistical approach to determining Contact Angle Hysteresis (CAH) by studying the nature of the triple line. A statistical distribution of local contact angles on a random three-dimensional drop is used as the basis for this approach. Drops with randomly shaped triple lines but of fixed volumes were deposited on a substrate and their triple line shapes were extracted by imaging. Using a solution developed by Prabhala et al. (Langmuir, 2010), the complete three dimensional shape of the sessile drop was generated. A distribution of the local contact angles for several such drops but of the same liquid-substrate pairs is generated. This distribution is a result of several microscopic advancing and receding processes along the triple line. This distribution is used to yield an approximation of the CAH associated with the substrate. This is then compared with measurements of CAH by means of a liquid infusion-withdrawal experiment. Static measurements are shown to be sufficient to measure quasistatic contact angle hysteresis of a substrate. The approach also points towards the relationship between microscopic triple line contortions and CAH.
Statistical analysis of low level atmospheric turbulence
NASA Technical Reports Server (NTRS)
Tieleman, H. W.; Chen, W. W. L.
1974-01-01
The statistical properties of low-level wind-turbulence data were obtained with the model 1080 total vector anemometer and the model 1296 dual split-film anemometer, both manufactured by Thermo Systems Incorporated. The data obtained from the above fast-response probes were compared with the results obtained from a pair of Gill propeller anemometers. The digitized time series representing the three velocity components and the temperature were each divided into a number of blocks, the length of which depended on the lowest frequency of interest and also on the storage capacity of the available computer. A moving-average and differencing high-pass filter was used to remove the trend and the low frequency components in the time series. The calculated results for each of the anemometers used are represented in graphical or tabulated form.
Statistical Evaluation of Time Series Analysis Techniques
NASA Technical Reports Server (NTRS)
Benignus, V. A.
1973-01-01
The performance of a modified version of NASA's multivariate spectrum analysis program is discussed. A multiple regression model was used to make the revisions. Performance improvements were documented and compared to the standard fast Fourier transform by Monte Carlo techniques.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak
2016-06-01
Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.
NASA Astrophysics Data System (ADS)
Lionello, Piero; Conte, Dario; Marzo, Luigi; Scarascia, Luca
2015-04-01
In the Mediterranean Sea there are two contrasting factors affecting the maximum level that water will reach during a storm in the next decades: the increase of mean sea level and the decrease of storminess. Future reduction of storminess, which is associated with a decreased intensity of the Mediterranean branch on the north hemisphere storm track, will determine a reduction of maxima of wind wave height and storm surge levels. Changes of mean sea level are produced by regional steric effects and by net mass addition. While it is possible to compute the steric effects with regional models, mass addition is ultimately the consequence of a remote cause: the melting of Greenland and Antarctica ice caps. This study considers four indicators of extreme water levels, which, ranked in order of increasing values: the average of the 10 largest annual maxima (wlind10), the largest annual maximum (wlind1), the 5 (rv5) and 50 (rv50) year return level. The analysis is based on a coordinated set of wave and storm surge simulation forced by inputs provided by regional climate model simulations that were carried out in the CIRCE EU-fp7 and cover the period 1951-2050. Accounting for all affecting factors but the mass addition, in about 60% of the Mediterranean coast reduced storminess and steric expansion will compensate each other and produce no significant change of maximum water level statistics. The remaining 40% of the coastline is almost equally divided between significant positive and negative changes. However, if a supplementary sea level increase, representing the effect of water mass addition, is added, the fraction of the coast with significant positive/negative changes increase/decrease quickly. If mass addition would contribute 10cm, there will be no significant negative changes and for any indicator. With a 20cm addition the increase would be significant for wlind10, wlind1, rv5 along more than 75% of the Mediterranean coastline. With a 35cm addition the increase
Comparative analysis of positive and negative attitudes toward statistics
NASA Astrophysics Data System (ADS)
Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah
2015-02-01
Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.
Statistical analysis of plasmaspheric EMIC waves
NASA Astrophysics Data System (ADS)
Kato, Y.; Miyoshi, Y.; Sakaguchi, K.; Kasahara, Y.; Keika, K.; Shoji, M.; Kitamura, N.; Hasegawa, S.; Kumamoto, A.; Shiokawa, K.
2014-12-01
Electromagnetic ion cyclotron (EMIC) waves in the inner magnetosphere are important since EMIC waves cause the pitch angle scattering of ring current ions as well as relativistic electrons of the radiation belts. Although the spatial distributions of EMIC waves have been investigated by several spacecraft such as CRRES, THEMIS and AMPTE/CCE, there have been little studies on plasmaspheric EMIC waves. We investigate statistically EMIC wave data using the Akebono/VLF measurements. The plasmaspheric EMIC waves tend to be distributed at lower L-shell region (L~2) than the slot region. There are no significant MLT dependences, which are different from the EMIC waves outside the plasmapause. The plasmaspheric EMIC wave frequencies depend on the equatorial cyclotron frequency, suggesting that the plasmaspheric EMIC waves are not propagated from high L-shell but generated near the equivalent L-shell magnetic equator. This result is consistent with the result of the dependence of resonance energy. Using the in-situ thermal plasma density measured by the Akebono satellite, we estimate the resonance energy of energetic ions, and the resonance energies of the plasmaspheric EMIC waves are few tens keV to ~ 1 MeV. The results indicate that the ring current and radiation belt ions may contribute the generation of the plasmaspheric EMIC waves.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Statistical analysis of fixed income market
NASA Astrophysics Data System (ADS)
Bernaschi, Massimo; Grilli, Luca; Vergni, Davide
2002-05-01
We present cross and time series analysis of price fluctuations in the US Treasury fixed income market. Bonds have been classified according to a suitable metric based on the correlation among them. The classification shows how the correlation among fixed income securities depends strongly on their maturity. We study also the structure of price fluctuations for single time series.
Component outage data analysis methods. Volume 2: Basic statistical methods
NASA Astrophysics Data System (ADS)
Marshall, J. A.; Mazumdar, M.; McCutchan, D. A.
1981-08-01
Statistical methods for analyzing outage data on major power system components such as generating units, transmission lines, and transformers are identified. The analysis methods produce outage statistics from component failure and repair data that help in understanding the failure causes and failure modes of various types of components. Methods for forecasting outage statistics for those components used in the evaluation of system reliability are emphasized.
Statistical Analysis of Multiple Choice Testing
2001-04-01
the question to help determine poor distractors (incorrect answers). However, Attali and Fraenkel show that while it is sound to use the Rpbis...heavily on question difficulty.21 Attali and Fraenkel say that the Biserial is usually preferred as a criterion measure for the correct alternative...pubs/mcq/scpre.html, p.6 17 Renckly, Thomas R. Test Analysis & Development Sysem (TAD) version 5.49. CD- ROM.(1990-2000). 18 Ibid. 19 Attali , Yigal
Internet Data Analysis for the Undergraduate Statistics Curriculum
ERIC Educational Resources Information Center
Sanchez, Juana; He, Yan
2005-01-01
Statistics textbooks for undergraduates have not caught up with the enormous amount of analysis of Internet data that is taking place these days. Case studies that use Web server log data or Internet network traffic data are rare in undergraduate Statistics education. And yet these data provide numerous examples of skewed and bimodal…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
ERIC Educational Resources Information Center
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data
ERIC Educational Resources Information Center
Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val
2006-01-01
Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…
Explorations in Statistics: The Analysis of Ratios and Normalized Data
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2013-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…
A Realistic Experimental Design and Statistical Analysis Project
ERIC Educational Resources Information Center
Muske, Kenneth R.; Myers, John A.
2007-01-01
A realistic applied chemical engineering experimental design and statistical analysis project is documented in this article. This project has been implemented as part of the professional development and applied statistics courses at Villanova University over the past five years. The novel aspects of this project are that the students are given a…
System statistical reliability model and analysis
NASA Technical Reports Server (NTRS)
Lekach, V. S.; Rood, H.
1973-01-01
A digital computer code was developed to simulate the time-dependent behavior of the 5-kwe reactor thermoelectric system. The code was used to determine lifetime sensitivity coefficients for a number of system design parameters, such as thermoelectric module efficiency and degradation rate, radiator absorptivity and emissivity, fuel element barrier defect constant, beginning-of-life reactivity, etc. A probability distribution (mean and standard deviation) was estimated for each of these design parameters. Then, error analysis was used to obtain a probability distribution for the system lifetime (mean = 7.7 years, standard deviation = 1.1 years). From this, the probability that the system will achieve the design goal of 5 years lifetime is 0.993. This value represents an estimate of the degradation reliability of the system.
A statistical analysis of UK financial networks
NASA Astrophysics Data System (ADS)
Chu, J.; Nadarajah, S.
2017-04-01
In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.
Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance
NASA Astrophysics Data System (ADS)
Glascock, M. D.; Neff, H.; Vaughn, K. J.
2004-06-01
The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.
Statistical analysis of litter experiments in teratology
Williams, R.; Buschbom, R.L.
1982-11-01
Teratological data is binary response data (each fetus is either affected or not) in which the responses within a litter are usually not independent. As a result, the litter should be taken as the experimental unit. For each litter, its size, n, and the number of fetuses, x, possessing the effect of interest are recorded. The ratio p = x/n is then the basic data generated by the experiment. There are currently three general approaches to the analysis of teratological data: nonparametric, transformation followed by t-test or ANOVA, and parametric. The first two are currently in wide use by practitioners while the third is relatively new to the field. These first two also appear to possess comparable power levels while maintaining the nominal level of significance. When transformations are employed, care must be exercised to check that the transformed data has the required properties. Since the data is often highly asymmetric, there may be no transformation which renders the data nearly normal. The parametric procedures, including the beta-binomial model, offer the possibility of increased power.
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
ERIC Educational Resources Information Center
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Chen, Zhe; Ohara, Shinji; Cao, Jianting; Vialatte, François; Lenz, Fred A; Cichocki, Andrzej
2007-01-01
This article is devoted to statistical modeling and analysis of electrocorticogram (ECoG) signals induced by painful cutaneous laser stimuli, which were recorded from implanted electrodes in awake humans. Specifically, with statistical tools of factor analysis and independent component analysis, the pain-induced laser-evoked potentials (LEPs) were extracted and investigated under different controlled conditions. With the help of wavelet analysis, quantitative and qualitative analyses were conducted regarding the LEPs' attributes of power, amplitude, and latency, in both averaging and single-trial experiments. Statistical hypothesis tests were also applied in various experimental setups. Experimental results reported herein also confirm previous findings in the neurophysiology literature. In addition, single-trial analysis has also revealed many new observations that might be interesting to the neuroscientists or clinical neurophysiologists. These promising results show convincing validation that advanced signal processing and statistical analysis may open new avenues for future studies of such ECoG or other relevant biomedical recordings.
Spectral signature verification using statistical analysis and text mining
NASA Astrophysics Data System (ADS)
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
Saisubramanian, N; Edwinoliver, N G; Nandakumar, N; Kamini, N R; Puvanakrishnan, R
2006-08-01
The efficacy of lipase from Aspergillus niger MTCC 2594 as an additive in laundry detergent formulations was assessed using response surface methodology (RSM). A five-level four-factorial central composite design was chosen to explain the washing protocol with four critical factors, viz. detergent concentration, lipase concentration, buffer pH and washing temperature. The model suggested that all the factors chosen had a significant impact on oil removal and the optimal conditions for the removal of olive oil from cotton fabric were 1.0% detergent, 75 U of lipase, buffer pH of 9.5 and washing temperature of 25 degrees C. Under optimal conditions, the removal of olive oil from cotton fabric was 33 and 17.1% at 25 and 49 degrees C, respectively, in the presence of lipase over treatment with detergent alone. Hence, lipase from A. niger could be effectively used as an additive in detergent formulation for the removal of triglyceride soil both in cold and warm wash conditions.
Analysis of Coastal Dunes: A Remote Sensing and Statistical Approach.
ERIC Educational Resources Information Center
Jones, J. Richard
1985-01-01
Remote sensing analysis and statistical methods were used to analyze the coastal dunes of Plum Island, Massachusetts. The research methodology used provides an example of a student project for remote sensing, geomorphology, or spatial analysis courses at the university level. (RM)
Data explorer: a prototype expert system for statistical analysis.
Aliferis, C.; Chao, E.; Cooper, G. F.
1993-01-01
The inadequate analysis of medical research data, due mainly to the unavailability of local statistical expertise, seriously jeopardizes the quality of new medical knowledge. Data Explorer is a prototype Expert System that builds on the versatility and power of existing statistical software, to provide automatic analyses and interpretation of medical data. The system draws much of its power by using belief network methods in place of more traditional, but difficult to automate, classical multivariate statistical techniques. Data Explorer identifies statistically significant relationships among variables, and using power-size analysis, belief network inference/learning and various explanatory techniques helps the user understand the importance of the findings. Finally the system can be used as a tool for the automatic development of predictive/diagnostic models from patient databases. PMID:8130501
A Divergence Statistics Extension to VTK for Performance Analysis
Pebay, Philippe Pierre; Bennett, Janine Camille
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Statistical inference for the additive hazards model under outcome-dependent sampling.
Yu, Jichang; Liu, Yanyan; Sandler, Dale P; Zhou, Haibo
2015-09-01
Cost-effective study design and proper inference procedures for data from such designs are always of particular interests to study investigators. In this article, we propose a biased sampling scheme, an outcome-dependent sampling (ODS) design for survival data with right censoring under the additive hazards model. We develop a weighted pseudo-score estimator for the regression parameters for the proposed design and derive the asymptotic properties of the proposed estimator. We also provide some suggestions for using the proposed method by evaluating the relative efficiency of the proposed method against simple random sampling design and derive the optimal allocation of the subsamples for the proposed design. Simulation studies show that the proposed ODS design is more powerful than other existing designs and the proposed estimator is more efficient than other estimators. We apply our method to analyze a cancer study conducted at NIEHS, the Cancer Incidence and Mortality of Uranium Miners Study, to study the risk of radon exposure to cancer.
Statistical inference for the additive hazards model under outcome-dependent sampling
Yu, Jichang; Liu, Yanyan; Sandler, Dale P.; Zhou, Haibo
2015-01-01
Cost-effective study design and proper inference procedures for data from such designs are always of particular interests to study investigators. In this article, we propose a biased sampling scheme, an outcome-dependent sampling (ODS) design for survival data with right censoring under the additive hazards model. We develop a weighted pseudo-score estimator for the regression parameters for the proposed design and derive the asymptotic properties of the proposed estimator. We also provide some suggestions for using the proposed method by evaluating the relative efficiency of the proposed method against simple random sampling design and derive the optimal allocation of the subsamples for the proposed design. Simulation studies show that the proposed ODS design is more powerful than other existing designs and the proposed estimator is more efficient than other estimators. We apply our method to analyze a cancer study conducted at NIEHS, the Cancer Incidence and Mortality of Uranium Miners Study, to study the risk of radon exposure to cancer. PMID:26379363
Fisher statistics for analysis of diffusion tensor directional information.
Hutchinson, Elizabeth B; Rutecki, Paul A; Alexander, Andrew L; Sutula, Thomas P
2012-04-30
A statistical approach is presented for the quantitative analysis of diffusion tensor imaging (DTI) directional information using Fisher statistics, which were originally developed for the analysis of vectors in the field of paleomagnetism. In this framework, descriptive and inferential statistics have been formulated based on the Fisher probability density function, a spherical analogue of the normal distribution. The Fisher approach was evaluated for investigation of rat brain DTI maps to characterize tissue orientation in the corpus callosum, fornix, and hilus of the dorsal hippocampal dentate gyrus, and to compare directional properties in these regions following status epilepticus (SE) or traumatic brain injury (TBI) with values in healthy brains. Direction vectors were determined for each region of interest (ROI) for each brain sample and Fisher statistics were applied to calculate the mean direction vector and variance parameters in the corpus callosum, fornix, and dentate gyrus of normal rats and rats that experienced TBI or SE. Hypothesis testing was performed by calculation of Watson's F-statistic and associated p-value giving the likelihood that grouped observations were from the same directional distribution. In the fornix and midline corpus callosum, no directional differences were detected between groups, however in the hilus, significant (p<0.0005) differences were found that robustly confirmed observations that were suggested by visual inspection of directionally encoded color DTI maps. The Fisher approach is a potentially useful analysis tool that may extend the current capabilities of DTI investigation by providing a means of statistical comparison of tissue structural orientation.
2010-01-01
Background A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics. Results This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases. Conclusions The GAM permutation testing methods
Statistical inference in behavior analysis: Friend or foe?
Baron, Alan
1999-01-01
Behavior analysts are undecided about the proper role to be played by inferential statistics in behavioral research. The traditional view, as expressed in Sidman's Tactics of Scientific Research (1960), was that inferential statistics has no place within a science that focuses on the steady-state behavior of individual organisms. Despite this admonition, there have been steady inroads of statistical techniques into behavior analysis since then, as evidenced by publications in the Journal of the Experimental Analysis of Behavior. The issues raised by these developments were considered at a panel held at the 24th annual convention of the Association for Behavior Analysis, Orlando, Florida (May, 1998). The proceedings are reported in this and the following articles. PMID:22478323
Statistical inference in behavior analysis: Experimental control is better
Perone, Michael
1999-01-01
Statistical inference promises automatic, objective, reliable assessments of data, independent of the skills or biases of the investigator, whereas the single-subject methods favored by behavior analysts often are said to rely too much on the investigator's subjective impressions, particularly in the visual analysis of data. In fact, conventional statistical methods are difficult to apply correctly, even by experts, and the underlying logic of null-hypothesis testing has drawn criticism since its inception. By comparison, single-subject methods foster direct, continuous interaction between investigator and subject and development of strong forms of experimental control that obviate the need for statistical inference. Treatment effects are demonstrated in experimental designs that incorporate replication within and between subjects, and the visual analysis of data is adequate when integrated into such designs. Thus, single-subject methods are ideal for shaping—and maintaining—the kind of experimental practices that will ensure the continued success of behavior analysis. PMID:22478328
Adaptive strategy for the statistical analysis of connectomes.
Meskaldji, Djalel Eddine; Ottet, Marie-Christine; Cammoun, Leila; Hagmann, Patric; Meuli, Reto; Eliez, Stephan; Thiran, Jean Philippe; Morgenthaler, Stephan
2011-01-01
We study an adaptive statistical approach to analyze brain networks represented by brain connection matrices of interregional connectivity (connectomes). Our approach is at a middle level between a global analysis and single connections analysis by considering subnetworks of the global brain network. These subnetworks represent either the inter-connectivity between two brain anatomical regions or by the intra-connectivity within the same brain anatomical region. An appropriate summary statistic, that characterizes a meaningful feature of the subnetwork, is evaluated. Based on this summary statistic, a statistical test is performed to derive the corresponding p-value. The reformulation of the problem in this way reduces the number of statistical tests in an orderly fashion based on our understanding of the problem. Considering the global testing problem, the p-values are corrected to control the rate of false discoveries. Finally, the procedure is followed by a local investigation within the significant subnetworks. We contrast this strategy with the one based on the individual measures in terms of power. We show that this strategy has a great potential, in particular in cases where the subnetworks are well defined and the summary statistics are properly chosen. As an application example, we compare structural brain connection matrices of two groups of subjects with a 22q11.2 deletion syndrome, distinguished by their IQ scores.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Metrology optical power budgeting in SIM using statistical analysis techniques
NASA Astrophysics Data System (ADS)
Kuan, Gary M.
2008-07-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-10-02
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques
NASA Technical Reports Server (NTRS)
Kuan, Gary M
2008-01-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
Data analysis using the Gnu R system for statistical computation
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
Improving statistical analysis of matched case-control studies.
Conway, Aaron; Rolley, John X; Fulbrook, Paul; Page, Karen; Thompson, David R
2013-06-01
Matched case-control research designs can be useful because matching can increase power due to reduced variability between subjects. However, inappropriate statistical analysis of matched data could result in a change in the strength of association between the dependent and independent variables or a change in the significance of the findings. We sought to ascertain whether matched case-control studies published in the nursing literature utilized appropriate statistical analyses. Of 41 articles identified that met the inclusion criteria, 31 (76%) used an inappropriate statistical test for comparing data derived from case subjects and their matched controls. In response to this finding, we developed an algorithm to support decision-making regarding statistical tests for matched case-control studies.
A novel statistic for genome-wide interaction analysis.
Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao
2010-09-23
Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Meta-analysis for Discovering Rare-Variant Associations: Statistical Methods and Software Programs.
Tang, Zheng-Zheng; Lin, Dan-Yu
2015-07-02
There is heightened interest in using next-generation sequencing technologies to identify rare variants that influence complex human diseases and traits. Meta-analysis is essential to this endeavor because large sample sizes are required for detecting associations with rare variants. In this article, we provide a comprehensive overview of statistical methods for meta-analysis of sequencing studies for discovering rare-variant associations. Specifically, we discuss the calculation of relevant summary statistics from participating studies, the construction of gene-level association tests, the choice of transformation for quantitative traits, the use of fixed-effects versus random-effects models, and the removal of shadow association signals through conditional analysis. We also show that meta-analysis based on properly calculated summary statistics is as powerful as joint analysis of individual-participant data. In addition, we demonstrate the performance of different meta-analysis methods by using both simulated and empirical data. We then compare four major software packages for meta-analysis of rare-variant associations-MASS, RAREMETAL, MetaSKAT, and seqMeta-in terms of the underlying statistical methodology, analysis pipeline, and software interface. Finally, we present PreMeta, a software interface that integrates the four meta-analysis packages and allows a consortium to combine otherwise incompatible summary statistics.
Revisiting the statistical analysis of pyroclast density and porosity data
NASA Astrophysics Data System (ADS)
Bernard, B.; Kueppers, U.; Ortiz, H.
2015-07-01
Explosive volcanic eruptions are commonly characterized based on a thorough analysis of the generated deposits. Amongst other characteristics in physical volcanology, density and porosity of juvenile clasts are some of the most frequently used to constrain eruptive dynamics. In this study, we evaluate the sensitivity of density and porosity data to statistical methods and introduce a weighting parameter to correct issues raised by the use of frequency analysis. Results of textural investigation can be biased by clast selection. Using statistical tools as presented here, the meaningfulness of a conclusion can be checked for any data set easily. This is necessary to define whether or not a sample has met the requirements for statistical relevance, i.e. whether a data set is large enough to allow for reproducible results. Graphical statistics are used to describe density and porosity distributions, similar to those used for grain-size analysis. This approach helps with the interpretation of volcanic deposits. To illustrate this methodology, we chose two large data sets: (1) directed blast deposits of the 3640-3510 BC eruption of Chachimbiro volcano (Ecuador) and (2) block-and-ash-flow deposits of the 1990-1995 eruption of Unzen volcano (Japan). We propose the incorporation of this analysis into future investigations to check the objectivity of results achieved by different working groups and guarantee the meaningfulness of the interpretation.
Introduction to Statistics and Data Analysis With Computer Applications I.
ERIC Educational Resources Information Center
Morris, Carl; Rolph, John
This document consists of unrevised lecture notes for the first half of a 20-week in-house graduate course at Rand Corporation. The chapter headings are: (1) Histograms and descriptive statistics; (2) Measures of dispersion, distance and goodness of fit; (3) Using JOSS for data analysis; (4) Binomial distribution and normal approximation; (5)…
Feasibility of voxel-based statistical analysis method for myocardial PET
NASA Astrophysics Data System (ADS)
Ram Yu, A.; Kim, Jin Su; Paik, Chang H.; Kim, Kyeong Min; Moo Lim, Sang
2014-09-01
Although statistical parametric mapping (SPM) analysis is widely used in neuroimaging studies, to our best knowledge, there was no application to myocardial PET data analysis. In this study, we developed the voxel based statistical analysis method for myocardial PET which provides statistical comparison results between groups in image space. PET Emission data of normal and myocardial infarction rats were acquired For the SPM analysis, a rat heart template was created. In addition, individual PET data was spatially normalized and smoothed. Two sample t-tests were performed to identify the myocardial infarct region. This developed SPM method was compared with conventional ROI methods. Myocardial glucose metabolism was decreased in the lateral wall of the left ventricle. In the result of ROI analysis, the mean value of the lateral wall was 29% decreased. The newly developed SPM method for myocardial PET could provide quantitative information in myocardial PET study.
A statistical model for iTRAQ data analysis.
Hill, Elizabeth G; Schwacke, John H; Comte-Walters, Susana; Slate, Elizabeth H; Oberg, Ann L; Eckel-Passow, Jeanette E; Therneau, Terry M; Schey, Kevin L
2008-08-01
We describe biological and experimental factors that induce variability in reporter ion peak areas obtained from iTRAQ experiments. We demonstrate how these factors can be incorporated into a statistical model for use in evaluating differential protein expression and highlight the benefits of using analysis of variance to quantify fold change. We demonstrate the model's utility based on an analysis of iTRAQ data derived from a spike-in study.
A Computer Aided Statistical Covariance Program for Missile System Analysis
1974-04-01
ENGINEERING RESEARCH OKLAHOMA STATE UNIVERSITY A COMPUTER AIDED STATISTICAL COVARIANCE PROGRAM FOR MISSILE SYSTEM ANALYSI. TO D JN2 U. S. Army Missile...ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK AREA & WORK UNIT NUMBERS Office of Engineering Rsch, Oklahoma State Univ Agiculture...ANALYSIS by James R. Rowland and V. M. Gupta School of Electrical Engineering V Approved for public release; distribution unlimited. Office of Engineering
Investigation of Weibull statistics in fracture analysis of cast aluminum
NASA Technical Reports Server (NTRS)
Holland, Frederic A., Jr.; Zaretsky, Erwin V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodology based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Investigation of Weibull statistics in fracture analysis of cast aluminum
NASA Technical Reports Server (NTRS)
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
NASA Technical Reports Server (NTRS)
Scargle, J. D.
1982-01-01
Detection of a periodic signal hidden in noise is frequently a goal in astronomical data analysis. This paper does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced. This choice was made because, of the methods in current use, it appears to have the simplest statistical behavior. A modification of the classical definition of the periodogram is necessary in order to retain the simple statistical behavior of the evenly spaced case. With this modification, periodogram analysis and least-squares fitting of sine waves to the data are exactly equivalent. Certain difficulties with the use of the periodogram are less important than commonly believed in the case of detection of strictly periodic signals. In addition, the standard method for mitigating these difficulties (tapering) can be used just as well if the sampling is uneven. An analysis of the statistical significance of signal detections is presented, with examples
NASA Astrophysics Data System (ADS)
Scargle, J. D.
1982-12-01
Detection of a periodic signal hidden in noise is frequently a goal in astronomical data analysis. This paper does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced. This choice was made because, of the methods in current use, it appears to have the simplest statistical behavior. A modification of the classical definition of the periodogram is necessary in order to retain the simple statistical behavior of the evenly spaced case. With this modification, periodogram analysis and least-squares fitting of sine waves to the data are exactly equivalent. Certain difficulties with the use of the periodogram are less important than commonly believed in the case of detection of strictly periodic signals. In addition, the standard method for mitigating these difficulties (tapering) can be used just as well if the sampling is uneven. An analysis of the statistical significance of signal detections is presented, with examples
Using Pre-Statistical Analysis to Streamline Monitoring Assessments
Reed, J.K.
1999-10-20
A variety of statistical methods exist to aid evaluation of groundwater quality and subsequent decision making in regulatory programs. These methods are applied because of large temporal and spatial extrapolations commonly applied to these data. In short, statistical conclusions often serve as a surrogate for knowledge. However, facilities with mature monitoring programs that have generated abundant data have inherently less uncertainty because of the sheer quantity of analytical results. In these cases, statistical tests can be less important, and ''expert'' data analysis should assume an important screening role.The WSRC Environmental Protection Department, working with the General Separations Area BSRI Environmental Restoration project team has developed a method for an Integrated Hydrogeological Analysis (IHA) of historical water quality data from the F and H Seepage Basins groundwater remediation project. The IHA combines common sense analytical techniques and a GIS presentation that force direct interactive evaluation of the data. The IHA can perform multiple data analysis tasks required by the RCRA permit. These include: (1) Development of a groundwater quality baseline prior to remediation startup, (2) Targeting of constituents for removal from RCRA GWPS, (3) Targeting of constituents for removal from UIC, permit, (4) Targeting of constituents for reduced, (5)Targeting of monitoring wells not producing representative samples, (6) Reduction in statistical evaluation, and (7) Identification of contamination from other facilities.
Feature-Based Statistical Analysis of Combustion Simulation Data
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
Statistical model applied to motor evoked potentials analysis.
Ma, Ying; Thakor, Nitish V; Jia, Xiaofeng
2011-01-01
Motor evoked potentials (MEPs) convey information regarding the functional integrity of the descending motor pathways. Absence of the MEP has been used as a neurophysiological marker to suggest cortico-spinal abnormalities in the operating room. Due to their high variability and sensitivity, detailed quantitative studies of MEPs are lacking. This paper applies a statistical method to characterize MEPs by estimating the number of motor units and single motor unit potential amplitudes. A clearly increasing trend of single motor unit potential amplitudes in the MEPs after each pulse of the stimulation pulse train is revealed by this method. This statistical method eliminates the effects of anesthesia, and provides an objective assessment of MEPs. Consequently this statistical method has high potential to be useful in future quantitative MEPs analysis.
Adiyaman wind potential and statistical analysis, in Turkey
NASA Astrophysics Data System (ADS)
Sogukpinar, Haci; Bozkurt, Ismail
2017-02-01
In this study, wind potential of Adiyaman is analyzed statistically and installed wind capacity across Turkey is summarized. One-year experimental data are obtained for the district of Adiyaman. The data is taken from two major data station of Sincik and Kahta which determines the wind potential of Adiyaman. Measurements at 10 m height are used for statistical analysis. With the data obtained, monthly average wind speed are calculated and statistical analyzes are performed using the Weibull, Gamma and Log-normal distribution. Data received from the wind station Sincik represents windy part of Adiyaman so average wind speed is higher. Kahta represent windless part of Adiyaman and the average wind speed is lower in there. This study shows that the best fit to the Gamma distribution of measurements made on.
Geographic analysis of forest health indicators using spatial scan statistics.
Coulston, John W; Riitters, Kurt H
2003-06-01
Geographically explicit analysis tools are needed to assess forest health indicators that are measured over large regions. Spatial scan statistics can be used to detect spatial or spatiotemporal clusters of forests representing hotspots of extreme indicator values. This paper demonstrates the approach through analyses of forest fragmentation indicators in the southeastern United States and insect and pathogen indicators in the Pacific Northwest United States. The scan statistic detected four spatial clusters of fragmented forest including a hotspot in the Piedmont and Coastal Plain region. Three recurring clusters of insect and pathogen occurrence were found in the Pacific Northwest. Spatial scan statistics are a powerful new tool that can be used to identify potential forest health problems.
Analysis of the chaotic maps generating different statistical distributions
NASA Astrophysics Data System (ADS)
Lawnik, M.
2015-09-01
The analysis of the chaotic maps, enabling the derivation of numbers from given statistical distributions was presented. The analyzed chaotic maps are in the form xk+1 = F-1(U(F(xk))), where F is the cumulative distribution function, U is the skew tent map and F-1 is the inverse function of F. The analysis was presented on the example of chaotic map with the standard normal distribution in view of his computational efficiency and accuracy. On the grounds of the conducted analysis, it should be indicated that the method not always allows to generate the values from the given distribution.
3D statistical failure analysis of monolithic dental ceramic crowns.
Nasrin, Sadia; Katsube, Noriko; Seghi, Robert R; Rokhlin, Stanislav I
2016-07-05
For adhesively retained ceramic crown of various types, it has been clinically observed that the most catastrophic failures initiate from the cement interface as a result of radial crack formation as opposed to Hertzian contact stresses originating on the occlusal surface. In this work, a 3D failure prognosis model is developed for interface initiated failures of monolithic ceramic crowns. The surface flaw distribution parameters determined by biaxial flexural tests on ceramic plates and point-to-point variations of multi-axial stress state at the intaglio surface are obtained by finite element stress analysis. They are combined on the basis of fracture mechanics based statistical failure probability model to predict failure probability of a monolithic crown subjected to single-cycle indentation load. The proposed method is verified by prior 2D axisymmetric model and experimental data. Under conditions where the crowns are completely bonded to the tooth substrate, both high flexural stress and high interfacial shear stress are shown to occur in the wall region where the crown thickness is relatively thin while high interfacial normal tensile stress distribution is observed at the margin region. Significant impact of reduced cement modulus on these stress states is shown. While the analyses are limited to single-cycle load-to-failure tests, high interfacial normal tensile stress or high interfacial shear stress may contribute to degradation of the cement bond between ceramic and dentin. In addition, the crown failure probability is shown to be controlled by high flexural stress concentrations over a small area, and the proposed method might be of some value to detect initial crown design errors.
Surface analysis and evaluation of progressive addition lens
NASA Astrophysics Data System (ADS)
Li, Zhiying; Li, Dan
2016-10-01
The Progressive addition lens is used increasingly extensive with its advantages of meeting the requirements of distant and near vision at the same time. Started from the surface equations of progressive addition lens, combined with evaluation method of spherical power and cylinder power, the relationship equations between the surface sag and optical power distribution are derived. According to the requirements on difference of actual and nominal optical power from Chinese National Standard, the tolerance analysis and evaluation of prototype progressive addition surface with addition of 2.5m-1 ( 7.5m-1 10m-1 ) is given in detail. The tolerance analysis method provides theoretical proof for lens processing control accuracy, and the processing feasibility of lens is evaluated much more reasonably.
Wavelet analysis in ecology and epidemiology: impact of statistical tests
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-01-01
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the ‘beta-surrogate’ method. PMID:24284892
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-06
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Computed Tomography Inspection and Analysis for Additive Manufacturing Components
NASA Technical Reports Server (NTRS)
Beshears, Ronald D.
2016-01-01
Computed tomography (CT) inspection was performed on test articles additively manufactured from metallic materials. Metallic AM and machined wrought alloy test articles with programmed flaws were inspected using a 2MeV linear accelerator based CT system. Performance of CT inspection on identically configured wrought and AM components and programmed flaws was assessed using standard image analysis techniques to determine the impact of additive manufacturing on inspectability of objects with complex geometries.
Aspects of design and statistical analysis in the Comet assay.
Wiklund, Stig Johan; Agurell, Eva
2003-03-01
Some aspects of the statistical design and analysis of the Comet (single cell gel electrophoresis) assay have been evaluated by means of a simulation study. The tail length and tail moment were selected for the quantification of DNA migration. Results from the simulation study showed that the choice of measure to summarize the cells on each slide is extremely important in order to facilitate an efficient analysis. For tail moment, the mean of log transformed data is clearly superior to the other evaluated measures, whereas using the mean of raw data without transformation can lead to very inefficient analyses. The 90th percentile, capturing the upper tail of the distribution, performs well for the tail length, with a slight improvement obtained by applying a log transformation prior to calculations. Furthermore, the simulation study has been used to assess the appropriateness of some models for statistical analysis and to address the issue of design (i.e. number of cultures or animals in each group, number of slides per animal/culture and number of cells scored per slide). Combining the results from the simulations with practical experience from the pharmaceutical industry, we conclude the paper by providing concise recommendations regarding the design and statistical analysis in the Comet assay.
Revisiting the statistical analysis of pyroclast density and porosity data
NASA Astrophysics Data System (ADS)
Bernard, B.; Kueppers, U.; Ortiz, H.
2015-03-01
Explosive volcanic eruptions are commonly characterized based on a thorough analysis of the generated deposits. Amongst other characteristics in physical volcanology, density and porosity of juvenile clasts are some of the most frequently used characteristics to constrain eruptive dynamics. In this study, we evaluate the sensitivity of density and porosity data and introduce a weighting parameter to correct issues raised by the use of frequency analysis. Results of textural investigation can be biased by clast selection. Using statistical tools as presented here, the meaningfulness of a conclusion can be checked for any dataset easily. This is necessary to define whether or not a sample has met the requirements for statistical relevance, i.e. whether a dataset is large enough to allow for reproducible results. Graphical statistics are used to describe density and porosity distributions, similar to those used for grain-size analysis. This approach helps with the interpretation of volcanic deposits. To illustrate this methodology we chose two large datasets: (1) directed blast deposits of the 3640-3510 BC eruption of Chachimbiro volcano (Ecuador) and (2) block-and-ash-flow deposits of the 1990-1995 eruption of Unzen volcano (Japan). We propose add the use of this analysis for future investigations to check the objectivity of results achieved by different working groups and guarantee the meaningfulness of the interpretation.
STATISTICAL ANALYSIS OF THE HEAVY NEUTRAL ATOMS MEASURED BY IBEX
Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard; Galli, André; Livadiotis, George; Fuselier, Steve A.; McComas, David J.
2015-10-15
We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O and Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O and Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath.
Statistical Analysis of the Heavy Neutral Atoms Measured by IBEX
NASA Astrophysics Data System (ADS)
Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard; Galli, André; Livadiotis, George; Fuselier, Steve A.; McComas, David J.
2015-10-01
We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O&Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O&Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath.
A statistical analysis of mesoscale rainfall as a random cascade
NASA Technical Reports Server (NTRS)
Gupta, Vijay K.; Waymire, Edward C.
1993-01-01
The paper considers the random cascade theory for spatial rainfall. Particular attention was given to the following four areas: (1) the relationship of the random cascade theory of rainfall to the simple scaling and the hierarchical cluster-point-process theories, (2) the mathematical foundations for some of the formalisms commonly applied in the develpment of statistical cascade theory, (3) the empirical evidence for a random cascade theory of rainfall, and (4) the way of using data for making estimates of parameters and for making statistical inference within this theoretical framework. An analysis of space-time rainfall data is presented. Cascade simulations are carried out to provide a comparison with methods of analysis that are applied to the rainfall data.
Three-parameter probability distribution density for statistical image analysis
NASA Astrophysics Data System (ADS)
Schau, H. C.
1980-01-01
Statistical analysis of 2-D image data or data gathered from a scanning radiometer requires that both the non-Gaussian nature and finite sample size of the process be considered. To aid the statistical analysis of this data, a higher moment description density function has been defined, and parameters have been identified with the estimated moments of the data. It is shown that the first two moments may be computed from a knowledge of the Weiner spectrum, whereas all higher moments require the complex spatial frequency spectrum. Parameter identification is carried out for a three-parameter density function and applied to a scene in the IR region, 8-14 microns. Results indicate that a three-parameter distribution density generally provides different probabilities than does a two-parameter Gaussian description if maximum entropy (minimum bias) forms are sought.
Collagen morphology and texture analysis: from statistics to classification
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-01-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage. PMID:23846580
Statistical Analysis of speckle noise reduction techniques for echocardiographic Images
NASA Astrophysics Data System (ADS)
Saini, Kalpana; Dewal, M. L.; Rohit, Manojkumar
2011-12-01
Echocardiography is the safe, easy and fast technology for diagnosing the cardiac diseases. As in other ultrasound images these images also contain speckle noise. In some cases this speckle noise is useful such as in motion detection. But in general noise removal is required for better analysis of the image and proper diagnosis. Different Adaptive and anisotropic filters are included for statistical analysis. Statistical parameters such as Signal-to-Noise Ratio (SNR), Peak Signal-to-Noise Ratio (PSNR), and Root Mean Square Error (RMSE) calculated for performance measurement. One more important aspect that there may be blurring during speckle noise removal. So it is prefered that filter should be able to enhance edges during noise removal.
Collagen morphology and texture analysis: from statistics to classification.
Mostaço-Guidolin, Leila B; Ko, Alex C-T; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G
2013-01-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage.
Collagen morphology and texture analysis: from statistics to classification
NASA Astrophysics Data System (ADS)
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-07-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage.
Statistical Signal Models and Algorithms for Image Analysis
1984-10-25
In this report, two-dimensional stochastic linear models are used in developing algorithms for image analysis such as classification, segmentation, and object detection in images characterized by textured backgrounds. These models generate two-dimensional random processes as outputs to which statistical inference procedures can naturally be applied. A common thread throughout our algorithms is the interpretation of the inference procedures in terms of linear prediction
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
The statistical analysis of multivariate serological frequency data.
Reyment, Richard A
2005-11-01
Data occurring in the form of frequencies are common in genetics-for example, in serology. Examples are provided by the AB0 group, the Rhesus group, and also DNA data. The statistical analysis of tables of frequencies is carried out using the available methods of multivariate analysis with usually three principal aims. One of these is to seek meaningful relationships between the components of a data set, the second is to examine relationships between populations from which the data have been obtained, the third is to bring about a reduction in dimensionality. This latter aim is usually realized by means of bivariate scatter diagrams using scores computed from a multivariate analysis. The multivariate statistical analysis of tables of frequencies cannot safely be carried out by standard multivariate procedures because they represent compositions and are therefore embedded in simplex space, a subspace of full space. Appropriate procedures for simplex space are compared and contrasted with simple standard methods of multivariate analysis ("raw" principal component analysis). The study shows that the differences between a log-ratio model and a simple logarithmic transformation of proportions may not be very great, particularly as regards graphical ordinations, but important discrepancies do occur. The divergencies between logarithmically based analyses and raw data are, however, great. Published data on Rhesus alleles observed for Italian populations are used to exemplify the subject.
Defects and statistical degradation analysis of photovoltaic power plants
NASA Astrophysics Data System (ADS)
Sundarajan, Prasanna
As the photovoltaic (PV) power plants age in the field, the PV modules degrade and generate visible and invisible defects. A defect and statistical degradation rate analysis of photovoltaic (PV) power plants is presented in two-part thesis. The first part of the thesis deals with the defect analysis and the second part of the thesis deals with the statistical degradation rate analysis. In the first part, a detailed analysis on the performance or financial risk related to each defect found in multiple PV power plants across various climatic regions of the USA is presented by assigning a risk priority number (RPN). The RPN for all the defects in each PV plant is determined based on two databases: degradation rate database; defect rate database. In this analysis it is determined that the RPN for each plant is dictated by the technology type (crystalline silicon or thin-film), climate and age. The PV modules aging between 3 and 19 years in four different climates of hot-dry, hot-humid, cold-dry and temperate are investigated in this study. In the second part, a statistical degradation analysis is performed to determine if the degradation rates are linear or not in the power plants exposed in a hot-dry climate for the crystalline silicon technologies. This linearity degradation analysis is performed using the data obtained through two methods: current-voltage method; metered kWh method. For the current-voltage method, the annual power degradation data of hundreds of individual modules in six crystalline silicon power plants of different ages is used. For the metered kWh method, a residual plot analysis using Winters' statistical method is performed for two crystalline silicon plants of different ages. The metered kWh data typically consists of the signal and noise components. Smoothers remove the noise component from the data by taking the average of the current and the previous observations. Once this is done, a residual plot analysis of the error component is
Statistical analysis of the seasonal variation in demographic data.
Fellman, J; Eriksson, A W
2000-10-01
There has been little agreement as to whether reproduction or similar demographic events occur seasonally and, especially, whether there is any universal seasonal pattern. One reason is that the seasonal pattern may vary in different populations and at different times. Another reason is that different statistical methods have been used. Every statistical model is based on certain assumed conditions and hence is designed to identify specific components of the seasonal pattern. Therefore, the statistical method applied should be chosen with due consideration. In this study we present, develop, and compare different statistical methods for the study of seasonal variation. Furthermore, we stress that the methods are applicable for the analysis of many kinds of demographic data. The first approaches in the literature were based on monthly frequencies, on the simple sine curve, and on the approximation that the months are of equal length. Later, "the population at risk" and the fact that the months have different lengths were considered. Under these later assumptions the targets of the statistical analyses are the rates. In this study we present and generalize the earlier models. Furthermore, we use trigonometric regression methods. The trigonometric regression model in its simplest form corresponds to the sine curve. We compare the regression methods with the earlier models and reanalyze some data. Our results show that models for rates eliminate the disturbing effects of the varying length of the months, including the effect of leap years, and of the seasonal pattern of the population at risk. Therefore, they give the purest analysis of the seasonal pattern of the demographic data in question, e.g., rates of general births, twin maternities, neural tube defects, and mortality. Our main finding is that the trigonometric regression methods are more flexible and easier to handle than the earlier methods, particularly when the data differ from the simple sine curve.
Optimal Multicomponent Analysis Using the Generalized Standard Addition Method.
ERIC Educational Resources Information Center
Raymond, Margaret; And Others
1983-01-01
Describes an experiment on the simultaneous determination of chromium and magnesium by spectophotometry modified to include the Generalized Standard Addition Method computer program, a multivariate calibration method that provides optimal multicomponent analysis in the presence of interference and matrix effects. Provides instructions for…
Ambiguity and nonidentifiability in the statistical analysis of neural codes.
Amarasingham, Asohan; Geman, Stuart; Harrison, Matthew T
2015-05-19
Many experimental studies of neural coding rely on a statistical interpretation of the theoretical notion of the rate at which a neuron fires spikes. For example, neuroscientists often ask, "Does a population of neurons exhibit more synchronous spiking than one would expect from the covariability of their instantaneous firing rates?" For another example, "How much of a neuron's observed spiking variability is caused by the variability of its instantaneous firing rate, and how much is caused by spike timing variability?" However, a neuron's theoretical firing rate is not necessarily well-defined. Consequently, neuroscientific questions involving the theoretical firing rate do not have a meaning in isolation but can only be interpreted in light of additional statistical modeling choices. Ignoring this ambiguity can lead to inconsistent reasoning or wayward conclusions. We illustrate these issues with examples drawn from the neural-coding literature.
Ambiguity and nonidentifiability in the statistical analysis of neural codes
Amarasingham, Asohan; Geman, Stuart; Harrison, Matthew T.
2015-01-01
Many experimental studies of neural coding rely on a statistical interpretation of the theoretical notion of the rate at which a neuron fires spikes. For example, neuroscientists often ask, “Does a population of neurons exhibit more synchronous spiking than one would expect from the covariability of their instantaneous firing rates?” For another example, “How much of a neuron’s observed spiking variability is caused by the variability of its instantaneous firing rate, and how much is caused by spike timing variability?” However, a neuron’s theoretical firing rate is not necessarily well-defined. Consequently, neuroscientific questions involving the theoretical firing rate do not have a meaning in isolation but can only be interpreted in light of additional statistical modeling choices. Ignoring this ambiguity can lead to inconsistent reasoning or wayward conclusions. We illustrate these issues with examples drawn from the neural-coding literature. PMID:25934918
Spectral Analysis of B Stars: An Application of Bayesian Statistics
NASA Astrophysics Data System (ADS)
Mugnes, J.-M.; Robert, C.
2012-12-01
To better understand the processes involved in stellar physics, it is necessary to obtain accurate stellar parameters (effective temperature, surface gravity, abundances…). Spectral analysis is a powerful tool for investigating stars, but it is also vital to reduce uncertainties at a decent computational cost. Here we present a spectral analysis method based on a combination of Bayesian statistics and grids of synthetic spectra obtained with TLUSTY. This method simultaneously constrains the stellar parameters by using all the lines accessible in observed spectra and thus greatly reduces uncertainties and improves the overall spectrum fitting. Preliminary results are shown using spectra from the Observatoire du Mont-Mégantic.
Detailed Analysis of the Interoccurrence Time Statistics in Seismic Activity
NASA Astrophysics Data System (ADS)
Tanaka, Hiroki; Aizawa, Yoji
2017-02-01
The interoccurrence time statistics of seismiciry is studied theoretically as well as numerically by taking into account the conditional probability and the correlations among many earthquakes in different magnitude levels. It is known so far that the interoccurrence time statistics is well approximated by the Weibull distribution, but the more detailed information about the interoccurrence times can be obtained from the analysis of the conditional probability. Firstly, we propose the Embedding Equation Theory (EET), where the conditional probability is described by two kinds of correlation coefficients; one is the magnitude correlation and the other is the inter-event time correlation. Furthermore, the scaling law of each correlation coefficient is clearly determined from the numerical data-analysis carrying out with the Preliminary Determination of Epicenter (PDE) Catalog and the Japan Meteorological Agency (JMA) Catalog. Secondly, the EET is examined to derive the magnitude dependence of the interoccurrence time statistics and the multi-fractal relation is successfully formulated. Theoretically we cannot prove the universality of the multi-fractal relation in seismic activity; nevertheless, the theoretical results well reproduce all numerical data in our analysis, where several common features or the invariant aspects are clearly observed. Especially in the case of stationary ensembles the multi-fractal relation seems to obey an invariant curve, furthermore in the case of non-stationary (moving time) ensembles for the aftershock regime the multi-fractal relation seems to satisfy a certain invariant curve at any moving times. It is emphasized that the multi-fractal relation plays an important role to unify the statistical laws of seismicity: actually the Gutenberg-Richter law and the Weibull distribution are unified in the multi-fractal relation, and some universality conjectures regarding the seismicity are briefly discussed.
The Effects of Statistical Analysis Software and Calculators on Statistics Achievement
ERIC Educational Resources Information Center
Christmann, Edwin P.
2009-01-01
This study compared the effects of microcomputer-based statistical software and hand-held calculators on the statistics achievement of university males and females. The subjects, 73 graduate students enrolled in univariate statistics classes at a public comprehensive university, were randomly assigned to groups that used either microcomputer-based…
Turbo recognition: a statistical approach to layout analysis
NASA Astrophysics Data System (ADS)
Tokuyasu, Taku A.; Chou, Philip A.
2000-12-01
Turbo recognition (TR) is a communication theory approach to the analysis of rectangular layouts, in the spirit of Document Image Decoding. The TR algorithm, inspired by turbo decoding, is based on a generative model of image production, in which two grammars are used simultaneously to describe structure in orthogonal (horizontal and vertical directions. This enables TR to strictly embody non-local constraints that cannot be taken into account by local statistical methods. This basis in finite state grammars also allows TR to be quickly retargetable to new domains. We illustrate some of the capabilities of TR with two examples involving realistic images. While TR, like turbo decoding, is not guaranteed to recover the statistically optimal solution, we present an experiment that demonstrates its ability to produce optimal or near-optimal results on a simple yet nontrivial example, the recovery of a filled rectangle in the midst of noise. Unlike methods such as stochastic context free grammars and exhaustive search, which are often intractable beyond small images, turbo recognition scales linearly with image size, suggesting TR as an efficient yet near-optimal approach to statistical layout analysis.
Agriculture, population growth, and statistical analysis of the radiocarbon record.
Zahid, H Jabran; Robinson, Erick; Kelly, Robert L
2016-01-26
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.
Agriculture, population growth, and statistical analysis of the radiocarbon record
Zahid, H. Jabran; Robinson, Erick; Kelly, Robert L.
2016-01-01
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide. PMID:26699457
Introduction to statistical methods for microRNA analysis.
Zararsiz, Gökmen; Coşgun, Erdal
2014-01-01
MicroRNA profiling is an important task to investigate miRNA functions and recent technologies such as microarray, single nucleotide polymorphism (SNP), quantitative real-time PCR (qPCR), and next-generation sequencing (NGS) have played a major role for miRNA analysis. In this chapter, we give an overview on statistical approaches for gene expressions, SNP, qPCR, and NGS data including preliminary analyses (pre-processing, differential expression, classification, clustering, exploration of interactions, and the use of ontologies). Our goal is to outline the key approaches with a brief discussion of problems avenues for their solutions and to give some examples for real-world use. Readers will be able to understand the different data formats (expression levels, sequences etc.) and they will be able to choose appropriate methods for their own research and application. On the other hand, we give brief notes on most popular tools/packages for statistical genetic analysis. This chapter aims to serve as a brief introduction to different kinds of statistical methods and also provides an extensive source of references.
A global analysis of soil acidification caused by nitrogen addition
NASA Astrophysics Data System (ADS)
Tian, Dashuan; Niu, Shuli
2015-02-01
Nitrogen (N) deposition-induced soil acidification has become a global problem. However, the response patterns of soil acidification to N addition and the underlying mechanisms remain far from clear. Here, we conducted a meta-analysis of 106 studies to reveal global patterns of soil acidification in responses to N addition. We found that N addition significantly reduced soil pH by 0.26 on average globally. However, the responses of soil pH varied with ecosystem types, N addition rate, N fertilization forms, and experimental durations. Soil pH decreased most in grassland, whereas boreal forest was not observed a decrease to N addition in soil acidification. Soil pH decreased linearly with N addition rates. Addition of urea and NH4NO3 contributed more to soil acidification than NH4-form fertilizer. When experimental duration was longer than 20 years, N addition effects on soil acidification diminished. Environmental factors such as initial soil pH, soil carbon and nitrogen content, precipitation, and temperature all influenced the responses of soil pH. Base cations of Ca2+, Mg2+ and K+ were critical important in buffering against N-induced soil acidification at the early stage. However, N addition has shifted global soils into the Al3+ buffering phase. Overall, this study indicates that acidification in global soils is very sensitive to N deposition, which is greatly modified by biotic and abiotic factors. Global soils are now at a buffering transition from base cations (Ca2+, Mg2+ and K+) to non-base cations (Mn2+ and Al3+). This calls our attention to care about the limitation of base cations and the toxic impact of non-base cations for terrestrial ecosystems with N deposition.
The NIRS Analysis Package: noise reduction and statistical inference.
Fekete, Tomer; Rubin, Denis; Carlson, Joshua M; Mujica-Parodi, Lilianne R
2011-01-01
Near infrared spectroscopy (NIRS) is a non-invasive optical imaging technique that can be used to measure cortical hemodynamic responses to specific stimuli or tasks. While analyses of NIRS data are normally adapted from established fMRI techniques, there are nevertheless substantial differences between the two modalities. Here, we investigate the impact of NIRS-specific noise; e.g., systemic (physiological), motion-related artifacts, and serial autocorrelations, upon the validity of statistical inference within the framework of the general linear model. We present a comprehensive framework for noise reduction and statistical inference, which is custom-tailored to the noise characteristics of NIRS. These methods have been implemented in a public domain Matlab toolbox, the NIRS Analysis Package (NAP). Finally, we validate NAP using both simulated and actual data, showing marked improvement in the detection power and reliability of NIRS.
A Statistical Analysis of Lunisolar-Earthquake Connections
NASA Astrophysics Data System (ADS)
Rüegg, Christian Michael-André
2012-11-01
Despite over a century of study, the relationship between lunar cycles and earthquakes remains controversial and difficult to quantitatively investigate. Perhaps as a consequence, major earthquakes around the globe are frequently followed by "prediction claim", using lunar cycles, that generate media furore and pressure scientists to provide resolute answers. The 2010-2011 Canterbury earthquakes in New Zealand were no exception; significant media attention was given to lunar derived earthquake predictions by non-scientists, even though the predictions were merely "opinions" and were not based on any statistically robust temporal or causal relationships. This thesis provides a framework for studying lunisolar earthquake temporal relationships by developing replicable statistical methodology based on peer reviewed literature. Notable in the methodology is a high accuracy ephemeris, called ECLPSE, designed specifically by the author for use on earthquake catalogs and a model for performing phase angle analysis.
Spatial statistical analysis of tree deaths using airborne digital imagery
NASA Astrophysics Data System (ADS)
Chang, Ya-Mei; Baddeley, Adrian; Wallace, Jeremy; Canci, Michael
2013-04-01
High resolution digital airborne imagery offers unprecedented opportunities for observation and monitoring of vegetation, providing the potential to identify, locate and track individual vegetation objects over time. Analytical tools are required to quantify relevant information. In this paper, locations of trees over a large area of native woodland vegetation were identified using morphological image analysis techniques. Methods of spatial point process statistics were then applied to estimate the spatially-varying tree death risk, and to show that it is significantly non-uniform. [Tree deaths over the area were detected in our previous work (Wallace et al., 2008).] The study area is a major source of ground water for the city of Perth, and the work was motivated by the need to understand and quantify vegetation changes in the context of water extraction and drying climate. The influence of hydrological variables on tree death risk was investigated using spatial statistics (graphical exploratory methods, spatial point pattern modelling and diagnostics).
Statistical analysis of effective singular values in matrix rank determination
NASA Technical Reports Server (NTRS)
Konstantinides, Konstantinos; Yao, Kung
1988-01-01
A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values to the end, conference regions are derived for the perturbed singular values of matrices with noisy observation data. The analysis is based on the theories of perturbations of singular values and statistical significance test. Threshold bounds for perturbation due to finite-precision and i.i.d. random models are evaluated. In random models, the threshold bounds depend on the dimension of the matrix, the noisy variance, and predefined statistical level of significance. Results applied to the problem of determining the effective order of a linear autoregressive system from the approximate rank of a sample autocorrelation matrix are considered. Various numerical examples illustrating the usefulness of these bounds and comparisons to other previously known approaches are given.
Statistical analysis of subjective preferences for video enhancement
NASA Astrophysics Data System (ADS)
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
Statistical Analysis of Human Blood Cytometries: Potential Donors and Patients
NASA Astrophysics Data System (ADS)
Bernal-Alvarado, J.; Segovia-Olvera, P.; Mancilla-Escobar, B. E.; Palomares, P.
2004-09-01
The histograms of the cell volume from human blood present valuable information for clinical evaluation. Measurements can be performed with automatic equipment and a graphical presentation of the data is available, nevertheless, an statistical and mathematical analysis of the cell volume distribution could be useful for medical interpretation too, as much as the numerical parameters characterizing the histograms might be correlated with healthy people and patient populations. In this work, a statistical exercise was performed in order to find the most suitable model fitting the cell volume histograms. Several trial functions were tested and their parameters were tabulated. Healthy people exhibited an average of the cell volume of 85 femto liters while patients had 95 femto liters. White blood cell presented a small variation and platelets preserved their average for both populations.
Statistical analysis of the particulation of shaped charge jets
Minich, R W, Baker, E L; Schwartz, A J
1999-08-12
A statistical analysis of shaped charge jet break-up was carried out in order to investigate the role of nonlinear instabilities leading to the particulation of the jet. Statistical methods generally used for studying fluctuations in nonlinear dynamical systems are applied to experimentally measured velocities of the individual particles. In particular we present results suggesting the deviation of non-Gaussian behavior for interparticle velocity correlations, characteristic of nonlinear dynamical systems. Results are presented for two silver shaped charge jets that differ primarily in their material processing. We provide evidence that the particulation of a jet is not random, but has its origin in a deterministic dynamical process involving the nonlinear coupling of two oscillators analogous to the underling dynamics observed in Rayleigh-Benard convection and modeled in the return map of Curry and Yorke.
STATISTICAL ANALYSIS OF TANK 19F FLOOR SAMPLE RESULTS
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
STATISTICAL ANALYSIS OF TANK 18F FLOOR SAMPLE RESULTS
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 18F as per the statistical sampling plan developed by Shine [1]. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL [2]. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results [3] to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL{sub 95%}) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 18F. The uncertainty is quantified in this report by an upper 95% confidence limit (UCL{sub 95%}) on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL{sub 95%} was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Statistical mechanics analysis of thresholding 1-bit compressed sensing
NASA Astrophysics Data System (ADS)
Xu, Yingying; Kabashima, Yoshiyuki
2016-08-01
The one-bit compressed sensing framework aims to reconstruct a sparse signal by only using the sign information of its linear measurements. To compensate for the loss of scale information, past studies in the area have proposed recovering the signal by imposing an additional constraint on the l 2-norm of the signal. Recently, an alternative strategy that captures scale information by introducing a threshold parameter to the quantization process was advanced. In this paper, we analyze the typical behavior of thresholding 1-bit compressed sensing utilizing the replica method of statistical mechanics, so as to gain an insight for properly setting the threshold value. Our result shows that fixing the threshold at a constant value yields better performance than varying it randomly when the constant is optimally tuned, statistically. Unfortunately, the optimal threshold value depends on the statistical properties of the target signal, which may not be known in advance. In order to handle this inconvenience, we develop a heuristic that adaptively tunes the threshold parameter based on the frequency of positive (or negative) values in the binary outputs. Numerical experiments show that the heuristic exhibits satisfactory performance while incurring low computational cost.
Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent
[Kinetic analysis of additive effect on desulfurization activity].
Han, Kui-hua; Zhao, Jian-li; Lu, Chun-mei; Wang, Yong-zheng; Zhao, Gai-ju; Cheng, Shi-qing
2006-02-01
The additive effects of A12O3, Fe2O3 and MnCO3 on CaO sulfation kinetics were investigated by thermogravimetic analysis method and modified grain model. The activation energy (Ea) and the pre-exponential factor (k0) of surface reaction, the activation energy (Ep) and the pre-exponential factor (D0) of product layer diffusion reaction were calculated according to the model. Additions of MnCO3 can enhance the initial reaction rate, product layer diffusion and the final CaO conversion of sorbents, the effect mechanism of which is similar to that of Fe2O3. The method based isokinetic temperature Ts and activation energy can not estimate the contribution of additive to the sulfation reactivity, the rate constant of the surface reaction (k), and the effective diffusivity of reactant in the product layer (Ds) under certain experimental conditions can reflect the effect of additives on the activation. Unstoichiometric metal oxide may catalyze the surface reaction and promote the diffusivity of reactant in the product layer by the crystal defect and distinct diffusion of cation and anion. According to the mechanism and effect of additive on the sulfation, the effective temperature and the stoichiometric relation of reaction, it is possible to improve the utilization of sorbent by compounding more additives to the calcium-based sorbent.
Managing Performance Analysis with Dynamic Statistical Projection Pursuit
Vetter, J.S.; Reed, D.A.
2000-05-22
Computer systems and applications are growing more complex. Consequently, performance analysis has become more difficult due to the complex, transient interrelationships among runtime components. To diagnose these types of performance issues, developers must use detailed instrumentation to capture a large number of performance metrics. Unfortunately, this instrumentation may actually influence the performance analysis, leading the developer to an ambiguous conclusion. In this paper, we introduce a technique for focusing a performance analysis on interesting performance metrics. This technique, called dynamic statistical projection pursuit, identifies interesting performance metrics that the monitoring system should capture across some number of processors. By reducing the number of performance metrics, projection pursuit can limit the impact of instrumentation on the performance of the target system and can reduce the volume of performance data.
Statistical analysis of static shape control in space structures
NASA Technical Reports Server (NTRS)
Burdisso, Ricardo A.; Haftka, Raphael T.
1990-01-01
The article addresses the problem of efficient analysis of the statistics of initial and corrected shape distortions in space structures. Two approaches for improving efficiency are considered. One is an adjoint technique for calculating distortion shapes: the second is a modal expansion of distortion shapes in terms of pseudo-vibration modes. The two techniques are applied to the problem of optimizing actuator locations on a 55 m radiometer antenna. The adjoint analysis technique is used with a discrete-variable optimization method. The modal approximation technique is coupled with a standard conjugate-gradient continuous optimization method. The agreement between the two sets of results is good, validating both the approximate analysis and optimality of the results.
Forensic discrimination of dyed hair color: II. Multivariate statistical analysis.
Barrett, Julie A; Siegel, Jay A; Goodpaster, John V
2011-01-01
This research is intended to assess the ability of UV-visible microspectrophotometry to successfully discriminate the color of dyed hair. Fifty-five red hair dyes were analyzed and evaluated using multivariate statistical techniques including agglomerative hierarchical clustering (AHC), principal component analysis (PCA), and discriminant analysis (DA). The spectra were grouped into three classes, which were visually consistent with different shades of red. A two-dimensional PCA observations plot was constructed, describing 78.6% of the overall variance. The wavelength regions associated with the absorbance of hair and dye were highly correlated. Principal components were selected to represent 95% of the overall variance for analysis with DA. A classification accuracy of 89% was observed for the comprehensive dye set, while external validation using 20 of the dyes resulted in a prediction accuracy of 75%. Significant color loss from successive washing of hair samples was estimated to occur within 3 weeks of dye application.
Data and statistical methods for analysis of trends and patterns
Atwood, C.L.; Gentillon, C.D.; Wilson, G.E.
1992-11-01
This report summarizes topics considered at a working meeting on data and statistical methods for analysis of trends and patterns in US commercial nuclear power plants. This meeting was sponsored by the Office of Analysis and Evaluation of Operational Data (AEOD) of the Nuclear Regulatory Commission (NRC). Three data sets are briefly described: Nuclear Plant Reliability Data System (NPRDS), Licensee Event Report (LER) data, and Performance Indicator data. Two types of study are emphasized: screening studies, to see if any trends or patterns appear to be present; and detailed studies, which are more concerned with checking the analysis assumptions, modeling any patterns that are present, and searching for causes. A prescription is given for a screening study, and ideas are suggested for a detailed study, when the data take of any of three forms: counts of events per time, counts of events per demand, and non-event data.
Error Analysis of Terrestrial Laser Scanning Data by Means of Spherical Statistics and 3D Graphs
Cuartero, Aurora; Armesto, Julia; Rodríguez, Pablo G.; Arias, Pedro
2010-01-01
This paper presents a complete analysis of the positional errors of terrestrial laser scanning (TLS) data based on spherical statistics and 3D graphs. Spherical statistics are preferred because of the 3D vectorial nature of the spatial error. Error vectors have three metric elements (one module and two angles) that were analyzed by spherical statistics. A study case has been presented and discussed in detail. Errors were calculating using 53 check points (CP) and CP coordinates were measured by a digitizer with submillimetre accuracy. The positional accuracy was analyzed by both the conventional method (modular errors analysis) and the proposed method (angular errors analysis) by 3D graphics and numerical spherical statistics. Two packages in R programming language were performed to obtain graphics automatically. The results indicated that the proposed method is advantageous as it offers a more complete analysis of the positional accuracy, such as angular error component, uniformity of the vector distribution, error isotropy, and error, in addition the modular error component by linear statistics. PMID:22163461
Gis-Based Spatial Statistical Analysis of College Graduates Employment
NASA Astrophysics Data System (ADS)
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Statistical energy analysis of complex structures, phase 2
NASA Technical Reports Server (NTRS)
Trudell, R. W.; Yano, L. I.
1980-01-01
A method for estimating the structural vibration properties of complex systems in high frequency environments was investigated. The structure analyzed was the Materials Experiment Assembly, (MEA), which is a portion of the OST-2A payload for the space transportation system. Statistical energy analysis (SEA) techniques were used to model the structure and predict the structural element response to acoustic excitation. A comparison of the intial response predictions and measured acoustic test data is presented. The conclusions indicate that: the SEA predicted the response of primary structure to acoustic excitation over a wide range of frequencies; and the contribution of mechanically induced random vibration to the total MEA is not significant.
Feature statistic analysis of ultrasound images of liver cancer
NASA Astrophysics Data System (ADS)
Huang, Shuqin; Ding, Mingyue; Zhang, Songgeng
2007-12-01
In this paper, a specific feature analysis of liver ultrasound images including normal liver, liver cancer especially hepatocellular carcinoma (HCC) and other hepatopathy is discussed. According to the classification of hepatocellular carcinoma (HCC), primary carcinoma is divided into four types. 15 features from single gray-level statistic, gray-level co-occurrence matrix (GLCM), and gray-level run-length matrix (GLRLM) are extracted. Experiments for the discrimination of each type of HCC, normal liver, fatty liver, angioma and hepatic abscess have been conducted. Corresponding features to potentially discriminate them are found.
Statistical Analysis of Strength Data for an Aerospace Aluminum Alloy
NASA Technical Reports Server (NTRS)
Neergaard, Lynn; Malone, Tina; Gentz, Steven J. (Technical Monitor)
2000-01-01
Aerospace vehicles are produced in limited quantities that do not always allow development of MIL-HDBK-5 A-basis design allowables. One method of examining production and composition variations is to perform 100% lot acceptance testing for aerospace Aluminum (Al) alloys. This paper discusses statistical trends seen in strength data for one Al alloy. A four-step approach reduced the data to residuals, visualized residuals as a function of time, grouped data with quantified scatter, and conducted analysis of variance (ANOVA).
Statistical Analysis of Strength Data for an Aerospace Aluminum Alloy
NASA Technical Reports Server (NTRS)
Neergaard, L.; Malone, T.
2001-01-01
Aerospace vehicles are produced in limited quantities that do not always allow development of MIL-HDBK-5 A-basis design allowables. One method of examining production and composition variations is to perform 100% lot acceptance testing for aerospace Aluminum (Al) alloys. This paper discusses statistical trends seen in strength data for one Al alloy. A four-step approach reduced the data to residuals, visualized residuals as a function of time, grouped data with quantified scatter, and conducted analysis of variance (ANOVA).
Statistical shape analysis for face movement manifold modeling
NASA Astrophysics Data System (ADS)
Wang, Xiaokan; Mao, Xia; Caleanu, Catalin-Daniel; Ishizuka, Mitsuru
2012-03-01
The inter-frame information for analyzing human face movement manifold is modeled by the statistical shape theory. Using the Riemannian geometry principles, we map a sequence of face shapes to a unified tangent space and obtain a curve corresponding to the face movement. The experimental results show that the face movement sequence forms a trajectory in a complex tangent space. Furthermore, the extent and type of face expression could be depicted as the range and direction of the curve. This represents a novel approach for face movement classification using shape-based analysis.
Multi-scale statistical analysis of coronal solar activity
Gamborino, Diana; del-Castillo-Negrete, Diego; Martinell, Julio J.
2016-07-08
Multi-filter images from the solar corona are used to obtain temperature maps that are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions, we show that the multi-scale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also to be extracted from the analysis.
Statistical analysis of personal radiofrequency electromagnetic field measurements with nondetects.
Röösli, Martin; Frei, Patrizia; Mohler, Evelyn; Braun-Fahrländer, Charlotte; Bürgi, Alfred; Fröhlich, Jürg; Neubauer, Georg; Theis, Gaston; Egger, Matthias
2008-09-01
Exposimeters are increasingly applied in bioelectromagnetic research to determine personal radiofrequency electromagnetic field (RF-EMF) exposure. The main advantages of exposimeter measurements are their convenient handling for study participants and the large amount of personal exposure data, which can be obtained for several RF-EMF sources. However, the large proportion of measurements below the detection limit is a challenge for data analysis. With the robust ROS (regression on order statistics) method, summary statistics can be calculated by fitting an assumed distribution to the observed data. We used a preliminary sample of 109 weekly exposimeter measurements from the QUALIFEX study to compare summary statistics computed by robust ROS with a naïve approach, where values below the detection limit were replaced by the value of the detection limit. For the total RF-EMF exposure, differences between the naïve approach and the robust ROS were moderate for the 90th percentile and the arithmetic mean. However, exposure contributions from minor RF-EMF sources were considerably overestimated with the naïve approach. This results in an underestimation of the exposure range in the population, which may bias the evaluation of potential exposure-response associations. We conclude from our analyses that summary statistics of exposimeter data calculated by robust ROS are more reliable and more informative than estimates based on a naïve approach. Nevertheless, estimates of source-specific medians or even lower percentiles depend on the assumed data distribution and should be considered with caution.
A statistical analysis of eruptive activity on Mount Etna, Sicily
NASA Astrophysics Data System (ADS)
Smethurst, Lucy; James, Mike R.; Pinkerton, Harry; Tawn, Jonathan A.
2009-10-01
A rigorous analysis of the timing and location of flank eruptions of Mount Etna on Sicily is important for the creation of hazard maps of the densely populated area surrounding the volcano. In this paper, we analyse the temporal, volumetric and spatial data on eruptive activity on Etna. Our analyses are based on the two most recent and robust historical data catalogues of flank eruption activity on Etna, with one from 1669 to 2008 and the other from 1610 to 2008. We use standard statistical methodology and modelling techniques, though a number of features are new to the analysis of eruption data. Our temporal analysis reveals that flank eruptions on Mount Etna between 1610 and 2008 follow an inhomogeneous Poisson process, with intensity of eruptions increasing nearly linearly since the mid-1900s. Our temporal analysis reveals no evidence of cyclicity over this period. An analysis of volumetric lava flow rates shows a marked increase in activity since 1971. This increase, which coincides with the formation of the Southeast Crater (SEC), appears to be related to increased activity on and around the SEC. This has significant implications for hazard analysis on Etna.
ANALYSIS OF MPC ACCESS REQUIREMENTS FOR ADDITION OF FILLER MATERIALS
W. Wallin
1996-09-03
This analysis is prepared by the Mined Geologic Disposal System (MGDS) Waste Package Development Department (WPDD) in response to a request received via a QAP-3-12 Design Input Data Request (Ref. 5.1) from WAST Design (formerly MRSMPC Design). The request is to provide: Specific MPC access requirements for the addition of filler materials at the MGDS (i.e., location and size of access required). The objective of this analysis is to provide a response to the foregoing request. The purpose of this analysis is to provide a documented record of the basis for the response. The response is stated in Section 8 herein. The response is based upon requirements from an MGDS perspective.
Detection of bearing damage by statistic vibration analysis
NASA Astrophysics Data System (ADS)
Sikora, E. A.
2016-04-01
The condition of bearings, which are essential components in mechanisms, is crucial to safety. The analysis of the bearing vibration signal, which is always contaminated by certain types of noise, is a very important standard for mechanical condition diagnosis of the bearing and mechanical failure phenomenon. In this paper the method of rolling bearing fault detection by statistical analysis of vibration is proposed to filter out Gaussian noise contained in a raw vibration signal. The results of experiments show that the vibration signal can be significantly enhanced by application of the proposed method. Besides, the proposed method is used to analyse real acoustic signals of a bearing with inner race and outer race faults, respectively. The values of attributes are determined according to the degree of the fault. The results confirm that the periods between the transients, which represent bearing fault characteristics, can be successfully detected.
First statistical analysis of Geant4 quality software metrics
NASA Astrophysics Data System (ADS)
Ronchieri, Elisabetta; Grazia Pia, Maria; Giacomini, Francesco
2015-12-01
Geant4 is a simulation system of particle transport through matter, widely used in several experimental areas from high energy physics and nuclear experiments to medical studies. Some of its applications may involve critical use cases; therefore they would benefit from an objective assessment of the software quality of Geant4. In this paper, we provide a first statistical evaluation of software metrics data related to a set of Geant4 physics packages. The analysis aims at identifying risks for Geant4 maintainability, which would benefit from being addressed at an early stage. The findings of this pilot study set the grounds for further extensions of the analysis to the whole of Geant4 and to other high energy physics software systems.
Vibroacoustic optimization using a statistical energy analysis model
NASA Astrophysics Data System (ADS)
Culla, Antonio; D`Ambrogio, Walter; Fregolent, Annalisa; Milana, Silvia
2016-08-01
In this paper, an optimization technique for medium-high frequency dynamic problems based on Statistical Energy Analysis (SEA) method is presented. Using a SEA model, the subsystem energies are controlled by internal loss factors (ILF) and coupling loss factors (CLF), which in turn depend on the physical parameters of the subsystems. A preliminary sensitivity analysis of subsystem energy to CLF's is performed to select CLF's that are most effective on subsystem energies. Since the injected power depends not only on the external loads but on the physical parameters of the subsystems as well, it must be taken into account under certain conditions. This is accomplished in the optimization procedure, where approximate relationships between CLF's, injected power and physical parameters are derived. The approach is applied on a typical aeronautical structure: the cabin of a helicopter.
Statistical analysis of cascading failures in power grids
Chertkov, Michael; Pfitzner, Rene; Turitsyn, Konstantin
2010-12-01
We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systems consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.
Processes and subdivisions in diogenites, a multivariate statistical analysis
NASA Technical Reports Server (NTRS)
Harriott, T. A.; Hewins, R. H.
1984-01-01
Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.
Statistical analysis of the 70 meter antenna surface distortions
NASA Technical Reports Server (NTRS)
Kiedron, K.; Chian, C. T.; Chuang, K. L.
1987-01-01
Statistical analysis of surface distortions of the 70 meter NASA/JPL antenna, located at Goldstone, was performed. The purpose of this analysis is to verify whether deviations due to gravity loading can be treated as quasi-random variables with normal distribution. Histograms of the RF pathlength error distribution for several antenna elevation positions were generated. The results indicate that the deviations from the ideal antenna surface are not normally distributed. The observed density distribution for all antenna elevation angles is taller and narrower than the normal density, which results in large positive values of kurtosis and a significant amount of skewness. The skewness of the distribution changes from positive to negative as the antenna elevation changes from zenith to horizon.
Statistical analysis of magnetically soft particles in magnetorheological elastomers
NASA Astrophysics Data System (ADS)
Gundermann, T.; Cremer, P.; Löwen, H.; Menzel, A. M.; Odenbach, S.
2017-04-01
The physical properties of magnetorheological elastomers (MRE) are a complex issue and can be influenced and controlled in many ways, e.g. by applying a magnetic field, by external mechanical stimuli, or by an electric potential. In general, the response of MRE materials to these stimuli is crucially dependent on the distribution of the magnetic particles inside the elastomer. Specific knowledge of the interactions between particles or particle clusters is of high relevance for understanding the macroscopic rheological properties and provides an important input for theoretical calculations. In order to gain a better insight into the correlation between the macroscopic effects and microstructure and to generate a database for theoretical analysis, x-ray micro-computed tomography (X-μCT) investigations as a base for a statistical analysis of the particle configurations were carried out. Different MREs with quantities of 2–15 wt% (0.27–2.3 vol%) of iron powder and different allocations of the particles inside the matrix were prepared. The X-μCT results were edited by an image processing software regarding the geometrical properties of the particles with and without the influence of an external magnetic field. Pair correlation functions for the positions of the particles inside the elastomer were calculated to statistically characterize the distributions of the particles in the samples.
Statistical analysis of a dynamic model for dietary contaminant exposure.
Bertail, P; Clémençon, S; Tressou, J
2010-03-01
This paper is devoted to the statistical analysis of a stochastic model introduced in [P. Bertail, S. Clémençon, and J. Tressou, A storage model with random release rate for modelling exposure to food contaminants, Math. Biosci. Eng. 35 (1) (2008), pp. 35-60] for describing the phenomenon of exposure to a certain food contaminant. In this modelling, the temporal evolution of the contamination exposure is entirely determined by the accumulation phenomenon due to successive dietary intakes and the pharmacokinetics governing the elimination process inbetween intakes, in such a way that the exposure dynamic through time is described as a piecewise deterministic Markov process. Paths of the contamination exposure process are scarcely observable in practice, therefore intensive computer simulation methods are crucial for estimating the time-dependent or steady-state features of the process. Here we consider simulation estimators based on consumption and contamination data and investigate how to construct accurate bootstrap confidence intervals (CI) for certain quantities of considerable importance from the epidemiology viewpoint. Special attention is also paid to the problem of computing the probability of certain rare events related to the exposure process path arising in dietary risk analysis using multilevel splitting or importance sampling (IS) techniques. Applications of these statistical methods to a collection of data sets related to dietary methyl mercury contamination are discussed thoroughly.
Spectral Envelopes and Additive + Residual Analysis/Synthesis
NASA Astrophysics Data System (ADS)
Rodet, Xavier; Schwarz, Diemo
The subject of this chapter is the estimation, representation, modification, and use of spectral envelopes in the context of sinusoidal-additive-plus-residual analysis/synthesis. A spectral envelope is an amplitude-vs-frequency function, which may be obtained from the envelope of a short-time spectrum (Rodet et al., 1987; Schwarz, 1998). [Precise definitions of such an envelope and short-time spectrum (STS) are given in Section 2.] The additive-plus-residual analysis/synthesis method is based on a representation of signals in terms of a sum of time-varying sinusoids and of a non-sinusoidal residual signal [e.g., see Serra (1989), Laroche et al. (1993), McAulay and Quatieri (1995), and Ding and Qian (1997)]. Many musical sound signals may be described as a combination of a nearly periodic waveform and colored noise. The nearly periodic part of the signal can be viewed as a sum of sinusoidal components, called partials, with time-varying frequency and amplitude. Such sinusoidal components are easily observed on a spectral analysis display (Fig. 5.1) as obtained, for instance, from a discrete Fourier transform.
EBprot: Statistical analysis of labeling-based quantitative proteomics data.
Koh, Hiromi W L; Swa, Hannah L F; Fermin, Damian; Ler, Siok Ghee; Gunaratne, Jayantha; Choi, Hyungwon
2015-08-01
Labeling-based proteomics is a powerful method for detection of differentially expressed proteins (DEPs). The current data analysis platform typically relies on protein-level ratios, which is obtained by summarizing peptide-level ratios for each protein. In shotgun proteomics, however, some proteins are quantified with more peptides than others, and this reproducibility information is not incorporated into the differential expression (DE) analysis. Here, we propose a novel probabilistic framework EBprot that directly models the peptide-protein hierarchy and rewards the proteins with reproducible evidence of DE over multiple peptides. To evaluate its performance with known DE states, we conducted a simulation study to show that the peptide-level analysis of EBprot provides better receiver-operating characteristic and more accurate estimation of the false discovery rates than the methods based on protein-level ratios. We also demonstrate superior classification performance of peptide-level EBprot analysis in a spike-in dataset. To illustrate the wide applicability of EBprot in different experimental designs, we applied EBprot to a dataset for lung cancer subtype analysis with biological replicates and another dataset for time course phosphoproteome analysis of EGF-stimulated HeLa cells with multiplexed labeling. Through these examples, we show that the peptide-level analysis of EBprot is a robust alternative to the existing statistical methods for the DE analysis of labeling-based quantitative datasets. The software suite is freely available on the Sourceforge website http://ebprot.sourceforge.net/. All MS data have been deposited in the ProteomeXchange with identifier PXD001426 (http://proteomecentral.proteomexchange.org/dataset/PXD001426/).
Statistical analysis of dynamic sequences for functional imaging
NASA Astrophysics Data System (ADS)
Kao, Chien-Min; Chen, Chin-Tu; Wernick, Miles N.
2000-04-01
Factor analysis of medical image sequences (FAMIS), in which one concerns the problem of simultaneous identification of homogeneous regions (factor images) and the characteristic temporal variations (factors) inside these regions from a temporal sequence of images by statistical analysis, is one of the major challenges in medical imaging. In this research, we contribute to this important area of research by proposing a two-step approach. First, we study the use of the noise- adjusted principal component (NAPC) analysis developed by Lee et. al. for identifying the characteristic temporal variations in dynamic scans acquired by PET and MRI. NAPC allows us to effectively reject data noise and substantially reduce data dimension based on signal-to-noise ratio consideration. Subsequently, a simple spatial analysis based on the criteria of minimum spatial overlapping and non-negativity of the factor images is applied for extraction of the factors and factor images. In our simulation study, our preliminary results indicate that the proposed approach can accurately identify the factor images. However, the factors are not completely separated.
Statistical Scalability Analysis of Communication Operations in Distributed Applications
Vetter, J S; McCracken, M O
2001-02-27
Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for individual communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.
Statistical methods for the analysis of climate extremes
NASA Astrophysics Data System (ADS)
Naveau, Philippe; Nogaj, Marta; Ammann, Caspar; Yiou, Pascal; Cooley, Daniel; Jomelli, Vincent
2005-08-01
Currently there is an increasing research activity in the area of climate extremes because they represent a key manifestation of non-linear systems and an enormous impact on economic and social human activities. Our understanding of the mean behavior of climate and its 'normal' variability has been improving significantly during the last decades. In comparison, climate extreme events have been hard to study and even harder to predict because they are, by definition, rare and obey different statistical laws than averages. In this context, the motivation for this paper is twofold. Firstly, we recall the basic principles of Extreme Value Theory that is used on a regular basis in finance and hydrology, but it still does not have the same success in climate studies. More precisely, the theoretical distributions of maxima and large peaks are recalled. The parameters of such distributions are estimated with the maximum likelihood estimation procedure that offers the flexibility to take into account explanatory variables in our analysis. Secondly, we detail three case-studies to show that this theory can provide a solid statistical foundation, specially when assessing the uncertainty associated with extreme events in a wide range of applications linked to the study of our climate. To cite this article: P. Naveau et al., C. R. Geoscience 337 (2005).
Log-Normality and Multifractal Analysis of Flame Surface Statistics
NASA Astrophysics Data System (ADS)
Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.
2013-11-01
The turbulent flame surface is typically highly wrinkled and folded at a multitude of scales controlled by various flame properties. It is useful if the information contained in this complex geometry can be projected onto a simpler regular geometry for the use of spectral, wavelet or multifractal analyses. Here we investigate local flame surface statistics of turbulent flame expanding under constant pressure. First the statistics of local length ratio is experimentally obtained from high-speed Mie scattering images. For spherically expanding flame, length ratio on the measurement plane, at predefined equiangular sectors is defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at corresponding area-ratio pdfs. Both the pdfs are found to be near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis. Currently at Indian Institute of Science, India.
Constraining cosmology with shear peak statistics: tomographic analysis
NASA Astrophysics Data System (ADS)
Martinet, Nicolas; Bartlett, James G.; Kiessling, Alina; Sartoris, Barbara
2015-09-01
The abundance of peaks in weak gravitational lensing maps is a potentially powerful cosmological tool, complementary to measurements of the shear power spectrum. We study peaks detected directly in shear maps, rather than convergence maps, an approach that has the advantage of working directly with the observable quantity, the galaxy ellipticity catalog. Using large numbers of numerical simulations to accurately predict the abundance of peaks and their covariance, we quantify the cosmological constraints attainable by a large-area survey similar to that expected from the Euclid mission, focusing on the density parameter, Ωm, and on the power spectrum normalization, σ8, for illustration. We present a tomographic peak counting method that improves the conditional (marginal) constraints by a factor of 1.2 (2) over those from a two-dimensional (i.e., non-tomographic) peak-count analysis. We find that peak statistics provide constraints an order of magnitude less accurate than those from the cluster sample in the ideal situation of a perfectly known observable-mass relation; however, when the scaling relation is not known a priori, the shear-peak constraints are twice as strong and orthogonal to the cluster constraints, highlighting the value of using both clusters and shear-peak statistics.
Statistical analysis of cascaded PLC-based PMD compensator
NASA Astrophysics Data System (ADS)
Wang, Bin; Wang, Lei; Wu, Xingkun
2005-01-01
The planar lightwave circuit (PLC) on silicon substrate offers a promising on-chip integrated solution to polarization-mode dispersion (PMD) compensation for long haul high speed communications. A novel cascaded PLC based PMD compensator is proposed in this paper and a detailed statistical analysis of PMD generated by cascaded PLC circuits is presented. Using Gisin and Pellaux's approach the distributions of first-order PMD produced by various multiple-stage PLC circuits were obtained by Monte Carlo simulation with respect to the phase shift introduced by heating elements in the circuits. The generated PMD was compared with a standard Maxwell distribution and that of a 12-stage nonlinear crystal based PMD compensator. It was found that a 3-stage cascaded PLC circuit yields a performance close to that of the crystal-based PMD compensator, while with a significant reduction in packaged size and enhancement in stability.
A Computer Program for Statistically-Based Decision Analysis
Polaschek, Jeanette X.; Lenert, Leslie A.; Garber, Alan M.
1990-01-01
The majority of patients with coronary artery disease do not fall into the well defined populations from randomized clinical trials. Observational databases contain a rich source of information that could be used by practicing physicians to evaluate treatment alternatives for their patients. We describe a computer system, the CABG Kibitzer, which uses an integrated approach to evaluate the treatment alternatives for CAD patients. We combine a statistical multivariate model for calculating survival advantages with DA techniques for assessing patient preferences and sensitivity analysis, to create one tool that physicians find easy to use in daily clinical practice. The development of tools of this kind is a necessary step in making the data of outcome studies accessible to practicing physicians.
Higher order statistical moment application for solar PV potential analysis
NASA Astrophysics Data System (ADS)
Basri, Mohd Juhari Mat; Abdullah, Samizee; Azrulhisham, Engku Ahmad; Harun, Khairulezuan
2016-10-01
Solar photovoltaic energy could be as alternative energy to fossil fuel, which is depleting and posing a global warming problem. However, this renewable energy is so variable and intermittent to be relied on. Therefore the knowledge of energy potential is very important for any site to build this solar photovoltaic power generation system. Here, the application of higher order statistical moment model is being analyzed using data collected from 5MW grid-connected photovoltaic system. Due to the dynamic changes of skewness and kurtosis of AC power and solar irradiance distributions of the solar farm, Pearson system where the probability distribution is calculated by matching their theoretical moments with that of the empirical moments of a distribution could be suitable for this purpose. On the advantage of the Pearson system in MATLAB, a software programming has been developed to help in data processing for distribution fitting and potential analysis for future projection of amount of AC power and solar irradiance availability.
Statistical uncertainty analysis of radon transport in nonisothermal, unsaturated soils
Holford, D.J.; Owczarski, P.C.; Gee, G.W.; Freeman, H.D.
1990-10-01
To accurately predict radon fluxes soils to the atmosphere, we must know more than the radium content of the soil. Radon flux from soil is affected not only by soil properties, but also by meteorological factors such as air pressure and temperature changes at the soil surface, as well as the infiltration of rainwater. Natural variations in meteorological factors and soil properties contribute to uncertainty in subsurface model predictions of radon flux, which, when coupled with a building transport model, will also add uncertainty to predictions of radon concentrations in homes. A statistical uncertainty analysis using our Rn3D finite-element numerical model was conducted to assess the relative importance of these meteorological factors and the soil properties affecting radon transport. 10 refs., 10 figs., 3 tabs.
A statistical analysis of the daily streamflow hydrograph
NASA Astrophysics Data System (ADS)
Kavvas, M. L.; Delleur, J. W.
1984-03-01
In this study a periodic statistical analysis of daily streamflow data in Indiana, U.S.A., was performed to gain some new insight into the stochastic structure which describes the daily streamflow process. This analysis was performed by the periodic mean and covariance functions of the daily streamflows, by the time and peak discharge -dependent recession limb of the daily streamflow hydrograph, by the time and discharge exceedance level (DEL) -dependent probability distribution of the hydrograph peak interarrival time, and by the time-dependent probability distribution of the time to peak discharge. Some new statistical estimators were developed and used in this study. In general features, this study has shown that: (a) the persistence properties of daily flows depend on the storage state of the basin at the specified time origin of the flow process; (b) the daily streamflow process is time irreversible; (c) the probability distribution of the daily hydrograph peak interarrival time depends both on the occurrence time of the peak from which the inter-arrival time originates and on the discharge exceedance level; and (d) if the daily streamflow process is modeled as the release from a linear watershed storage, this release should depend on the state of the storage and on the time of the release as the persistence properties and the recession limb decay rates were observed to change with the state of the watershed storage and time. Therefore, a time-varying reservoir system needs to be considered if the daily streamflow process is to be modeled as the release from a linear watershed storage.
Statistical analysis of the uncertainty related to flood hazard appraisal
NASA Astrophysics Data System (ADS)
Notaro, Vincenza; Freni, Gabriele
2015-12-01
The estimation of flood hazard frequency statistics for an urban catchment is of great interest in practice. It provides the evaluation of potential flood risk and related damage and supports decision making for flood risk management. Flood risk is usually defined as function of the probability, that a system deficiency can cause flooding (hazard), and the expected damage, due to the flooding magnitude (damage), taking into account both the exposure and the vulnerability of the goods at risk. The expected flood damage can be evaluated by an a priori estimation of potential damage caused by flooding or by interpolating real damage data. With regard to flood hazard appraisal several procedures propose to identify some hazard indicator (HI) such as flood depth or the combination of flood depth and velocity and to assess the flood hazard corresponding to the analyzed area comparing the HI variables with user-defined threshold values or curves (penalty curves or matrixes). However, flooding data are usually unavailable or piecemeal allowing for carrying out a reliable flood hazard analysis, therefore hazard analysis is often performed by means of mathematical simulations aimed at evaluating water levels and flow velocities over catchment surface. As results a great part of the uncertainties intrinsic to flood risk appraisal can be related to the hazard evaluation due to the uncertainty inherent to modeling results and to the subjectivity of the user defined hazard thresholds applied to link flood depth to a hazard level. In the present work, a statistical methodology was proposed for evaluating and reducing the uncertainties connected with hazard level estimation. The methodology has been applied to a real urban watershed as case study.
Statistical analysis and modelling of small satellite reliability
NASA Astrophysics Data System (ADS)
Guo, Jian; Monas, Liora; Gill, Eberhard
2014-05-01
This paper attempts to characterize failure behaviour of small satellites through statistical analysis of actual in-orbit failures. A unique Small Satellite Anomalies Database comprising empirical failure data of 222 small satellites has been developed. A nonparametric analysis of the failure data has been implemented by means of a Kaplan-Meier estimation. An innovative modelling method, i.e. Bayesian theory in combination with Markov Chain Monte Carlo (MCMC) simulations, has been proposed to model the reliability of small satellites. An extensive parametric analysis using the Bayesian/MCMC method has been performed to fit a Weibull distribution to the data. The influence of several characteristics such as the design lifetime, mass, launch year, mission type and the type of satellite developers on the reliability has been analyzed. The results clearly show the infant mortality of small satellites. Compared with the classical maximum-likelihood estimation methods, the proposed Bayesian/MCMC method results in better fitting Weibull models and is especially suitable for reliability modelling where only very limited failures are observed.
Helioseismology of pre-emerging active regions. III. Statistical analysis
Barnes, G.; Leka, K. D.; Braun, D. C.; Birch, A. C.
2014-05-01
The subsurface properties of active regions (ARs) prior to their appearance at the solar surface may shed light on the process of AR formation. Helioseismic holography has been applied to samples taken from two populations of regions on the Sun (pre-emergence and without emergence), each sample having over 100 members, that were selected to minimize systematic bias, as described in Paper I. Paper II showed that there are statistically significant signatures in the average helioseismic properties that precede the formation of an AR. This paper describes a more detailed analysis of the samples of pre-emergence regions and regions without emergence based on discriminant analysis. The property that is best able to distinguish the populations is found to be the surface magnetic field, even a day before the emergence time. However, after accounting for the correlations between the surface field and the quantities derived from helioseismology, there is still evidence of a helioseismic precursor to AR emergence that is present for at least a day prior to emergence, although the analysis presented cannot definitively determine the subsurface properties prior to emergence due to the small sample sizes.
Classification of Malaysia aromatic rice using multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Detection of viruses via statistical gene expression analysis.
Chen, Minhua; Carlson, David; Zaas, Aimee; Woods, Christopher W; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Carin, Lawrence
2011-03-01
We develop a new bayesian construction of the elastic net (ENet), with variational bayesian analysis. This modeling framework is motivated by analysis of gene expression data for viruses, with a focus on H3N2 and H1N1 influenza, as well as Rhino virus and RSV (respiratory syncytial virus). Our objective is to understand the biological pathways responsible for the host response to such viruses, with the ultimate objective of developing a clinical test to distinguish subjects infected by such viruses from subjects with other symptom causes (e.g., bacteria). In addition to analyzing these new datasets, we provide a detailed analysis of the bayesian ENet and compare it to related models.
Multivariate Statistical Analysis of MSL APXS Bulk Geochemical Data
NASA Astrophysics Data System (ADS)
Hamilton, V. E.; Edwards, C. S.; Thompson, L. M.; Schmidt, M. E.
2014-12-01
We apply cluster and factor analyses to bulk chemical data of 130 soil and rock samples measured by the Alpha Particle X-ray Spectrometer (APXS) on the Mars Science Laboratory (MSL) rover Curiosity through sol 650. Multivariate approaches such as principal components analysis (PCA), cluster analysis, and factor analysis compliment more traditional approaches (e.g., Harker diagrams), with the advantage of simultaneously examining the relationships between multiple variables for large numbers of samples. Principal components analysis has been applied with success to APXS, Pancam, and Mössbauer data from the Mars Exploration Rovers. Factor analysis and cluster analysis have been applied with success to thermal infrared (TIR) spectral data of Mars. Cluster analyses group the input data by similarity, where there are a number of different methods for defining similarity (hierarchical, density, distribution, etc.). For example, without any assumptions about the chemical contributions of surface dust, preliminary hierarchical and K-means cluster analyses clearly distinguish the physically adjacent rock targets Windjana and Stephen as being distinctly different than lithologies observed prior to Curiosity's arrival at The Kimberley. In addition, they are separated from each other, consistent with chemical trends observed in variation diagrams but without requiring assumptions about chemical relationships. We will discuss the variation in cluster analysis results as a function of clustering method and pre-processing (e.g., log transformation, correction for dust cover) and implications for interpreting chemical data. Factor analysis shares some similarities with PCA, and examines the variability among observed components of a dataset so as to reveal variations attributable to unobserved components. Factor analysis has been used to extract the TIR spectra of components that are typically observed in mixtures and only rarely in isolation; there is the potential for similar
Microcomputers: Statistical Analysis Software. Evaluation Guide Number 5.
ERIC Educational Resources Information Center
Gray, Peter J.
This guide discusses six sets of features to examine when purchasing a microcomputer-based statistics program: hardware requirements; data management; data processing; statistical procedures; printing; and documentation. While the current statistical packages have several negative features, they are cost saving and convenient for small to moderate…
Fu, Wenjiang J; Stromberg, Arnold J; Viele, Kert; Carroll, Raymond J; Wu, Guoyao
2010-07-01
Over the past 2 decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (Type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine growth retardation).
Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao
2009-01-01
Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Decreasing Cloudiness Over China: An Updated Analysis Examining Additional Variables
Kaiser, D.P.
2000-01-14
As preparation of the IPCC's Third Assessment Report takes place, one of the many observed climate variables of key interest is cloud amount. For several nations of the world, there exist records of surface-observed cloud amount dating back to the middle of the 20th Century or earlier, offering valuable information on variations and trends. Studies using such databases include Sun and Groisman (1999) and Kaiser and Razuvaev (1995) for the former Soviet Union, Angel1 et al. (1984) for the United States, Henderson-Sellers (1986) for Europe, Jones and Henderson-Sellers (1992) for Australia, and Kaiser (1998) for China. The findings of Kaiser (1998) differ from the other studies in that much of China appears to have experienced decreased cloudiness over recent decades (1954-1994), whereas the other land regions for the most part show evidence of increasing cloud cover. This paper expands on Kaiser (1998) by analyzing trends in additional meteorological variables for Chi na [station pressure (p), water vapor pressure (e), and relative humidity (rh)] and extending the total cloud amount (N) analysis an additional two years (through 1996).
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-08
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes.
Statistical Analysis of Tank 5 Floor Sample Results
Shine, E. P.
2013-01-31
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide1, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
STATISTICAL ANALYSIS OF TANK 5 FLOOR SAMPLE RESULTS
Shine, E.
2012-03-14
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, radionuclide, inorganic, and anion concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements above their
Statistical Analysis Of Tank 5 Floor Sample Results
Shine, E. P.
2012-08-01
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
NASA Astrophysics Data System (ADS)
Donges, Jonathan F.; Petrova, Irina; Loew, Alexander; Marwan, Norbert; Kurths, Jürgen
2015-11-01
Eigen techniques such as empirical orthogonal function (EOF) or coupled pattern (CP)/maximum covariance analysis have been frequently used for detecting patterns in multivariate climatological data sets. Recently, statistical methods originating from the theory of complex networks have been employed for the very same purpose of spatio-temporal analysis. This climate network (CN) analysis is usually based on the same set of similarity matrices as is used in classical EOF or CP analysis, e.g., the correlation matrix of a single climatological field or the cross-correlation matrix between two distinct climatological fields. In this study, formal relationships as well as conceptual differences between both eigen and network approaches are derived and illustrated using global precipitation, evaporation and surface air temperature data sets. These results allow us to pinpoint that CN analysis can complement classical eigen techniques and provides additional information on the higher-order structure of statistical interrelationships in climatological data. Hence, CNs are a valuable supplement to the statistical toolbox of the climatologist, particularly for making sense out of very large data sets such as those generated by satellite observations and climate model intercomparison exercises.
NASA Astrophysics Data System (ADS)
Donges, Jonathan; Petrova, Irina; Löw, Alexander; Marwan, Norbert; Kurths, Jürgen
2015-04-01
Eigen techniques such as empirical orthogonal function (EOF) or coupled pattern (CP) / maximum covariance analysis have been frequently used for detecting patterns in multivariate climatological data sets. Recently, statistical methods originating from the theory of complex networks have been employed for the very same purpose of spatio-temporal analysis. This climate network (CN) analysis is usually based on the same set of similarity matrices as is used in classical EOF or CP analysis, e.g., the correlation matrix of a single climatological field or the cross-correlation matrix between two distinct climatological fields. In this study, formal relationships as well as conceptual differences between both eigen and network approaches are derived and illustrated using global precipitation, evaporation and surface air temperature data sets. These results allow us to pinpoint that CN analysis can complement classical eigen techniques and provides additional information on the higher-order structure of statistical interrelationships in climatological data. Hence, CNs are a valuable supplement to the statistical toolbox of the climatologist, particularly for making sense out of very large data sets such as those generated by satellite observations and climate model intercomparison exercises.
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
Autotasked Performance in the NAS Workload: A Statistical Analysis
NASA Technical Reports Server (NTRS)
Carter, R. L.; Stockdale, I. E.; Kutler, Paul (Technical Monitor)
1998-01-01
A statistical analysis of the workload performance of a production quality FORTRAN code for five different Cray Y-MP hardware and system software configurations is performed. The analysis was based on an experimental procedure that was designed to minimize correlations between the number of requested CPUs and the time of day the runs were initiated. Observed autotasking over heads were significantly larger for the set of jobs that requested the maximum number of CPUs. Speedups for UNICOS 6 releases show consistent wall clock speedups in the workload of around 2. which is quite good. The observed speed ups were very similar for the set of jobs that requested 8 CPUs and the set that requested 4 CPUs. The original NAS algorithm for determining charges to the user discourages autotasking in the workload. A new charging algorithm to be applied to jobs run in the NQS multitasking queues also discourages NAS users from using auto tasking. The new algorithm favors jobs requesting 8 CPUs over those that request less, although the jobs requesting 8 CPUs experienced significantly higher over head and presumably degraded system throughput. A charging algorithm is presented that has the following desirable characteristics when applied to the data: higher overhead jobs requesting 8 CPUs are penalized when compared to moderate overhead jobs requesting 4 CPUs, thereby providing a charging incentive to NAS users to use autotasking in a manner that provides them with significantly improved turnaround while also maintaining system throughput.
Statistical analysis of plasma thermograms measured by differential scanning calorimetry.
Fish, Daniel J; Brewood, Greg P; Kim, Jong Sung; Garbett, Nichola C; Chaires, Jonathan B; Benight, Albert S
2010-11-01
Melting curves of human plasma measured by differential scanning calorimetry (DSC), known as thermograms, have the potential to markedly impact diagnosis of human diseases. A general statistical methodology is developed to analyze and classify DSC thermograms to analyze and classify thermograms. Analysis of an acquired thermogram involves comparison with a database of empirical reference thermograms from clinically characterized diseases. Two parameters, a distance metric, P, and correlation coefficient, r, are combined to produce a 'similarity metric,' ρ, which can be used to classify unknown thermograms into pre-characterized categories. Simulated thermograms known to lie within or fall outside of the 90% quantile range around a median reference are also analyzed. Results verify the utility of the methods and establish the apparent dynamic range of the metric ρ. Methods are then applied to data obtained from a collection of plasma samples from patients clinically diagnosed with SLE (lupus). High correspondence is found between curve shapes and values of the metric ρ. In a final application, an elementary classification rule is implemented to successfully analyze and classify unlabeled thermograms. These methods constitute a set of powerful yet easy to implement tools for quantitative classification, analysis and interpretation of DSC plasma melting curves.
Statistical Power Flow Analysis of an Imperfect Ribbed Cylinder
NASA Astrophysics Data System (ADS)
Blakemore, M.; Woodhouse, J.; Hardie, D. J. W.
1999-05-01
Prediction of the noise transmitted from machinery and flow sources on a submarine to the sonar arrays poses a complex problem. Vibrations in the pressure hull provide the main transmission mechanism. The pressure hull is characterised by a very large number of modes over the frequency range of interest (at least 100,000) and by high modal overlap, both of which place its analysis beyond the scope of finite element or boundary element methods. A method for calculating the transmission is presented, which is broadly based on Statistical Energy Analysis, but extended in two important ways: (1) a novel subsystem breakdown which exploits the particular geometry of a submarine pressure hull; (2) explicit modelling of energy density variation within a subsystem due to damping. The method takes account of fluid-structure interaction, the underlying pass/stop band characteristics resulting from the near-periodicity of the pressure hull construction, the effect of vibration isolators such as bulkheads, and the cumulative effect of irregularities (e.g., attachments and penetrations).
FTree query construction for virtual screening: a statistical analysis
NASA Astrophysics Data System (ADS)
Gerlach, Christof; Broughton, Howard; Zaliani, Andrea
2008-02-01
FTrees (FT) is a known chemoinformatic tool able to condense molecular descriptions into a graph object and to search for actives in large databases using graph similarity. The query graph is classically derived from a known active molecule, or a set of actives, for which a similar compound has to be found. Recently, FT similarity has been extended to fragment space, widening its capabilities. If a user were able to build a knowledge-based FT query from information other than a known active structure, the similarity search could be combined with other, normally separate, fields like de-novo design or pharmacophore searches. With this aim in mind, we performed a comprehensive analysis of several databases in terms of FT description and provide a basic statistical analysis of the FT spaces so far at hand. Vendors' catalogue collections and MDDR as a source of potential or known "actives", respectively, have been used. With the results reported herein, a set of ranges, mean values and standard deviations for several query parameters are presented in order to set a reference guide for the users. Applications on how to use this information in FT query building are also provided, using a newly built 3D-pharmacophore from 57 5HT-1F agonists and a published one which was used for virtual screening for tRNA-guanine transglycosylase (TGT) inhibitors.
Higher order statistical frequency domain decomposition for operational modal analysis
NASA Astrophysics Data System (ADS)
Nita, G. M.; Mahgoub, M. A.; Sharyatpanahi, S. G.; Cretu, N. C.; El-Fouly, T. M.
2017-02-01
Experimental methods based on modal analysis under ambient vibrational excitation are often employed to detect structural damages of mechanical systems. Many of such frequency domain methods, such as Basic Frequency Domain (BFD), Frequency Domain Decomposition (FFD), or Enhanced Frequency Domain Decomposition (EFFD), use as first step a Fast Fourier Transform (FFT) estimate of the power spectral density (PSD) associated with the response of the system. In this study it is shown that higher order statistical estimators such as Spectral Kurtosis (SK) and Sample to Model Ratio (SMR) may be successfully employed not only to more reliably discriminate the response of the system against the ambient noise fluctuations, but also to better identify and separate contributions from closely spaced individual modes. It is shown that a SMR-based Maximum Likelihood curve fitting algorithm may improve the accuracy of the spectral shape and location of the individual modes and, when combined with the SK analysis, it provides efficient means to categorize such individual spectral components according to their temporal dynamics as coherent or incoherent system responses to unknown ambient excitations.
Structure-based statistical analysis of transmembrane helices.
Baeza-Delgado, Carlos; Marti-Renom, Marc A; Mingarro, Ismael
2013-03-01
Recent advances in determination of the high-resolution structure of membrane proteins now enable analysis of the main features of amino acids in transmembrane (TM) segments in comparison with amino acids in water-soluble helices. In this work, we conducted a large-scale analysis of the prevalent locations of amino acids by using a data set of 170 structures of integral membrane proteins obtained from the MPtopo database and 930 structures of water-soluble helical proteins obtained from the protein data bank. Large hydrophobic amino acids (Leu, Val, Ile, and Phe) plus Gly were clearly prevalent in TM helices whereas polar amino acids (Glu, Lys, Asp, Arg, and Gln) were less frequent in this type of helix. The distribution of amino acids along TM helices was also examined. As expected, hydrophobic and slightly polar amino acids are commonly found in the hydrophobic core of the membrane whereas aromatic (Trp and Tyr), Pro, and the hydrophilic amino acids (Asn, His, and Gln) occur more frequently in the interface regions. Charged amino acids are also statistically prevalent outside the hydrophobic core of the membrane, and whereas acidic amino acids are frequently found at both cytoplasmic and extra-cytoplasmic interfaces, basic amino acids cluster at the cytoplasmic interface. These results strongly support the experimentally demonstrated biased distribution of positively charged amino acids (that is, the so-called the positive-inside rule) with structural data.
Data Analysis & Statistical Methods for Command File Errors
NASA Technical Reports Server (NTRS)
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Sensitivity analysis of geometric errors in additive manufacturing medical models.
Pinto, Jose Miguel; Arrieta, Cristobal; Andia, Marcelo E; Uribe, Sergio; Ramos-Grez, Jorge; Vargas, Alex; Irarrazaval, Pablo; Tejos, Cristian
2015-03-01
Additive manufacturing (AM) models are used in medical applications for surgical planning, prosthesis design and teaching. For these applications, the accuracy of the AM models is essential. Unfortunately, this accuracy is compromised due to errors introduced by each of the building steps: image acquisition, segmentation, triangulation, printing and infiltration. However, the contribution of each step to the final error remains unclear. We performed a sensitivity analysis comparing errors obtained from a reference with those obtained modifying parameters of each building step. Our analysis considered global indexes to evaluate the overall error, and local indexes to show how this error is distributed along the surface of the AM models. Our results show that the standard building process tends to overestimate the AM models, i.e. models are larger than the original structures. They also show that the triangulation resolution and the segmentation threshold are critical factors, and that the errors are concentrated at regions with high curvatures. Errors could be reduced choosing better triangulation and printing resolutions, but there is an important need for modifying some of the standard building processes, particularly the segmentation algorithms.
RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database
Andronescu, Mirela; Bereg, Vera; Hoos, Holger H; Condon, Anne
2008-01-01
Background The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. Results In this paper we describe RNA STRAND – the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. Conclusion RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at . PMID:18700982
Use of statistical analysis in the biomedical informatics literature.
Scotch, Matthew; Duggal, Mona; Brandt, Cynthia; Lin, Zhenqui; Shiffman, Richard
2010-01-01
Statistics is an essential aspect of biomedical informatics. To examine the use of statistics in informatics research, a literature review of recent articles in two high-impact factor biomedical informatics journals, the Journal of American Medical Informatics Association (JAMIA) and the International Journal of Medical Informatics was conducted. The use of statistical methods in each paper was examined. Articles of original investigations from 2000 to 2007 were reviewed. For each journal, the results by statistical methods were analyzed as: descriptive, elementary, multivariable, other regression, machine learning, and other statistics. For both journals, descriptive statistics were most often used. Elementary statistics such as t tests, chi(2), and Wilcoxon tests were much more frequent in JAMIA, while machine learning approaches such as decision trees and support vector machines were similar in occurrence across the journals. Also, the use of diagnostic statistics such as sensitivity, specificity, precision, and recall, was more frequent in JAMIA. These results highlight the use of statistics in informatics and the need for biomedical informatics scientists to have, as a minimum, proficiency in descriptive and elementary statistics.
ERIC Educational Resources Information Center
Petocz, Agnes; Newbery, Glenn
2010-01-01
Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…
A statistical analysis of the impact of advertising signs on road safety.
Yannis, George; Papadimitriou, Eleonora; Papantoniou, Panagiotis; Voulgari, Chrisoula
2013-01-01
This research aims to investigate the impact of advertising signs on road safety. An exhaustive review of international literature was carried out on the effect of advertising signs on driver behaviour and safety. Moreover, a before-and-after statistical analysis with control groups was applied on several road sites with different characteristics in the Athens metropolitan area, in Greece, in order to investigate the correlation between the placement or removal of advertising signs and the related occurrence of road accidents. Road accident data for the 'before' and 'after' periods on the test sites and the control sites were extracted from the database of the Hellenic Statistical Authority, and the selected 'before' and 'after' periods vary from 2.5 to 6 years. The statistical analysis shows no statistical correlation between road accidents and advertising signs in none of the nine sites examined, as the confidence intervals of the estimated safety effects are non-significant at 95% confidence level. This can be explained by the fact that, in the examined road sites, drivers are overloaded with information (traffic signs, directions signs, labels of shops, pedestrians and other vehicles, etc.) so that the additional information load from advertising signs may not further distract them.
Statistical analysis of mission profile parameters of civil transport airplanes
NASA Technical Reports Server (NTRS)
Buxbaum, O.
1972-01-01
The statistical analysis of flight times as well as airplane gross weights and fuel weights of jet-powered civil transport airplanes has shown that the distributions of their frequency of occurrence per flight can be presented approximately in general form. Before, however, these results may be used during the project stage of an airplane for defining a typical mission profile (the parameters of which are assumed to occur, for example, with a probability of 50 percent), the following points have to be taken into account. Because the individual airplanes were rotated during service, the scatter between the distributions of mission profile parameters for airplanes of the same type, which were flown with similar payload, has proven to be very small. Significant deviations from the generalized distributions may occur if an operator uses one airplane preferably on one or two specific routes. Another reason for larger deviations could be that the maintenance services of the operators of the observed airplanes are not representative of other airlines. Although there are indications that this is unlikely, similar information should be obtained from other operators. Such information would improve the reliability of the data.
Statistical Analysis of Resistivity Anomalies Caused by Underground Caves
NASA Astrophysics Data System (ADS)
Frid, V.; Averbach, A.; Frid, M.; Dudkinski, D.; Liskevich, G.
2017-03-01
Geophysical prospecting of underground caves being performed on a construction site is often still a challenging procedure. Estimation of a likelihood level of an anomaly found is frequently a mandatory requirement of a project principal due to necessity of risk/safety assessment. However, the methodology of such estimation is not hitherto developed. Aiming to put forward such a methodology the present study (being performed as a part of an underground caves mapping prior to the land development on the site area) consisted of application of electrical resistivity tomography (ERT) together with statistical analysis utilized for the likelihood assessment of underground anomalies located. The methodology was first verified via a synthetic modeling technique and applied to the in situ collected ERT data and then crossed referenced with intrusive investigations (excavation and drilling) for the data verification. The drilling/excavation results showed that the proper discovering of underground caves can be done if anomaly probability level is not lower than 90 %. Such a probability value was shown to be consistent with the modeling results. More than 30 underground cavities were discovered on the site utilizing the methodology.
Statistical Analysis of the AIAA Drag Prediction Workshop CFD Solutions
NASA Technical Reports Server (NTRS)
Morrison, Joseph H.; Hemsch, Michael J.
2007-01-01
The first AIAA Drag Prediction Workshop (DPW), held in June 2001, evaluated the results from an extensive N-version test of a collection of Reynolds-Averaged Navier-Stokes CFD codes. The code-to-code scatter was more than an order of magnitude larger than desired for design and experimental validation of cruise conditions for a subsonic transport configuration. The second AIAA Drag Prediction Workshop, held in June 2003, emphasized the determination of installed pylon-nacelle drag increments and grid refinement studies. The code-to-code scatter was significantly reduced compared to the first DPW, but still larger than desired. However, grid refinement studies showed no significant improvement in code-to-code scatter with increasing grid refinement. The third AIAA Drag Prediction Workshop, held in June 2006, focused on the determination of installed side-of-body fairing drag increments and grid refinement studies for clean attached flow on wing alone configurations and for separated flow on the DLR-F6 subsonic transport model. This report compares the transonic cruise prediction results of the second and third workshops using statistical analysis.
Parameterization of 3D brain structures for statistical shape analysis
NASA Astrophysics Data System (ADS)
Zhu, Litao; Jiang, Tianzi
2004-05-01
Statistical Shape Analysis (SSA) is a powerful tool for noninvasive studies of pathophysiology and diagnosis of brain diseases. It also provides a shape constraint for the segmentation of brain structures. There are two key problems in SSA: the representation of shapes and their alignments. The widely used parameterized representations are obtained by preserving angles or areas and the alignments of shapes are achieved by rotating parameter net. However, representations preserving angles or areas do not really guarantee the anatomical correspondence of brain structures. In this paper, we incorporate shape-based landmarks into parameterization of banana-like 3D brain structures to address this problem. Firstly, we get the triangulated surface of the object and extract two landmarks from the mesh, i.e. the ends of the banana-like object. Then the surface is parameterized by creating a continuous and bijective mapping from the surface to a spherical surface based on a heat conduction model. The correspondence of shapes is achieved by mapping the two landmarks to the north and south poles of the sphere and using an extracted origin orientation to select the dateline during parameterization. We apply our approach to the parameterization of lateral ventricle and a multi-resolution shape representation is obtained by using the Discrete Fourier Transform.
Statistical shape analysis of subcortical structures using spectral matching.
Shakeri, Mahsa; Lombaert, Herve; Datta, Alexandre N; Oser, Nadine; Létourneau-Guillon, Laurent; Lapointe, Laurence Vincent; Martin, Florence; Malfait, Domitille; Tucholka, Alan; Lippé, Sarah; Kadoury, Samuel
2016-09-01
Studying morphological changes of subcortical structures often predicate neurodevelopmental and neurodegenerative diseases, such as Alzheimer's disease and schizophrenia. Hence, methods for quantifying morphological variations in the brain anatomy, including groupwise shape analyses, are becoming increasingly important for studying neurological disorders. In this paper, a novel groupwise shape analysis approach is proposed to detect regional morphological alterations in subcortical structures between two study groups, e.g., healthy and pathological subjects. The proposed scheme extracts smoothed triangulated surface meshes from segmented binary maps, and establishes reliable point-to-point correspondences among the population of surfaces using a spectral matching method. Mean curvature features are incorporated in the matching process, in order to increase the accuracy of the established surface correspondence. The mean shapes are created as the geometric mean of all surfaces in each group, and a distance map between these shapes is used to characterize the morphological changes between the two study groups. The resulting distance map is further analyzed to check for statistically significant differences between two populations. The performance of the proposed framework is evaluated on two separate subcortical structures (hippocampus and putamen). Furthermore, the proposed methodology is validated in a clinical application for detecting abnormal subcortical shape variations in Alzheimer's disease. Experimental results show that the proposed method is comparable to state-of-the-art algorithms, has less computational cost, and is more sensitive to small morphological variations in patients with neuropathologies.
High Statistics Analysis of Nucleon form Factor in Lattice QCD
NASA Astrophysics Data System (ADS)
Shintani, Eigo; Wittig, Hartmut
We systematically study the effect of excited state contamination into the signal of nucleon axial, (iso-)scalar and tensor charge, extracted from three-point function with various sets of source-sink separation. In order to enhance the statistics at O(10,000) measurement, we use the all-mode-averaging technique using the approximation of observable with the optimized size of local deflation field and block size of Schwartz alternative procedure to reduce the computational cost. Numerical study is performed with the range of source-sink separation (ts) from 0.8 fm to more than 1.5 fm with several cut-off scales (a-1 = 3-4 GeV) and pion masses (mπ = 0.19-0.45 GeV) keeping the volume as mπL > 4 on Nf = 2 Wilson-clover fermion configurations in Mainz-CLS group. We suggest that in the measurement of axial-charge there appears the significant effect of unsuppressed excited state contamination at less than ts = 1.2 fm even in light pion region, otherwides those are small in scalar and tensor charge. In the analysis using ts > 1.5 fm, the axial charge approaches to experimental result near physical point.
Statistical analysis of barrier isolator/glovebox glove failure.
Park, Young H; Pines, E; Ofouku, M; Cournoyer, M E
2007-01-01
In response to new, stricter safety requirements set out by the federal government, compounding pharmacists are investigating applications and processes appropriate for their facilities. One application, cutrrently used by many industries, was developed by Los Alamos National Laboratories for defense work. A barrier isolator or "glovebox" is a containment device that allows work within a sealed space while providing protection for people and the environment. Though knowledge of glove box use and maintenance has grown, unplanned breaches (e.g., glove failures) remain a concern. Recognizing that effective maintenance procedures can minimize breaches, we analyzed data drawn from glove failure records of Los Alamos National Laboratory's Nuclear Materials Technology Division to evaluate current inventory strategy in light of actual performance of the various types of gloves. This report includes a description of the statistical methods employed. The results of our analysis pinpointed the most frequently occurring causes of glove failure and revealed a significant imbalance between the current glove replacement schedule and the rate of glove failures in a much shorter period. We concluded that, to minimize unplanned breaches, either the replacement period needs to be adjusted or causes of failure eliminated or reduced.
NASA Astrophysics Data System (ADS)
Kreinovich, Vladik; Longpre, Luc; Starks, Scott A.; Xiang, Gang; Beck, Jan; Kandathi, Raj; Nayak, Asis; Ferson, Scott; Hajagos, Janos
2007-02-01
In many areas of science and engineering, it is desirable to estimate statistical characteristics (mean, variance, covariance, etc.) under interval uncertainty. For example, we may want to use the measured values x(t) of a pollution level in a lake at different moments of time to estimate the average pollution level; however, we do not know the exact values x(t)--e.g., if one of the measurement results is 0, this simply means that the actual (unknown) value of x(t) can be anywhere between 0 and the detection limit (DL). We must, therefore, modify the existing statistical algorithms to process such interval data. Such a modification is also necessary to process data from statistical databases, where, in order to maintain privacy, we only keep interval ranges instead of the actual numeric data (e.g., a salary range instead of the actual salary). Most resulting computational problems are NP-hard--which means, crudely speaking, that in general, no computationally efficient algorithm can solve all particular cases of the corresponding problem. In this paper, we overview practical situations in which computationally efficient algorithms exist: e.g., situations when measurements are very accurate, or when all the measurements are done with one (or few) instruments. As a case study, we consider a practical problem from bioinformatics: to discover the genetic difference between the cancer cells and the healthy cells, we must process the measurements results and find the concentrations c and h of a given gene in cancer and in healthy cells. This is a particular case of a general situation in which, to estimate states or parameters which are not directly accessible by measurements, we must solve a system of equations in which coefficients are only known with interval uncertainty. We show that in general, this problem is NP-hard, and we describe new efficient algorithms for solving this problem in practically important situations.
Bayesian Analysis of Order-Statistics Models for Ranking Data.
ERIC Educational Resources Information Center
Yu, Philip L. H.
2000-01-01
Studied the order-statistics models, extending the usual normal order-statistics model into one in which the underlying random variables followed a multivariate normal distribution. Used a Bayesian approach and the Gibbs sampling technique. Applied the proposed method to analyze presidential election data from the American Psychological…
The Higher Education System in Israel: Statistical Abstract and Analysis.
ERIC Educational Resources Information Center
Herskovic, Shlomo
This edition of a statistical abstract published every few years on the higher education system in Israel presents the most recent data available through 1990-91. The data were gathered through the cooperation of the Central Bureau of Statistics and institutions of higher education. Chapter 1 presents a summary of principal findings covering the…
ERIC Educational Resources Information Center
Mascaró, Maite; Sacristán, Ana Isabel; Rufino, Marta M.
2016-01-01
For the past 4 years, we have been involved in a project that aims to enhance the teaching and learning of experimental analysis and statistics, of environmental and biological sciences students, through computational programming activities (using R code). In this project, through an iterative design, we have developed sequences of R-code-based…
Using the statistical analysis method to assess the landslide susceptibility
NASA Astrophysics Data System (ADS)
Chan, Hsun-Chuan; Chen, Bo-An; Wen, Yo-Ting
2015-04-01
This study assessed the landslide susceptibility in Jing-Shan River upstream watershed, central Taiwan. The landslide inventories during typhoons Toraji in 2001, Mindulle in 2004, Kalmaegi and Sinlaku in 2008, Morakot in 2009, and the 0719 rainfall event in 2011, which were established by Taiwan Central Geological Survey, were used as landslide data. This study aims to assess the landslide susceptibility by using different statistical methods including logistic regression, instability index method and support vector machine (SVM). After the evaluations, the elevation, slope, slope aspect, lithology, terrain roughness, slope roughness, plan curvature, profile curvature, total curvature, average of rainfall were chosen as the landslide factors. The validity of the three established models was further examined by the receiver operating characteristic curve. The result of logistic regression showed that the factor of terrain roughness and slope roughness had a stronger impact on the susceptibility value. Instability index method showed that the factor of terrain roughness and lithology had a stronger impact on the susceptibility value. Due to the fact that the use of instability index method may lead to possible underestimation around the river side. In addition, landslide susceptibility indicated that the use of instability index method laid a potential issue about the number of factor classification. An increase of the number of factor classification may cause excessive variation coefficient of the factor. An decrease of the number of factor classification may make a large range of nearby cells classified into the same susceptibility level. Finally, using the receiver operating characteristic curve discriminate the three models. SVM is a preferred method than the others in assessment of landslide susceptibility. Moreover, SVM is further suggested to be nearly logistic regression in terms of recognizing the medium-high and high susceptibility.
On the Statistical Analysis of X-Ray Polarization Measurements
NASA Astrophysics Data System (ADS)
Strohmayer, T. E.; Kallman, T. R.
2013-08-01
In many polarimetry applications, including observations in the X-ray band, the measurement of a polarization signal can be reduced to the detection and quantification of a deviation from uniformity of a distribution of measured angles of the form A + Bcos 2(phi - phi0) (0 < phi < π). We explore the statistics of such polarization measurements using Monte Carlo simulations and χ2 fitting methods. We compare our results to those derived using the traditional probability density used to characterize polarization measurements and quantify how they deviate as the intrinsic modulation amplitude grows. We derive relations for the number of counts required to reach a given detection level (parameterized by β the "number of σ's" of the measurement) appropriate for measuring the modulation amplitude a by itself (single interesting parameter case) or jointly with the position angle phi (two interesting parameters case). We show that for the former case, when the intrinsic amplitude is equal to the well-known minimum detectable polarization, (MDP) it is, on average, detected at the 3σ level. For the latter case, when one requires a joint measurement at the same confidence level, then more counts are needed than what was required to achieve the MDP level. This additional factor is amplitude-dependent, but is ≈2.2 for intrinsic amplitudes less than about 20%. It decreases slowly with amplitude and is ≈1.8 when the amplitude is 50%. We find that the position angle uncertainty at 1σ confidence is well described by the relation σphi = 28.°5/β.
Lamart, Stephanie; Griffiths, Nina M; Tchitchek, Nicolas; Angulo, Jaime F; Van der Meeren, Anne
2017-03-01
The aim of this work was to develop a computational tool that integrates several statistical analysis features for biodistribution data from internal contamination experiments. These data represent actinide levels in biological compartments as a function of time and are derived from activity measurements in tissues and excreta. These experiments aim at assessing the influence of different contamination conditions (e.g. intake route or radioelement) on the biological behavior of the contaminant. The ever increasing number of datasets and diversity of experimental conditions make the handling and analysis of biodistribution data difficult. This work sought to facilitate the statistical analysis of a large number of datasets and the comparison of results from diverse experimental conditions. Functional modules were developed using the open-source programming language R to facilitate specific operations: descriptive statistics, visual comparison, curve fitting, and implementation of biokinetic models. In addition, the structure of the datasets was harmonized using the same table format. Analysis outputs can be written in text files and updated data can be written in the consistent table format. Hence, a data repository is built progressively, which is essential for the optimal use of animal data. Graphical representations can be automatically generated and saved as image files. The resulting computational tool was applied using data derived from wound contamination experiments conducted under different conditions. In facilitating biodistribution data handling and statistical analyses, this computational tool ensures faster analyses and a better reproducibility compared with the use of multiple office software applications. Furthermore, re-analysis of archival data and comparison of data from different sources is made much easier. Hence this tool will help to understand better the influence of contamination characteristics on actinide biokinetics. Our approach can aid
Combined statistical analysis of landslide release and propagation
NASA Astrophysics Data System (ADS)
Mergili, Martin; Rohmaneo, Mohammad; Chu, Hone-Jay
2016-04-01
Statistical methods - often coupled with stochastic concepts - are commonly employed to relate areas affected by landslides with environmental layers, and to estimate spatial landslide probabilities by applying these relationships. However, such methods only concern the release of landslides, disregarding their motion. Conceptual models for mass flow routing are used for estimating landslide travel distances and possible impact areas. Automated approaches combining release and impact probabilities are rare. The present work attempts to fill this gap by a fully automated procedure combining statistical and stochastic elements, building on the open source GRASS GIS software: (1) The landslide inventory is subset into release and deposition zones. (2) We employ a traditional statistical approach to estimate the spatial release probability of landslides. (3) We back-calculate the probability distribution of the angle of reach of the observed landslides, employing the software tool r.randomwalk. One set of random walks is routed downslope from each pixel defined as release area. Each random walk stops when leaving the observed impact area of the landslide. (4) The cumulative probability function (cdf) derived in (3) is used as input to route a set of random walks downslope from each pixel in the study area through the DEM, assigning the probability gained from the cdf to each pixel along the path (impact probability). The impact probability of a pixel is defined as the average impact probability of all sets of random walks impacting a pixel. Further, the average release probabilities of the release pixels of all sets of random walks impacting a given pixel are stored along with the area of the possible release zone. (5) We compute the zonal release probability by increasing the release probability according to the size of the release zone - the larger the zone, the larger the probability that a landslide will originate from at least one pixel within this zone. We
Parallelization of the Physical-Space Statistical Analysis System (PSAS)
NASA Technical Reports Server (NTRS)
Larson, J. W.; Guo, J.; Lyster, P. M.
1999-01-01
Atmospheric data assimilation is a method of combining observations with model forecasts to produce a more accurate description of the atmosphere than the observations or forecast alone can provide. Data assimilation plays an increasingly important role in the study of climate and atmospheric chemistry. The NASA Data Assimilation Office (DAO) has developed the Goddard Earth Observing System Data Assimilation System (GEOS DAS) to create assimilated datasets. The core computational components of the GEOS DAS include the GEOS General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). The need for timely validation of scientific enhancements to the data assimilation system poses computational demands that are best met by distributed parallel software. PSAS is implemented in Fortran 90 using object-based design principles. The analysis portions of the code solve two equations. The first of these is the "innovation" equation, which is solved on the unstructured observation grid using a preconditioned conjugate gradient (CG) method. The "analysis" equation is a transformation from the observation grid back to a structured grid, and is solved by a direct matrix-vector multiplication. Use of a factored-operator formulation reduces the computational complexity of both the CG solver and the matrix-vector multiplication, rendering the matrix-vector multiplications as a successive product of operators on a vector. Sparsity is introduced to these operators by partitioning the observations using an icosahedral decomposition scheme. PSAS builds a large (approx. 128MB) run-time database of parameters used in the calculation of these operators. Implementing a message passing parallel computing paradigm into an existing yet developing computational system as complex as PSAS is nontrivial. One of the technical challenges is balancing the requirements for computational reproducibility with the need for high performance. The problem of computational
Regression analysis of mixed recurrent-event and panel-count data with additive rate models.
Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L
2015-03-01
Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study.
A deterministic and statistical energy analysis of tyre cavity resonance noise
NASA Astrophysics Data System (ADS)
Mohamed, Zamri; Wang, Xu
2016-03-01
Tyre cavity resonance was studied using a combination of deterministic analysis and statistical energy analysis where its deterministic part was implemented using the impedance compact mobility matrix method and its statistical part was done by the statistical energy analysis method. While the impedance compact mobility matrix method can offer a deterministic solution to the cavity pressure response and the compliant wall vibration velocity response in the low frequency range, the statistical energy analysis method can offer a statistical solution of the responses in the high frequency range. In the mid frequency range, a combination of the statistical energy analysis and deterministic analysis methods can identify system coupling characteristics. Both methods have been compared to those from commercial softwares in order to validate the results. The combined analysis result has been verified by the measurement result from a tyre-cavity physical model. The analysis method developed in this study can be applied to other similar toroidal shape structural-acoustic systems.
AMA Statistical Information Based Analysis of a Compressive Imaging System
NASA Astrophysics Data System (ADS)
Hope, D.; Prasad, S.
-based analysis of a compressive imaging system based on a new highly efficient and robust method that enables us to evaluate statistical entropies. Our method is based on the notion of density of states (DOS), which plays a major role in statistical mechanics by allowing one to express macroscopic thermal averages in terms of the number of configuration states of a system for a certain energy level. Instead of computing the number of states at a certain energy level, however, we compute the number of possible configurations (states) of a particular image scene that correspond to a certain probability value. This allows us to compute the probability for each possible state, or configuration, of the scene being imaged. We assess the performance of a single pixel compressive imaging system based on the amount of information encoded and transmitted in parameters that characterize the information in the scene. Amongst many examples, we study the problem of faint companion detection. Here, we show how information in the recorded images depends on the choice of basis for representing the scene and the amount of measurement noise. The noise creates confusion when associating a recorded image with the correct member of the ensemble that produced the image. We show that multiple measurements enable one to mitigate this confusion noise.
Navy Additive Manufacturing: Policy Analysis for Future DLA Material Support
2014-12-01
support programs. 14. SUBJECT TERMS additive manufacturing, 3D printing, technology adoption 15. NUMBER OF PAGES 69 16...LEFT BLANK xii LIST OF ACRONYMS AND ABBREVIATIONS 3D Three Dimensions or Three Dimensional 3DP 3D Printing AM Additive Manufacturing AMDO...this is about to change. Additive manufacturing (AM) systems (commonly known as “ 3D printing”) could bring the organic parts manufacturing capability
SUBMILLIMETER NUMBER COUNTS FROM STATISTICAL ANALYSIS OF BLAST MAPS
Patanchon, Guillaume; Ade, Peter A. R.; Griffin, Matthew; Hargrave, Peter C.; Mauskopf, Philip; Moncelsi, Lorenzo; Pascale, Enzo; Bock, James J.; Chapin, Edward L.; Halpern, Mark; Marsden, Gaelen; Scott, Douglas; Devlin, Mark J.; Dicker, Simon R.; Klein, Jeff; Rex, Marie; Gundersen, Joshua O.; Hughes, David H.; Netterfield, Calvin B.; Olmi, Luca
2009-12-20
We describe the application of a statistical method to estimate submillimeter galaxy number counts from confusion-limited observations by the Balloon-borne Large Aperture Submillimeter Telescope (BLAST). Our method is based on a maximum likelihood fit to the pixel histogram, sometimes called 'P(D)', an approach which has been used before to probe faint counts, the difference being that here we advocate its use even for sources with relatively high signal-to-noise ratios. This method has an advantage over standard techniques of source extraction in providing an unbiased estimate of the counts from the bright end down to flux densities well below the confusion limit. We specifically analyze BLAST observations of a roughly 10 deg{sup 2} map centered on the Great Observatories Origins Deep Survey South field. We provide estimates of number counts at the three BLAST wavelengths 250, 350, and 500 mum; instead of counting sources in flux bins we estimate the counts at several flux density nodes connected with power laws. We observe a generally very steep slope for the counts of about -3.7 at 250 mum, and -4.5 at 350 and 500 mum, over the range approx0.02-0.5 Jy, breaking to a shallower slope below about 0.015 Jy at all three wavelengths. We also describe how to estimate the uncertainties and correlations in this method so that the results can be used for model-fitting. This method should be well suited for analysis of data from the Herschel satellite.
NASA Technical Reports Server (NTRS)
Djorgovski, George
1993-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
NASA Technical Reports Server (NTRS)
Djorgovski, Stanislav
1992-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
Statistical Design, Models and Analysis for the Job Change Framework.
ERIC Educational Resources Information Center
Gleser, Leon Jay
1990-01-01
Proposes statistical methodology for testing Loughead and Black's "job change thermostat." Discusses choice of target population; relationship between job satisfaction and values, perceptions, and opportunities; and determinants of job change. (SK)
Unbiased statistical analysis for multi-stage proteomic search strategies.
Everett, Logan J; Bierl, Charlene; Master, Stephen R
2010-02-05
"Multi-stage" search strategies have become widely accepted for peptide identification and are implemented in a number of available software packages. We describe limitations of these strategies for validation and decoy-based statistical analyses and demonstrate these limitations using a set of control sample spectra. We propose a solution that corrects the statistical deficiencies and describe its implementation using the open-source software X!Tandem.
Statistical analysis of synaptic transmission: model discrimination and confidence limits.
Stricker, C; Redman, S; Daley, D
1994-01-01
Procedures for discriminating between competing statistical models of synaptic transmission, and for providing confidence limits on the parameters of these models, have been developed. These procedures were tested against simulated data and were used to analyze the fluctuations in synaptic currents evoked in hippocampal neurones. All models were fitted to data using the Expectation-Maximization algorithm and a maximum likelihood criterion. Competing models were evaluated using the log-likelihood ratio (Wilks statistic). When the competing models were not nested, Monte Carlo sampling of the model used as the null hypothesis (H0) provided density functions against which H0 and the alternate model (H1) were tested. The statistic for the log-likelihood ratio was determined from the fit of H0 and H1 to these probability densities. This statistic was used to determine the significance level at which H0 could be rejected for the original data. When the competing models were nested, log-likelihood ratios and the chi 2 statistic were used to determine the confidence level for rejection. Once the model that provided the best statistical fit to the data was identified, many estimates for the model parameters were calculated by resampling the original data. Bootstrap techniques were then used to obtain the confidence limits of these parameters. PMID:7948672
Hybrid Additive Manufacturing Technologies - An Analysis Regarding Potentials and Applications
NASA Astrophysics Data System (ADS)
Merklein, Marion; Junker, Daniel; Schaub, Adam; Neubauer, Franziska
Imposing the trend of mass customization of lightweight construction in industry, conventional manufacturing processes like forming technology and chipping production are pushed to their limits for economical manufacturing. More flexible processes are needed which were developed by the additive manufacturing technology. This toolless production principle offers a high geometrical freedom and an optimized utilization of the used material. Thus load adjusted lightweight components can be produced in small lot sizes in an economical way. To compensate disadvantages like inadequate accuracy and surface roughness hybrid machines combining additive and subtractive manufacturing are developed. Within this paper the principles of mainly used additive manufacturing processes of metals and their possibility to be integrated into a hybrid production machine are summarized. It is pointed out that in particular the integration of deposition processes into a CNC milling center supposes high potential for manufacturing larger parts with high accuracy. Furthermore the combination of additive and subtractive manufacturing allows the production of ready to use products within one single machine. Additionally actual research for the integration of additive manufacturing processes into the production chain will be analyzed. For the long manufacturing time of additive production processes the combination with conventional manufacturing processes like sheet or bulk metal forming seems an effective solution. Especially large volumes can be produced by conventional processes. In an additional production step active elements can be applied by additive manufacturing. This principle is also investigated for tool production to reduce chipping of the high strength material used for forming tools. The aim is the addition of active elements onto a geometrical simple basis by using Laser Metal Deposition. That process allows the utilization of several powder materials during one process what
Liu, Chen; Wu, Xin-wu
2011-04-01
A relationship between the waste production and socio-economic factors is essential in waste management. In the present study, the factors influencing municipal solid waste generation in China were investigated by multiple statistical analysis. Twelve items were chosen for investigation: GDP, per capita GDP, urban population, the proportion of urban population, the area of urban construction, the area of paved roads, the area of urban gardens and green areas, the number of the large cities, annual per capita disposable income of urban households, annual per capita consumption expenditure of urban households, total energy consumption and annual per capital consumption for households. Two methodologies from multiple statistical analysis were selected; specifically principal components analysis (PCA) and cluster analysis (CA). Three new dimensions were identified by PCA: component 1: economy and urban development; component 2: energy consumption; and component 3: urban scale. The three components together accounted for 99.1% of the initial variance. The results show that economy and urban development are important items influencing MSW generation. The proportion of urban population and urban population had the highest loadings in all factors. The relationship between growth of gross domestic product (GDP) and production of MSW was not as clear-cut as often assumed in China, a situation that is more likely to apply to developed countries. Energy consumption was another factor considered in our study of MSW generation. In addition, the annual MSW quantity variation was investigated by cluster analysis.
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most
Wang, Youping; Sonntag, Karin; Rudloff, Eicke; Wehling, Peter; Snowdon, Rod J
2006-02-01
Two Brassica napus-Crambe abyssinica monosomic addition lines (2n=39, AACC plus a single chromosome from C. abyssinca) were obtained from the F(2) progeny of the asymmetric somatic hybrid. The alien chromosome from C. abyssinca in the addition line was clearly distinguished by genomic in situ hybridization (GISH). Twenty-seven microspore-derived plants from the addition lines were obtained. Fourteen seedlings were determined to be diploid plants (2n=38) arising from spontaneous chromosome doubling, while 13 seedlings were confirmed as haploid plants. Doubled haploid plants produced after treatment with colchicine and two disomic chromosome addition lines (2n=40, AACC plus a single pair of homologous chromosomes from C. abyssinca) could again be identified by GISH analysis. The lines are potentially useful for molecular genetic analysis of novel C. abyssinica genes or alleles contributing to traits relevant for oilseed rape (B. napus) breeding.
A multiple additive regression tree analysis of three exposure measures during Hurricane Katrina.
Curtis, Andrew; Li, Bin; Marx, Brian D; Mills, Jacqueline W; Pine, John
2011-01-01
This paper analyses structural and personal exposure to Hurricane Katrina. Structural exposure is measured by flood height and building damage; personal exposure is measured by the locations of 911 calls made during the response. Using these variables, this paper characterises the geography of exposure and also demonstrates the utility of a robust analytical approach in understanding health-related challenges to disadvantaged populations during recovery. Analysis is conducted using a contemporary statistical approach, a multiple additive regression tree (MART), which displays considerable improvement over traditional regression analysis. By using MART, the percentage of improvement in R-squares over standard multiple linear regression ranges from about 62 to more than 100 per cent. The most revealing finding is the modelled verification that African Americans experienced disproportionate exposure in both structural and personal contexts. Given the impact of exposure to health outcomes, this finding has implications for understanding the long-term health challenges facing this population.
Fixed-ratio ray designs have been used for detecting and characterizing interactions of large numbers of chemicals in combination. Single chemical dose-response data are used to predict an “additivity curve” along an environmentally relevant ray. A “mixture curve” is estimated fr...
Statistical analysis of large-scale neuronal recording data
Reed, Jamie L.; Kaas, Jon H.
2010-01-01
Relating stimulus properties to the response properties of individual neurons and neuronal networks is a major goal of sensory research. Many investigators implant electrode arrays in multiple brain areas and record from chronically implanted electrodes over time to answer a variety of questions. Technical challenges related to analyzing large-scale neuronal recording data are not trivial. Several analysis methods traditionally used by neurophysiologists do not account for dependencies in the data that are inherent in multi-electrode recordings. In addition, when neurophysiological data are not best modeled by the normal distribution and when the variables of interest may not be linearly related, extensions of the linear modeling techniques are recommended. A variety of methods exist to analyze correlated data, even when data are not normally distributed and the relationships are nonlinear. Here we review expansions of the Generalized Linear Model designed to address these data properties. Such methods are used in other research fields, and the application to large-scale neuronal recording data will enable investigators to determine the variable properties that convincingly contribute to the variances in the observed neuronal measures. Standard measures of neuron properties such as response magnitudes can be analyzed using these methods, and measures of neuronal network activity such as spike timing correlations can be analyzed as well. We have done just that in recordings from 100-electrode arrays implanted in the primary somatosensory cortex of owl monkeys. Here we illustrate how one example method, Generalized Estimating Equations analysis, is a useful method to apply to large-scale neuronal recordings. PMID:20472395
Statistical Analysis of CMC Constituent and Processing Data
NASA Technical Reports Server (NTRS)
Fornuff, Jonathan
2004-01-01
observed using statistical analysis software. The ultimate purpose of this study is to determine what variations in material processing can lead to the most critical changes in the materials property. The work I have taken part in this summer explores, in general, the key properties needed In this study SiC/SiC composites of varying architectures, utilizing a boron-nitride (BN)
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Statistical analysis of 1D HRR target features
NASA Astrophysics Data System (ADS)
Gross, David C.; Schmitz, James L.; Williams, Robert L.
2000-08-01
Automatic target recognition (ATR) and feature-aided tracking (FAT) algorithms that use one-dimensional (1-D) high range resolution (HRR) profiles require unique or distinguishable target features. This paper explores the use of statistical measures to quantify the separability and stability of ground target features found in HRR profiles. Measures of stability, such as the mean and variance, can be used to determine the stability of a target feature as a function of the target aspect and elevation angle. Statistical measures of feature predictability and separability, such as the Fisher and Bhattacharyya measures, demonstrate the capability to adequately predict the desired target feature over a specified aspect angular region. These statistical measures for separability and stability are explained in detail and their usefulness is demonstrated with measured HRR data.
Statistical parameters and analysis of local contrast gloss.
Oksman, Antti; Juuti, Mikko; Peiponen, Kai-Erik
2008-08-04
Recently, we introduced a sensor for the detection of local contrast gloss (or luster) of products. This is a new development step in contrast gloss measurement, since contrast gloss has been measured previously from a macroscopic area. Therefore, yet there do not exist statistical parameters for the classification of the contrast gloss as there exist parameters, e.g., for the classification of the surface roughness. In this study, we define novel statistical parameters for the diffuse component and contrast gloss obtained by the sensor for the detection of local contrast gloss. As an example, we utilize these statistical parameters and measured specular gloss, diffuse-component, and contrast gloss maps in the characterization of prints.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Interfaces between statistical analysis packages and the ESRI geographic information system
NASA Technical Reports Server (NTRS)
Masuoka, E.
1980-01-01
Interfaces between ESRI's geographic information system (GIS) data files and real valued data files written to facilitate statistical analysis and display of spatially referenced multivariable data are described. An example of data analysis which utilized the GIS and the statistical analysis system is presented to illustrate the utility of combining the analytic capability of a statistical package with the data management and display features of the GIS.
NASA Astrophysics Data System (ADS)
Moeck, Christian; Radny, Dirk; Borer, Paul; Rothardt, Judith; Auckenthaler, Adrian; Berg, Michael; Schirmer, Mario
2016-11-01
A combined approach of multivariate statistical analysis, namely factor analysis (FA) and hierarchical cluster analysis (HCA), interpretation of geochemical processes, stable water isotope data and organic micropollutants enabling to assess spatial patterns of water types was performed for a study area in Switzerland, where drinking water production is close to different potential input pathways for contamination. To avoid drinking water contamination, artificial groundwater recharge with surface water into an aquifer is used to create a hydraulic barrier between potential intake pathways for contamination and drinking water extraction wells. Inter-aquifer mixing in the subsurface is identified, where a high amount of artificial infiltrated surface water is mixed with a lesser amount of water originating from the regional flow pathway in the vicinity of drinking water extraction wells. The spatial distribution of different water types can be estimated and a conceptual system understanding is developed. Results of the multivariate statistical analysis are comparable with gained information from isotopic data and organic micropollutants analyses. The integrated approach using different kinds of observations can be easily transferred to a variety of hydrological settings to synthesise and evaluate large hydrochemical datasets. The combination of additional data with different information content is conceivable and enabled effective interpretation of hydrological processes. Using the applied approach leads to more sound conceptual system understanding acting as the very basis to develop improved water resources management practices in a sustainable way.
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
NASA Technical Reports Server (NTRS)
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
The Power of Statistical Tests for Moderators in Meta-Analysis
ERIC Educational Resources Information Center
Hedges, Larry V.; Pigott, Therese D.
2004-01-01
Calculation of the statistical power of statistical tests is important in planning and interpreting the results of research studies, including meta-analyses. It is particularly important in moderator analyses in meta-analysis, which are often used as sensitivity analyses to rule out moderator effects but also may have low statistical power. This…
Outliers in Statistical Analysis: Basic Methods of Detection and Accommodation.
ERIC Educational Resources Information Center
Jacobs, Robert
Researchers are often faced with the prospect of dealing with observations within a given data set that are unexpected in terms of their great distance from the concentration of observations. For their potential to influence the mean disproportionately, thus affecting many statistical analyses, outlying observations require special care on the…
Data Desk Professional: Statistical Analysis for the Macintosh.
ERIC Educational Resources Information Center
Wise, Steven L.; Kutish, Gerald W.
This review of Data Desk Professional, a statistical software package for Macintosh microcomputers, includes information on: (1) cost and the amount and allocation of memory; (2) usability (documentation quality, ease of use); (3) running programs; (4) program output (quality of graphics); (5) accuracy; and (6) user services. In conclusion, it is…
Private School Universe Survey, 1991-92. Statistical Analysis Report.
ERIC Educational Resources Information Center
Broughman, Stephen; And Others
This report on the private school universe, a data collection system developed by the National Center for Education Statistics, presents data on schools with grades kindergarten through 12 by school size, school level, religious orientation, geographical region, and program emphasis. Numbers of students and teachers are reported in the same…
Did Tanzania Achieve the Second Millennium Development Goal? Statistical Analysis
ERIC Educational Resources Information Center
Magoti, Edwin
2016-01-01
Development Goal "Achieve universal primary education", the challenges faced, along with the way forward towards achieving the fourth Sustainable Development Goal "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all". Statistics show that Tanzania has made very promising steps…
Statistical Analysis Tools for Learning in Engineering Laboratories.
ERIC Educational Resources Information Center
Maher, Carolyn A.
1990-01-01
Described are engineering programs that have used automated data acquisition systems to implement data collection and analyze experiments. Applications include a biochemical engineering laboratory, heat transfer performance, engineering materials testing, mechanical system reliability, statistical control laboratory, thermo-fluid laboratory, and a…
Bringing statistics up to speed with data in analysis of lymphocyte motility.
Letendre, Kenneth; Donnadieu, Emmanuel; Moses, Melanie E; Cannon, Judy L
2015-01-01
Two-photon (2P) microscopy provides immunologists with 3D video of the movement of lymphocytes in vivo. Motility parameters extracted from these videos allow detailed analysis of lymphocyte motility in lymph nodes and peripheral tissues. However, standard parametric statistical analyses such as the Student's t-test are often used incorrectly, and fail to take into account confounds introduced by the experimental methods, potentially leading to erroneous conclusions about T cell motility. Here, we compare the motility of WT T cell versus PKCθ-/-, CARMA1-/-, CCR7-/-, and PTX-treated T cells. We show that the fluorescent dyes used to label T cells have significant effects on T cell motility, and we demonstrate the use of factorial ANOVA as a statistical tool that can control for these effects. In addition, researchers often choose between the use of "cell-based" parameters by averaging multiple steps of a single cell over time (e.g. cell mean speed), or "step-based" parameters, in which all steps of a cell population (e.g. instantaneous speed) are grouped without regard for the cell track. Using mixed model ANOVA, we show that we can maintain cell-based analyses without losing the statistical power of step-based data. We find that as we use additional levels of statistical control, we can more accurately estimate the speed of T cells as they move in lymph nodes as well as measure the impact of individual signaling molecules on T cell motility. As there is increasing interest in using computational modeling to understand T cell behavior in in vivo, these quantitative measures not only give us a better determination of actual T cell movement, they may prove crucial for models to generate accurate predictions about T cell behavior.
Bringing Statistics Up to Speed with Data in Analysis of Lymphocyte Motility
Letendre, Kenneth; Donnadieu, Emmanuel
2015-01-01
Two-photon (2P) microscopy provides immunologists with 3D video of the movement of lymphocytes in vivo. Motility parameters extracted from these videos allow detailed analysis of lymphocyte motility in lymph nodes and peripheral tissues. However, standard parametric statistical analyses such as the Student’s t-test are often used incorrectly, and fail to take into account confounds introduced by the experimental methods, potentially leading to erroneous conclusions about T cell motility. Here, we compare the motility of WT T cell versus PKCθ-/-, CARMA1-/-, CCR7-/-, and PTX-treated T cells. We show that the fluorescent dyes used to label T cells have significant effects on T cell motility, and we demonstrate the use of factorial ANOVA as a statistical tool that can control for these effects. In addition, researchers often choose between the use of “cell-based” parameters by averaging multiple steps of a single cell over time (e.g. cell mean speed), or “step-based” parameters, in which all steps of a cell population (e.g. instantaneous speed) are grouped without regard for the cell track. Using mixed model ANOVA, we show that we can maintain cell-based analyses without losing the statistical power of step-based data. We find that as we use additional levels of statistical control, we can more accurately estimate the speed of T cells as they move in lymph nodes as well as measure the impact of individual signaling molecules on T cell motility. As there is increasing interest in using computational modeling to understand T cell behavior in in vivo, these quantitative measures not only give us a better determination of actual T cell movement, they may prove crucial for models to generate accurate predictions about T cell behavior. PMID:25973755
Research on the integrative strategy of spatial statistical analysis of GIS
NASA Astrophysics Data System (ADS)
Xie, Zhong; Han, Qi Juan; Wu, Liang
2008-12-01
Presently, the spacial social and natural phenomenon is studied by both the GIS technique and statistics methods. However, plenty of complex practical applications restrict these research methods. The data models and technologies exploited are full of special localization. This paper firstly sums up the requirement of spacial statistical analysis. On the base of the requirement, the universal spatial statistical models are transformed into the function tools in statistical GIS system. A pyramidal structure of three layers is brought forward. Therefore, it is feasible to combine the techniques of spacial dada management, searches and visualization in GIS with the methods of processing data in the statistic analysis. It will form an integrative statistical GIS environment with the management, analysis, application and assistant decision-making of spacial statistical information.
Statistical analysis of modeling error in structural dynamic systems
NASA Technical Reports Server (NTRS)
Hasselman, T. K.; Chrostowski, J. D.
1990-01-01
The paper presents a generic statistical model of the (total) modeling error for conventional space structures in their launch configuration. Modeling error is defined as the difference between analytical prediction and experimental measurement. It is represented by the differences between predicted and measured real eigenvalues and eigenvectors. Comparisons are made between pre-test and post-test models. Total modeling error is then subdivided into measurement error, experimental error and 'pure' modeling error, and comparisons made between measurement error and total modeling error. The generic statistical model presented in this paper is based on the first four global (primary structure) modes of four different structures belonging to the generic category of Conventional Space Structures (specifically excluding large truss-type space structures). As such, it may be used to evaluate the uncertainty of predicted mode shapes and frequencies, sinusoidal response, or the transient response of other structures belonging to the same generic category.
Statistical analysis of the lithospheric magnetic anomaly data
NASA Astrophysics Data System (ADS)
Pavon-Carrasco, Fco Javier; de Santis, Angelo; Ferraccioli, Fausto; Catalán, Manuel; Ishihara, Takemi
2013-04-01
Different analyses carried out on the lithospheric magnetic anomaly data from GEODAS DVD v5.0.10 database (World Digital Magnetic Anomaly Map, WDMAM) show that the data distribution is not Gaussian, but Laplacian. Although this behaviour has been formerly pointed out in other works (e.g., Walker and Jackson, Geophys. J. Int, 143, 799-808, 2000), they have not given any explanation about this statistical property of the magnetic anomalies. In this work, we perform different statistical tests to confirm that the lithospheric magnetic anomaly data follow indeed a Laplacian distribution and we also give a possible interpretation of this behavior providing a model of magnetization which depends on the variation of the geomagnetic field and both induced and remanent magnetizations in the terrestrial lithosphere.
Statistical Analysis of CFD Solutions from the Drag Prediction Workshop
NASA Technical Reports Server (NTRS)
Hemsch, Michael J.
2002-01-01
A simple, graphical framework is presented for robust statistical evaluation of results obtained from N-Version testing of a series of RANS CFD codes. The solutions were obtained by a variety of code developers and users for the June 2001 Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration used for the computational tests is the DLR-F4 wing-body combination previously tested in several European wind tunnels and for which a previous N-Version test had been conducted. The statistical framework is used to evaluate code results for (1) a single cruise design point, (2) drag polars and (3) drag rise. The paper concludes with a discussion of the meaning of the results, especially with respect to predictability, Validation, and reporting of solutions.
Statistical analysis of motion contrast in optical coherence tomography angiography
NASA Astrophysics Data System (ADS)
Cheng, Yuxuan; Guo, Li; Pan, Cong; Lu, Tongtong; Hong, Tianyu; Ding, Zhihua; Li, Peng
2015-11-01
Optical coherence tomography angiography (Angio-OCT), mainly based on the temporal dynamics of OCT scattering signals, has found a range of potential applications in clinical and scientific research. Based on the model of random phasor sums, temporal statistics of the complex-valued OCT signals are mathematically described. Statistical distributions of the amplitude differential and complex differential Angio-OCT signals are derived. The theories are validated through the flow phantom and live animal experiments. Using the model developed, the origin of the motion contrast in Angio-OCT is mathematically explained, and the implications in the improvement of motion contrast are further discussed, including threshold determination and its residual classification error, averaging method, and scanning protocol. The proposed mathematical model of Angio-OCT signals can aid in the optimal design of the system and associated algorithms.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology
NASA Technical Reports Server (NTRS)
Morgan, Carolyn; DePriest, Douglas; Thompson, Richard (Technical Monitor)
2002-01-01
The cost and safety goals for NASA's next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to establish statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The research work was focused on establishing the suitable mathematical/statistical models for these purposes. It is anticipated that the resulting models can be incorporated into a software tool to provide rapid, variable-fidelity, aerothermal environments to predict heating along an arbitrary trajectory. This work will support development of an integrated design tool to perform automated thermal protection system (TPS) sizing and material selection.
A Statistical Framework for the Functional Analysis of Metagenomes
Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.
2008-10-01
Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.
Integration of Advanced Statistical Analysis Tools and Geophysical Modeling
2010-12-01
later in this section. 2) San Luis Obispo . Extracted features were also provided for MTADS EM61, MTADS magnetics, EM61 cart, and TEMTADS data sets from...subsequent training of statistical classifiers using these features. Results of discrimination studies at Camp Sibert and San Luis Obispo have shown...Comparison of classification performance Figures 10 through 13 show receiver operating characteristics for data sets acquired at San Luis Obispo . Subplot
A statistical analysis of hard X-Ray solar flares
NASA Technical Reports Server (NTRS)
Pearce, G.; Rowe, A. K.; Yeung, J.
1993-01-01
In this study we perform a statistical study on, 8319 X-Ray solar flares observed with the Hard X-Ray Spectrometer (HXRBS) on the Solar Maximum Mission satellite (SMM). The events are examined in terms of the durations, maximum intensities, and intensity profiles. It is concluded that there is no evidence for a correlation between flare intensity, flare duration, and flare asymmetry. However, we do find evidence for a rapid fall-of in the number of short-duration events.
Statistical analysis of mammalian pre-mRNA splicing sites.
Gelfand, M S
1989-01-01
222 donor and 222 acceptor (including 206 pairs) non-homologous splicing sites were studied. Well known features of these were confirmed and some novel observations were made. It is (1) cCAGGGag signal in (-60)-(-58) region of acceptor sites; (2) strong complementarity between regions (-69)-(-55) and (-36)-(-22) of some of the acceptor sites, and (3) small but statistically significant correlation between discrimination energies of corresponding donor and acceptor sites. PMID:2528123
Computational and Statistical Analysis of Protein Mass Spectrometry Data
Noble, William Stafford; MacCoss, Michael J.
2012-01-01
High-throughput proteomics experiments involving tandem mass spectrometry produce large volumes of complex data that require sophisticated computational analyses. As such, the field offers many challenges for computational biologists. In this article, we briefly introduce some of the core computational and statistical problems in the field and then describe a variety of outstanding problems that readers of PLoS Computational Biology might be able to help solve. PMID:22291580
Kotula, Paul G; Keenan, Michael R
2006-12-01
Multivariate statistical analysis methods have been applied to scanning transmission electron microscopy (STEM) energy-dispersive X-ray spectral images. The particular application of the multivariate curve resolution (MCR) technique provides a high spectral contrast view of the raw spectral image. The power of this approach is demonstrated with a microelectronics failure analysis. Specifically, an unexpected component describing a chemical contaminant was found, as well as a component consistent with a foil thickness change associated with the focused ion beam specimen preparation process. The MCR solution is compared with a conventional analysis of the same spectral image data set.
NASA Astrophysics Data System (ADS)
Daeid, N. Nic; Meier-Augenstein, W.; Kemp, H. F.
2012-04-01
The analysis of cotton fibres can be particularly challenging within a forensic science context where discrimination of one fibre from another is of importance. Normally cotton fibre analysis examines the morphological structure of the recovered material and compares this with that of a known fibre from a particular source of interest. However, the conventional microscopic and chemical analysis of fibres and any associated dyes is generally unsuccessful because of the similar morphology of the fibres. Analysis of the dyes which may have been applied to the cotton fibre can also be undertaken though this can be difficult and unproductive in terms of discriminating one fibre from another. In the study presented here we have explored the potential for Isotope Ratio Mass Spectrometry (IRMS) to be utilised as an additional tool for cotton fibre analysis in an attempt to reveal further discriminatory information. This work has concentrated on un-dyed cotton fibres of known origin in order to expose the potential of the analytical technique. We report the results of a pilot study aimed at testing the hypothesis that multi-element stable isotope analysis of cotton fibres in conjunction with multivariate statistical analysis of the resulting isotopic abundance data using well established chemometric techniques permits sample provenancing based on the determination of where the cotton was grown and as such will facilitate sample discrimination. To date there is no recorded literature of this type of application of IRMS to cotton samples, which may be of forensic science relevance.
Additional analysis of dendrochemical data of Fallon, Nevada.
Sheppard, Paul R; Helsel, Dennis R; Speakman, Robert J; Ridenour, Gary; Witten, Mark L
2012-04-05
Previously reported dendrochemical data showed temporal variability in concentration of tungsten (W) and cobalt (Co) in tree rings of Fallon, Nevada, US. Criticism of this work questioned the use of the Mann-Whitney test for determining change in element concentrations. Here, we demonstrate that Mann-Whitney is appropriate for comparing background element concentrations to possibly elevated concentrations in environmental media. Given that Mann-Whitney tests for differences in shapes of distributions, inter-tree variability (e.g., "coefficient of median variation") was calculated for each measured element across trees within subsites and time periods. For W and Co, the metals of highest interest in Fallon, inter-tree variability was always higher within versus outside of Fallon. For calibration purposes, this entire analysis was repeated at a different town, Sweet Home, Oregon, which has a known tungsten-powder facility, and inter-tree variability of W in tree rings confirmed the establishment date of that facility. Mann-Whitney testing of simulated data also confirmed its appropriateness for analysis of data affected by point-source contamination. This research adds important new dimensions to dendrochemistry of point-source contamination by adding analysis of inter-tree variability to analysis of central tendency. Fallon remains distinctive by a temporal increase in W beginning by the mid 1990s and by elevated Co since at least the early 1990s, as well as by high inter-tree variability for W and Co relative to comparison towns.
NASA Astrophysics Data System (ADS)
Beyer, Hans Georg; Chougule, Abhijit
2016-04-01
While wind energy industry growing rapidly and siting of wind turbines onshore as well as offshore is increasing, many wind engineering model tools have been developed for the assessment of loads on wind turbines due to varying wind speeds. In order to have proper wind turbine design and performance analysis, it is important to have an accurate representation of the incoming wind field. To ease the analysis, tools for the generation of synthetic wind fields have been developed, e.g the widely used TurbSim procedure. We analyse respective synthetic data sets on one hand in view of the similarity of the spectral characteristics of measured and synthetic sets. In addition, second order characteristics with direct relevance to load assessment as given by the statistics of increments and rainflow count results are inspected.
Tong, Chudong; Shi, Xuhua; Lan, Ting
2016-11-01
Multivariate statistical methods have been widely applied to develop data-based process monitoring models. Recently, a multi-manifold projections (MMP) algorithm was proposed for modeling and monitoring chemical industrial processes, the MMP is an effective tool for preserving the global and local geometric structure of the original data space in the reduced feature subspace, but it does not provide orthogonal basis functions for data reconstruction. Recognition of this issue, an improved version of MMP algorithm named orthogonal MMP (OMMP) is formulated. Based on the OMMP model, a further processing step and a different monitoring index are proposed to model and monitor the variation in the residual subspace. Additionally, a novel variable contribution analysis is presented for fault diagnosis by integrating the nearest in-control neighbor calculation and reconstruction-based contribution analysis. The validity and superiority of the proposed fault detection and diagnosis strategy are then validated through case studies on the Tennessee Eastman benchmark process.
Framework for the Statistical Shape Analysis of Brain Structures using SPHARM-PDM
Styner, Martin; Oguz, Ipek; Xu, Shun; Brechbühler, Christian; Pantazis, Dimitrios; Levitt, James J; Shenton, Martha E; Gerig, Guido
2009-01-01
Shape analysis has become of increasing interest to the neuroimaging community due to its potential to precisely locate morphological changes between healthy and pathological structures. This manuscript presents a comprehensive set of tools for the computation of 3D structural statistical shape analysis. It has been applied in several studies on brain morphometry, but can potentially be employed in other 3D shape problems. Its main limitations is the necessity of spherical topology. The input of the proposed shape analysis is a set of binary segmentation of a single brain structure, such as the hippocampus or caudate. These segmentations are converted into a corresponding spherical harmonic description (SPHARM), which is then sampled into a triangulated surfaces (SPHARM-PDM). After alignment, differences between groups of surfaces are computed using the Hotelling T2 two sample metric. Statistical p-values, both raw and corrected for multiple comparisons, result in significance maps. Additional visualization of the group tests are provided via mean difference magnitude and vector maps, as well as maps of the group covariance information. The correction for multiple comparisons is performed via two separate methods that each have a distinct view of the problem. The first one aims to control the family-wise error rate (FWER) or false-positives via the extrema histogram of non-parametric permutations. The second method controls the false discovery rate and results in a less conservative estimate of the false-negatives. PMID:21941375
Analysis of fluorine addition to the vanguard first stage
NASA Technical Reports Server (NTRS)
Tomazic, William A; Schmidt, Harold W; Tischler, Adelbert O
1957-01-01
The effect of adding fluorine to the Vanguard first-stage oxidant was anlyzed. An increase in specific impulse of 5.74 percent may be obtained with 30 percent fluorine. This increase, coupled with increased mass ratio due to greater oxidant density, gave up to 24.6-percent increase in first-stage burnout energy with 30 percent fluorine added. However, a change in tank configuration is required to accommodate the higher oxidant-fuel ratio necessary for peak specific impulse with fluorine addition.
Hewett, Paul; Bullock, William H
2014-01-01
For more than 20 years CSX Transportation (CSXT) has collected exposure measurements from locomotive engineers and conductors who are potentially exposed to diesel emissions. The database included measurements for elemental and total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, carbon monoxide, and nitrogen dioxide. This database was statistically analyzed and summarized, and the resulting statistics and exposure profiles were compared to relevant occupational exposure limits (OELs) using both parametric and non-parametric descriptive and compliance statistics. Exposure ratings, using the American Industrial Health Association (AIHA) exposure categorization scheme, were determined using both the compliance statistics and Bayesian Decision Analysis (BDA). The statistical analysis of the elemental carbon data (a marker for diesel particulate) strongly suggests that the majority of levels in the cabs of the lead locomotives (n = 156) were less than the California guideline of 0.020 mg/m(3). The sample 95th percentile was roughly half the guideline; resulting in an AIHA exposure rating of category 2/3 (determined using BDA). The elemental carbon (EC) levels in the trailing locomotives tended to be greater than those in the lead locomotive; however, locomotive crews rarely ride in the trailing locomotive. Lead locomotive EC levels were similar to those reported by other investigators studying locomotive crew exposures and to levels measured in urban areas. Lastly, both the EC sample mean and 95%UCL were less than the Environmental Protection Agency (EPA) reference concentration of 0.005 mg/m(3). With the exception of nitrogen dioxide, the overwhelming majority of the measurements for total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, and combustion gases in the cabs of CSXT locomotives were either non-detects or considerably less than the working OELs for the years represented in the database. When compared to the previous American
Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control
Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M
2014-01-01
Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part’s porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented. PMID:26601041
Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control.
Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M
2014-01-01
Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part's porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented.
Analysis of Saccharides by the Addition of Amino Acids
NASA Astrophysics Data System (ADS)
Ozdemir, Abdil; Lin, Jung-Lee; Gillig, Kent J.; Gulfen, Mustafa; Chen, Chung-Hsuan
2016-06-01
In this work, we present the detection sensitivity improvement of electrospray ionization (ESI) mass spectrometry of neutral saccharides in a positive ion mode by the addition of various amino acids. Saccharides of a broad molecular weight range were chosen as the model compounds in the present study. Saccharides provide strong noncovalent interactions with amino acids, and the complex formation enhances the signal intensity and simplifies the mass spectra of saccharides. Polysaccharides provide a polymer-like ESI spectrum with a basic subunit difference between multiply charged chains. The protonated spectra of saccharides are not well identified because of different charge state distributions produced by the same molecules. Depending on the solvent used and other ions or molecules present in the solution, noncovalent interactions with saccharides may occur. These interactions are affected by the addition of amino acids. Amino acids with polar side groups show a strong tendency to interact with saccharides. In particular, serine shows a high tendency to interact with saccharides and significantly improves the detection sensitivity of saccharide compounds.
Additional EIPC Study Analysis: Interim Report on High Priority Topics
Hadley, Stanton W
2013-11-01
Between 2010 and 2012 the Eastern Interconnection Planning Collaborative (EIPC) conducted a major long-term resource and transmission study of the Eastern Interconnection (EI). With guidance from a Stakeholder Steering Committee (SSC) that included representatives from the Eastern Interconnection States Planning Council (EISPC) among others, the project was conducted in two phases. Phase 1 involved a long-term capacity expansion analysis that involved creation of eight major futures plus 72 sensitivities. Three scenarios were selected for more extensive transmission- focused evaluation in Phase 2. Five power flow analyses, nine production cost model runs (including six sensitivities), and three capital cost estimations were developed during this second phase. The results from Phase 1 and 2 provided a wealth of data that could be examined further to address energy-related questions. A list of 13 topics was developed for further analysis; this paper discusses the first five.
Statistics of Data Fitting: Flaws and Fixes of Polynomial Analysis of Channeled Spectra
NASA Astrophysics Data System (ADS)
Karstens, William; Smith, David
2013-03-01
Starting from general statistical principles, we have critically examined Baumeister's procedure* for determining the refractive index of thin films from channeled spectra. Briefly, the method assumes that the index and interference fringe order may be approximated by polynomials quadratic and cubic in photon energy, respectively. The coefficients of the polynomials are related by differentiation, which is equivalent to comparing energy differences between fringes. However, we find that when the fringe order is calculated from the published IR index for silicon* and then analyzed with Baumeister's procedure, the results do not reproduce the original index. This problem has been traced to 1. Use of unphysical powers in the polynomials (e.g., time-reversal invariance requires that the index is an even function of photon energy), and 2. Use of insufficient terms of the correct parity. Exclusion of unphysical terms and addition of quartic and quintic terms to the index and order polynomials yields significantly better fits with fewer parameters. This represents a specific example of using statistics to determine if the assumed fitting model adequately captures the physics contained in experimental data. The use of analysis of variance (ANOVA) and the Durbin-Watson statistic to test criteria for the validity of least-squares fitting will be discussed. *D.F. Edwards and E. Ochoa, Appl. Opt. 19, 4130 (1980). Supported in part by the US Department of Energy, Office of Nuclear Physics under contract DE-AC02-06CH11357.
Statistical Analysis of the Nonhomogeneity Detector for Non-Gaussian Interference Backgrounds
2005-06-01
Statistical Inference . Mineola, NY: Dover, 2003. lanta (March 2004). He is a member of the editorial board of the Digital Signal [35] J. H. Michels, M...01-06-2005 Journal Article 2003 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Statistical Analysis of the Nonhomogeneity Detector for Non-Gaussian N/A... statistical analysis of the method. The non-Gaussian interference scenario is assumed to be modeled by a spherically invariant random process (SIRP). We
Risk analysis of sulfites used as food additives in China.
Zhang, Jian Bo; Zhang, Hong; Wang, Hua Li; Zhang, Ji Yue; Luo, Peng Jie; Zhu, Lei; Wang, Zhu Tian
2014-02-01
This study was to analyze the risk of sulfites in food consumed by the Chinese people and assess the health protection capability of maximum-permitted level (MPL) of sulfites in GB 2760-2011. Sulfites as food additives are overused or abused in many food categories. When the MPL in GB 2760-2011 was used as sulfites content in food, the intake of sulfites in most surveyed populations was lower than the acceptable daily intake (ADI). Excess intake of sulfites was found in all the surveyed groups when a high percentile of sulfites in food was in taken. Moreover, children aged 1-6 years are at a high risk to intake excess sulfites. The primary cause for the excess intake of sulfites in Chinese people is the overuse and abuse of sulfites by the food industry. The current MPL of sulfites in GB 2760-2011 protects the health of most populations.
Disclosure of hydraulic fracturing fluid chemical additives: analysis of regulations.
Maule, Alexis L; Makey, Colleen M; Benson, Eugene B; Burrows, Isaac J; Scammell, Madeleine K
2013-01-01
Hydraulic fracturing is used to extract natural gas from shale formations. The process involves injecting into the ground fracturing fluids that contain thousands of gallons of chemical additives. Companies are not mandated by federal regulations to disclose the identities or quantities of chemicals used during hydraulic fracturing operations on private or public lands. States have begun to regulate hydraulic fracturing fluids by mandating chemical disclosure. These laws have shortcomings including nondisclosure of proprietary or "trade secret" mixtures, insufficient penalties for reporting inaccurate or incomplete information, and timelines that allow for after-the-fact reporting. These limitations leave lawmakers, regulators, public safety officers, and the public uninformed and ill-prepared to anticipate and respond to possible environmental and human health hazards associated with hydraulic fracturing fluids. We explore hydraulic fracturing exemptions from federal regulations, as well as current and future efforts to mandate chemical disclosure at the federal and state level.
2012-10-24
time series similarity measures for classification and change detection of ecosystem dynamics . Remote...for estimating species-richness, and introduce a method based on statistical wavelet multiresolution texture analysis to quantitatively assess...entropy for estimating species-richness, and introduce a method based on statistical wavelet multiresolution texture analysis to quantitatively
Statistics Education Research in Malaysia and the Philippines: A Comparative Analysis
ERIC Educational Resources Information Center
Reston, Enriqueta; Krishnan, Saras; Idris, Noraini
2014-01-01
This paper presents a comparative analysis of statistics education research in Malaysia and the Philippines by modes of dissemination, research areas, and trends. An electronic search for published research papers in the area of statistics education from 2000-2012 yielded 20 for Malaysia and 19 for the Philippines. Analysis of these papers showed…
APPLICATION OF STATISTICAL ENERGY ANALYSIS TO VIBRATIONS OF MULTI-PANEL STRUCTURES.
cylindrical shell are compared with predictions obtained from statistical energy analysis . Generally good agreement is observed. The flow of mechanical...the coefficients of proportionality between power flow and average modal energy difference, which one must know in order to apply statistical energy analysis . No
Linkage analysis of systolic blood pressure: a score statistic and computer implementation
Wang, Kai; Peng, Yingwei
2003-01-01
A genome-wide linkage analysis was conducted on systolic blood pressure using a score statistic. The randomly selected Replicate 34 of the simulated data was used. The score statistic was applied to the sibships derived from the general pedigrees. An add-on R program to GENEHUNTER was developed for this analysis and is freely available. PMID:14975145
Fatigue Crack Propagation: Probabilistic Modeling and Statistical Analysis.
1988-03-23
School of Physics "Enrico Fermi" (1986) (eds. D.V. Lindley and C.A. Clarotti) Amsterdam: North Holland (with Morris H. DeGroot ) An accelerated life...Festschrift in Honor of Ingram Olkin 1988, Editors: Jim Press & Leon Jay Gleser (with Morris H. DeGroot and Maria J. Bayarri) New York: Springer-Verlag...389, Department of Statistics, Ohio State University (with Morris H. DeGroot ) In this paper, the concepts of comparison of experiments in the context
Statistical analysis of multivariate atmospheric variables. [cloud cover
NASA Technical Reports Server (NTRS)
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.
Statistical analysis of epidemiologic data of pregnancy outcomes
Butler, W.J.; Kalasinski, L.A. )
1989-02-01
In this paper, a generalized logistic regression model for correlated observations is used to analyze epidemiologic data on the frequency of spontaneous abortion among a group of women office workers. The results are compared to those obtained from the use of the standard logistic regression model that assumes statistical independence among all the pregnancies contributed by one woman. In this example, the correlation among pregnancies from the same woman is fairly small and did not have a substantial impact on the magnitude of estimates of parameters of the model. This is due at least partly to the small average number of pregnancies contributed by each woman.
JAWS data collection, analysis highlights, and microburst statistics
NASA Technical Reports Server (NTRS)
Mccarthy, J.; Roberts, R.; Schreiber, W.
1983-01-01
Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.
Statistical Analysis of Noisy Signals Using Classification Tools
Thompson, Sandra E.; Heredia-Langner, Alejandro; Johnson, Timothy J.; Foster, Nancy S.; Valentine, Nancy B.; Amonette, James E.
2005-06-04
The potential use of chemicals, biotoxins and biological pathogens are a threat to military and police forces as well as the general public. Rapid identification of these agents is made difficult due to the noisy nature of the signal that can be obtained from portable, in-field sensors. In previously published articles, we created a flowchart that illustrated a method for triaging bacterial identification by combining standard statistical techniques for discrimination and identification with mid-infrared spectroscopic data. The present work documents the process of characterizing and eliminating the sources of the noise and outlines how multidisciplinary teams are necessary to accomplish that goal.
Statistical distributions of potential interest in ultrasound speckle analysis.
Nadarajah, Saralees
2007-05-21
Compound statistical modelling of the uncompressed envelope of the backscattered signal has received much interest recently. In this note, a comprehensive collection of models is derived for the uncompressed envelope of the backscattered signal by compounding the Nakagami distribution with 13 flexible families. The corresponding estimation procedures are derived by the method of moments and the method of maximum likelihood. The sensitivity of the models to their various parameters is examined. It is expected that this work could serve as a useful reference and lead to improved modelling of the uncompressed envelope of the backscattered signal.
NASA Astrophysics Data System (ADS)
Shouno, Hayaru; Kido, Shoji; Okada, Masato
2004-09-01
Bidirectional associative memory (BAM) is a kind of an artificial neural network used to memorize and retrieve heterogeneous pattern pairs. Many efforts have been made to improve BAM from the the viewpoint of computer application, and few theoretical studies have been done. We investigated the theoretical characteristics of BAM using a framework of statistical-mechanical analysis. To investigate the equilibrium state of BAM, we applied self-consistent signal to noise analysis (SCSNA) and obtained a macroscopic parameter equations and relative capacity. Moreover, to investigate not only the equilibrium state but also the retrieval process of reaching the equilibrium state, we applied statistical neurodynamics to the update rule of BAM and obtained evolution equations for the macroscopic parameters. These evolution equations are consistent with the results of SCSNA in the equilibrium state.
Statistical analysis of bankrupting and non-bankrupting stocks
NASA Astrophysics Data System (ADS)
Li, Qian; Wang, Fengzhong; Wei, Jianrong; Liang, Yuan; Huang, Jiping; Stanley, H. Eugene
2012-04-01
The recent financial crisis has caused extensive world-wide economic damage, affecting in particular those who invested in companies that eventually filed for bankruptcy. A better understanding of stocks that become bankrupt would be helpful in reducing risk in future investments. Economists have conducted extensive research on this topic, and here we ask whether statistical physics concepts and approaches may offer insights into pre-bankruptcy stock behavior. To this end, we study all 20092 stocks listed in US stock markets for the 20-year period 1989-2008, including 4223 (21 percent) that became bankrupt during that period. We find that, surprisingly, the distributions of the daily returns of those stocks that become bankrupt differ significantly from those that do not. Moreover, these differences are consistent for the entire period studied. We further study the relation between the distribution of returns and the length of time until bankruptcy, and observe that larger differences of the distribution of returns correlate with shorter time periods preceding bankruptcy. This behavior suggests that sharper fluctuations in the stock price occur when the stock is closer to bankruptcy. We also analyze the cross-correlations between the return and the trading volume, and find that stocks approaching bankruptcy tend to have larger return-volume cross-correlations than stocks that are not. Furthermore, the difference increases as bankruptcy approaches. We conclude that before a firm becomes bankrupt its stock exhibits unusual behavior that is statistically quantifiable.
Statistical Analysis of the Uncertainty in Pre-Flight Aerodynamic Database of a Hypersonic Vehicle
NASA Astrophysics Data System (ADS)
Huh, Lynn
The objective of the present research was to develop a new method to derive the aerodynamic coefficients and the associated uncertainties for flight vehicles via post- flight inertial navigation analysis using data from the inertial measurement unit. Statistical estimates of vehicle state and aerodynamic coefficients are derived using Monte Carlo simulation. Trajectory reconstruction using the inertial navigation system (INS) is a simple and well used method. However, deriving realistic uncertainties in the reconstructed state and any associated parameters is not so straight forward. Extended Kalman filters, batch minimum variance estimation and other approaches have been used. However, these methods generally depend on assumed physical models, assumed statistical distributions (usually Gaussian) or have convergence issues for non-linear problems. The approach here assumes no physical models, is applicable to any statistical distribution, and does not have any convergence issues. The new approach obtains the statistics directly from a sufficient number of Monte Carlo samples using only the generally well known gyro and accelerometer specifications and could be applied to the systems of non-linear form and non-Gaussian distribution. When redundant data are available, the set of Monte Carlo simulations are constrained to satisfy the redundant data within the uncertainties specified for the additional data. The proposed method was applied to validate the uncertainty in the pre-flight aerodynamic database of the X-43A Hyper-X research vehicle. In addition to gyro and acceleration data, the actual flight data include redundant measurements of position and velocity from the global positioning system (GPS). The criteria derived from the blend of the GPS and INS accuracy was used to select valid trajectories for statistical analysis. The aerodynamic coefficients were derived from the selected trajectories by either direct extraction method based on the equations in
Statistical Analysis of Complexity Generators for Cost Estimation
NASA Technical Reports Server (NTRS)
Rowell, Ginger Holmes
1999-01-01
Predicting the cost of cutting edge new technologies involved with spacecraft hardware can be quite complicated. A new feature of the NASA Air Force Cost Model (NAFCOM), called the Complexity Generator, is being developed to model the complexity factors that drive the cost of space hardware. This parametric approach is also designed to account for the differences in cost, based on factors that are unique to each system and subsystem. The cost driver categories included in this model are weight, inheritance from previous missions, technical complexity, and management factors. This paper explains the Complexity Generator framework, the statistical methods used to select the best model within this framework, and the procedures used to find the region of predictability and the prediction intervals for the cost of a mission.
Analysis of surface sputtering on a quantum statistical basis
NASA Technical Reports Server (NTRS)
Wilhelm, H. E.
1975-01-01
Surface sputtering is explained theoretically by means of a 3-body sputtering mechanism involving the ion and two surface atoms of the solid. By means of quantum-statistical mechanics, a formula for the sputtering ratio S(E) is derived from first principles. The theoretical sputtering rate S(E) was found experimentally to be proportional to the square of the difference between incident ion energy and the threshold energy for sputtering of surface atoms at low ion energies. Extrapolation of the theoretical sputtering formula to larger ion energies indicates that S(E) reaches a saturation value and finally decreases at high ion energies. The theoretical sputtering ratios S(E) for wolfram, tantalum, and molybdenum are compared with the corresponding experimental sputtering curves in the low energy region from threshold sputtering energy to 120 eV above the respective threshold energy. Theory and experiment are shown to be in good agreement.
Statistical multi-site fatigue damage analysis model
NASA Astrophysics Data System (ADS)
Wang, G. S.
1995-02-01
A statistical model has been developed to evaluate fatigue damage at multi-sites in complex joints based on coupon test data and fracture mechanics methods. The model is similar to the USAF model, but modified by introducing a failure criterion and a probability of fatal crack occurrence to account for the multiple site damage phenomenon. The involvement of NDI techniques has been included in the model which can be used to evaluate the structural reliability, the detectability of fatigue damage (cracks), and the risk of failure based on NDI results taken from samples. A practical example is provided for rivet fasteners and bolted fasteners. It is shown that the model can be used even if it is based on conventional S-N coupon experiments should further fractographic inspections be made for cracks on the broken surfaces of specimens.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation
NASA Technical Reports Server (NTRS)
DePriest, Douglas; Morgan, Carolyn
2003-01-01
The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.
Statistical methods for the geographical analysis of rare diseases.
Gómez-Rubio, Virgilio; López-Quílez, Antonio
2010-01-01
In this chapter we provide a summary of different methods for the detection of disease clusters. First of all, we give a summary of methods for computing estimates of the relative risk. These estimates provide smoothed values of the relative risks that can account for its spatial variation. Some methods for assessing spatial autocorrelation and general clustering are also discussed to test for significant spatial variation of the risk. In order to find the actual location of the clusters, scan methods are introduced. The spatial scan statistic is discussed as well as its extension by means of Generalised Linear Models that allows for the inclusion of covariates and cluster effects. In this context, zero-inflated models are introduced to account for the high number of zeros that appear when studying rare diseases. Finally, two applications of these methods are shown using data of Systemic Lupus Erythematosus in Spain and brain cancer in Navarre (Spain).
Statistical analysis of loopy belief propagation in random fields
NASA Astrophysics Data System (ADS)
Yasuda, Muneki; Kataoka, Shun; Tanaka, Kazuyuki
2015-10-01
Loopy belief propagation (LBP), which is equivalent to the Bethe approximation in statistical mechanics, is a message-passing-type inference method that is widely used to analyze systems based on Markov random fields (MRFs). In this paper, we propose a message-passing-type method to analytically evaluate the quenched average of LBP in random fields by using the replica cluster variation method. The proposed analytical method is applicable to general pairwise MRFs with random fields whose distributions differ from each other and can give the quenched averages of the Bethe free energies over random fields, which are consistent with numerical results. The order of its computational cost is equivalent to that of standard LBP. In the latter part of this paper, we describe the application of the proposed method to Bayesian image restoration, in which we observed that our theoretical results are in good agreement with the numerical results for natural images.
Statistical analysis of Nomao customer votes for spots of France
NASA Astrophysics Data System (ADS)
Pálovics, Róbert; Daróczy, Bálint; Benczúr, András; Pap, Julia; Ermann, Leonardo; Phan, Samuel; Chepelianskii, Alexei D.; Shepelyansky, Dima L.
2015-08-01
We investigate the statistical properties of votes of customers for spots of France collected by the startup company Nomao. The frequencies of votes per spot and per customer are characterized by a power law distribution which remains stable on a time scale of a decade when the number of votes is varied by almost two orders of magnitude. Using the computer science methods we explore the spectrum and the eigenvalues of a matrix containing user ratings to geolocalized items. Eigenvalues nicely map to large towns and regions but show certain level of instability as we modify the interpretation of the underlying matrix. We evaluate imputation strategies that provide improved prediction performance by reaching geographically smooth eigenvectors. We point on possible links between distribution of votes and the phenomenon of self-organized criticality.
Statistical analysis of a global photochemical model of the atmosphere
NASA Astrophysics Data System (ADS)
Frol'Kis, V. A.; Karol', I. L.; Kiselev, A. A.; Ozolin, Yu. E.; Zubov, V. A.
2007-08-01
This is a study of the sensitivity of model results (atmospheric content of main gas constituents and radiative characteristics of the atmosphere) to errors in emissions of a number of atmospheric gaseous pollutants. Groups of the model variables most dependent on these errors are selected. Two variants of emissions are considered: one without their evolution and the other with their variation according to the IPCC scenario. The estimates are made on the basis of standard statistical methods for the results obtained with the detailed onedimensional radiative—photochemical model of the Main Geophysical Observatory (MGO). Some approaches to such estimations with models of higher complexity and to the solution of the inverse problem (i.e., the estimation of the necessary accuracy of external model parameters for obtaining the given accuracy of model results) are outlined.
In-Situ Statistical Analysis of Autotune Simulation Data using Graphical Processing Units
Ranjan, Niloo; Sanyal, Jibonananda; New, Joshua Ryan
2013-08-01
Developing accurate building energy simulation models to assist energy efficiency at speed and scale is one of the research goals of the Whole-Building and Community Integration group, which is a part of Building Technologies Research and Integration Center (BTRIC) at Oak Ridge National Laboratory (ORNL). The aim of the Autotune project is to speed up the automated calibration of building energy models to match measured utility or sensor data. The workflow of this project takes input parameters and runs EnergyPlus simulations on Oak Ridge Leadership Computing Facility s (OLCF) computing resources such as Titan, the world s second fastest supercomputer. Multiple simulations run in parallel on nodes having 16 processors each and a Graphics Processing Unit (GPU). Each node produces a 5.7 GB output file comprising 256 files from 64 simulations. Four types of output data covering monthly, daily, hourly, and 15-minute time steps for each annual simulation is produced. A total of 270TB+ of data has been produced. In this project, the simulation data is statistically analyzed in-situ using GPUs while annual simulations are being computed on the traditional processors. Titan, with its recent addition of 18,688 Compute Unified Device Architecture (CUDA) capable NVIDIA GPUs, has greatly extended its capability for massively parallel data processing. CUDA is used along with C/MPI to calculate statistical metrics such as sum, mean, variance, and standard deviation leveraging GPU acceleration. The workflow developed in this project produces statistical summaries of the data which reduces by multiple orders of magnitude the time and amount of data that needs to be stored. These statistical capabilities are anticipated to be useful for sensitivity analysis of EnergyPlus simulations.
Statistical analysis of 2AFC contrast threshold measurements
NASA Astrophysics Data System (ADS)
Tchou, Philip; Flynn, Michael
2005-04-01
Most prior 2AFC experiments have been designed using a small number of signal strengths with many scenes for each strength. Percent correct is then computed for each level and fit to the assumed psychometric function. However, this introduces error because the signal strengths of individual responses are shifted. An alternative approach is to compute the statistical likelihood as a function of the threshold and width of the psychometric response curve. The best fit is then determined by finding the threshold and width that maximize the likelihood. In this paper, we discuss a method for analyzing 2AFC observer responses using maximum likelihood estimation (MLE) techniques. The logit model is used to represent the psychometric function and derive the likelihood. A conjugate gradient search algorithm is then used to find the maximum likelihood. The method is illustrated using human observer results from a previous study while statistical characteristics of the method are examined using simulated response data. The human observer results show that the psychometric function varies between observers and from test to test. The simulations show that the variance of the threshold and width exhibit a 1/Nobs relationship (σ=1.5201*Nobs-0.5236), where Nobs is the number of observations made in a 2AFC test ranging from 10 to 30000. The variance of the human observer data was in close agreement with the simulations. These results indicate that the method is robust over a wide range of observations and can be used to predict human responses. The results of the simulations also suggest how to minimize error in future studies.
Power flow as a complement to statistical energy analysis and finite element analysis
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1987-01-01
Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.
Statistical analysis of imperfection effect on cylindrical buckling response
NASA Astrophysics Data System (ADS)
Ismail, M. S.; Purbolaksono, J.; Muhammad, N.; Andriyana, A.; Liew, H. L.
2015-12-01
It is widely reported that no efficient guidelines for modelling imperfections in composite structures are available. In response, this work evaluates the imperfection factors of axially compressed Carbon Fibre Reinforced Polymer (CFRP) cylinder with different ply angles through finite element (FE) analysis. The sensitivity of imperfection factors were analysed using design of experiment: factorial design approach. From the analysis it identified three critical factors that sensitively reacted towards buckling load. Furthermore empirical equation is proposed according to each type of cylinder. Eventually, critical buckling loads estimated by empirical equation showed good agreements with FE analysis. The design of experiment methodology is useful in identifying parameters that lead to structures imperfection tolerance.
Rashid, Naim U; Sun, Wei; Ibrahim, Joseph G
2014-01-01
In DAE (DNA After Enrichment)-seq experiments, genomic regions related with certain biological processes are enriched/isolated by an assay and are then sequenced on a high-throughput sequencing platform to determine their genomic positions. Statistical analysis of DAE-seq data aims to detect genomic regions with significant aggregations of isolated DNA fragments ("enriched regions") versus all the other regions ("background"). However, many confounding factors may influence DAE-seq signals. In addition, the signals in adjacent genomic regions may exhibit strong correlations, which invalidate the independence assumption employed by many existing methods. To mitigate these issues, we develop a novel Autoregressive Hidden Markov Model (AR-HMM) to account for covariates effects and violations of the independence assumption. We demonstrate that our AR-HMM leads to improved performance in identifying enriched regions in both simulated and real datasets, especially in those in epigenetic datasets with broader regions of DAE-seq signal enrichment. We also introduce a variable selection procedure in the context of the HMM/AR-HMM where the observations are not independent and the mean value of each state-specific emission distribution is modeled by some covariates. We study the theoretical properties of this variable selection procedure and demonstrate its efficacy in simulated and real DAE-seq data. In summary, we develop several practical approaches for DAE-seq data analysis that are also applicable to more general problems in statistics.
Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell
2012-01-01
Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.
Statistical analysis of suicide characteristics in Iaşi County.
Herea, Speranta-Giulia; Scripcaru, C
2012-01-01
A prospective study intended for statistic analysis of suicide events occurring in 2004-2009 period, in lasi County, was performed. Specific data emerged from the conventional investigation, focusing on the sex, age, seasonality, marital condition, occupation status, blood alcohol concentration, religion adherence, and previous suicide attempts of the persons who committed the lethal self-aggression. The results showed a males: females (M:F) ratio of 4.13:1, central tendency to suicide towards the 46 years, a mean age of the self-murderers series of 45 years, while the most frequent age was 49 years. The interquartile range expanded from 33 to 56 years. The rural:urban (R:U) ratio was 1.38:1, whereas a statistically-significant seasonal variation was found in villages. Suicide events occurred more frequently around the Easter and Christmas, whereas the orthodox Christian believers seemed to suicide more than Catholics. Additionally, a correlated analysis, based essentially on data provided by the local Institute of Legal Medicine and Psychiatry Hospital, offered a comprehensive understanding of the mental state of the self-murderers and their psychiatric profile.
Image analysis of gunshot residue on entry wounds. II--A statistical estimation of firing range.
Brown, H; Cauchi, D M; Holden, J L; Allen, F C; Cordner, S; Thatcher, P
1999-03-29
A statistical investigation of the relationship between firing range and the amount and distribution of gunshot residue (GSR), used automated image analysis (IA) to quantify GSR deposit resulting from firings into pig skin, from distances ranging between contact and 45 cm. Overall, for a Ruger .22 semi-automatic rifle using CCI solid point, high velocity ammunition, the total area of GSR deposit on the skin sections decreased in a non-linear fashion with firing range. More specifically there were significant differences in the amount of GSR deposited from shots fired at contact compared with shots fired from distances between 2.5 and 45 cm; and between shots fired from a distance of 20 cm or less, with shots fired at a distance of 30 cm or more. In addition, GSR particles were heavily concentrated in the wound tract only for contact and close range shots at 2.5 cm, while the particle distribution was more uniform between the wound tract and the skin surfaces for shots fired from distances greater than 2.5 cm. Consequently, for future scientific investigations of gunshot fatalities, once standards have been established for the weapon and ammunition type in question, image analysis quantification of GSR deposited in and around the gunshot wound may be capable of providing a reliable, statistical basis for estimating firing range.
Application of the Statistical ICA Technique in the DANCE Data Analysis
NASA Astrophysics Data System (ADS)
Baramsai, Bayarbadrakh; Jandel, M.; Bredeweg, T. A.; Rusev, G.; Walker, C. L.; Couture, A.; Mosby, S.; Ullmann, J. L.; Dance Collaboration
2015-10-01
The Detector for Advanced Neutron Capture Experiments (DANCE) at the Los Alamos Neutron Science Center is used to improve our understanding of the neutron capture reaction. DANCE is a highly efficient 4 π γ-ray detector array consisting of 160 BaF2 crystals which make it an ideal tool for neutron capture experiments. The (n, γ) reaction Q-value equals to the sum energy of all γ-rays emitted in the de-excitation cascades from the excited capture state to the ground state. The total γ-ray energy is used to identify reactions on different isotopes as well as the background. However, it's challenging to identify contribution in the Esum spectra from different isotopes with the similar Q-values. Recently we have tested the applicability of modern statistical methods such as Independent Component Analysis (ICA) to identify and separate different (n, γ) reaction yields on different isotopes that are present in the target material. ICA is a recently developed computational tool for separating multidimensional data into statistically independent additive subcomponents. In this conference talk, we present some results of the application of ICA algorithms and its modification for the DANCE experimental data analysis. This research is supported by the U. S. Department of Energy, Office of Science, Nuclear Physics under the Early Career Award No. LANL20135009.
Additional challenges for uncertainty analysis in river engineering
NASA Astrophysics Data System (ADS)
Berends, Koen; Warmink, Jord; Hulscher, Suzanne
2016-04-01
the proposed intervention. The implicit assumption underlying such analysis is that both models are commensurable. We hypothesize that they are commensurable only to a certain extent. In an idealised study we have demonstrated that prediction performance loss should be expected with increasingly large engineering works. When accounting for parametric uncertainty of floodplain roughness in model identification, we see uncertainty bounds for predicted effects of interventions increase with increasing intervention scale. Calibration of these types of models therefore seems to have a shelf-life, beyond which calibration does not longer improves prediction. Therefore a qualification scheme for model use is required that can be linked to model validity. In this study, we characterize model use along three dimensions: extrapolation (using the model with different external drivers), extension (using the model for different output or indicators) and modification (using modified models). Such use of models is expected to have implications for the applicability of surrogating modelling for efficient uncertainty analysis as well, which is recommended for future research. Warmink, J. J.; Straatsma, M. W.; Huthoff, F.; Booij, M. J. & Hulscher, S. J. M. H. 2013. Uncertainty of design water levels due to combined bed form and vegetation roughness in the Dutch river Waal. Journal of Flood Risk Management 6, 302-318 . DOI: 10.1111/jfr3.12014
Kinetic analysis of microbial respiratory response to substrate addition
NASA Astrophysics Data System (ADS)
Blagodatskaya, Evgenia; Blagodatsky, Sergey; Yuyukina, Tatayna; Kuzyakov, Yakov
2010-05-01
Heterotrophic component of CO2 emitted from soil is mainly due to the respiratory activity of soil microorganisms. Field measurements of microbial respiration can be used for estimation of C-budget in soil, while laboratory estimation of respiration kinetics allows the elucidation of mechanisms of soil C sequestration. Physiological approaches based on 1) time-dependent or 2) substrate-dependent respiratory response of soil microorganisms decomposing the organic substrates allow to relate the functional properties of soil microbial community with decomposition rates of soil organic matter. We used a novel methodology combining (i) microbial growth kinetics and (ii) enzymes affinity to the substrate to show the shift in functional properties of the soil microbial community after amendments with substrates of contrasting availability. We combined the application of 14C labeled glucose as easily available C source to soil with natural isotope labeling of old and young soil SOM. The possible contribution of two processes: isotopic fractionation and preferential substrate utilization to the shifts in δ13C during SOM decomposition in soil after C3-C4 vegetation change was evaluated. Specific growth rate (µ) of soil microorganisms was estimated by fitting the parameters of the equation v(t) = A + B * exp(µ*t), to the measured CO2 evolution rate (v(t)) after glucose addition, and where A is the initial rate of non-growth respiration, B - initial rate of the growing fraction of total respiration. Maximal mineralization rate (Vmax), substrate affinity of microbial enzymes (Ks) and substrate availability (Sn) were determined by Michaelis-Menten kinetics. To study the effect of plant originated C on δ13C signature of SOM we compared the changes in isotopic composition of different C pools in C3 soil under grassland with C3-C4 soil where C4 plant Miscanthus giganteus was grown for 12 years on the plot after grassland. The shift in 13δ C caused by planting of M. giganteus
New Statistical Approach to the Analysis of Hierarchical Data
NASA Astrophysics Data System (ADS)
Neuman, S. P.; Guadagnini, A.; Riva, M.
2014-12-01
Many variables possess a hierarchical structure reflected in how their increments vary in space and/or time. Quite commonly the increments (a) fluctuate in a highly irregular manner; (b) possess symmetric, non-Gaussian frequency distributions characterized by heavy tails that often decay with separation distance or lag; (c) exhibit nonlinear power-law scaling of sample structure functions in a midrange of lags, with breakdown in such scaling at small and large lags; (d) show extended power-law scaling (ESS) at all lags; and (e) display nonlinear scaling of power-law exponent with order of sample structure function. Some interpret this to imply that the variables are multifractal, which explains neither breakdowns in power-law scaling nor ESS. We offer an alternative interpretation consistent with all above phenomena. It views data as samples from stationary, anisotropic sub-Gaussian random fields subordinated to truncated fractional Brownian motion (tfBm) or truncated fractional Gaussian noise (tfGn). The fields are scaled Gaussian mixtures with random variances. Truncation of fBm and fGn entails filtering out components below data measurement or resolution scale and above domain scale. Our novel interpretation of the data allows us to obtain maximum likelihood estimates of all parameters characterizing the underlying truncated sub-Gaussian fields. These parameters in turn make it possible to downscale or upscale all statistical moments to situations entailing smaller or larger measurement or resolution and sampling scales, respectively. They also allow one to perform conditional or unconditional Monte Carlo simulations of random field realizations corresponding to these scales. Aspects of our approach are illustrated on field and laboratory measured porous and fractured rock permeabilities, as well as soil texture characteristics and neural network estimates of unsaturated hydraulic parameters in a deep vadose zone near Phoenix, Arizona. We also use our approach
Does published orthodontic research account for clustering effects during statistical data analysis?
Koletsi, Despina; Pandis, Nikolaos; Polychronopoulou, Argy; Eliades, Theodore
2012-06-01
In orthodontics, multiple site observations within patients or multiple observations collected at consecutive time points are often encountered. Clustered designs require larger sample sizes compared to individual randomized trials and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this study to assess to what degree clustering effects are considered during design and data analysis in the three major orthodontic journals. The contents of the most recent 24 issues of the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO), Angle Orthodontist (AO), and European Journal of Orthodontics (EJO) from December 2010 backwards were hand searched. Articles with clustering effects and whether the authors accounted for clustering effects were identified. Additionally, information was collected on: involvement of a statistician, single or multicenter study, number of authors in the publication, geographical area, and statistical significance. From the 1584 articles, after exclusions, 1062 were assessed for clustering effects from which 250 (23.5 per cent) were considered to have clustering effects in the design (kappa = 0.92, 95 per cent CI: 0.67-0.99 for inter rater agreement). From the studies with clustering effects only, 63 (25.20 per cent) had indicated accounting for clustering effects. There was evidence that the studies published in the AO have higher odds of accounting for clustering effects [AO versus AJODO: odds ratio (OR) = 2.17, 95 per cent confidence interval (CI): 1.06-4.43, P = 0.03; EJO versus AJODO: OR = 1.90, 95 per cent CI: 0.84-4.24, non-significant; and EJO versus AO: OR = 1.15, 95 per cent CI: 0.57-2.33, non-significant). The results of this study indicate that only about a quarter of the studies with clustering effects account for this in statistical data analysis.
Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation
NASA Technical Reports Server (NTRS)
Jefferys, William H.; Berger, James O.
1992-01-01
'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.
Statistical language analysis for automatic exfiltration event detection.
Robinson, David Gerald
2010-04-01
This paper discusses the recent development a statistical approach for the automatic identification of anomalous network activity that is characteristic of exfiltration events. This approach is based on the language processing method eferred to as latent dirichlet allocation (LDA). Cyber security experts currently depend heavily on a rule-based framework for initial detection of suspect network events. The application of the rule set typically results in an extensive list of uspect network events that are then further explored manually for suspicious activity. The ability to identify anomalous network events is heavily dependent on the experience of the security personnel wading through the network log. Limitations f this approach are clear: rule-based systems only apply to exfiltration behavior that has previously been observed, and experienced cyber security personnel are rare commodities. Since the new methodology is not a discrete rule-based pproach, it is more difficult for an insider to disguise the exfiltration events. A further benefit is that the methodology provides a risk-based approach that can be implemented in a continuous, dynamic or evolutionary fashion. This permits uspect network activity to be identified early with a quantifiable risk associated with decision making when responding to suspicious activity.
Performance analysis of LVQ algorithms: a statistical physics approach.
Ghosh, Anarta; Biehl, Michael; Hammer, Barbara
2006-01-01
Learning vector quantization (LVQ) constitutes a powerful and intuitive method for adaptive nearest prototype classification. However, original LVQ has been introduced based on heuristics and numerous modifications exist to achieve better convergence and stability. Recently, a mathematical foundation by means of a cost function has been proposed which, as a limiting case, yields a learning rule similar to classical LVQ2.1. It also motivates a modification which shows better stability. However, the exact dynamics as well as the generalization ability of many LVQ algorithms have not been thoroughly investigated so far. Using concepts from statistical physics and the theory of on-line learning, we present a mathematical framework to analyse the performance of different LVQ algorithms in a typical scenario in terms of their dynamics, sensitivity to initial conditions, and generalization ability. Significant differences in the algorithmic stability and generalization ability can be found already for slightly different variants of LVQ. We study five LVQ algorithms in detail: Kohonen's original LVQ1, unsupervised vector quantization (VQ), a mixture of VQ and LVQ, LVQ2.1, and a variant of LVQ which is based on a cost function. Surprisingly, basic LVQ1 shows very good performance in terms of stability, asymptotic generalization ability, and robustness to initializations and model parameters which, in many cases, is superior to recent alternative proposals.
On the Statistical Analysis of X-ray Polarization Measurements
NASA Technical Reports Server (NTRS)
Strohmayer, T. E.; Kallman, T. R.
2013-01-01
In many polarimetry applications, including observations in the X-ray band, the measurement of a polarization signal can be reduced to the detection and quantification of a deviation from uniformity of a distribution of measured angles of the form alpha plus beta cosine (exp 2)(phi - phi(sub 0) (0 (is) less than phi is less than pi). We explore the statistics of such polarization measurements using both Monte Carlo simulations as well as analytic calculations based on the appropriate probability distributions. We derive relations for the number of counts required to reach a given detection level (parameterized by beta the "number of sigma's" of the measurement) appropriate for measuring the modulation amplitude alpha by itself (single interesting parameter case) or jointly with the position angle phi (two interesting parameters case). We show that for the former case when the intrinsic amplitude is equal to the well known minimum detectable polarization (MDP) it is, on average, detected at the 3sigma level. For the latter case, when one requires a joint measurement at the same confidence level, then more counts are needed, by a factor of approximately equal to 2.2, than that required to achieve the MDP level. We find that the position angle uncertainty at 1sigma confidence is well described by the relation sigma(sub pi) equals 28.5(degrees) divided by beta.
A statistical mechanics analysis of the set covering problem
NASA Astrophysics Data System (ADS)
Fontanari, J. F.
1996-02-01
The dependence of the optimal solution average cost 0305-4470/29/3/004/img1 of the set covering problem on the density of 1's of the incidence matrix (0305-4470/29/3/004/img2) and on the number of constraints (P) is investigated in the limit where the number of items (N) goes to infinity. The annealed approximation is employed to study two stochastic models: the constant density model, where the elements of the incidence matrix are statistically independent random variables, and the Karp model, where the rows of the incidence matrix possess the same number of 1's. Lower bounds for 0305-4470/29/3/004/img1 are presented in the case that P scales with ln N and 0305-4470/29/3/004/img2 is of order 1, as well as in the case that P scales linearly with N and 0305-4470/29/3/004/img2 is of order 1/N. It is shown that in the case that P scales with exp N and 0305-4470/29/3/004/img2 is of order 1 the annealed approximation yields exact results for both models.
Statistical analysis of geomagnetic storm driver and intensity
NASA Astrophysics Data System (ADS)
Katus, R. M.; Liemohn, M. W.
2013-05-01
Geomagnetic storms are investigated statistically with respect to the solar wind driver and the intensity of the events. The Hot Electron and Ion Drift Integrator (HEIDI) model was used to simulate all of the intense storms (minimum Dst < - 100 nT) from solar cycle 23 (1996-2005). Four different configurations of HEIDI were used to investigate the outer boundary condition and electric field description. The storms are then classified as being a coronal mass ejection (CME) or corotating interaction region (CIR) driven event and binned based on the magnitude of the minimum Dst. The simulation results as well as solar wind and geomagnetic data sets are then analyzed along a normalized epoch timeline. The average behavior of each storm type and the corresponding HEIDI configurations are then presented and discussed. It is found that while the self-consistent electric field better reproduces stronger CME driven storms, the Volland-Stern electric field does well reproducing the results for CIR driven events.
SEDA: A software package for the Statistical Earthquake Data Analysis
Lombardi, A. M.
2017-01-01
In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package. PMID:28290482
Statistical analysis of shear cracks on rock surfaces
NASA Astrophysics Data System (ADS)
Åström, J. A.
2007-04-01
A set of 3873 cracks on exposed granite rock surfaces are analyzed in order to investigate possible fracture mechanisms. The fracture patterns are compared with the Mohr-Coulomb and the Roscoe fracture models, which can be combined into a single fracture scheme. A third model for comparison is based on interacting `penny-shaped' micro cracks introduced by Healy et al. [Nature 439, 64 (2006)]. The former models predict a bimodal fracture angle distribution, with two narrow peaks separated by 60○-90○ symmetrically on both sides of the direction of the largest principal stress, while the latter predicts a single broader peak in the same direction with standard deviation in the range 15○-20○. The crack length distributions seem consistent with numerical simulation, whereas the fracture patterns are Euclidean rather than fractal. The statistical analyses indicate that none of the models fully describe the fracture patterns. It seems that natural shear fractures easily become a complex combination of different fracture mechanisms.
Statistical analysis of vibration-induced bone and joint damages.
Schenk, T
1995-01-01
Vibration-induced damages to bones and joints are still occupational diseases with insufficient knowledge about causing and moderating factors and resulting damages. For a better understanding of these relationships also retrospective analyses of already acknowledged occupational diseases may be used. Already recorded detailed data for 203 in 1970 to 1979 acknowledged occupational diseases in the building industry and the building material industry of the GDR are the basis for the here described investigations. The data were gathered from the original documents of the occupational diseases and scaled in cooperation of an industrial engineer and an industrial physician. For the purposes of this investigations the data are to distinguish between data which describe the conditions of the work place (e.g. material, tools and posture), the exposure parameters (e.g. beginning of exposure and latency period) and the disease (e.g. anamnestical and radiological data). These data are treated for the use with sophisticated computerized statistical methods. The following analyses were carried out. Investigation of the connections between the several characteristics, which describe the occupational disease (health damages), including the comparison of the severity of the damages at the individual joints. Investigation of the side dependence of the damages. Investigation of the influence of the age at the beginning of the exposure and the age at the acknowledgement of the occupational disease and herewith of the exposure duration. Investigation of the effect of different occupational and exposure conditions.
SEDA: A software package for the Statistical Earthquake Data Analysis.
Lombardi, A M
2017-03-14
In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package.
Statistical threshold for nonlinear Granger Causality in motor intention analysis.
Liu, MengTing; Kuo, Ching-Chang; Chiu, Alan W L
2011-01-01
Directed influence between multiple channel signal measurements is important for the understanding of large dynamic systems. This research investigates a method to analyze large, complex multi-variable systems using directional flow measure to extract relevant information related to the functional connectivity between different units in the system. The directional flow measure was completed through nonlinear Granger Causality (GC) which is based on the nonlinear predictive models using radial basis functions (RBF). In order to extract relevant information from the causality map, we propose a threshold method that can be set up through a spatial statistical process where only the top 20% of causality pathways is shown. We applied this approach to a brain computer interface (BCI) application to decode the different intended arm reaching movement (left, right and forward) using 128 surface electroencephalography (EEG) electrodes. We also evaluated the importance of selecting the appropriate radius in the region of interest and found that the directions of causal influence of active brain regions were unique with respect to the intended direction.
Statistical analysis of AFE GN&C aeropass performance
NASA Technical Reports Server (NTRS)
Chang, Ho-Pen; French, Raymond A.
1990-01-01
Performance of the guidance, navigation, and control (GN&C) system used on the Aeroassist Flight Experiment (AFE) spacecraft has been studied with Monte Carlo techniques. The performance of the AFE GN&C is investigated with a 6-DOF numerical dynamic model which includes a Global Reference Atmospheric Model (GRAM) and a gravitational model with oblateness corrections. The study considers all the uncertainties due to the environment and the system itself. In the AFE's aeropass phase, perturbations on the system performance are caused by an error space which has over 20 dimensions of the correlated/uncorrelated error sources. The goal of this study is to determine, in a statistical sense, how much flight path angle error can be tolerated at entry interface (EI) and still have acceptable delta-V capability at exit to position the AFE spacecraft for recovery. Assuming there is fuel available to produce 380 ft/sec of delta-V at atmospheric exit, a 3-sigma standard deviation in flight path angle error of 0.04 degrees at EI would result in a 98-percent probability of mission success.
SEDA: A software package for the Statistical Earthquake Data Analysis
NASA Astrophysics Data System (ADS)
Lombardi, A. M.
2017-03-01
In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package.
Docking studies on NSAID/COX-2 isozyme complexes using Contact Statistics analysis
NASA Astrophysics Data System (ADS)
Ermondi, Giuseppe; Caron, Giulia; Lawrence, Raelene; Longo, Dario
2004-11-01
The selective inhibition of COX-2 isozymes should lead to a new generation of NSAIDs with significantly reduced side effects; e.g. celecoxib (Celebrex®) and rofecoxib (Vioxx®). To obtain inhibitors with higher selectivity it has become essential to gain additional insight into the details of the interactions between COX isozymes and NSAIDs. Although X-ray structures of COX-2 complexed with a small number of ligands are available, experimental data are missing for two well-known selective COX-2 inhibitors (rofecoxib and nimesulide) and docking results reported are controversial. We use a combination of a traditional docking procedure with a new computational tool (Contact Statistics analysis) that identifies the best orientation among a number of solutions to shed some light on this topic.
NASA Astrophysics Data System (ADS)
Omura, Masaaki; Yoshida, Kenji; Kohta, Masushi; Kubo, Takabumi; Ishiguro, Toshimichi; Kobayashi, Kazuto; Hozumi, Naohiro; Yamaguchi, Tadashi
2016-07-01
To characterize skin ulcers for bacterial infection, quantitative ultrasound (QUS) parameters were estimated by the multiple statistical analysis of the echo amplitude envelope based on both Weibull and generalized gamma distributions and the ratio of mean to standard deviation of the echo amplitude envelope. Measurement objects were three rat models (noninfection, critical colonization, and infection models). Ultrasound data were acquired using a modified ultrasonic diagnosis system with a center frequency of 11 MHz. In parallel, histopathological images and two-dimensional map of speed of sound (SoS) were observed. It was possible to detect typical tissue characteristics such as infection by focusing on the relationship of QUS parameters and to indicate the characteristic differences that were consistent with the scatterer structure. Additionally, the histopathological characteristics and SoS of noninfected and infected tissues were matched to the characteristics of QUS parameters in each rat model.
Statistical analysis from recent abundance determinations in HgMn stars
NASA Astrophysics Data System (ADS)
Ghazaryan, S.; Alecian, G.
2016-08-01
To better understand the hot chemically peculiar group of HgMn stars, we have considered a compilation of a large number of recently published data obtained for these stars from spectroscopy. We compare these data to the previous compilation by Smith. We confirm the main trends of the abundance peculiarities, namely the increasing overabundances with increasing atomic number of heavy elements, and their large spread from star to star. For all the measured elements, we have looked for correlations between abundances and effective temperature (Teff). In addition to the known correlation for Mn, some other elements are found to show some connection between their abundances and Teff. We have also checked if multiplicity is a determinant parameter for abundance peculiarities determined for these stars. A statistical analysis using a Kolmogorov-Smirnov test shows that the abundances anomalies in the atmosphere of HgMn stars do not present significant dependence on the multiplicity.
The Patterns of Teacher Compensation. Statistical Analysis Report.
ERIC Educational Resources Information Center
Chambers, Jay; Bobbitt, Sharon A.
This report presents information regarding the patterns of variation in the salaries paid to public and private school teachers in relation to various personal and job characteristics. Specifically, the analysis examines the relationship between compensation and variables such as public/private schools, gender, race/ethnic background, school level…
Open Access Publishing Trend Analysis: Statistics beyond the Perception
ERIC Educational Resources Information Center
Poltronieri, Elisabetta; Bravo, Elena; Curti, Moreno; Maurizio Ferri,; Mancini, Cristina
2016-01-01
Introduction: The purpose of this analysis was twofold: to track the number of open access journals acquiring impact factor, and to investigate the distribution of subject categories pertaining to these journals. As a case study, journals in which the researchers of the National Institute of Health (Istituto Superiore di Sanità) in Italy have…
Statistical Performance Analysis of Data-Driven Neural Models.
Freestone, Dean R; Layton, Kelvin J; Kuhlmann, Levin; Cook, Mark J
2017-02-01
Data-driven model-based analysis of electrophysiological data is an emerging technique for understanding the mechanisms of seizures. Model-based analysis enables tracking of hidden brain states that are represented by the dynamics of neural mass models. Neural mass models describe the mean firing rates and mean membrane potentials of populations of neurons. Various neural mass models exist with different levels of complexity and realism. An ideal data-driven model-based analysis framework will incorporate the most realistic model possible, enabling accurate imaging of the physiological variables. However, models must be sufficiently parsimonious to enable tracking of important variables using data. This paper provides tools to inform the realism versus parsimony trade-off, the Bayesian Cramer-Rao (lower) Bound (BCRB). We demonstrate how the BCRB can be used to assess the feasibility of using various popular neural mass models to track epilepsy-related dynamics via stochastic filtering methods. A series of simulations show how optimal state estimates relate to measurement noise, model error and initial state uncertainty. We also demonstrate that state estimation accuracy will vary between seizure-like and normal rhythms. The performance of the extended Kalman filter (EKF) is assessed against the BCRB. This work lays a foundation for assessing feasibility of model-based analysis. We discuss how the framework can be used to design experiments to better understand epilepsy.
Granger causality--statistical analysis under a configural perspective.
von Eye, Alexander; Wiedermann, Wolfgang; Mun, Eun-Young
2014-03-01
The concept of Granger causality can be used to examine putative causal relations between two series of scores. Based on regression models, it is asked whether one series can be considered the cause for the second series. In this article, we propose extending the pool of methods available for testing hypotheses that are compatible with Granger causation by adopting a configural perspective. This perspective allows researchers to assume that effects exist for specific categories only or for specific sectors of the data space, but not for other categories or sectors. Configural Frequency Analysis (CFA) is proposed as the method of analysis from a configural perspective. CFA base models are derived for the exploratory analysis of Granger causation. These models are specified so that they parallel the regression models used for variable-oriented analysis of hypotheses of Granger causation. An example from the development of aggression in adolescence is used. The example shows that only one pattern of change in aggressive impulses over time Granger-causes change in physical aggression against peers.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems
NASA Technical Reports Server (NTRS)
He, Yuning; Davies, Misty Dawn
2014-01-01
The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
Modular reweighting software for statistical mechanical analysis of biased equilibrium data
NASA Astrophysics Data System (ADS)
Sindhikara, Daniel J.
2012-07-01
Here a simple, useful, modular approach and software suite designed for statistical reweighting and analysis of equilibrium ensembles is presented. Statistical reweighting is useful and sometimes necessary for analysis of equilibrium enhanced sampling methods, such as umbrella sampling or replica exchange, and also in experimental cases where biasing factors are explicitly known. Essentially, statistical reweighting allows extrapolation of data from one or more equilibrium ensembles to another. Here, the fundamental separable steps of statistical reweighting are broken up into modules - allowing for application to the general case and avoiding the black-box nature of some “all-inclusive” reweighting programs. Additionally, the programs included are, by-design, written with little dependencies. The compilers required are either pre-installed on most systems, or freely available for download with minimal trouble. Examples of the use of this suite applied to umbrella sampling and replica exchange molecular dynamics simulations will be shown along with advice on how to apply it in the general case. New version program summaryProgram title: Modular reweighting version 2 Catalogue identifier: AEJH_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJH_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 179 118 No. of bytes in distributed program, including test data, etc.: 8 518 178 Distribution format: tar.gz Programming language: C++, Python 2.6+, Perl 5+ Computer: Any Operating system: Any RAM: 50-500 MB Supplementary material: An updated version of the original manuscript (Comput. Phys. Commun. 182 (2011) 2227) is available Classification: 4.13 Catalogue identifier of previous version: AEJH_v1_0 Journal reference of previous version: Comput. Phys. Commun. 182 (2011) 2227 Does the new
Statistical theory and methodology for remote sensing data analysis
NASA Technical Reports Server (NTRS)
Odell, P. L.
1974-01-01
A model is developed for the evaluation of acreages (proportions) of different crop-types over a geographical area using a classification approach and methods for estimating the crop acreages are given. In estimating the acreages of a specific croptype such as wheat, it is suggested to treat the problem as a two-crop problem: wheat vs. nonwheat, since this simplifies the estimation problem considerably. The error analysis and the sample size problem is investigated for the two-crop approach. Certain numerical results for sample sizes are given for a JSC-ERTS-1 data example on wheat identification performance in Hill County, Montana and Burke County, North Dakota. Lastly, for a large area crop acreages inventory a sampling scheme is suggested for acquiring sample data and the problem of crop acreage estimation and the error analysis is discussed.
Stalked protozoa identification by image analysis and multivariable statistical techniques.
Amaral, A L; Ginoris, Y P; Nicolau, A; Coelho, M A Z; Ferreira, E C
2008-06-01
Protozoa are considered good indicators of the treatment quality in activated sludge systems as they are sensitive to physical, chemical and operational processes. Therefore, it is possible to correlate the predominance of certain species or groups and several operational parameters of the plant. This work presents a semiautomatic image analysis procedure for the recognition of the stalked protozoa species most frequently found in wastewater treatment plants by determining the geometrical, morphological and signature data and subsequent processing by discriminant analysis and neural network techniques. Geometrical descriptors were found to be responsible for the best identification ability and the identification of the crucial Opercularia and Vorticella microstoma microorganisms provided some degree of confidence to establish their presence in wastewater treatment plants.
Characterization of Nuclear Fuel using Multivariate Statistical Analysis
Robel, M; Robel, M; Robel, M; Kristo, M J; Kristo, M J
2007-11-27
Various combinations of reactor type and fuel composition have been characterized using principle components analysis (PCA) of the concentrations of 9 U and Pu isotopes in the 10 fuel as a function of burnup. The use of PCA allows the reduction of the 9-dimensional data (isotopic concentrations) into a 3-dimensional approximation, giving a visual representation of the changes in nuclear fuel composition with burnup. Real-world variation in the concentrations of {sup 234}U and {sup 236}U in the fresh (unirradiated) fuel was accounted for. The effects of reprocessing were also simulated. The results suggest that, 15 even after reprocessing, Pu isotopes can be used to determine both the type of reactor and the initial fuel composition with good discrimination. Finally, partial least squares discriminant analysis (PSLDA) was investigated as a substitute for PCA. Our results suggest that PLSDA is a better tool for this application where separation between known classes is most important.
Practical guidance for statistical analysis of operational event data
Atwood, C.L.
1995-10-01
This report presents ways to avoid mistakes that are sometimes made in analysis of operational event data. It then gives guidance on what to do when a model is rejected, a list of standard types of models to consider, and principles for choosing one model over another. For estimating reliability, it gives advice on which failure modes to model, and moment formulas for combinations of failure modes. The issues are illustrated with many examples and case studies.
Multivariate statistical analysis of Raman images of a pharmaceutical tablet.
Lin, Haisheng; Marjanović, Ognjen; Lennox, Barry; Šašić, Slobodan; Clegg, Ian M
2012-03-01
This paper describes the application of principal component analysis (PCA) and independent component analysis (ICA) to identify the reference spectra of a pharmaceutical tablet's constituent compounds from Raman spectroscopic data. The analysis shows, first with a simulated data set and then with data collected from a pharmaceutical tablet, that both PCA and ICA are able to identify most of the features present in the reference spectra of the constituent compounds. However, the results suggest that the ICA method may be more appropriate when attempting to identify unknown reference spectra from a sample. The resulting PCA and ICA models are subsequently used to estimate the relative concentrations of the constituent compounds and to produce spatial distribution images of the analyzed tablet. These images provide a visual representation of the spatial distribution of the constituent compounds throughout the tablet. Images associated with the ICA scores are found to be more informative and not as affected by measurement noise as the PCA based score images. The paper concludes with a discussion of the future work that needs to be undertaken for ICA to gain wider acceptance in the applied spectroscopy community.
Analysis of tensile bond strengths using Weibull statistics.
Burrow, Michael F; Thomas, David; Swain, Mike V; Tyas, Martin J
2004-09-01
Tensile strength tests of restorative resins bonded to dentin, and the resultant strengths of interfaces between the two, exhibit wide variability. Many variables can affect test results, including specimen preparation and storage, test rig design and experimental technique. However, the more fundamental source of variability, that associated with the brittle nature of the materials, has received little attention. This paper analyzes results from micro-tensile tests on unfilled resins and adhesive bonds between restorative resin composite and dentin in terms of reliability using the Weibull probability of failure method. Results for the tensile strengths of Scotchbond Multipurpose Adhesive (3M) and Clearfil LB Bond (Kuraray) bonding resins showed Weibull moduli (m) of 6.17 (95% confidence interval, 5.25-7.19) and 5.01 (95% confidence interval, 4.23-5.8). Analysis of results for micro-tensile tests on bond strengths to dentin gave moduli between 1.81 (Clearfil Liner Bond 2V) and 4.99 (Gluma One Bond, Kulzer). Material systems with m in this range do not have a well-defined strength. The Weibull approach also enables the size dependence of the strength to be estimated. An example where the bonding area was changed from 3.1 to 1.1 mm diameter is shown. Weibull analysis provides a method for determining the reliability of strength measurements in the analysis of data from bond strength and tensile tests on dental restorative materials.
Zhu, Xiaofeng
2016-01-01
Meta-analysis of single trait for multiple cohorts has been used for increasing statistical power in genome-wide association studies (GWASs). Although hundreds of variants have been identified by GWAS, these variants only explain a small fraction of phenotypic variation. Cross-phenotype association analysis (CPASSOC) can further improve statistical power by searching for variants that contribute to multiple traits, which is often relevant to pleiotropy. In this study, we performed CPASSOC analysis on the summary statistics from the Genetic Investigation of ANthropometric Traits (GIANT) consortium using a novel method recently developed by our group. Sex-specific meta-analysis data for height, body mass index (BMI), and waist-to-hip ratio adjusted for BMI (WHRadjBMI) from discovery phase of the GIANT consortium study were combined using CPASSOC for each trait as well as 3 traits together. The conventional meta-analysis results from the discovery phase data of GIANT consortium studies were used to compare with that from CPASSOC analysis. The CPASSOC analysis was able to identify 17 loci associated with anthropometric traits that were missed by conventional meta-analysis. Among these loci, 16 have been reported in literature by including additional samples and 1 is novel. We also demonstrated that CPASSOC is able to detect pleiotropic effects when analyzing multiple traits. PMID:27701450
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
ERIC Educational Resources Information Center
Jones, Andrew T.
2011-01-01
Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…
A new statistic for the analysis of circular data in gamma-ray astronomy
NASA Technical Reports Server (NTRS)
Protheroe, R. J.
1985-01-01
A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.
Statistical analysis of 59 inspected SSME HPFTP turbine blades (uncracked and cracked)
NASA Technical Reports Server (NTRS)
Wheeler, John T.
1987-01-01
The numerical results of statistical analysis of the test data of Space Shuttle Main Engine high pressure fuel turbopump second-stage turbine blades, including some with cracks are presented. Several statistical methods use the test data to determine the application of differences in frequency variations between the uncracked and cracked blades.
ERIC Educational Resources Information Center
Papadimitriou, Fivos; Kidman, Gillian
2012-01-01
Certain statistic and scientometric features of articles published in the journal "International Research in Geographical and Environmental Education" (IRGEE) are examined in this paper for the period 1992-2009 by applying nonparametric statistics and Shannon's entropy (diversity) formula. The main findings of this analysis are: (a) after 2004,…
School Library Media Centers: 1993-94. Statistical Analysis Report, August 1998.
ERIC Educational Resources Information Center
Chaney, Bradford; Williams, Jeffrey
This statistical analysis report from the National Center for Education Statistics examines the current state of school libraries in the United States and how they have changed. The primary source of data in this report is the 1993-94 Library Survey, the first federally sponsored survey of library media centers and head librarians in elementary…
Statistical correlation analysis for comparing vibration data from test and analysis
NASA Technical Reports Server (NTRS)
Butler, T. G.; Strang, R. F.; Purves, L. R.; Hershfeld, D. J.
1986-01-01
A theory was developed to compare vibration modes obtained by NASTRAN analysis with those obtained experimentally. Because many more analytical modes can be obtained than experimental modes, the analytical set was treated as expansion functions for putting both sources in comparative form. The dimensional symmetry was developed for three general cases: nonsymmetric whole model compared with a nonsymmetric whole structural test, symmetric analytical portion compared with a symmetric experimental portion, and analytical symmetric portion with a whole experimental test. The theory was coded and a statistical correlation program was installed as a utility. The theory is established with small classical structures.
Statistical analysis of wines using a robust compositional biplot.
Hron, K; Jelínková, M; Filzmoser, P; Kreuziger, R; Bednář, P; Barták, P
2012-02-15
Eight phenolic acids (vanillic, gentisic, protocatechuic, syringic, gallic, coumaric, ferulic and caffeic) were quantitatively determined in 30 commercially available wines from South Moravia by gas chromatography-mass spectrometry. Raw (untransformed) and centered log-ratio transformed data were evaluated by classical and robust version of principal component analysis (PCA). A robust compositional biplot of the centered log-ratio transformed data gives the best resolution of particular categories of wines. Vanillic, syringic and gallic acids were identified as presumed markers occurring in relatively higher concentrations in red wines. Gentisic and caffeic acid were tentatively suggested as prospective technological markers, reflecting presumably some kinds of technological aspects of wine making.
Statistical Analysis of the Different Factors Affecting the Diarrhea
Zaman, Qamruz; Khan, Imtiaz
2011-01-01
Diarrhea is a worldwide problem facing both developing countries and developed countries, especially in pediatric population. Because of shortage of health facilities and lack of good food in developing countries, it is known fact that developing countries are facing this death taking problem more. The main purpose of this study was to examine the various factors which affect the recovery time of diarrhea. A multiple linear regression was applied to analyze the data and to select a model. The response variable for the study was the recovery time of diarrhea. The results of the analysis show that the Zinc is the main factor which affect the recovery time in Peshawar. PMID:23408274
Statistical methods for the forensic analysis of striated tool marks
Hoeksema, Amy Beth
2013-01-01
In forensics, fingerprints can be used to uniquely identify suspects in a crime. Similarly, a tool mark left at a crime scene can be used to identify the tool that was used. However, the current practice of identifying matching tool marks involves visual inspection of marks by forensic experts which can be a very subjective process. As a result, declared matches are often successfully challenged in court, so law enforcement agencies are particularly interested in encouraging research in more objective approaches. Our analysis is based on comparisons of profilometry data, essentially depth contours of a tool mark surface taken along a linear path. In current practice, for stronger support of a match or non-match, multiple marks are made in the lab under the same conditions by the suspect tool. We propose the use of a likelihood ratio test to analyze the difference between a sample of comparisons of lab tool marks to a field tool mark, against a sample of comparisons of two lab tool marks. Chumbley et al. (2010) point out that the angle of incidence between the tool and the marked surface can have a substantial impact on the tool mark and on the effectiveness of both manual and algorithmic matching procedures. To better address this problem, we describe how the analysis can be enhanced to model the effect of tool angle and allow for angle estimation for a tool mark left at a crime scene. With sufficient development, such methods may lead to more defensible forensic analyses.
Statistical Analysis of Shear Wave Speed in the Uterine Cervix
Carlson, Lindsey C.; Feltovich, Helen; Palmeri, Mark L.; del Rio, Alejandro Muñoz; Hall, Timothy J.
2014-01-01
Although cervical softening is critical in pregnancy, there currently is no objective method for assessing the softness of the cervix. Shear wave speed (SWS) estimation is a noninvasive tool used to measure tissue mechanical properties such as stiffness. The goal of this study was to determine the spatial variability and assess the ability of SWS to classify ripened vs. unripened tissue samples. Ex vivo human hysterectomy samples (n = 22) were collected, a subset (n = 13) were ripened. SWS estimates were made at 4–5 locations along the length of the canal on both anterior and posterior halves. A linear mixed model was used for a robust multivariate analysis. Receiver operating characteristic (ROC) analysis and the area under the ROC curve (AUC) were calculated to describe the utility of SWS to classify ripened vs. unripened tissue samples. Results showed that all variables used in the linear mixed model were significant (p<0.05). Estimates at the mid location for the unripened group were 3.45 ± 0.95 m/s (anterior) and 3.56 ± 0.92 m/s (posterior), and 2.11 ± 0.45 m/s (anterior) and 2.68 ± 0.57 m/s (posterior) for the ripened (p < 0.001). The AUC’s were 0.91 and 0.84 for anterior and posterior respectively suggesting SWS estimates may be useful for quantifying cervical softening. PMID:25392863
Analysis of compressive fracture in rock using statistical techniques
Blair, S.C.
1994-12-01
Fracture of rock in compression is analyzed using a field-theory model, and the processes of crack coalescence and fracture formation and the effect of grain-scale heterogeneities on macroscopic behavior of rock are studied. The model is based on observations of fracture in laboratory compression tests, and incorporates assumptions developed using fracture mechanics analysis of rock fracture. The model represents grains as discrete sites, and uses superposition of continuum and crack-interaction stresses to create cracks at these sites. The sites are also used to introduce local heterogeneity. Clusters of cracked sites can be analyzed using percolation theory. Stress-strain curves for simulated uniaxial tests were analyzed by studying the location of cracked sites, and partitioning of strain energy for selected intervals. Results show that the model implicitly predicts both development of shear-type fracture surfaces and a strength-vs-size relation that are similar to those observed for real rocks. Results of a parameter-sensitivity analysis indicate that heterogeneity in the local stresses, attributed to the shape and loading of individual grains, has a first-order effect on strength, and that increasing local stress heterogeneity lowers compressive strength following an inverse power law. Peak strength decreased with increasing lattice size and decreasing mean site strength, and was independent of site-strength distribution. A model for rock fracture based on a nearest-neighbor algorithm for stress redistribution is also presented and used to simulate laboratory compression tests, with promising results.
An experimental statistical analysis of stress projection factors in BCC tantalum
Carroll, J. D.; Clark, B. G.; Buchheit, T. E.; Boyce, B. L.; Weinberger, C. R.
2013-10-01
Crystallographic slip planes in body centered cubic (BCC) metals are not fully understood. In polycrystals, there are additional confounding effects from grain interactions. This paper describes an experimental investigation into the effects of grain orientation and neighbors on elastic–plastic strain accumulation. In situ strain fields were obtained by performing digital image correlation (DIC) on images from a scanning electron microscope (SEM) and from optical microscopy. These strain fields were statistically compared to the grain structure measured by electron backscatter diffraction (EBSD). Spearman rank correlations were performed between effective strain and six microstructural factors including four Schmid factors associated with the <111> slip direction, grain size, and Taylor factor. Modest correlations (~10%) were found for a polycrystal tension specimen. The influence of grain neighbors was first investigated by re-correlating the polycrystal data using clusters of similarly-oriented grains identified by low grain boundary misorientation angles. Second, the experiment was repeated on a tantalum oligocrystal, with through-thickness grains. Much larger correlation coefficients were found in this multicrystal due to the dearth of grain neighbors and subsurface microstructure. Finally, a slip trace analysis indicated (in agreement with statistical correlations) that macroscopic slip often occurs on {110}<111> slip systems and sometimes by pencil glide on maximum resolved shear stress planes (MRSSP). These results suggest that Schmid factors are suitable for room temperature, quasistatic, tensile deformation in tantalum as long as grain neighbor effects are accounted for.
Statistical analysis of the ambiguities in the asteroid period determinations
NASA Astrophysics Data System (ADS)
Butkiewicz, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.
2014-07-01
A synodic period of an asteroid can be derived from its lightcurve by standard methods like Fourier-series fitting. A problem appears when results of observations are based on less than a full coverage of a lightcurve and/or high level of noise. Also, long gaps between individual lightcurves create an ambiguity in the cycle count which leads to aliases. Excluding binary systems and objects with non-principal-axis rotation, the rotation period is usually identical to the period of the second Fourier harmonic of the lightcurve. There are cases, however, where it may be connected with the 1st, 3rd, or 4th harmonic and it is difficult to choose among them when searching for the period. To help remove such uncertainties we analysed asteroid lightcurves for a range of shapes and observing/illuminating geometries. We simulated them using a modified internal code from the ISAM service (Marciniak et al. 2012, A&A 545, A131). In our computations, shapes of asteroids were modeled as Gaussian random spheres (Muinonen 1998, A&A, 332, 1087). A combination of Lommel-Seeliger and Lambert scattering laws was assumed. For each of the 100 shapes, we randomly selected 1000 positions of the spin axis, systematically changing the solar phase angle with a step of 5°. For each lightcurve, we determined its peak-to-peak amplitude, fitted the 6th-order Fourier series and derived the amplitudes of its harmonics. Instead of the number of the lightcurve extrema, which in many cases is subjective, we characterized each lightcurve by the order of the highest-amplitude Fourier harmonic. The goal of our simulations was to derive statistically significant conclusions (based on the underlying assumptions) about the dominance of different harmonics in the lightcurves of the specified amplitude and phase angle. The results, presented in the Figure, can be used in individual cases to estimate the probability that the obtained lightcurve is dominated by a specified Fourier harmonic. Some of the
Statistical analysis of aerosol species, trace gasses, and meteorology in Chicago.
Binaku, Katrina; O'Brien, Timothy; Schmeling, Martina; Fosco, Tinamarie
2013-09-01
Both canonical correlation analysis (CCA) and principal component analysis (PCA) were applied to atmospheric aerosol and trace gas concentrations and meteorological data collected in Chicago during the summer months of 2002, 2003, and 2004. Concentrations of ammonium, calcium, nitrate, sulfate, and oxalate particulate matter, as well as, meteorological parameters temperature, wind speed, wind direction, and humidity were subjected to CCA and PCA. Ozone and nitrogen oxide mixing ratios were also included in the data set. The purpose of statistical analysis was to determine the extent of existing linear relationship(s), or lack thereof, between meteorological parameters and pollutant concentrations in addition to reducing dimensionality of the original data to determine sources of pollutants. In CCA, the first three canonical variate pairs derived were statistically significant at the 0.05 level. Canonical correlation between the first canonical variate pair was 0.821, while correlations of the second and third canonical variate pairs were 0.562 and 0.461, respectively. The first canonical variate pair indicated that increasing temperatures resulted in high ozone mixing ratios, while the second canonical variate pair showed wind speed and humidity's influence on local ammonium concentrations. No new information was uncovered in the third variate pair. Canonical loadings were also interpreted for information regarding relationships between data sets. Four principal components (PCs), expressing 77.0 % of original data variance, were derived in PCA. Interpretation of PCs suggested significant production and/or transport of secondary aerosols in the region (PC1). Furthermore, photochemical production of ozone and wind speed's influence on pollutants were expressed (PC2) along with overall measure of local meteorology (PC3). In summary, CCA and PCA results combined were successful in uncovering linear relationships between meteorology and air pollutants in Chicago and
GIS application on spatial landslide analysis using statistical based models
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet; Lee, Saro; Buchroithner, Manfred F.
2009-09-01
This paper presents the assessment results of spatially based probabilistic three models using Geoinformation Techniques (GIT) for landslide susceptibility analysis at Penang Island in Malaysia. Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and supported with field surveys. Maps of the topography, soil type, lineaments and land cover were constructed from the spatial data sets. There are ten landslide related factors were extracted from the spatial database and the frequency ratio, fuzzy logic, and bivariate logistic regression coefficients of each factor was computed. Finally, landslide susceptibility maps were drawn for study area using frequency ratios, fuzzy logic and bivariate logistic regression models. For verification, the results of the analyses were compared with actual landslide locations in study area. The verification results show that bivariate logistic regression model provides slightly higher prediction accuracy than the frequency ratio and fuzzy logic models.
Ordinary chondrites - Multivariate statistical analysis of trace element contents
NASA Technical Reports Server (NTRS)
Lipschutz, Michael E.; Samuels, Stephen M.
1991-01-01
The contents of mobile trace elements (Co, Au, Sb, Ga, Se, Rb, Cs, Te, Bi, Ag, In, Tl, Zn, and Cd) in Antarctic and non-Antarctic populations of H4-6 and L4-6 chondrites, were compared using standard multivariate discriminant functions borrowed from linear discriminant analysis and logistic regression. A nonstandard randomization-simulation method was developed, making it possible to carry out probability assignments on a distribution-free basis. Compositional differences were found both between the Antarctic and non-Antarctic H4-6 chondrite populations and between two L4-6 chondrite populations. It is shown that, for various types of meteorites (in particular, for the H4-6 chondrites), the Antarctic/non-Antarctic compositional difference is due to preterrestrial differences in the genesis of their parent materials.
Spectral reflectance of surface soils - A statistical analysis
NASA Technical Reports Server (NTRS)
Crouse, K. R.; Henninger, D. L.; Thompson, D. R.
1983-01-01
The relationship of the physical and chemical properties of soils to their spectral reflectance as measured at six wavebands of Thematic Mapper (TM) aboard NASA's Landsat-4 satellite was examined. The results of performing regressions of over 20 soil properties on the six TM bands indicated that organic matter, water, clay, cation exchange capacity, and calcium were the properties most readily predicted from TM data. The middle infrared bands, bands 5 and 7, were the best bands for predicting soil properties, and the near infrared band, band 4, was nearly as good. Clustering 234 soil samples on the TM bands and characterizing the clusters on the basis of soil properties revealed several clear relationships between properties and reflectance. Discriminant analysis found organic matter, fine sand, base saturation, sand, extractable acidity, and water to be significant in discriminating among clusters.
Machine processing for remotely acquired data. [using multivariate statistical analysis
NASA Technical Reports Server (NTRS)
Landgrebe, D. A.
1974-01-01
This paper is a general discussion of earth resources information systems which utilize airborne and spaceborne sensors. It points out that information may be derived by sensing and analyzing the spectral, spatial and temporal variations of electromagnetic fields emanating from the earth surface. After giving an overview system organization, the two broad categories of system types are discussed. These are systems in which high quality imagery is essential and those more numerically oriented. Sensors are also discussed with this categorization of systems in mind. The multispectral approach and pattern recognition are described as an example data analysis procedure for numerically-oriented systems. The steps necessary in using a pattern recognition scheme are described and illustrated with data obtained from aircraft and the Earth Resources Technology Satellite (ERTS-1).
Statistical Analysis of Acoustic Wave Parameters Near Solar Active Regions
NASA Astrophysics Data System (ADS)
Rabello-Soares, M. Cristina; Bogart, Richard S.; Scherrer, Philip H.
2016-08-01
In order to quantify the influence of magnetic fields on acoustic mode parameters and flows in and around active regions, we analyze the differences in the parameters in magnetically quiet regions nearby an active region (which we call “nearby regions”), compared with those of quiet regions at the same disk locations for which there are no neighboring active regions. We also compare the mode parameters in active regions with those in comparably located quiet regions. Our analysis is based on ring-diagram analysis of all active regions observed by the Helioseismic and Magnetic Imager (HMI) during almost five years. We find that the frequency at which the mode amplitude changes from attenuation to amplification in the quiet nearby regions is around 4.2 mHz, in contrast to the active regions, for which it is about 5.1 mHz. This amplitude enhacement (the “acoustic halo effect”) is as large as that observed in the active regions, and has a very weak dependence on the wave propagation direction. The mode energy difference in nearby regions also changes from a deficit to an excess at around 4.2 mHz, but averages to zero over all modes. The frequency difference in nearby regions increases with increasing frequency until a point at which the frequency shifts turn over sharply, as in active regions. However, this turnover occurs around 4.9 mHz, which is significantly below the acoustic cutoff frequency. Inverting the horizontal flow parameters in the direction of the neigboring active regions, we find flows that are consistent with a model of the thermal energy flow being blocked directly below the active region.
Introducing Statistics to Geography Students: The Case for Exploratory Data Analysis.
ERIC Educational Resources Information Center
Burn, Christopher R.; Fox, Michael F.
1986-01-01
Exploratory data analysis (EDA) gives students a feel for the data being considered. Four applications of EDA are discussed: the use of displays, resistant statistics, transformations, and smoothing. (RM)
Multiple outcomes are often measured on each experimental unit in toxicology experiments. These multiple observations typically imply the existence of correlation between endpoints, and a statistical analysis that incorporates it may result in improved inference. When both disc...
Orthogonal separations: Comparison of orthogonality metrics by statistical analysis.
Schure, Mark R; Davis, Joe M
2015-10-02
Twenty orthogonality metrics (OMs) derived from convex hull, information theory, fractal dimension, correlation coefficients, nearest neighbor distances and bin-density techniques were calculated from a diverse group of 47 experimental two-dimensional (2D) chromatograms. These chromatograms comprise two datasets; one dataset is a collection of 2D chromatograms from Peter Carr's laboratory at the University of Minnesota, and the other dataset is based on pairs of one-dimensional chromatograms previously published by Martin Gilar and coworkers (Waters Corp.). The chromatograms were pooled to make a third or combined dataset. Cross-correlation results suggest that specific OMs are correlated within families of nearest neighbor methods, correlation coefficients and the information theory methods. Principal component analysis of the OMs show that none of the OMs stands out as clearly better at explaining the data variance than any another OM. Principal component analysis of individual chromatograms shows that different OMs favor certain chromatograms. The chromatograms exhibit a range of quality, as subjectively graded by nine experts experienced in 2D chromatography. The subjective (grading) evaluations were taken at two intervals per expert and demonstrated excellent consistency for each expert. Excellent agreement for both very good and very bad chromatograms was seen across the range of experts. However, evaluation uncertainty increased for chromatograms that were judged as average to mediocre. The grades were converted to numbers (percentages) for numerical computations. The percentages were correlated with OMs to establish good OMs for evaluating the quality of 2D chromatograms. Certain metrics correlate better than others. However, these results are not consistent across all chromatograms examined. Most of the nearest neighbor methods were observed to correlate poorly with the percentages. However, one method, devised by Clark and Evans, appeared to work
Wheat signature modeling and analysis for improved training statistics
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Malila, W. A.; Cicone, R. C.; Gleason, J. M.
1976-01-01
The author has identified the following significant results. The spectral, spatial, and temporal characteristics of wheat and other signatures in LANDSAT multispectral scanner data were examined through empirical analysis and simulation. Irrigation patterns varied widely within Kansas; 88 percent of wheat acreage in Finney was irrigated and 24 percent in Morton, as opposed to less than 3 percent for western 2/3's of the State. The irrigation practice was definitely correlated with the observed spectral response; wheat variety differences produced observable spectral differences due to leaf coloration and different dates of maturation. Between-field differences were generally greater than within-field differences, and boundary pixels produced spectral features distinct from those within field centers. Multiclass boundary pixels contributed much of the observed bias in proportion estimates. The variability between signatures obtained by different draws of training data decreased as the sample size became larger; also, the resulting signatures became more robust and the particular decision threshold value became less important.
A Statistical Analysis of the Determinants of Naval Flight Officer Training Attrition
1998-03-01
variables utilized in the model include commissioning source, race, and undergraduate major. The statistical analysis sought to determine the effect of each...of these demographic factors on the probability of attrition by reason. The results show that commissioning source has a significant effect on...in the model include commissioning source, race, and undergraduate major. The statistical analysis sought to determine the effect of each of these
Analysis/forecast experiments with a multivariate statistical analysis scheme using FGGE data
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1985-01-01
A three-dimensional, multivariate, statistical analysis method, optimal interpolation (OI) is described for modeling meteorological data from widely dispersed sites. The model was developed to analyze FGGE data at the NASA-Goddard Laboratory of Atmospherics. The model features a multivariate surface analysis over the oceans, including maintenance of the Ekman balance and a geographically dependent correlation function. Preliminary comparisons are made between the OI model and similar schemes employed at the European Center for Medium Range Weather Forecasts and the National Meteorological Center. The OI scheme is used to provide input to a GCM, and model error correlations are calculated for forecasts of 500 mb vertical water mixing ratios and the wind profiles. Comparisons are made between the predictions and measured data. The model is shown to be as accurate as a successive corrections model out to 4.5 days.
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets
Sandstrom, Mary M.; Brown, Geoffrey W.; Preston, Daniel N.; Pollard, Colin J.; Warner, Kirstin F.; Sorensen, Daniel N.; Remmers, Daniel L.; Phillips, Jason J.; Shelley, Timothy J.; Reyes, Jose A.; Hsu, Peter C.; Reynolds, John G.
2015-10-30
The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statistically significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.
NASA Astrophysics Data System (ADS)
Zhang, Jun; Guo, Fan
2015-11-01
Tooth modification technique is widely used in gear industry to improve the meshing performance of gearings. However, few of the present studies on tooth modification considers the influence of inevitable random errors on gear modification effects. In order to investigate the uncertainties of tooth modification amount variations on system's dynamic behaviors of a helical planetary gears, an analytical dynamic model including tooth modification parameters is proposed to carry out a deterministic analysis on the dynamics of a helical planetary gear. The dynamic meshing forces as well as the dynamic transmission errors of the sun-planet 1 gear pair with and without tooth modifications are computed and compared to show the effectiveness of tooth modifications on gear dynamics enhancement. By using response surface method, a fitted regression model for the dynamic transmission error(DTE) fluctuations is established to quantify the relationship between modification amounts and DTE fluctuations. By shifting the inevitable random errors arousing from manufacturing and installing process to tooth modification amount variations, a statistical tooth modification model is developed and a methodology combining Monte Carlo simulation and response surface method is presented for uncertainty analysis of tooth modifications. The uncertainly analysis reveals that the system's dynamic behaviors do not obey the normal distribution rule even though the design variables are normally distributed. In addition, a deterministic modification amount will not definitely achieve an optimal result for both static and dynamic transmission error fluctuation reduction simultaneously.
DHLAS: A web-based information system for statistical genetic analysis of HLA population data.
Thriskos, P; Zintzaras, E; Germenis, A
2007-03-01
DHLAS (database HLA system) is a user-friendly, web-based information system for the analysis of human leukocyte antigens (HLA) data from population studies. DHLAS has been developed using JAVA and the R system, it runs on a Java Virtual Machine and its user-interface is web-based powered by the servlet engine TOMCAT. It utilizes STRUTS, a Model-View-Controller framework and uses several GNU packages to perform several of its tasks. The database engine it relies upon for fast access is MySQL, but others can be used a well. The system estimates metrics, performs statistical testing and produces graphs required for HLA population studies: (i) Hardy-Weinberg equilibrium (calculated using both asymptotic and exact tests), (ii) genetics distances (Euclidian or Nei), (iii) phylogenetic trees using the unweighted pair group method with averages and neigbor-joining method, (iv) linkage disequilibrium (pairwise and overall, including variance estimations), (v) haplotype frequencies (estimate using the expectation-maximization algorithm) and (vi) discriminant analysis. The main merit of DHLAS is the incorporation of a database, thus, the data can be stored and manipulated along with integrated genetic data analysis procedures. In addition, it has an open architecture allowing the inclusion of other functions and procedures.
Processing and statistical analysis of soil-root images
NASA Astrophysics Data System (ADS)
Razavi, Bahar S.; Hoang, Duyen; Kuzyakov, Yakov
2016-04-01
Importance of the hotspots such as rhizosphere, the small soil volume that surrounds and is influenced by plant roots, calls for spatially explicit methods to visualize distribution of microbial activities in this active site (Kuzyakov and Blagodatskaya, 2015). Zymography technique has previously been adapted to visualize the spatial dynamics of enzyme activities in rhizosphere (Spohn and Kuzyakov, 2014). Following further developing of soil zymography -to obtain a higher resolution of enzyme activities - we aimed to 1) quantify the images, 2) determine whether the pattern (e.g. distribution of hotspots in space) is clumped (aggregated) or regular (dispersed). To this end, we incubated soil-filled rhizoboxes with maize Zea mays L. and without maize (control box) for two weeks. In situ soil zymography was applied to visualize enzymatic activity of β-glucosidase and phosphatase at soil-root interface. Spatial resolution of fluorescent images was improved by direct application of a substrate saturated membrane to the soil-root system. Furthermore, we applied "spatial point pattern analysis" to determine whether the pattern (e.g. distribution of hotspots in space) is clumped (aggregated) or regular (dispersed). Our results demonstrated that distribution of hotspots at rhizosphere is clumped (aggregated) compare to control box without plant which showed regular (dispersed) pattern. These patterns were similar in all three replicates and for both enzymes. We conclude that improved zymography is promising in situ technique to identify, analyze, visualize and quantify spatial distribution of enzyme activities in the rhizosphere. Moreover, such different patterns should be considered in assessments and modeling of rhizosphere extension and the corresponding effects on soil properties and functions. Key words: rhizosphere, spatial point pattern, enzyme activity, zymography, maize.
NASA Astrophysics Data System (ADS)
Shirvani-Mahdavi, Hamidreza; Shafiee, Parisa
2016-12-01
Matrix mismatching in the quantitative analysis of materials through calibration-based laser-induced breakdown spectroscopy (LIBS) is a serious problem. In this paper, to overcome the matrix mismatching, two distinct approaches named addition standardization (AS) and addition-internal combinatorial standardization (A-ICS) are demonstrated for LIBS experiments. Furthermore, in order to examine the efficiency of these methods, the concentration of calcium in ordinary garden soil without any fertilizer is individually measured by each of the two procedures. To achieve this purpose, ten standard samples with different concentrations of calcium (as the analyte) and copper (as the internal standard) are prepared in the form of cylindrical tablets, so that the soil plays the role of the matrix in all of them. The measurements indicate that the relative error of concentration compared to a certified value derived by induced coupled plasma optical emission spectroscopy is 3.97% and 2.23% for AS and A-ICS methods, respectively. Furthermore, calculations related to standard deviation indicates that A-ICS method may be more accurate than AS one.
Signal analysis applications of nonlinear dynamics and higher-order statistics
NASA Astrophysics Data System (ADS)
Solinsky, James C.; Feeney, John J.
1994-03-01
The use of higher-order statistics (HOS) in acoustic, and financial signal analysis applications is outlined in theory and followed with specific data examples. HOS analysis is used to identify data regions of interest, and nonlinear dynamics (ND) analysis is used in a 4D embedded space to show structural density changes resulting from the HOS regions. A second-order statistical comparison is made with the same data processed to have random Fourier phase, since the HOS information is contained in this nonrandom phase. These empirical results indicate that HOS data regions are structural distortions to a second-order planar disk in the 4D ND analysis space.
NASA Technical Reports Server (NTRS)
Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.
2011-01-01
Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
Martin, David; Boyle, Fergal
2015-09-01
Several clinical studies have identified a strong correlation between neointimal hyperplasia following coronary stent deployment and both stent-induced arterial injury and altered vessel hemodynamics. As such, the sequential structural and fluid dynamics analysis of balloon-expandable stent deployment should provide a comprehensive indication of stent performance. Despite this observation, very few numerical studies of balloon-expandable coronary stents have considered both the mechanical and hemodynamic impact of stent deployment. Furthermore, in the few studies that have considered both phenomena, only a small number of stents have been considered. In this study, a sequential structural and fluid dynamics analysis methodology was employed to compare both the mechanical and hemodynamic impact of six balloon-expandable coronary stents. To investigate the relationship between stent design and performance, several common stent design properties were then identified and the dependence between these properties and both the mechanical and hemodynamic variables of interest was evaluated using statistical measures of correlation. Following the completion of the numerical analyses, stent strut thickness was identified as the only common design property that demonstrated a strong dependence with either the mean equivalent stress predicted in the artery wall or the mean relative residence time predicted on the luminal surface of the artery. These results corroborate the findings of the large-scale ISAR-STEREO clinical studies and highlight the crucial role of strut thickness in coronary stent design. The sequential structural and fluid dynamics analysis methodology and the multivariable statistical treatment of the results described in this study should prove useful in the design of future balloon-expandable coronary stents.
Methods of learning in statistical education: Design and analysis of a randomized trial
NASA Astrophysics Data System (ADS)
Boyd, Felicity Turner
Background. Recent psychological and technological advances suggest that active learning may enhance understanding and retention of statistical principles. A randomized trial was designed to evaluate the addition of innovative instructional methods within didactic biostatistics courses for public health professionals. Aims. The primary objectives were to evaluate and compare the addition of two active learning methods (cooperative and internet) on students' performance; assess their impact on performance after adjusting for differences in students' learning style; and examine the influence of learning style on trial participation. Methods. Consenting students enrolled in a graduate introductory biostatistics course were randomized to cooperative learning, internet learning, or control after completing a pretest survey. The cooperative learning group participated in eight small group active learning sessions on key statistical concepts, while the internet learning group accessed interactive mini-applications on the same concepts. Controls received no intervention. Students completed evaluations after each session and a post-test survey. Study outcome was performance quantified by examination scores. Intervention effects were analyzed by generalized linear models using intent-to-treat analysis and marginal structural models accounting for reported participation. Results. Of 376 enrolled students, 265 (70%) consented to randomization; 69, 100, and 96 students were randomized to the cooperative, internet, and control groups, respectively. Intent-to-treat analysis showed no differences between study groups; however, 51% of students in the intervention groups had dropped out after the second session. After accounting for reported participation, expected examination scores were 2.6 points higher (of 100 points) after completing one cooperative learning session (95% CI: 0.3, 4.9) and 2.4 points higher after one internet learning session (95% CI: 0.0, 4.7), versus
Statistical Analysis of Large Simulated Yield Datasets for Studying Climate Effects
NASA Technical Reports Server (NTRS)
Makowski, David; Asseng, Senthold; Ewert, Frank; Bassu, Simona; Durand, Jean-Louis; Martre, Pierre; Adam, Myriam; Aggarwal, Pramod K.; Angulo, Carlos; Baron, Chritian; Basso, Bruno; Bertuzzi, Patrick; Biemath, Christian; Boogaard, Hendrik; Boote, Kenneth J.; Brisson, Nadine; Cammarano, Davide; Challinor, Andrew J.; Conijn, Sjakk J. G.; Corbeels, Marc; Deryng, Delphine; De Sanctis, Giacomo; Doltra, Jordi; Gayler, Sebastian; Goldberg, Richard A.; Grassini, Patricio; Hatfield, Jerry L.; Heng, Lee; Hoek, Steven; Hooker, Josh; Hunt, Tony L. A.; Ingwersen, Joachim; Izaurralde, Cesar; Jongschaap, Raymond E. E.; Rosenzweig, Cynthia
2015-01-01
Many studies have been carried out during the last decade to study the effect of climate change on crop yields and other key crop characteristics. In these studies, one or several crop models were used to simulate crop growth and development for different climate scenarios that correspond to different projections of atmospheric CO2 concentration, temperature, and rainfall changes (Semenov et al., 1996; Tubiello and Ewert, 2002; White et al., 2011). The Agricultural Model Intercomparison and Improvement Project (AgMIP; Rosenzweig et al., 2013) builds on these studies with the goal of using an ensemble of multiple crop models in order to assess effects of climate change scenarios for several crops in contrasting environments. These studies generate large datasets, including thousands of simulated crop yield data. They include series of yield values obtained by combining several crop models with different climate scenarios that are defined by several climatic variables (temperature, CO2, rainfall, etc.). Such datasets potentially provide useful information on the possible effects of different climate change scenarios on crop yields. However, it is sometimes difficult to analyze these datasets and to summarize them in a useful way due to their structural complexity; simulated yield data can differ among contrasting climate scenarios, sites, and crop models. Another issue is that it is not straightforward to extrapolate the results obtained for the scenarios to alternative climate change scenarios not initially included in the simulation protocols. Additional dynamic crop model simulations for new climate change scenarios are an option but this approach is costly, especially when a large number of crop models are used to generate the simulated data, as in AgMIP. Statistical models have been used to analyze responses of measured yield data to climate variables in past studies (Lobell et al., 2011), but the use of a statistical model to analyze yields simulated by complex
NASA Astrophysics Data System (ADS)
Chan, Kwai S.
2015-12-01
Rectangular plates of Ti-6Al-4V with extra low interstitial (ELI) were fabricated by layer-by-layer deposition techniques that included electron beam melting (EBM) and laser beam melting (LBM). The surface conditions of these plates were characterized using x-ray micro-computed tomography. The depth and radius of surface notch-like features on the LBM and EBM plates were measured from sectional images of individual virtual slices of the rectangular plates. The stress concentration factors of individual surface notches were computed and analyzed statistically to determine the appropriate distributions for the notch depth, notch radius, and stress concentration factor. These results were correlated with the fatigue life of the Ti-6Al-4V ELI alloys from an earlier investigation. A surface notch analysis was performed to assess the debit in the fatigue strength due to the surface notches. The assessment revealed that the fatigue lives of the additively manufactured plates with rough surface topographies and notch-like features are dominated by the fatigue crack growth of large cracks for both the LBM and EBM materials. The fatigue strength reduction due to the surface notches can be as large as 60%-75%. It is concluded that for better fatigue performance, the surface notches on EBM and LBM materials need to be removed by machining and the surface roughness be improved to a surface finish of about 1 μm.
Adams, James; Kruger, Uwe; Geis, Elizabeth; Gehn, Eva; Fimbres, Valeria; Pollard, Elena; Mitchell, Jessica; Ingram, Julie; Hellmers, Robert; Quig, David; Hahn, Juergen
2017-01-01
Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate
NASA Astrophysics Data System (ADS)
Chakravarty, T.; Chowdhury, A.; Ghose, A.; Bhaumik, C.; Balamuralidhar, P.
2014-03-01
Telematics form an important technology enabler for intelligent transportation systems. By deploying on-board diagnostic devices, the signatures of vehicle vibration along with its location and time are recorded. Detailed analyses of the collected signatures offer deep insights into the state of the objects under study. Towards that objective, we carried out experiments by deploying telematics device in one of the office bus that ferries employees to office and back. Data is being collected from 3-axis accelerometer, GPS, speed and the time for all the journeys. In this paper, we present initial results of the above exercise by applying statistical methods to derive information through systematic analysis of the data collected over four months. It is demonstrated that the higher order derivative of the measured Z axis acceleration samples display the properties Weibull distribution when the time axis is replaced by the amplitude of such processed acceleration data. Such an observation offers us a method to predict future behaviour where deviations from prediction are classified as context-based aberrations or progressive degradation of the system. In addition we capture the relationship between speed of the vehicle and median of the jerk energy samples using regression analysis. Such results offer an opportunity to develop a robust method to model road-vehicle interaction thereby enabling us to predict such like driving behaviour and condition based maintenance etc.
NASA Astrophysics Data System (ADS)
Gardezi, Syed Jamal Safdar; Faye, Ibrahima; Kamel, Nidal; Eltoukhy, Mohamed Meselhy; Hussain, Muhammad
2014-10-01
Early detection of breast cancer helps reducing the mortality rates. Mammography is very useful tool in breast cancer detection. But it is very difficult to separate different morphological features in mammographic images. In this study, Morphological Component Analysis (MCA) method is used to extract different morphological aspects of mammographic images by effectively preserving the morphological characteristics of regions. MCA decomposes the mammogram into piecewise smooth part and the texture part using the Local Discrete Cosine Transform (LDCT) and Curvelet Transform via wrapping (CURVwrap). In this study, simple comparison in performance has been done using some statistical features for the original image versus the piecewise smooth part obtained from the MCA decomposition. The results show that MCA suppresses the structural noises and blood vessels from the mammogram and enhances the performance for mass detection.
Wang, Yi; Ma, Xiang; Wen, Ya-Dong; Zou, Quan; Wang, Jun; Tu, Jia-Run; Cai, Wen-Sheng; Shao, Xue-Guang
2013-05-01
Near infrared diffusive reflectance spectroscopy has been applied in on-site or on-line analysis due to its characteristics of fastness, non-destruction and the feasibility for real complex sample analysis. The present work reported a real-time monitoring method for industrial production by using near infrared spectroscopic technique and multivariate statistical process analysis. In the method, the real-time near infrared spectra of the materials are collected on the production line, and then the evaluation of the production process can be achieved by a statistic Hotelling T2 calculated with the established model. In this work, principal component analysis (PCA) is adopted for building the model, and the statistic is calculated by projecting the real-time spectra onto the PCA model. With an application of the method in a practical production, it was demonstrated that a real-time evaluation of the variations in the production can be realized by investigating the changes in the statistic, and the comparison of the products in different batches can be achieved by further statistics of the statistic. Therefore, the proposed method may provide a practical way for quality insurance of production processes.
ROOT — A C++ framework for petabyte data storage, statistical analysis and visualization
NASA Astrophysics Data System (ADS)
Antcheva, I.; Ballintijn, M.; Bellenot, B.; Biskup, M.; Brun, R.; Buncic, N.; Canal, Ph.; Casadei, D.; Couet, O.; Fine, V.; Franco, L.; Ganis, G.; Gheata, A.; Maline, D. Gonzalez; Goto, M.; Iwaszkiewicz, J.; Kreshuk, A.; Segura, D. Marcos; Maunder, R.; Moneta, L.; Naumann, A.; Offermann, E.; Onuchin, V.; Panacek, S.; Rademakers, F.; Russo, P.; Tadel, M.
2009-12-01
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, the RooFit package allows the user to perform complex data modeling and fitting while the RooStats library provides abstractions and implementations for advanced statistical tools. Multivariate classification methods based on machine learning techniques are available via the TMVA package. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks — e.g. data mining in HEP — by using PROOF, which will take care of optimally
NASA Astrophysics Data System (ADS)
Li, Hui-Chuan
2014-10-01
This study examines students' procedural and conceptual achievement in fraction addition in England and Taiwan. A total of 1209 participants (561 British students and 648 Taiwanese students) at ages 12 and 13 were recruited from England and Taiwan to take part in the study. A quantitative design by means of a self-designed written test is adopted as central to the methodological considerations. The test has two major parts: the concept part and the skill part. The former is concerned with students' conceptual knowledge of fraction addition and the latter is interested in students' procedural competence when adding fractions. There were statistically significant differences both in concept and skill parts between the British and Taiwanese groups with the latter having a higher score. The analysis of the students' responses to the skill section indicates that the superiority of Taiwanese students' procedural achievements over those of their British peers is because most of the former are able to apply algorithms to adding fractions far more successfully than the latter. Earlier, Hart [1] reported that around 30% of the British students in their study used an erroneous strategy (adding tops and bottoms, for example, 2/3 + 1/7 = 3/10) while adding fractions. This study also finds that nearly the same percentage of the British group remained using this erroneous strategy to add fractions as Hart found in 1981. The study also provides evidence to show that students' understanding of fractions is confused and incomplete, even those who are successfully able to perform operations. More research is needed to be done to help students make sense of the operations and eventually attain computational competence with meaningful grounding in the domain of fractions.
Liu, Na; Li, Jun; Li, Bao-Guo
2014-11-01
The study of quality control of Chinese medicine has always been the hot and the difficulty spot of the development of traditional Chinese medicine (TCM), which is also one of the key problems restricting the modernization and internationalization of Chinese medicine. Multivariate statistical analysis is an analytical method which is suitable for the analysis of characteristics of TCM. It has been used widely in the study of quality control of TCM. Multivariate Statistical analysis was used for multivariate indicators and variables that appeared in the study of quality control and had certain correlation between each other, to find out the hidden law or the relationship between the data can be found,.which could apply to serve the decision-making and realize the effective quality evaluation of TCM. In this paper, the application of multivariate statistical analysis in the quality control of Chinese medicine was summarized, which could provided the basis for its further study.
Airwaves and Microblogs: A Statistical Analysis of Al-Shabaab’s Propaganda Effectiveness
2014-12-01
Somalia, Westgate, Kismayo, propaganda, jihad, ideology, data analysis, statistical analysis, counterterrorism, counter violent extremist messaging...Somalia that challenge U.S. interests.11 As with many violent organizations, al-Shabaab has made extensive use of the information environment as...Not all Radicals are the Same: Implications for Counter-Radicalization Strategy,” in Countering Violent Extremism, Scientific Methods & Strategies
Use of the Jackknife Statistic To Establish the External Validity of Discriminant Analysis Results.
ERIC Educational Resources Information Center
Daniel, Larry G.
That the jackknifing technique is superior to traditional techniques for assessing the external validity of statistical results of discriminant analysis is defended. Traditional approaches assessed include: (1) the empirical method, in which the discriminant function coefficients (DFCs) obtained in a given analysis are applied to predict group…
Analysis of Variance with Summary Statistics in Microsoft® Excel®
ERIC Educational Resources Information Center
Larson, David A.; Hsu, Ko-Cheng
2010-01-01
Students regularly are asked to solve Single Factor Analysis of Variance problems given only the sample summary statistics (number of observations per category, category means, and corresponding category standard deviations). Most undergraduate students today use Excel for data analysis of this type. However, Excel, like all other statistical…
Poucheret, Patrick; Fons, Françoise; Doré, Jean Christophe; Michelot, Didier; Rapior, Sylvie
2010-06-15
Ninety percent of fatal higher fungus poisoning is due to amatoxin-containing mushroom species. In addition to absence of antidote, no chemotherapeutic consensus was reported. The aim of the present study is to perform a retrospective multidimensional multivariate statistic analysis of 2110 amatoxin poisoning clinical cases, in order to optimize therapeutic decision-making. Our results allowed to classify drugs as a function of their influence on one major parameter: patient survival. Active principles were classified as first intention, second intention, adjuvant or controversial pharmaco-therapeutic clinical intervention. We conclude that (1) retrospective multidimensional multivariate statistic analysis of complex clinical dataset might help future therapeutic decision-making and (2) drugs such as silybin, N-acetylcystein and putatively ceftazidime are clearly associated, in amatoxin poisoning context, with higher level of patient survival.
Kamal, Ghulam Mustafa; Wang, Xiaohua; Bin Yuan; Wang, Jie; Sun, Peng; Zhang, Xu; Liu, Maili
2016-09-01
Soy sauce a well known seasoning all over the world, especially in Asia, is available in global market in a wide range of types based on its purpose and the processing methods. Its composition varies with respect to the fermentation processes and addition of additives, preservatives and flavor enhancers. A comprehensive (1)H NMR based study regarding the metabonomic variations of soy sauce to differentiate among different types of soy sauce available on the global market has been limited due to the complexity of the mixture. In present study, (13)C NMR spectroscopy coupled with multivariate statistical data analysis like principle component analysis (PCA), and orthogonal partial least square-discriminant analysis (OPLS-DA) was applied to investigate metabonomic variations among different types of soy sauce, namely super light, super dark, red cooking and mushroom soy sauce. The main additives in soy sauce like glutamate, sucrose and glucose were easily distinguished and quantified using (13)C NMR spectroscopy which were otherwise difficult to be assigned and quantified due to serious signal overlaps in (1)H NMR spectra. The significantly higher concentration of sucrose in dark, red cooking and mushroom flavored soy sauce can directly be linked to the addition of caramel in soy sauce. Similarly, significantly higher level of glutamate in super light as compared to super dark and mushroom flavored soy sauce may come from the addition of monosodium glutamate. The study highlights the potentiality of (13)C NMR based metabonomics coupled with multivariate statistical data analysis in differentiating between the types of soy sauce on the basis of level of additives, raw materials and fermentation procedures.
NASA Astrophysics Data System (ADS)
Li, Hongxin; Jiang, Haodong; Gao, Ming; Ma, Zhi; Ma, Chuangui; Wang, Wei
2015-12-01
The statistical fluctuation problem is a critical factor in all quantum key distribution (QKD) protocols under finite-key conditions. The current statistical fluctuation analysis is mainly based on independent random samples, however, the precondition cannot always be satisfied because of different choices of samples and actual parameters. As a result, proper statistical fluctuation methods are required to solve this problem. Taking the after-pulse contributions into consideration, this paper gives the expression for the secure key rate and the mathematical model for statistical fluctuations, focusing on a decoy-state QKD protocol [Z.-C. Wei et al., Sci. Rep. 3, 2453 (2013), 10.1038/srep02453] with a biased basis choice. On this basis, a classified analysis of statistical fluctuation is represented according to the mutual relationship between random samples. First, for independent identical relations, a deviation comparison is made between the law of large numbers and standard error analysis. Second, a sufficient condition is given that the Chernoff bound achieves a better result than Hoeffding's inequality based on only independent relations. Third, by constructing the proper martingale, a stringent way is proposed to deal issues based on dependent random samples through making use of Azuma's inequality. In numerical optimization, the impact on the secure key rate, the comparison of secure key rates, and the respective deviations under various kinds of statistical fluctuation analyses are depicted.
The linear statistical d.c. model of GaAs MESFET using factor analysis
NASA Astrophysics Data System (ADS)
Dobrzanski, Lech
1995-02-01
The linear statistical model of the GaAs MESFET's current generator is obtained by means of factor analysis. Three different MESFET deterministic models are taken into account in the analysis: the Statz model (ST), the Materka-type model (MT) and a new proprietary model of MESFET with implanted channel (PLD). It is shown that statistical models obtained using factor analysis provide excellent generation of the multidimensional random variable representing the drain current of MESFET. The method of implementation of the statistical model into the SPICE program is presented. It is proved that for a strongly limited number of Monte Carlo analysis runs in that program, the statistical models considered in each case (ST, MT and PLD) enable good reconstruction of the empirical factor structure. The empirical correlation matrix of model parameters is not reconstructed exactly by statistical modelling, but values of correlation matrix elements obtained from simulated data are within the confidence intervals for the small sample. This paper proves that a formal approach to statistical modelling using factor analysis is the right path to follow, in spite of the fact, that CAD systems (PSpice[MicroSim Corp.], Microwave Harmonica[Compact Software]) are not designed properly for generation of the multidimensional random variable. It is obvious that further progress in implementation of statistical methods in CAD software is required. Furthermore, a new approach to the MESFET's d.c. model is presented. The separate functions, describing the linear as well as the saturated region of MESFET output characteristics, are combined in the single equation. This way of modelling is particularly suitable for transistors with an implanted channel.
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Bassari, Jinous; Triantafyllopoulos, Spiros
1984-01-01
The University of Southwestern Louisiana (USL) NASA PC R and D statistical analysis support package is designed to be a three-level package to allow statistical analysis for a variety of applications within the USL Data Base Management System (DBMS) contract work. The design addresses usage of the statistical facilities as a library package, as an interactive statistical analysis system, and as a batch processing package.
Lee, L.; Helsel, D.
2005-01-01
Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
Clarke, David C; Morris, Melody K; Lauffenburger, Douglas A
2013-01-01
Multiplexed bead-based flow cytometric immunoassays are a powerful experimental tool for investigating cellular communication networks, yet their widespread adoption is limited in part by challenges in robust quantitative analysis of the measurements. Here we report our application of mixed-effects modeling for the normalization and statistical analysis of bead-based immunoassay data. Our data set consisted of bead-based immunoassay measurements of 16 phospho-proteins in lysates of HepG2 cells treated with ligands that regulate acute-phase protein secretion. Mixed-effects modeling provided estimates for the effects of both the technical and biological sources of variance, and normalization was achieved by subtracting the technical effects from the measured values. This approach allowed us to detect ligand effects on signaling with greater precision and sensitivity and to more accurately characterize the HepG2 cell signaling network using constrained fuzzy logic. Mixed-effects modeling analysis of our data was vital for ascertaining that IL-1α and TGF-α treatment increased the activities of more pathways than IL-6 and TNF-α and that TGF-α and TNF-α increased p38 MAPK and c-Jun N-terminal kinase (JNK) phospho-protein levels in a synergistic manner. Moreover, we used mixed-effects modeling-based technical effect estimates to reveal the substantial variance contributed by batch effects along with the absence of loading order and assay plate position effects. We conclude that mixed-effects modeling enabled additional insights to be gained from our data than would otherwise be possible and we discuss how this methodology can play an important role in enhancing the value of experiments employing multiplexed bead-based immunoassays.
Clarke, David C.; Morris, Melody K.; Lauffenburger, Douglas A.
2013-01-01
Multiplexed bead-based flow cytometric immunoassays are a powerful experimental tool for investigating cellular communication networks, yet their widespread adoption is limited in part by challenges in robust quantitative analysis of the measurements. Here we report our application of mixed-effects modeling for the normalization and statistical analysis of bead-based immunoassay data. Our data set consisted of bead-based immunoassay measurements of 16 phospho-proteins in lysates of HepG2 cells treated with ligands that regulate acute-phase protein secretion. Mixed-effects modeling provided estimates for the effects of both the technical and biological sources of variance, and normalization was achieved by subtracting the technical effects from the measured values. This approach allowed us to detect ligand effects on signaling with greater precision and sensitivity and to more accurately characterize the HepG2 cell signaling network using constrained fuzzy logic. Mixed-effects modeling analysis of our data was vital for ascertaining that IL-1α and TGF-α treatment increased the activities of more pathways than IL-6 and TNF-α and that TGF-α and TNF-α increased p38 MAPK and c-Jun N-terminal kinase (JNK) phospho-protein levels in a synergistic manner. Moreover, we used mixed-effects modeling-based technical effect estimates to reveal the substantial variance contributed by batch effects along with the absence of loading order and assay plate position effects. We conclude that mixed-effects modeling enabled additional insights to be gained from our data than would otherwise be possible and we discuss how this methodology can play an important role in enhancing the value of experiments employing multiplexed bead-based immunoassays. PMID:23071098
Huang, Huei-Chung; Niu, Yi; Qin, Li-Xuan
2015-01-01
Deep sequencing has recently emerged as a powerful alternative to microarrays for the high-throughput profiling of gene expression. In order to account for the discrete nature of RNA sequencing data, new statistical methods and computational tools have been developed for the analysis of differential expression to identify genes that are relevant to a disease such as cancer. In this paper, it is thus timely to provide an overview of these analysis methods and tools. For readers with statistical background, we also review the parameter estimation algorithms and hypothesis testing strategies used in these methods. PMID:26688660
Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis
Obeid, Rita; Brooks, Patricia J.; Powers, Kasey L.; Gillespie-Lynch, Kristen; Lum, Jarrad A. G.
2016-01-01
Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD. PMID:27602006
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.
Lin, Johnny; Bentler, Peter M
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis.
Obeid, Rita; Brooks, Patricia J; Powers, Kasey L; Gillespie-Lynch, Kristen; Lum, Jarrad A G
2016-01-01
Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD.
NASA Astrophysics Data System (ADS)
Bochanski, John J.; Hawley, Suzanne L.; West, Andrew A.
2011-03-01
We present a statistical parallax analysis of low-mass dwarfs from the Sloan Digital Sky Survey. We calculate absolute r-band magnitudes (Mr ) as a function of color and spectral type and investigate changes in Mr with location in the Milky Way. We find that magnetically active M dwarfs are intrinsically brighter in Mr than their inactive counterparts at the same color or spectral type. Metallicity, as traced by the proxy ζ, also affects Mr , with metal-poor stars having fainter absolute magnitudes than higher metallicity M dwarfs at the same color or spectral type. Additionally, we measure the velocity ellipsoid and solar reflex motion for each subsample of M dwarfs. We find good agreement between our measured solar peculiar motion and previous results for similar populations, as well as some evidence for differing motions of early and late M-type populations in U and W velocities that cannot be attributed to asymmetric drift. The reflex solar motion and the velocity dispersions both show that younger populations, as traced by magnetic activity and location near the Galactic plane, have experienced less dynamical heating. We introduce a new parameter, the independent position altitude (IPA), to investigate populations as a function of vertical height from the Galactic plane. M dwarfs at all types exhibit an increase in velocity dispersion when analyzed in comparable IPA subgroups.
Statistical analysis of the MODIS atmosphere products for the Tomsk region
NASA Astrophysics Data System (ADS)
Afonin, Sergey V.; Belov, Vladimir V.; Engel, Marina V.
2005-10-01
The paper presents the results of using the MODIS Atmosphere Products satellite information to study the atmospheric characteristics (the aerosol and water vapor) in the Tomsk Region (56-61°N, 75-90°E) in 2001-2004. The satellite data were received from the NASA Goddard Distributed Active Archive Center (DAAC) through the INTERNET.To use satellite data for a solution of scientific and applied problems, it is very important to know their accuracy. Despite the results of validation of the MODIS data have already been available in the literature, we decided to carry out additional investigations for the Tomsk Region. The paper presents the results of validation of the aerosol optical thickness (AOT) and total column precipitable water (TCPW), which are in good agreement with the test data. The statistical analysis revealed some interesting facts. Thus, for example, analyzing the data on the spatial distribution of the average seasonal values of AOT or TCPW for 2001-2003 in the Tomsk Region, we established that instead of the expected spatial homogeneity of these distributions, they have similar spatial structures.
Singh, Sunil Kumar; Jha, Sunil Kumar; Chaudhary, Anand; Yadava, R D S; Rai, S B
2010-02-01
Herbal medicines play an important role in modern human life and have significant effects on treating diseases; however, the quality and safety of these herbal products has now become a serious issue due to increasing pollution in air, water, soil, etc. The present study proposes Fourier transform infrared spectroscopy (FTIR) along with the statistical method principal component analysis (PCA) to identify and discriminate herbal medicines for quality control. Herbal plants have been characterized using FTIR spectroscopy. Characteristic peaks (strong and weak) have been marked for each herbal sample in the fingerprint region (400-2000 cm(-1)). The ratio of the areas of any two marked characteristic peaks was found to be nearly consistent for the same plant from different regions, and thus the present idea suggests an additional discrimination method for herbal medicines. PCA clusters herbal medicines into different groups, clearly showing that this method can adequately discriminate different herbal medicines using FTIR data. Toxic metal contents (Cd, Pb, Cr, and As) have been determined and the results compared with the higher permissible daily intake limit of heavy metals proposed by the World Health Organization (WHO).
Bochanski, John J.; Hawley, Suzanne L.; West, Andrew A.
2011-03-15
We present a statistical parallax analysis of low-mass dwarfs from the Sloan Digital Sky Survey. We calculate absolute r-band magnitudes (M{sub r} ) as a function of color and spectral type and investigate changes in M{sub r} with location in the Milky Way. We find that magnetically active M dwarfs are intrinsically brighter in M{sub r} than their inactive counterparts at the same color or spectral type. Metallicity, as traced by the proxy {zeta}, also affects M{sub r} , with metal-poor stars having fainter absolute magnitudes than higher metallicity M dwarfs at the same color or spectral type. Additionally, we measure the velocity ellipsoid and solar reflex motion for each subsample of M dwarfs. We find good agreement between our measured solar peculiar motion and previous results for similar populations, as well as some evidence for differing motions of early and late M-type populations in U and W velocities that cannot be attributed to asymmetric drift. The reflex solar motion and the velocity dispersions both show that younger populations, as traced by magnetic activity and location near the Galactic plane, have experienced less dynamical heating. We introduce a new parameter, the independent position altitude (IPA), to investigate populations as a function of vertical height from the Galactic plane. M dwarfs at all types exhibit an increase in velocity dispersion when analyzed in comparable IPA subgroups.
Statistical analysis of data from dilution assays with censored correlated counts.
Quiroz, Jorge; Wilson, Jeffrey R; Roychoudhury, Satrajit
2012-01-01
Frequently, count data obtained from dilution assays are subject to an upper detection limit, and as such, data obtained from these assays are usually censored. Also, counts from the same subject at different dilution levels are correlated. Ignoring the censoring and the correlation may provide unreliable and misleading results. Therefore, any meaningful data modeling requires that the censoring and the correlation be simultaneously addressed. Such comprehensive approaches of modeling censoring and correlation are not widely used in the analysis of dilution assays data. Traditionally, these data are analyzed using a general linear model on a logarithmic-transformed average count per subject. However, this traditional approach ignores the between-subject variability and risks, providing inconsistent results and unreliable conclusions. In this paper, we propose the use of a censored negative binomial model with normal random effects to analyze such data. This model addresses, in addition to the censoring and the correlation, any overdispersion that may be present in count data. The model is shown to be widely accessible through the use of several modern statistical software.
de Jong, Simone; van Eijk, Kristel R; Zeegers, Dave W L H; Strengman, Eric; Janson, Esther; Veldink, Jan H; van den Berg, Leonard H; Cahn, Wiepke; Kahn, René S; Boks, Marco P M; Ophoff, Roel A
2012-09-01
There is genetic evidence that schizophrenia is a polygenic disorder with a large number of loci of small effect on disease susceptibility. Genome-wide association studies (GWASs) of schizophrenia have had limited success, with the best finding at the MHC locus at chromosome 6p. A recent effort of the Psychiatric GWAS consortium (PGC) yielded five novel loci for schizophrenia. In this study, we aim to highlight additional schizophrenia susceptibility loci from the PGC study by combining the top association findings from the discovery stage (9394 schizophrenia cases and 12 462 controls) with expression QTLs (eQTLs) and differential gene expression in whole blood of schizophrenia patients and controls. We examined the 6192 single-nucleotide polymorphisms (SNPs) with significance threshold at P<0.001. eQTLs were calculated for these SNPs in a sample of healthy controls (n=437). The transcripts significantly regulated by the top SNPs from the GWAS meta-analysis were subsequently tested for differential expression in an independent set of schizophrenia cases and controls (n=202). After correction for multiple testing, the eQTL analysis yielded 40 significant cis-acting effects of the SNPs. Seven of these transcripts show differential expression between cases and controls. Of these, the effect of three genes (RNF5, TRIM26 and HLA-DRB3) coincided with the direction expected from meta-analysis findings and were all located within the MHC region. Our results identify new genes of interest and highlight again the involvement of the MHC region in schizophrenia susceptibility.
A new statistical analysis of rare earth element diffusion data in garnet
NASA Astrophysics Data System (ADS)
Chu, X.; Ague, J. J.
2015-12-01
The incorporation of rare earth elements (REE) in garnet, Sm and Lu in particular, links garnet chemical zoning to absolute age determinations. The application of REE-based geochronology depends critically on the diffusion behaviors of the parent and daughter isotopes. Previous experimental studies on REE diffusion in garnet, however, exhibit significant discrepancies that impact interpretations of garnet Sm/Nd and Lu/Hf ages.We present a new statistical framework to analyze diffusion data for REE using an Arrhenius relationship that accounts for oxygen fugacity, cation radius and garnet unit-cell dimensions [1]. Our approach is based on Bayesian statistics and is implemented by the Markov chain Monte Carlo method. A similar approach has been recently applied to model diffusion of divalent cations in garnet [2]. The analysis incorporates recent data [3] in addition to the data compilation in ref. [1]. We also include the inter-run bias that helps reconcile the discrepancies among data sets. This additional term estimates the reproducibility and other experimental variabilities not explicitly incorporated in the Arrhenius relationship [2] (e.g., compositional dependence [3] and water content).The fitted Arrhenius relationships are consistent with the models in ref. [3], as well as refs. [1]&[4] at high temperatures. Down-temperature extrapolation leads to >0.5 order of magnitude faster diffusion coefficients than in refs. [1]&[4] at <750 °C. The predicted diffusion coefficients are significantly slower than ref. [5]. The fast diffusion [5] was supported by a field test of the Pikwitonei Granulite—the garnet Sm/Nd age postdates the metamorphic peak (750 °C) by ~30 Myr [6], suggesting considerable resetting of the Sm/Nd system during cooling. However, the Pikwitonei Granulite is a recently recognized UHT terrane with peak temperature exceeding 900 °C [7]. The revised closure temperature (~730 °C) is consistent with our new diffusion model.[1] Carlson (2012) Am
The statistical analysis techniques to support the NGNP fuel performance experiments
Binh T. Pham; Jeffrey J. Einerson
2013-10-01
This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He–Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.
The statistical analysis techniques to support the NGNP fuel performance experiments
NASA Astrophysics Data System (ADS)
Pham, Binh T.; Einerson, Jeffrey J.
2013-10-01
This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory's Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.
Subramanyam, Busetty; Das, Ashutosh
2014-01-01
In adsorption study, to describe sorption process and evaluation of best-fitting isotherm model is a key analysis to investigate the theoretical hypothesis. Hence, numerous statistically analysis have been extensively used to estimate validity of the experimental equilibrium adsorption values with the predicted equilibrium values. Several statistical error analysis were carried out. In the present study, the following statistical analysis were carried out to evaluate the adsorption isotherm model fitness, like the Pearson correlation, the coefficient of determination and the Chi-square test, have been used. The ANOVA test was carried out for evaluating significance of various error functions and also coefficient of dispersion were evaluated for linearised and non-linearised models. The adsorption of phenol onto natural soil (Local name Kalathur soil) was carried out, in batch mode at 30 ± 20 C. For estimating the isotherm parameters, to get a holistic view of the analysis the models were compared between linear and non-linear isotherm models. The result reveled that, among above mentioned error functions and statistical functions were designed to determine the best fitting isotherm.
NASA Astrophysics Data System (ADS)
Oliveira Mendes, Thiago de; Pinto, Liliane Pereira; Santos, Laurita dos; Tippavajhala, Vamshi Krishna; Téllez Soto, Claudio Alberto; Martin, Airton Abrahão
2016-07-01
The analysis of biological systems by spectroscopic techniques involves the evaluation of hundreds to thousands of variables. Hence, different statistical approaches are used to elucidate regions that discriminate classes of samples and to propose new vibrational markers for explaining various phenomena like disease monitoring, mechanisms of action of drugs, food, and so on. However, the technical statistics are not always widely discussed in applied sciences. In this context, this work presents a detailed discussion including the various steps necessary for proper statistical analysis. It includes univariate parametric and nonparametric tests, as well as multivariate unsupervised and supervised approaches. The main objective of this study is to promote proper understanding of the application of various statistical tools in these spectroscopic methods used for the analysis of biological samples. The discussion of these methods is performed on a set of in vivo confocal Raman spectra of human skin analysis that aims to identify skin aging markers. In the Appendix, a complete routine of data analysis is executed in a free software that can be used by the scientific community involved in these studies.
2011-01-01
Background Verbal autopsies provide valuable information for studying mortality patterns in populations that lack reliable vital registration data. Methods for transforming verbal autopsy results into meaningful information for health workers and policymakers, however, are often costly or complicated to use. We present a simple additive algorithm, the Tariff Method (termed Tariff), which can be used for assigning individual cause of death and for determining cause-specific mortality fractions (CSMFs) from verbal autopsy data. Methods Tariff calculates a score, or "tariff," for each cause, for each sign/symptom, across a pool of validated verbal autopsy data. The tariffs are summed for a given response pattern in a verbal autopsy, and this sum (score) provides the basis for predicting the cause of death in a dataset. We implemented this algorithm and evaluated the method's predictive ability, both in terms of chance-corrected concordance at the individual cause assignment level and in terms of CSMF accuracy at the population level. The analysis was conducted separately for adult, child, and neonatal verbal autopsies across 500 pairs of train-test validation verbal autopsy data. Results Tariff is capable of outperforming physician-certified verbal autopsy in most cases. In terms of chance-corrected concordance, the method achieves 44.5% in adults, 39% in children, and 23.9% in neonates. CSMF accuracy was 0.745 in adults, 0.709 in children, and 0.679 in neonates. Conclusions Verbal autopsies can be an efficient means of obtaining cause of death data, and Tariff provides an intuitive, reliable method for generating individual cause assignment and CSMFs. The method is transparent and flexible and can be readily implemented by users without training in statistics or computer science. PMID:21816107
The Grenoble Analysis Toolkit (GreAT)-A statistical analysis framework
NASA Astrophysics Data System (ADS)
Putze, A.; Derome, L.
2014-12-01
The field of astroparticle physics is currently the focus of prolific scientific activity. In the last decade, this field has undergone significant developments thanks to several experimental results from CREAM, PAMELA, Fermi, and H.E.S.S. Moreover, the next generation of instruments, such as AMS-02 (launched on 16 May 2011) and CTA, will undoubtedly facilitate more sensitive and precise measurements of the cosmic-ray and γ-ray fluxes. To fully exploit the wealth of high precision data generated by these experiments, robust and efficient statistical tools such as Markov Chain Monte Carlo algorithms or evolutionary algorithms, able to handle the complexity of joint parameter spaces and datasets, are necessary for a phenomenological interpretation. The Grenoble Analysis Toolkit (GreAT) is an user-friendly and modular object orientated framework in C++, which samples the user-defined parameter space with a pre- or user-defined algorithm. The functionality of GreAT is presented in the context of cosmic-ray physics, where the boron-to-carbon (B/C) ratio is used to constrain cosmic-ray propagation models.
Crossett, Ben; Edwards, Alistair V G; White, Melanie Y; Cordwell, Stuart J
2008-01-01
Standardized methods for the solubilization of proteins prior to proteomics analyses incorporating two-dimensional gel electrophoresis (2-DE) are essential for providing reproducible data that can be subjected to rigorous statistical interrogation for comparative studies investigating disease-genesis. In this chapter, we discuss the imaging and image analysis of proteins separated by 2-DE, in the context of determining protein abundance alterations related to a change in biochemical or biophysical conditions. We then describe the principles behind 2-DE gel statistical analysis, including subtraction of background noise, spot detection, gel matching, spot quantitation for data comparison, and statistical requirements to create meaningful gel data sets. We also emphasize the need to develop reproducible and robust protocols for protein sample preparation and 2-DE itself.
Landing Site Dispersion Analysis and Statistical Assessment for the Mars Phoenix Lander
NASA Technical Reports Server (NTRS)
Bonfiglio, Eugene P.; Adams, Douglas; Craig, Lynn; Spencer, David A.; Strauss, William; Seelos, Frank P.; Seelos, Kimberly D.; Arvidson, Ray; Heet, Tabatha
2008-01-01
The Mars Phoenix Lander launched on August 4, 2007 and successfully landed on Mars 10 months later on May 25, 2008. Landing ellipse predicts and hazard maps were key in selecting safe surface targets for Phoenix. Hazard maps were based on terrain slopes, geomorphology maps and automated rock counts of MRO's High Resolution Imaging Science Experiment (HiRISE) images. The expected landing dispersion which led to the selection of Phoenix's surface target is discussed as well as the actual landing dispersion predicts determined during operations in the weeks, days, and hours before landing. A statistical assessment of these dispersions is performed, comparing the actual landing-safety probabilities to criteria levied by the project. Also discussed are applications for this statistical analysis which were used by the Phoenix project. These include using the statistical analysis used to verify the effectiveness of a pre-planned maneuver menu and calculating the probability of future maneuvers.
A statistical analysis of the effect of warfare on the human secondary sex ratio.
Graffelman, J; Hoekstra, R F
2000-06-01
Many factors have been hypothesized to affect the human secondary sex ratio (the annual percentage of males among all live births), among them race, parental ages, and birth order. Some authors have even proposed warfare as a factor influencing live birth sex ratios. The hypothesis that during and shortly after periods of war the human secondary sex ratio is higher has received little statistical treatment. In this paper we evaluate the war hypothesis using 3 statistical methods: linear regression, randomization, and time-series analysis. Live birth data from 10 different countries were included. Although we cannot speak of a general phenomenon, statistical evidence for an association between warfare and live birth sex ratio was found for several countries. Regression and randomization test results were in agreement. Time-series analysis showed that most human sex-ratio time series can be described by a common model. The results obtained using intervention models differed somewhat from results obtained by regression methods.
Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop
NASA Technical Reports Server (NTRS)
Morrison, Joseph H.
2010-01-01
A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.
Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis
Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun
2014-01-01
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
A statistical analysis to assess the maturity and stability of six composts.
Komilis, Dimitrios P; Tziouvaras, Ioannis S
2009-05-01
Despite the long-time application of organic waste derived composts to crops, there is still no universally accepted index to assess compost maturity and stability. The research presented in this article investigated the suitability of seven types of seeds for use in germination bioassays to assess the maturity and phytotoxicity of six composts. The composts used in the study were derived from cow manure, sea weeds, olive pulp, poultry manure and municipal solid waste. The seeds used in the germination bioassays were radish, pepper, spinach, tomato, cress, cucumber and lettuce. Data were analyzed with an analysis of variance at two levels and with pair-wise comparisons. The analysis revealed that composts rendered as phytotoxic to one type of seed could enhance the growth of another type of seed. Therefore, germination indices, which ranged from 0% to 262%, were highly dependent on the type of seed used in the germination bioassay. The poultry manure compost was highly phytotoxic to all seeds. At the 99% confidence level, the type of seed and the interaction between the seeds and the composts were found to significantly affect germination. In addition, the stability of composts was assessed by their microbial respiration, which ranged from approximately 4 to 16g O(2)/kg organic matter and from 2.6 to approximately 11g CO(2)-C/kg C, after seven days. Initial average oxygen uptake rates were all less than approximately 0.35g O(2)/kg organic matter/h for all six composts. A high statistically significant correlation coefficient was calculated between the cumulative carbon dioxide production, over a 7-day period, and the radish seed germination index. It appears that a germination bioassay with radish can be a valid test to assess both compost stability and compost phytotoxicity.
Buttigieg, Pier Luigi; Ramette, Alban
2014-12-01
The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community.
Analysis methods for the determination of anthropogenic additions of P to agricultural soils
Technology Transfer Automated Retrieval System (TEKTRAN)
Phosphorus additions and measurement in soil is of concern on lands where biosolids have been applied. Colorimetric analysis for plant-available P may be inadequate for the accurate assessment of soil P. Phosphate additions in a regulatory environment need to be accurately assessed as the reported...
ERIC Educational Resources Information Center
Curtis, Deborah A.; Araki, Cheri J.
The purpose of this research was to analyze recent statistics textbooks in the behavioral sciences in terms of their coverage of exploratory data analysis (EDA) philosophy and techniques. Twenty popular texts were analyzed. EDA philosophy was not addressed in the vast majority of texts. Only three texts had an entire chapter on EDA. None of the…
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis
ERIC Educational Resources Information Center
Lin, Johnny; Bentler, Peter M.
2012-01-01
Goodness-of-fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square, but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's (1984) asymptotically distribution-free method and Satorra Bentler's…
ERIC Educational Resources Information Center
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
A Statistical Analysis of Infrequent Events on Multiple-Choice Tests that Indicate Probable Cheating
ERIC Educational Resources Information Center
Sundermann, Michael J.
2008-01-01
A statistical analysis of multiple-choice answers is performed to identify anomalies that can be used as evidence of student cheating. The ratio of exact errors in common (EEIC: two students put the same wrong answer for a question) to differences (D: two students get different answers) was found to be a good indicator of cheating under a wide…
The Impact of Training and Demographics in WIA Program Performance: A Statistical Analysis
ERIC Educational Resources Information Center
Moore, Richard W.; Gorman, Philip C.
2009-01-01
The Workforce Investment Act (WIA) measures participant labor market outcomes to drive program performance. This article uses statistical analysis to examine the relationship between participant characteristics and key outcome measures in one large California local WIA program. This study also measures the impact of different training…
ERIC Educational Resources Information Center
Lau, Joann M.; Korn, Robert W.
2007-01-01
In this article, the authors present a laboratory exercise in data collection and statistical analysis in biological space using clustered stomates on leaves of "Begonia" plants. The exercise can be done in middle school classes by students making their own slides and seeing imprints of cells, or at the high school level through collecting data of…
ERIC Educational Resources Information Center
Zhou, Ping; Wang, Qinwen; Yang, Jie; Li, Jingqiu; Guo, Junming; Gong, Zhaohui
2015-01-01
This study aimed to investigate the statuses on the publishing and usage of college biochemistry textbooks in China. A textbook database was constructed and the statistical analysis was adopted to evaluate the textbooks. The results showed that there were 945 (~57%) books for theory teaching, 379 (~23%) books for experiment teaching and 331 (~20%)…
ERIC Educational Resources Information Center
Long, Mike; Frigo, Tracey; Batten, Margaret
This report describes the current educational and employment situation of Australian Indigenous youth in terms of their pathways from school to work. A literature review and analysis of statistical data identify barriers to successful transition from school to work, including forms of teaching, curriculum, and assessment that pose greater…
ERIC Educational Resources Information Center
Peterlin, Primoz
2010-01-01
Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…
Statistical analysis of relative labeled mass spectrometry data from complex samples using ANOVA
Oberg, Ann L.; Mahoney, Douglas W.; Eckel-Passow, Jeanette E.; Malone, Christopher J.; Wolfinger, Russell D.; Hill, Elizabeth G.; Cooper, Leslie T.; Onuma, Oyere K.; Spiro, Craig; Therneau, Terry M.; Bergen, H. Robert
2008-01-01
Statistical tools enable unified analysis of data from multiple global proteomic experiments, producing unbiased estimates of normalization terms despite the missing data problem inherent in these studies. The modeling approach, implementation and useful visualization tools are demonstrated via case study of complex biological samples assessed using the iTRAQ™ relative labeling protocol. PMID:18173221
Bayesian Statistical Analysis Applied to NAA Data for Neutron Flux Spectrum Determination
NASA Astrophysics Data System (ADS)
Chiesa, D.; Previtali, E.; Sisti, M.
2014-04-01
In this paper, we present a statistical method, based on Bayesian statistics, to evaluate the neutron flux spectrum from the activation data of different isotopes. The experimental data were acquired during a neutron activation analysis (NAA) experiment [A. Borio di Tigliole et al., Absolute flux measurement by NAA at the Pavia University TRIGA Mark II reactor facilities, ENC 2012 - Transactions Research Reactors, ISBN 978-92-95064-14-0, 22 (2012)] performed at the TRIGA Mark II reactor of Pavia University (Italy). In order to evaluate the neutron flux spectrum, subdivided in energy groups, we must solve a system of linear equations containing the grouped cross sections and the activation rate data. We solve this problem with Bayesian statistical analysis, including the uncertainties of the coefficients and the a priori information about the neutron flux. A program for the analysis of Bayesian hierarchical models, based on Markov Chain Monte Carlo (MCMC) simulations, is used to define the problem statistical model and solve it. The energy group fluxes and their uncertainties are then determined with great accuracy and the correlations between the groups are analyzed. Finally, the dependence of the results on the prior distribution choice and on the group cross section data is investigated to confirm the reliability of the analysis.
Statistical analysis of the strength and lifetime under tension of crystalline polymeric solids
NASA Astrophysics Data System (ADS)
Li, C. Y.; Nitta, K. H.
2015-06-01
The ductile fracture behavior under uniaxial tension of melt-crystallized isotactic polypropylene specimens at room temperature was investigated from a statistical point of view. Each tensile test was performed more than one hundred times and statistical data for the breaking point were obtained under each tensile condition. The probability distribution curves of the fracture time and strength approximately followed Gaussian statistics at lower tensile speeds, but changed to a Weibull function at higher-speed tests. Additionally, with increasing tensile speed the mean and standard deviation of the fracture time decreased linearly. The toughness, which is the total area under the stress-strain curves, was found to be independent of the tensile conditions, indicating that fracture toughness is a criterion for fracture under tension.
Lubetzky-Vilnai, Anat; Ciol, Marcia; McCoy, Sarah Westcott
2014-01-01
Deriving clinical prediction rules (CPRs) to identify specific characteristics of patients who would likely respond to certain interventions has become a research priority in physical rehabilitation. Understanding the appropriate statistical principles and methods of analyses underlying the derivation of CPRs is important for future rehabilitation research and clinical applications. In this article, we aimed to provide an overview of statistical techniques used for the derivation of CPRs to predict success following physical therapy interventions and to generate recommendations for improvements in CPR derivation research and statistical analysis in rehabilitation. We have summarized the current state of CPR intervention-related research by reviewing 26 studies. A common technique was found in most studies and included univariate association of factors with treatment success, stepwise logistic regression to determine the most parsimonious set of predictors for success, and calculation of accuracy statistics (focusing on positive likelihood ratios). We identified several shortcomings related to inadequate ratio of events by number of predictors, lack of standardization regarding acceptable interobserver reliability of predictors, questionable handling of predictors including reliance on univariate analysis and early categorization, and not accounting for dependence and collinearity of predictors in multivariable model construction. Interpretation of the derived CPRs was found to be difficult due to lack of precision of estimates and paradoxical findings when a subset of the predictors yielded a larger positive likelihood ratio than did the full set of predictors. Finally, we make recommendations regarding how to strengthen the use of statistical principles and methods to create consistency across rehabilitation research for CPR derivations.
Olavarría, Verónica V; Arima, Hisatomi; Anderson, Craig S; Brunser, Alejandro; Muñoz-Venturelli, Paula; Billot, Laurent; Lavados, Pablo M
2017-02-01
Background The HEADPOST Pilot is a proof-of-concept, open, prospective, multicenter, international, cluster randomized, phase IIb controlled trial, with masked outcome assessment. The trial will test if lying flat head position initiated in patients within 12 h of onset of acute ischemic stroke involving the anterior circulation increases cerebral blood flow in the middle cerebral arteries, as measured by transcranial Doppler. The study will also assess the safety and feasibility of patients lying flat for ≥24 h. The trial was conducted in centers in three countries, with ability to perform early transcranial Doppler. A feature of this trial was that patients were randomized to a certain position according to the month of admission to hospital. Objective To outline in detail the predetermined statistical analysis plan for HEADPOST Pilot study. Methods All data collected by participating researchers will be reviewed and formally assessed. Information pertaining to the baseline characteristics of patients, their process of care, and the delivery of treatments will be classified, and for each item, appropriate descriptive statistical analyses are planned with comparisons made between randomized groups. For the outcomes, statistical comparisons to be made between groups are planned and described. Results This statistical analysis plan was developed for the analysis of the results of the HEADPOST Pilot study to be transparent, available, verifiable, and predetermined before data lock. Conclusions We have developed a statistical analysis plan for the HEADPOST Pilot study which is to be followed to avoid analysis bias arising from prior knowledge of the study findings. Trial registration The study is registered under HEADPOST-Pilot, ClinicalTrials.gov Identifier NCT01706094.
Quantitative shape analysis with weighted covariance estimates for increased statistical efficiency
2013-01-01
Background The introduction and statistical formalisation of landmark-based methods for analysing biological shape has made a major impact on comparative morphometric analyses. However, a satisfactory solution for including information from 2D/3D shapes represented by ‘semi-landmarks’ alongside well-defined landmarks into the analyses is still missing. Also, there has not been an integration of a statistical treatment of measurement error in the current approaches. Results We propose a procedure based upon the description of landmarks with measurement covariance, which extends statistical linear modelling processes to semi-landmarks for further analysis. Our formulation is based upon a self consistent approach to the construction of likelihood-based parameter estimation and includes corrections for parameter bias, induced by the degrees of freedom within the linear model. The method has been implemented and tested on measurements from 2D fly wing, 2D mouse mandible and 3D mouse skull data. We use these data to explore possible advantages and disadvantages over the use of standard Procrustes/PCA analysis via a combination of Monte-Carlo studies and quantitative statistical tests. In the process we show how appropriate weighting provides not only greater stability but also more efficient use of the available landmark data. The set of new landmarks generated in our procedure (‘ghost points’) can then be used in any further downstream statistical analysis. Conclusions Our approach provides a consistent way of including different forms of landmarks into an analysis and reduces instabilities due to poorly defined points. Our results suggest that the method has the potential to be utilised for the analysis of 2D/3D data, and in particular, for the inclusion of information from surfaces represented by multiple landmark points. PMID:23548043
NASA Technical Reports Server (NTRS)
Baker, K. B.; Sturrock, P. A.
1975-01-01
The question of whether pulsars form a single group or whether pulsars come in two or more different groups is discussed. It is proposed that such groups might be related to several factors such as the initial creation of the neutron star, or the orientation of the magnetic field axis with the spin axis. Various statistical models are examined.
NASA Astrophysics Data System (ADS)
Barré, Anthony; Suard, Frédéric; Gérard, Mathias; Montaru, Maxime; Riu, Delphine
2014-01-01
This paper describes the statistical analysis of recorded data parameters of electrical battery ageing during electric vehicle use. These data permit traditional battery ageing investigation based on the evolution of the capacity fade and resistance raise. The measured variables are examined in order to explain the correlation between battery ageing and operating conditions during experiments. Such study enables us to identify the main ageing factors. Then, detailed statistical dependency explorations present the responsible factors on battery ageing phenomena. Predictive battery ageing models are built from this approach. Thereby results demonstrate and quantify a relationship between variables and battery ageing global observations, and also allow accurate battery ageing diagnosis through predictive models.
A statistical analysis of the beam position measurement in the Los Alamos proton storage ring
Kolski, Jeff S; Macek, Robert J; Mc Crady, Rodney C
2010-01-01
The beam position monitors (BPMs) are the main diagnostic in the Los Alamos Proton Storage Ring (PSR). They are used in several applications during operations and tuning including orbit bumps and measurements of the tune, closed orbit (CO), and injection offset. However the BPM data acquisition system makes use of older technologies, such as matrix switches, that could lead to faulty measurements. This is the first statistical study of the PSR BPM perfonnance using BPM measurements. In this study, 101 consecutive CO measurements are analyzed. Reported here are the results of the statistical analysis, tune and CO measurement spreads, the BPM single turn measurement error, and examples of the observed data acquisition errors.
NASA Technical Reports Server (NTRS)
Koch, Steven E.; Golus, Robert E.
1988-01-01
This paper presents a statistical analysis of the characteristics of the wavelike activity that occurred over the north-central United States on July 11-12, 1981, using data from the Cooperative Convective Precipitation Experiment in Montana. In particular, two distinct wave episodes of about 8-h duration within a longer (33 h) period of wave activity were studied in detail. It is demonstrated that the observed phenomena display features consistent with those of mesoscale gravity waves. The principles of statistical methods used to detect and track mesoscale gravity waves are discussed together with their limitations.
An Application of Multivariate Statistical Analysis for Query-Driven Visualization
Gosink, Luke J.; Garth, Christoph; Anderson, John C.; Bethel, E. Wes; Joy, Kenneth I.
2010-03-01
Abstract?Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.
Dorazio, Robert M; Hunter, Margaret E
2015-11-03
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
NASA Astrophysics Data System (ADS)
Piersanti, Mirko; Materassi, Massimo; Spogli, Luca; Cicone, Antonio; Alberti, Tommaso
2016-04-01
Highly irregular fluctuations of the power of trans-ionospheric GNSS signals, namely radio power scintillation, are, at least to a large extent, the effect of ionospheric plasma turbulence, a by-product of the non-linear and non-stationary evolution of the plasma fields defining the Earth's upper atmosphere. One could expect the ionospheric turbulence characteristics of inter-scale coupling, local randomness and high time variability to be inherited by the scintillation on radio signals crossing the medium. On this basis, the remote sensing of local features of the turbulent plasma could be expected as feasible by studying radio scintillation. The dependence of the statistical properties of the medium fluctuations on the space- and time-scale is the distinctive character of intermittent turbulent media. In this paper, a multi-scale statistical analysis of some samples of GPS radio scintillation is presented: the idea is that assessing how the statistics of signal fluctuations vary with time scale under different Helio-Geophysical conditions will be of help in understanding the corresponding multi-scale statistics of the turbulent medium causing that scintillation. In particular, two techniques are tested as multi-scale decomposition schemes of the signals: the discrete wavelet analysis and the Empirical Mode Decomposition. The discussion of the results of the one analysis versus the other will be presented, trying to highlight benefits and limits of each scheme, also under suitably different helio-geophysical conditions.
Statistical object data analysis of taxonomic trees from human microbiome data.
La Rosa, Patricio S; Shands, Berkley; Deych, Elena; Zhou, Yanjiao; Sodergren, Erica; Weinstock, George; Shannon, William D
2012-01-01
Human microbiome research characterizes the microbial content of samples from human habitats to learn how interactions between bacteria and their host might impact human health. In this work a novel parametric statistical inference method based on object-oriented data analysis (OODA) for analyzing HMP data is proposed. OODA is an emerging area of statistical inference where the goal is to apply statistical methods to objects such as functions, images, and graphs or trees. The data objects that pertain to this work are taxonomic trees of bacteria built from analysis of 16S rRNA gene sequences (e.g. using RDP); there is one such object for each biological sample analyzed. Our goal is to model and formally compare a set of trees. The contribution of our work is threefold: first, a weighted tree structure to analyze RDP data is introduced; second, using a probability measure to model a set of taxonomic trees, we introduce an approximate MLE procedure for estimating model parameters and we derive LRT statistics for comparing the distributions of two metagenomic populations; and third the Jumpstart HMP data is analyzed using the proposed model providing novel insights and future directions of analysis.
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments
Dorazio, Robert; Hunter, Margaret
2015-01-01
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E
2014-04-01
This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs.
Belianinov, Alex; Panchapakesan, G.; Lin, Wenzhi; Sales, Brian C.; Sefat, Athena Safa; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.
2014-12-02
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1 x Sex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.
Statistical shape analysis using 3D Poisson equation--A quantitatively validated approach.
Gao, Yi; Bouix, Sylvain
2016-05-01
Statistical shape analysis has been an important area of research with applications in biology, anatomy, neuroscience, agriculture, paleontology, etc. Unfortunately, the proposed methods are rarely quantitatively evaluated, and as shown in recent studies, when they are evaluated, significant discrepancies exist in their outputs. In this work, we concentrate on the problem of finding the consistent location of deformation between two population of shapes. We propose a new shape analysis algorithm along with a framework to perform a quantitative evaluation of its performance. Specifically, the algorithm constructs a Signed Poisson Map (SPoM) by solving two Poisson equations on the volumetric shapes of arbitrary topology, and statistical analysis is then carried out on the SPoMs. The method is quantitatively evaluated on synthetic shapes and applied on real shape data sets in brain structures.
Statistical Analysis of Human Body Movement and Group Interactions in Response to Music
NASA Astrophysics Data System (ADS)
Desmet, Frank; Leman, Marc; Lesaffre, Micheline; de Bruyn, Leen
Quantification of time series that relate to physiological data is challenging for empirical music research. Up to now, most studies have focused on time-dependent responses of individual subjects in controlled environments. However, little is known about time-dependent responses of between-subject interactions in an ecological context. This paper provides new findings on the statistical analysis of group synchronicity in response to musical stimuli. Different statistical techniques were applied to time-dependent data obtained from an experiment on embodied listening in individual and group settings. Analysis of inter group synchronicity are described. Dynamic Time Warping (DTW) and Cross Correlation Function (CCF) were found to be valid methods to estimate group coherence of the resulting movements. It was found that synchronicity of movements between individuals (human-human interactions) increases significantly in the social context. Moreover, Analysis of Variance (ANOVA) revealed that the type of music is the predominant factor in both the individual and the social context.
On the Interpretation of Running Trends as Summary Statistics for Time Series Analysis
NASA Astrophysics Data System (ADS)
Vigo, Isabel M.; Trottini, Mario; Belda, Santiago
2016-04-01
In recent years, running trends analysis (RTA) has been widely used in climate applied research as summary statistics for time series analysis. There is no doubt that RTA might be a useful descriptive tool, but despite its general use in applied research, precisely what it reveals about the underlying time series is unclear and, as a result, its interpretation is unclear too. This work contributes to such interpretation in two ways: 1) an explicit formula is obtained for the set of time series with a given series of running trends, making it possible to show that running trends, alone, perform very poorly as summary statistics for time series analysis; and 2) an equivalence is established between RTA and the estimation of a (possibly nonlinear) trend component of the underlying time series using a weighted moving average filter. Such equivalence provides a solid ground for RTA implementation and interpretation/validation.
Advanced statistical methods for improved data analysis of NASA astrophysics missions
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.
1992-01-01
The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.
Nam, Dougu
2015-03-02
Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting.
MacKinnon, David P; Pirlott, Angela G
2015-02-01
Statistical mediation methods provide valuable information about underlying mediating psychological processes, but the ability to infer that the mediator variable causes the outcome variable is more complex than widely known. Researchers have recently emphasized how violating assumptions about confounder bias severely limits causal inference of the mediator to dependent variable relation. Our article describes and addresses these limitations by drawing on new statistical developments in causal mediation analysis. We first review the assumptions underlying causal inference and discuss three ways to examine the effects of confounder bias when assumptions are violated. We then describe four approaches to address the influence of confounding variables and enhance causal inference, including comprehensive structural equation models, instrumental variable methods, principal stratification, and inverse probability weighting. Our goal is to further the adoption of statistical methods to enhance causal inference in mediation studies.
Statistical Analysis of Current Sheets in Three-dimensional Magnetohydrodynamic Turbulence
NASA Astrophysics Data System (ADS)
Zhdankin, Vladimir; Uzdensky, Dmitri A.; Perez, Jean C.; Boldyrev, Stanislav
2013-07-01
We develop a framework for studying the statistical properties of current sheets in numerical simulations of magnetohydrodynamic (MHD) turbulence with a strong guide field, as modeled by reduced MHD. We describe an algorithm that identifies current sheets in a simulation snapshot and then determines their geometrical properties (including length, width, and thickness) and intensities (peak current density and total energy dissipation rate). We then apply this procedure to simulations of reduced MHD and perform a statistical analysis on the obtained population of current sheets. We evaluate the role of reconnection by separately studying the populations of current sheets which contain magnetic X-points and those which do not. We find that the statistical properties of the two populations are different in general. We compare the scaling of these properties to phenomenological predictions obtained for the inertial range of MHD turbulence. Finally, we test whether the reconnecting current sheets are consistent with the Sweet-Parker model.
Statistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech
NASA Astrophysics Data System (ADS)
Přibil, J.; Přibilová, A.
2009-01-01
The paper addresses reflection of microintonation and spectral properties in male and female acted emotional speech. Microintonation component of speech melody is analyzed regarding its spectral and statistical parameters. According to psychological research of emotional speech, different emotions are accompanied by different spectral noise. We control its amount by spectral flatness according to which the high frequency noise is mixed in voiced frames during cepstral speech synthesis. Our experiments are aimed at statistical analysis of cepstral coefficient values and ranges of spectral flatness in three emotions (joy, sadness, anger), and a neutral state for comparison. Calculated histograms of spectral flatness distribution are visually compared and modelled by Gamma probability distribution. Histograms of cepstral coefficient distribution are evaluated and compared using skewness and kurtosis. Achieved statistical results show good correlation comparing male and female voices for all emotional states portrayed by several Czech and Slovak professional actors.