On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis
Li, Bing; Chun, Hyonho; Zhao, Hongyu
2014-01-01
We introduce a nonparametric method for estimating non-gaussian graphical models based on a new statistical relation called additive conditional independence, which is a three-way relation among random vectors that resembles the logical structure of conditional independence. Additive conditional independence allows us to use one-dimensional kernel regardless of the dimension of the graph, which not only avoids the curse of dimensionality but also simplifies computation. It also gives rise to a parallel structure to the gaussian graphical model that replaces the precision matrix by an additive precision operator. The estimators derived from additive conditional independence cover the recently introduced nonparanormal graphical model as a special case, but outperform it when the gaussian copula assumption is violated. We compare the new method with existing ones by simulations and in genetic pathway analysis. PMID:26401064
Mathematical and statistical analysis
NASA Technical Reports Server (NTRS)
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Deconstructing Statistical Analysis
ERIC Educational Resources Information Center
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Statistical log analysis made practical
Mitchell, W.K.; Nelson, R.J. )
1991-06-01
This paper discusses the advantages of a statistical approach to log analysis. Statistical techniques use inverse methods to calculate formation parameters. The use of statistical techniques has been limited, however, by the complexity of the mathematics and lengthy computer time required to minimize traditionally used nonlinear equations.
Hahn, A.A.
1994-11-01
The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques.
Weak additivity principle for current statistics in d dimensions.
Pérez-Espigares, C; Garrido, P L; Hurtado, P I
2016-04-01
The additivity principle (AP) allows one to compute the current distribution in many one-dimensional nonequilibrium systems. Here we extend this conjecture to general d-dimensional driven diffusive systems, and validate its predictions against both numerical simulations of rare events and microscopic exact calculations of three paradigmatic models of diffusive transport in d=2. Crucially, the existence of a structured current vector field at the fluctuating level, coupled to the local mobility, turns out to be essential to understand current statistics in d>1. We prove that, when compared to the straightforward extension of the AP to high d, the so-called weak AP always yields a better minimizer of the macroscopic fluctuation theory action for current statistics.
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Statistical Analysis of RNA Backbone
Hershkovitz, Eli; Sapiro, Guillermo; Tannenbaum, Allen; Williams, Loren Dean
2009-01-01
Local conformation is an important determinant of RNA catalysis and binding. The analysis of RNA conformation is particularly difficult due to the large number of degrees of freedom (torsion angles) per residue. Proteins, by comparison, have many fewer degrees of freedom per residue. In this work, we use and extend classical tools from statistics and signal processing to search for clusters in RNA conformational space. Results are reported both for scalar analysis, where each torsion angle is separately studied, and for vectorial analysis, where several angles are simultaneously clustered. Adapting techniques from vector quantization and clustering to the RNA structure, we find torsion angle clusters and RNA conformational motifs. We validate the technique using well-known conformational motifs, showing that the simultaneous study of the total torsion angle space leads to results consistent with known motifs reported in the literature and also to the finding of new ones. PMID:17048391
Statistical Analysis of Tsunami Variability
NASA Astrophysics Data System (ADS)
Zolezzi, Francesca; Del Giudice, Tania; Traverso, Chiara; Valfrè, Giulio; Poggi, Pamela; Parker, Eric J.
2010-05-01
similar to that seen in ground motion attenuation correlations used for seismic hazard assessment. The second issue was intra-event variability. This refers to the differences in tsunami wave run-up along a section of coast during a single event. Intra-event variability investigated directly considering field observations. The tsunami events used in the statistical evaluation were selected on the basis of the completeness and reliability of the available data. Tsunami considered for the analysis included the recent and well surveyed tsunami of Boxing Day 2004 (Great Indian Ocean Tsunami), Java 2006, Okushiri 1993, Kocaeli 1999, Messina 1908 and a case study of several historic events in Hawaii. Basic statistical analysis was performed on the field observations from these tsunamis. For events with very wide survey regions, the run-up heights have been grouped in order to maintain a homogeneous distance from the source. Where more than one survey was available for a given event, the original datasets were maintained separately to avoid combination of non-homogeneous data. The observed run-up measurements were used to evaluate the minimum, maximum, average, standard deviation and coefficient of variation for each data set. The minimum coefficient of variation was 0.12 measured for the 2004 Boxing Day tsunami at Nias Island (7 data) while the maximum is 0.98 for the Okushiri 1993 event (93 data). The average coefficient of variation is of the order of 0.45.
Asymptotic modal analysis and statistical energy analysis
NASA Technical Reports Server (NTRS)
Dowell, Earl H.
1992-01-01
Asymptotic Modal Analysis (AMA) is a method which is used to model linear dynamical systems with many participating modes. The AMA method was originally developed to show the relationship between statistical energy analysis (SEA) and classical modal analysis (CMA). In the limit of a large number of modes of a vibrating system, the classical modal analysis result can be shown to be equivalent to the statistical energy analysis result. As the CMA result evolves into the SEA result, a number of systematic assumptions are made. Most of these assumptions are based upon the supposition that the number of modes approaches infinity. It is for this reason that the term 'asymptotic' is used. AMA is the asymptotic result of taking the limit of CMA as the number of modes approaches infinity. AMA refers to any of the intermediate results between CMA and SEA, as well as the SEA result which is derived from CMA. The main advantage of the AMA method is that individual modal characteristics are not required in the model or computations. By contrast, CMA requires that each modal parameter be evaluated at each frequency. In the latter, contributions from each mode are computed and the final answer is obtained by summing over all the modes in the particular band of interest. AMA evaluates modal parameters only at their center frequency and does not sum the individual contributions from each mode in order to obtain a final result. The method is similar to SEA in this respect. However, SEA is only capable of obtaining spatial averages or means, as it is a statistical method. Since AMA is systematically derived from CMA, it can obtain local spatial information as well.
Statistical quality control through overall vibration analysis
NASA Astrophysics Data System (ADS)
Carnero, M. ^{a.} Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence
Statistical Analysis of Zebrafish Locomotor Response.
Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure. PMID:26437184
Statistical analysis of planetary surfaces
NASA Astrophysics Data System (ADS)
Schmidt, Frederic; Landais, Francois; Lovejoy, Shaun
2015-04-01
In the last decades, a huge amount of topographic data has been obtained by several techniques (laser and radar altimetry, DTM…) for different bodies in the solar system, including Earth, Mars, the Moon etc.. In each case, topographic fields exhibit an extremely high variability with details at each scale, from millimeter to thousands of kilometers. This complexity seems to prohibit global descriptions or global topography models. Nevertheless, this topographic complexity is well-known to exhibit scaling laws that establish a similarity between scales and permit simpler descriptions and models. Indeed, efficient simulations can be made using the statistical properties of scaling fields (fractals). But realistic simulations of global topographic fields must be multi (not mono) scaling behaviour, reflecting the extreme variability and intermittency observed in real fields that can not be generated by simple scaling models. A multiscaling theory has been developed in order to model high variability and intermittency. This theory is a good statistical candidate to model the topography field with a limited number of parameters (called the multifractal parameters). In our study, we show that statistical properties of the Martian topography is accurately reproduced by this model, leading to new interpretation of geomorphological processes.
Statistical Power in Meta-Analysis
ERIC Educational Resources Information Center
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Asymptotic modal analysis and statistical energy analysis
NASA Technical Reports Server (NTRS)
Dowell, Earl H.
1988-01-01
Statistical Energy Analysis (SEA) is defined by considering the asymptotic limit of Classical Modal Analysis, an approach called Asymptotic Modal Analysis (AMA). The general approach is described for both structural and acoustical systems. The theoretical foundation is presented for structural systems, and experimental verification is presented for a structural plate responding to a random force. Work accomplished subsequent to the grant initiation focusses on the acoustic response of an interior cavity (i.e., an aircraft or spacecraft fuselage) with a portion of the wall vibrating in a large number of structural modes. First results were presented at the ASME Winter Annual Meeting in December, 1987, and accepted for publication in the Journal of Vibration, Acoustics, Stress and Reliability in Design. It is shown that asymptotically as the number of acoustic modes excited becomes large, the pressure level in the cavity becomes uniform except at the cavity boundaries. However, the mean square pressure at the cavity corner, edge and wall is, respectively, 8, 4, and 2 times the value in the cavity interior. Also it is shown that when the portion of the wall which is vibrating is near a cavity corner or edge, the response is significantly higher.
Collecting operational event data for statistical analysis
Atwood, C.L.
1994-09-01
This report gives guidance for collecting operational data to be used for statistical analysis, especially analysis of event counts. It discusses how to define the purpose of the study, the unit (system, component, etc.) to be studied, events to be counted, and demand or exposure time. Examples are given of classification systems for events in the data sources. A checklist summarizes the essential steps in data collection for statistical analysis.
Statistical Survey and Analysis Handbook.
ERIC Educational Resources Information Center
Smith, Kenneth F.
The National Food and Agriculture Council of the Philippines regularly requires rapid feedback data for analysis, which will assist in monitoring programs to improve and increase the production of selected crops by small scale farmers. Since many other development programs in various subject matter areas also require similar statistical…
Statistical Analysis For Nucleus/Nucleus Collisions
NASA Technical Reports Server (NTRS)
Mcguire, Stephen C.
1989-01-01
Report describes use of several statistical techniques to charactertize angular distributions of secondary particles emitted in collisions of atomic nuclei in energy range of 24 to 61 GeV per nucleon. Purpose of statistical analysis to determine correlations between intensities of emitted particles and angles comfirming existence of quark/gluon plasma.
Explorations in Statistics: The Analysis of Change
ERIC Educational Resources Information Center
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
A Statistical Analysis of Cotton Fiber Properties
NASA Astrophysics Data System (ADS)
Ghosh, Anindya; Das, Subhasis; Majumder, Asha
2016-04-01
This paper reports a statistical analysis of different cotton fiber properties, such as strength, breaking elongation, upper half mean length, length uniformity index, short fiber index, micronaire, reflectance and yellowness measured from 1200 cotton bales. The uni-variate, bi-variate and multi-variate statistical analysis have been invoked to elicit interrelationship between above-mentioned properties taking them up singularly, pairwise and multiple way, respectively. In multi-variate analysis all cotton fiber properties are simultaneously considered for multi-dimensional techniques of principal factor analysis.
Statistics analysis embedded in spatial DBMS
NASA Astrophysics Data System (ADS)
Chen, Rongguo; Chen, Siqing
2006-10-01
This article sets forth the principle and methodology for implementing spatial database management system (DBMS) by using open source object-relational DBMS - PostgreSQL. The geospatial data model and spatial analysis and processing operations for spatial objects and datasets can be inserted into the DBMS by extended SQL. To implement the statistics analysis embedded in spatial DBMS, an open source statistical program R is introduced to extend the capability of the spatial DBMS. R is a language and environment for statistical computing and graphics. There is a large sum of statistical methods in the form of packages in R. Many classical and modern spatial statistical techniques are implemented in R environment. PL/R is a loadable procedural language containing most of the capabilities in R language which is extensible and enables user to write DBMS functions and triggers in R language. Therefore, the PL/R will extend its capability of spatial statistics and geostatistics when the two kinds of packages are loaded into R language. The PL/R can be extended without limit so that any new method of statistics analysis embedded into the spatial DBMS becomes very convenient.
Statistical Analysis Techniques for Small Sample Sizes
NASA Technical Reports Server (NTRS)
Navard, S. E.
1984-01-01
The small sample sizes problem which is encountered when dealing with analysis of space-flight data is examined. Because of such a amount of data available, careful analyses are essential to extract the maximum amount of information with acceptable accuracy. Statistical analysis of small samples is described. The background material necessary for understanding statistical hypothesis testing is outlined and the various tests which can be done on small samples are explained. Emphasis is on the underlying assumptions of each test and on considerations needed to choose the most appropriate test for a given type of analysis.
Statistical Tools for Forensic Analysis of Toolmarks
David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser
2004-04-22
Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.
Mar, Raymond A; Spreng, R Nathan; Deyoung, Colin G
2013-09-01
Personality neuroscience involves examining relations between cognitive or behavioral variability and neural variables like brain structure and function. Such studies have uncovered a number of fascinating associations but require large samples, which are expensive to collect. Here, we propose a system that capitalizes on neuroimaging data commonly collected for separate purposes and combines it with new behavioral data to test novel hypotheses. Specifically, we suggest that groups of researchers compile a database of structural (i.e., anatomical) and resting-state functional scans produced for other task-based investigations and pair these data with contact information for the participants who contributed the data. This contact information can then be used to collect additional cognitive, behavioral, or individual-difference data that are then reassociated with the neuroimaging data for analysis. This would allow for novel hypotheses regarding brain-behavior relations to be tested on the basis of large sample sizes (with adequate statistical power) for low additional cost. This idea can be implemented at small scales at single institutions, among a group of collaborating researchers, or perhaps even within a single lab. It can also be implemented at a large scale across institutions, although doing so would entail a number of additional complications.
[Statistical analysis using freely-available "EZR (Easy R)" software].
Kanda, Yoshinobu
2015-10-01
Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
Statistical Analysis Experiment for Freshman Chemistry Lab.
ERIC Educational Resources Information Center
Salzsieder, John C.
1995-01-01
Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…
MICROARRAY DATA ANALYSIS USING MULTIPLE STATISTICAL MODELS
Microarray Data Analysis Using Multiple Statistical Models
Wenjun Bao1, Judith E. Schmid1, Amber K. Goetz1, Ming Ouyang2, William J. Welsh2,Andrew I. Brooks3,4, ChiYi Chu3,Mitsunori Ogihara3,4, Yinhe Cheng5, David J. Dix1. 1National Health and Environmental Effects Researc...
Applied Behavior Analysis and Statistical Process Control?
ERIC Educational Resources Information Center
Hopkins, B. L.
1995-01-01
Incorporating statistical process control (SPC) methods into applied behavior analysis is discussed. It is claimed that SPC methods would likely reduce applied behavior analysts' intimate contacts with problems and would likely yield poor treatment and research decisions. Cases and data presented by Pfadt and Wheeler (1995) are cited as examples.…
Statistical shape analysis: From landmarks to diffeomorphisms.
Zhang, Miaomiao; Golland, Polina
2016-10-01
We offer a blazingly brief review of evolution of shape analysis methods in medical imaging. As the representations and the statistical models grew more sophisticated, the problem of shape analysis has been gradually redefined to accept images rather than binary segmentations as a starting point. This transformation enabled shape analysis to take its rightful place in the arsenal of tools for extracting and understanding patterns in large clinical image sets. We speculate on the future developments in shape analysis and potential applications that would bring this mathematically rich area to bear on clinical practice. PMID:27377332
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Statistical Analysis of Thermal Analysis Margin
NASA Technical Reports Server (NTRS)
Garrison, Matthew B.
2011-01-01
NASA Goddard Space Flight Center requires that each project demonstrate a minimum of 5 C margin between temperature predictions and hot and cold flight operational limits. The bounding temperature predictions include worst-case environment and thermal optical properties. The purpose of this work is to: assess how current missions are performing against their pre-launch bounding temperature predictions and suggest any possible changes to the thermal analysis margin rules
Stork, LeAnna M.; Gennings, Chris; Carchman, Richard; Carter, Jr., Walter H.; Pounds, Joel G.; Mumtaz, Moiz
2006-12-01
Several assumptions, defined and undefined, are used in the toxicity assessment of chemical mixtures. In scientific practice mixture components in the low-dose region, particularly subthreshold doses, are often assumed to behave additively (i.e., zero interaction) based on heuristic arguments. This assumption has important implications in the practice of risk assessment, but has not been experimentally tested. We have developed methodology to test for additivity in the sense of Berenbaum (Advances in Cancer Research, 1981), based on the statistical equivalence testing literature where the null hypothesis of interaction is rejected for the alternative hypothesis of additivity when data support the claim. The implication of this approach is that conclusions of additivity are made with a false positive rate controlled by the experimenter. The claim of additivity is based on prespecified additivity margins, which are chosen using expert biological judgment such that small deviations from additivity, which are not considered to be biologically important, are not statistically significant. This approach is in contrast to the usual hypothesis-testing framework that assumes additivity in the null hypothesis and rejects when there is significant evidence of interaction. In this scenario, failure to reject may be due to lack of statistical power making the claim of additivity problematic. The proposed method is illustrated in a mixture of five organophosphorus pesticides that were experimentally evaluated alone and at relevant mixing ratios. Motor activity was assessed in adult male rats following acute exposure. Four low-dose mixture groups were evaluated. Evidence of additivity is found in three of the four low-dose mixture groups.The proposed method tests for additivity of the whole mixture and does not take into account subset interactions (e.g., synergistic, antagonistic) that may have occurred and cancelled each other out.
Comparative statistical analysis of planetary surfaces
NASA Astrophysics Data System (ADS)
Schmidt, Frédéric; Landais, Francois; Lovejoy, Shaun
2016-04-01
In the present study, we aim to provide a statistical and comparative description of topographic fields by using the huge amount of topographic data available for different bodies in the solar system, including Earth, Mars, the Moon etc.. Our goal is to characterize and quantify the geophysical processes involved by a relevant statistical description. In each case, topographic fields exhibit an extremely high variability with details at each scale, from millimeter to thousands of kilometers. This complexity seems to prohibit global descriptions or global topography models. Nevertheless, this topographic complexity is well-known to exhibit scaling laws that establish a similarity between scales and permit simpler descriptions and models. Indeed, efficient simulations can be made using the statistical properties of scaling fields (fractals). But realistic simulations of global topographic fields must be multi (not mono) scaling behaviour, reflecting the extreme variability and intermittency observed in real fields that can not be generated by simple scaling models. A multiscaling theory has been developed in order to model high variability and intermittency. This theory is a good statistical candidate to model the topography field with a limited number of parameters (called the multifractal parameters). After a global analysis of Mars (Landais et. al, 2015) we have performed similar analysis on different body in the solar system including the Moon, Venus and mercury indicating that the mulifractal parameters might be relevant to explain the competition between several processes operating on multiple scales
Statistical Analysis of Iberian Peninsula Megaliths Orientations
NASA Astrophysics Data System (ADS)
González-García, A. C.
2009-08-01
Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.
Protein Sectors: Statistical Coupling Analysis versus Conservation
Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas
2015-01-01
Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535
Applied behavior analysis and statistical process control?
Hopkins, B L
1995-01-01
This paper examines Pfadt and Wheeler's (1995) suggestions that the methods of statistical process control (SPC) be incorporated into applied behavior analysis. The research strategies of SPC are examined and compared to those of applied behavior analysis. I argue that the statistical methods that are a part of SPC would likely reduce applied behavior analysts' intimate contacts with the problems with which they deal and would, therefore, likely yield poor treatment and research decisions. Examples of these kinds of results and decisions are drawn from the cases and data Pfadt and Wheeler present. This paper also describes and clarifies many common misconceptions about SPC, including W. Edwards Deming's involvement in its development, its relationship to total quality management, and its confusion with various other methods designed to detect sources of unwanted variability. PMID:7592156
Multivariate analysis: A statistical approach for computations
NASA Astrophysics Data System (ADS)
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Analysis and modeling of resistive switching statistics
NASA Astrophysics Data System (ADS)
Long, Shibing; Cagli, Carlo; Ielmini, Daniele; Liu, Ming; Suñé, Jordi
2012-04-01
The resistive random access memory (RRAM), based on the reversible switching between different resistance states, is a promising candidate for next-generation nonvolatile memories. One of the most important challenges to foster the practical application of RRAM is the control of the statistical variation of switching parameters to gain low variability and high reliability. In this work, starting from the well-known percolation model of dielectric breakdown (BD), we establish a framework of analysis and modeling of the resistive switching statistics in RRAM devices, which are based on the formation and disconnection of a conducting filament (CF). One key aspect of our proposal is the relation between the CF resistance and the switching statistics. Hence, establishing the correlation between SET and RESET switching variables and the initial resistance of the device in the OFF and ON states, respectively, is a fundamental issue. Our modeling approach to the switching statistics is fully analytical and contains two main elements: (i) a geometrical cell-based description of the CF and (ii) a deterministic model for the switching dynamics. Both ingredients might be slightly different for the SET and RESET processes, for the type of switching (bipolar or unipolar), and for the kind of considered resistive structure (oxide-based, conductive bridge, etc.). However, the basic structure of our approach is thought to be useful for all the cases and should provide a framework for the physics-based understanding of the switching mechanisms and the associated statistics, for the trustful estimation of RRAM performance, and for the successful forecast of reliability. As a first application example, we start by considering the case of the RESET statistics of NiO-based RRAM structures. In particular, we statistically analyze the RESET transitions of a statistically significant number of switching cycles of Pt/NiO/W devices. In the RESET transition, the ON-state resistance (RON) is a
Statistical analysis of diversification with species traits.
Paradis, Emmanuel
2005-01-01
Testing whether some species traits have a significant effect on diversification rates is central in the assessment of macroevolutionary theories. However, we still lack a powerful method to tackle this objective. I present a new method for the statistical analysis of diversification with species traits. The required data are observations of the traits on recent species, the phylogenetic tree of these species, and reconstructions of ancestral values of the traits. Several traits, either continuous or discrete, and in some cases their interactions, can be analyzed simultaneously. The parameters are estimated by the method of maximum likelihood. The statistical significance of the effects in a model can be tested with likelihood ratio tests. A simulation study showed that past random extinction events do not affect the Type I error rate of the tests, whereas statistical power is decreased, though some power is still kept if the effect of the simulated trait on speciation is strong. The use of the method is illustrated by the analysis of published data on primates. The analysis of these data showed that the apparent overall positive relationship between body mass and species diversity is actually an artifact due to a clade-specific effect. Within each clade the effect of body mass on speciation rate was in fact negative. The present method allows to take both effects (clade and body mass) into account simultaneously.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood. PMID:25213136
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-10-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, however, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1) P-hacking, which is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want; 2) overemphasis on P values rather than on the actual size of the observed effect; 3) overuse of statistical hypothesis testing, and being seduced by the word "significant"; and 4) over-reliance on standard errors, which are often misunderstood. PMID:25204545
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-11-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason maybe that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1. P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. 2. Overemphasis on P values rather than on the actual size of the observed effect. 3. Overuse of statistical hypothesis testing, and being seduced by the word "significant". 4. Overreliance on standard errors, which are often misunderstood.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2014-10-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, however, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1) P-hacking, which is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want; 2) overemphasis on P values rather than on the actual size of the observed effect; 3) overuse of statistical hypothesis testing, and being seduced by the word "significant"; and 4) over-reliance on standard errors, which are often misunderstood.
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
Two statistical tests for meiotic breakpoint analysis.
Plaetke, R; Schachtel, G A
1995-01-01
Meiotic breakpoint analysis (BPA), a statistical method for ordering genetic markers, is increasing in importance as a method for building genetic maps of human chromosomes. Although BPA does not provide estimates of genetic distances between markers, it efficiently locates new markers on already defined dense maps, when likelihood analysis becomes cumbersome or the sample size is small. However, until now no assessments of statistical significance have been available for evaluating the possibility that the results of a BPA were produced by chance. In this paper, we propose two statistical tests to determine whether the size of a sample and its genetic information content are sufficient to distinguish between "no linkage" and "linkage" of a marker mapped by BPA to a certain region. Both tests are exact and should be conducted after a BPA has assigned the marker to an interval on the map. Applications of the new tests are demonstrated by three examples: (1) a synthetic data set, (2) a data set of five markers on human chromosome 8p, and (3) a data set of four markers on human chromosome 17q. PMID:7847387
Statistical Uncertainty Analysis Applied to Criticality Calculation
Hartini, Entin; Andiwijayakusuma, Dinan; Susmikanti, Mike; Nursinta, A. W.
2010-06-22
In this paper, we present an uncertainty methodology based on a statistical approach, for assessing uncertainties in criticality prediction using monte carlo method due to uncertainties in the isotopic composition of the fuel. The methodology has been applied to criticality calculations with MCNP5 with additional stochastic input of the isotopic fuel composition. The stochastic input were generated using the latin hypercube sampling method based one the probability density function of each nuclide composition. The automatic passing of the stochastic input to the MCNP and the repeated criticality calculation is made possible by using a python script to link the MCNP and our latin hypercube sampling code.
Statistical Considerations for Analysis of Microarray Experiments
Owzar, Kouros; Barry, William T.; Jung, Sin-Ho
2014-01-01
Microarray technologies enable the simultaneous interrogation of expressions from thousands of genes from a biospecimen sample taken from a patient. This large set of expressions generate a genetic profile of the patient that may be used to identify potential prognostic or predictive genes or genetic models for clinical outcomes. The aim of this article is to provide a broad overview of some of the major statistical considerations for the design and analysis of microarrays experiments conducted as correlative science studies to clinical trials. An emphasis will be placed on how the lack of understanding and improper use of statistical concepts and methods will lead to noise discovery and misinterpretation of experimental results. PMID:22212230
Statistical analysis of extreme auroral electrojet indices
NASA Astrophysics Data System (ADS)
Nakamura, Masao; Yoneda, Asato; Oda, Mitsunobu; Tsubouchi, Ken
2015-09-01
Extreme auroral electrojet activities can damage electrical power grids due to large induced currents in the Earth, degrade radio communications and navigation systems due to the ionospheric disturbances and cause polar-orbiting satellite anomalies due to the enhanced auroral electron precipitation. Statistical estimation of extreme auroral electrojet activities is an important factor in space weather research. For this estimation, we utilize extreme value theory (EVT), which focuses on the statistical behavior in the tail of a distribution. As a measure of auroral electrojet activities, auroral electrojet indices AL, AU, and AE, are used, which describe the maximum current strength of the westward and eastward auroral electrojets and the sum of the two oppositely directed in the auroral latitude ionosphere, respectively. We provide statistical evidence for finite upper limits to AL and AU and estimate the annual expected number and probable intensity of their extreme events. We detect two different types of extreme AE events; therefore, application of the appropriate EVT analysis to AE is difficult.
Statistical Hot Channel Analysis for the NBSR
Cuadra A.; Baek J.
2014-05-27
A statistical analysis of thermal limits has been carried out for the research reactor (NBSR) at the National Institute of Standards and Technology (NIST). The objective of this analysis was to update the uncertainties of the hot channel factors with respect to previous analysis for both high-enriched uranium (HEU) and low-enriched uranium (LEU) fuels. Although uncertainties in key parameters which enter into the analysis are not yet known for the LEU core, the current analysis uses reasonable approximations instead of conservative estimates based on HEU values. Cumulative distribution functions (CDFs) were obtained for critical heat flux ratio (CHFR), and onset of flow instability ratio (OFIR). As was done previously, the Sudo-Kaminaga correlation was used for CHF and the Saha-Zuber correlation was used for OFI. Results were obtained for probability levels of 90%, 95%, and 99.9%. As an example of the analysis, the results for both the existing reactor with HEU fuel and the LEU core show that CHFR would have to be above 1.39 to assure with 95% probability that there is no CHF. For the OFIR, the results show that the ratio should be above 1.40 to assure with a 95% probability that OFI is not reached.
Additivity in the Analysis and Design of HIV Protease Inhibitors
Jorissen, Robert N.; Kiran Kumar Reddy, G. S.; Ali, Akbar; Altman, Michael D.; Chellappan, Sripriya; Anjum, Saima G.; Tidor, Bruce; Schiffer, Celia A.; Rana, Tariq M.; Gilson, Michael K.
2009-01-01
We explore the applicability of an additive treatment of substituent effects to the analysis and design of HIV protease inhibitors. Affinity data for a set of inhibitors with a common chemical framework were analyzed to provide estimates of the free energy contribution of each chemical substituent. These estimates were then used to design new inhibitors, whose high affinities were confirmed by synthesis and experimental testing. Derivations of additive models by least-squares and ridge-regression methods were found to yield statistically similar results. The additivity approach was also compared with standard molecular descriptor-based QSAR; the latter was not found to provide superior predictions. Crystallographic studies of HIV protease-inhibitor complexes help explain the perhaps surprisingly high degree of substituent additivity in this system, and allow some of the additivity coefficients to be rationalized on a structural basis. PMID:19193159
Recent advances in statistical energy analysis
NASA Technical Reports Server (NTRS)
Heron, K. H.
1992-01-01
Statistical Energy Analysis (SEA) has traditionally been developed using modal summation and averaging approach, and has led to the need for many restrictive SEA assumptions. The assumption of 'weak coupling' is particularly unacceptable when attempts are made to apply SEA to structural coupling. It is now believed that this assumption is more a function of the modal formulation rather than a necessary formulation of SEA. The present analysis ignores this restriction and describes a wave approach to the calculation of plate-plate coupling loss factors. Predictions based on this method are compared with results obtained from experiments using point excitation on one side of an irregular six-sided box structure. Conclusions show that the use and calculation of infinite transmission coefficients is the way forward for the development of a purely predictive SEA code.
Methods of the computer-aided statistical analysis of microcircuits
NASA Astrophysics Data System (ADS)
Beliakov, Iu. N.; Kurmaev, F. A.; Batalov, B. V.
Methods that are currently used for the computer-aided statistical analysis of microcircuits at the design stage are summarized. In particular, attention is given to methods for solving problems in statistical analysis, statistical planning, and factorial model synthesis by means of irregular experimental design. Efficient ways of reducing the computer time required for statistical analysis and numerical methods of microcircuit analysis are proposed. The discussion also covers various aspects of the organization of computer-aided microcircuit modeling and analysis systems.
Multivariate statistical analysis of wildfires in Portugal
NASA Astrophysics Data System (ADS)
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
HistFitter software framework for statistical data analysis
NASA Astrophysics Data System (ADS)
Baak, M.; Besjes, G. J.; Côté, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-04-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface.
On intracluster Faraday rotation. II - Statistical analysis
NASA Technical Reports Server (NTRS)
Lawler, J. M.; Dennison, B.
1982-01-01
The comparison of a reliable sample of radio source Faraday rotation measurements seen through rich clusters of galaxies, with sources seen through the outer parts of clusters and therefore having little intracluster Faraday rotation, indicates that the distribution of rotation in the former population is broadened, but only at the 80% level of statistical confidence. Employing a physical model for the intracluster medium in which the square root of magnetic field strength/turbulent cell per gas core radius number ratio equals approximately 0.07 microgauss, a Monte Carlo simulation is able to reproduce the observed broadening. An upper-limit analysis figure of less than 0.20 microgauss for the field strength/turbulent cell ratio, combined with lower limits on field strength imposed by limitations on the Compton-scattered flux, shows that intracluster magnetic fields must be tangled on scales greater than about 20 kpc.
Statistical Analysis of Cardiovascular Data from FAP
NASA Technical Reports Server (NTRS)
Sealey, Meghan
2016-01-01
pressure, etc.) to see which could best predict how long the subjects could tolerate the tilt tests. With this I plan to analyze an artificial gravity study in order to determine the effects of orthostatic intolerance during spaceflight. From these projects, I became efficient in using the statistical software Stata, which I had previously never used before. I learned new statistical methods, such as mixed-effects linear regression, maximum likelihood estimation on longitudinal data, and post model-fitting tests to see if certain parameters contribute significantly to the model, all of which will better my understanding for when I continue studying for my masters' degree. I was also able to demonstrate my knowledge of statistics by helping other students run statistical analyses for their own projects. After completing these projects, the experience and knowledge gained from completing this analysis exemplifies the type of work that I would like to pursue in the future. After completing my masters' degree, I plan to pursue a career in biostatistics, which is exactly the position that I interned as, and I plan to use this experience to contribute to that goal
Analysis of Variance: What Is Your Statistical Software Actually Doing?
ERIC Educational Resources Information Center
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
R: a statistical environment for hydrological analysis
NASA Astrophysics Data System (ADS)
Zambrano-Bigiarini, Mauricio; Bellin, Alberto
2010-05-01
The free software environment for statistical computing and graphics "R" has been developed and it is maintained by statistical programmers, with the support of an increasing community of users with many different backgrounds, which allows access to both well-established and experimental techniques. Hydrological modelling practitioners spent large amount of time in pre- and post-processing data and results with traditional instruments. In this work "R" and some of its packages are presented as powerful tools to explore and extract patterns from raw information, to pre-process input data of hydrological models, and post-processing its results. In particular, examples are taken from analysing 30-years of daily data for a basin of 85000 km2, saving a large amount of time that could be better spent in doing analysis. In doing so, vectorial and raster GIS files were imported, for carrying out spatial and geostatistical analysis. Thousands of raw text files with time series of precipitation, temperature and streamflow were summarized and organized. Gauging stations to be used in the modelling process are selected according to the amount of days with information, and missing time series data are filled in using spatial interpolation. Time series on the gauging stations are summarized through daily, monthly and annual plots. Input files in dbase format are automatically created in a batch process. Results of a hydrological model are compared with observed values through plots and numerical goodness of fit indexes. Two packages specifically developed to assists hydrologists in the previous tasks are briefly presented. At the end, we think the "R" environment would be a valuable tool to support undergraduate and graduate education in hydrology, because it is helpful to capture the main features of large amount of data; it is a flexible and fully functional programming language, able to be interfaced to existing Fortran and C code and well suited to the ever growing demands
An R package for statistical provenance analysis
NASA Astrophysics Data System (ADS)
Vermeesch, Pieter; Resentini, Alberto; Garzanti, Eduardo
2016-05-01
This paper introduces provenance, a software package within the statistical programming environment R, which aims to facilitate the visualisation and interpretation of large amounts of sedimentary provenance data, including mineralogical, petrographic, chemical and isotopic provenance proxies, or any combination of these. provenance comprises functions to: (a) calculate the sample size required to achieve a given detection limit; (b) plot distributional data such as detrital zircon U-Pb age spectra as Cumulative Age Distributions (CADs) or adaptive Kernel Density Estimates (KDEs); (c) plot compositional data as pie charts or ternary diagrams; (d) correct the effects of hydraulic sorting on sandstone petrography and heavy mineral composition; (e) assess the settling equivalence of detrital minerals and grain-size dependence of sediment composition; (f) quantify the dissimilarity between distributional data using the Kolmogorov-Smirnov and Sircombe-Hazelton distances, or between compositional data using the Aitchison and Bray-Curtis distances; (e) interpret multi-sample datasets by means of (classical and nonmetric) Multidimensional Scaling (MDS) and Principal Component Analysis (PCA); and (f) simplify the interpretation of multi-method datasets by means of Generalised Procrustes Analysis (GPA) and 3-way MDS. All these tools can be accessed through an intuitive query-based user interface, which does not require knowledge of the R programming language. provenance is free software released under the GPL-2 licence and will be further expanded based on user feedback.
Additional EIPC Study Analysis. Final Report
Hadley, Stanton W; Gotham, Douglas J.; Luciani, Ralph L.
2014-12-01
Between 2010 and 2012 the Eastern Interconnection Planning Collaborative (EIPC) conducted a major long-term resource and transmission study of the Eastern Interconnection (EI). With guidance from a Stakeholder Steering Committee (SSC) that included representatives from the Eastern Interconnection States Planning Council (EISPC) among others, the project was conducted in two phases. Phase 1 involved a long-term capacity expansion analysis that involved creation of eight major futures plus 72 sensitivities. Three scenarios were selected for more extensive transmission- focused evaluation in Phase 2. Five power flow analyses, nine production cost model runs (including six sensitivities), and three capital cost estimations were developed during this second phase. The results from Phase 1 and 2 provided a wealth of data that could be examined further to address energy-related questions. A list of 14 topics was developed for further analysis. This paper brings together the earlier interim reports of the first 13 topics plus one additional topic into a single final report.
Statistical analysis of regulatory ecotoxicity tests.
Isnard, P; Flammarion, P; Roman, G; Babut, M; Bastien, P; Bintein, S; Esserméant, L; Férard, J F; Gallotti-Schmitt, S; Saouter, E; Saroli, M; Thiébaud, H; Tomassone, R; Vindimian, E
2001-11-01
ANOVA-type data analysis, i.e.. determination of lowest-observed-effect concentrations (LOECs), and no-observed-effect concentrations (NOECs), has been widely used for statistical analysis of chronic ecotoxicity data. However, it is more and more criticised for several reasons, among which the most important is probably the fact that the NOEC depends on the choice of test concentrations and number of replications and rewards poor experiments, i.e., high variability, with high NOEC values. Thus, a recent OECD workshop concluded that the use of the NOEC should be phased out and that a regression-based estimation procedure should be used. Following this workshop, a working group was established at the French level between government, academia and industry representatives. Twenty-seven sets of chronic data (algae, daphnia, fish) were collected and analysed by ANOVA and regression procedures. Several regression models were compared and relations between NOECs and ECx, for different values of x, were established in order to find an alternative summary parameter to the NOEC. Biological arguments are scarce to help in defining a negligible level of effect x for the ECx. With regard to their use in the risk assessment procedures, a convenient methodology would be to choose x so that ECx are on average similar to the present NOEC. This would lead to no major change in the risk assessment procedure. However, experimental data show that the ECx depend on the regression models and that their accuracy decreases in the low effect zone. This disadvantage could probably be reduced by adapting existing experimental protocols but it could mean more experimental effort and higher cost. ECx (derived with existing test guidelines, e.g., regarding the number of replicates) whose lowest bounds of the confidence interval are on average similar to present NOEC would improve this approach by a priori encouraging more precise experiments. However, narrow confidence intervals are not only
Statistical analysis of single-trial Granger causality spectra.
Brovelli, Andrea
2012-01-01
Granger causality analysis is becoming central for the analysis of interactions between neural populations and oscillatory networks. However, it is currently unclear whether single-trial estimates of Granger causality spectra can be used reliably to assess directional influence. We addressed this issue by combining single-trial Granger causality spectra with statistical inference based on general linear models. The approach was assessed on synthetic and neurophysiological data. Synthetic bivariate data was generated using two autoregressive processes with unidirectional coupling. We simulated two hypothetical experimental conditions: the first mimicked a constant and unidirectional coupling, whereas the second modelled a linear increase in coupling across trials. The statistical analysis of single-trial Granger causality spectra, based on t-tests and linear regression, successfully recovered the underlying pattern of directional influence. In addition, we characterised the minimum number of trials and coupling strengths required for significant detection of directionality. Finally, we demonstrated the relevance for neurophysiology by analysing two local field potentials (LFPs) simultaneously recorded from the prefrontal and premotor cortices of a macaque monkey performing a conditional visuomotor task. Our results suggest that the combination of single-trial Granger causality spectra and statistical inference provides a valuable tool for the analysis of large-scale cortical networks and brain connectivity.
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)
NASA Astrophysics Data System (ADS)
Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee
2010-12-01
Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review
CORSSA: Community Online Resource for Statistical Seismicity Analysis
NASA Astrophysics Data System (ADS)
Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.
2011-12-01
Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
Statistical Analysis of Nondisjunction Assays in Drosophila
Zeng, Yong; Li, Hua; Schweppe, Nicole M.; Hawley, R. Scott; Gilliland, William D.
2010-01-01
Many advances in the understanding of meiosis have been made by measuring how often errors in chromosome segregation occur. This process of nondisjunction can be studied by counting experimental progeny, but direct measurement of nondisjunction rates is complicated by not all classes of nondisjunctional progeny being viable. For X chromosome nondisjunction in Drosophila female meiosis, all of the normal progeny survive, while nondisjunctional eggs produce viable progeny only if fertilized by sperm that carry the appropriate sex chromosome. The rate of nondisjunction has traditionally been estimated by assuming a binomial process and doubling the number of observed nondisjunctional progeny, to account for the inviable classes. However, the correct way to derive statistics (such as confidence intervals or hypothesis testing) by this approach is far from clear. Instead, we use the multinomial-Poisson hierarchy model and demonstrate that the old estimator is in fact the maximum-likelihood estimator (MLE). Under more general assumptions, we derive asymptotic normality of this estimator and construct confidence interval and hypothesis testing formulae. Confidence intervals under this framework are always larger than under the binomial framework, and application to published data shows that use of the multinomial approach can avoid an apparent type 1 error made by use of the binomial assumption. The current study provides guidance for researchers designing genetic experiments on nondisjunction and improves several methods for the analysis of genetic data. PMID:20660647
Statistical approach to partial equilibrium analysis
NASA Astrophysics Data System (ADS)
Wang, Yougui; Stanley, H. E.
2009-04-01
A statistical approach to market equilibrium and efficiency analysis is proposed in this paper. One factor that governs the exchange decisions of traders in a market, named willingness price, is highlighted and constitutes the whole theory. The supply and demand functions are formulated as the distributions of corresponding willing exchange over the willingness price. The laws of supply and demand can be derived directly from these distributions. The characteristics of excess demand function are analyzed and the necessary conditions for the existence and uniqueness of equilibrium point of the market are specified. The rationing rates of buyers and sellers are introduced to describe the ratio of realized exchange to willing exchange, and their dependence on the market price is studied in the cases of shortage and surplus. The realized market surplus, which is the criterion of market efficiency, can be written as a function of the distributions of willing exchange and the rationing rates. With this approach we can strictly prove that a market is efficient in the state of equilibrium.
Statistical energy analysis of nonlinear vibrating systems.
Spelman, G M; Langley, R S
2015-09-28
Nonlinearities in practical systems can arise in contacts between components, possibly from friction or impacts. However, it is also known that quadratic and cubic nonlinearity can occur in the stiffness of structural elements undergoing large amplitude vibration, without the need for local contacts. Nonlinearity due purely to large amplitude vibration can then result in significant energy being found in frequency bands other than those being driven by external forces. To analyse this phenomenon, a method is developed here in which the response of the structure in the frequency domain is divided into frequency bands, and the energy flow between the frequency bands is calculated. The frequency bands are assigned an energy variable to describe the mean response and the nonlinear coupling between bands is described in terms of weighted summations of the convolutions of linear modal transfer functions. This represents a nonlinear extension to an established linear theory known as statistical energy analysis (SEA). The nonlinear extension to SEA theory is presented for the case of a plate structure with quadratic and cubic nonlinearity. PMID:26303923
Web-Based Statistical Sampling and Analysis
ERIC Educational Resources Information Center
Quinn, Anne; Larson, Karen
2016-01-01
Consistent with the Common Core State Standards for Mathematics (CCSSI 2010), the authors write that they have asked students to do statistics projects with real data. To obtain real data, their students use the free Web-based app, Census at School, created by the American Statistical Association (ASA) to help promote civic awareness among school…
Multivariate statistical analysis of environmental monitoring data
Ross, D.L.
1997-11-01
EPA requires statistical procedures to determine whether soil or ground water adjacent to or below waste units is contaminated. These statistical procedures are often based on comparisons between two sets of data: one representing background conditions, and one representing site conditions. Since statistical requirements were originally promulgated in the 1980s, EPA has made several improvements and modifications. There are, however, problems which remain. One problem is that the regulations do not require a minimum probability that contaminated sites will be correctly identified. Another problems is that the effect of testing several correlated constituents on the probable outcome of the statistical tests has not been quantified. Results from computer simulations to determine power functions for realistic monitoring situations are presented here. Power functions for two different statistical procedures: the Student`s t-test, and the multivariate Hotelling`s T{sup 2} test, are compared. The comparisons indicate that the multivariate test is often more powerful when the tests are applied with significance levels to control the probability of falsely identifying clean sites as contaminated. This program could also be used to verify that statistical procedures achieve some minimum power standard at a regulated waste unit.
Predicting typology of landslide occurrences by statistical GIS analysis
NASA Astrophysics Data System (ADS)
Mancini, Francesco; Ceppi, Claudia; Ritrovato, Giuliano
2010-05-01
This study aim at the landslide susceptibility mapping by multivariate statistical methods with the additional capability to distinguish among typology of landslide occurrences. The methodology is being tested in a hilly area of the Daunia Region (Apulia, southern Italy) where small settlements are historically threatened by landslide phenomena. In the used multivariate statistical analysis all the variables were managed in a GIS in addition to the landslide inventory where geometric and descriptive properties have to be implemented in a suitable data structure in order to refer the independent set of variables to the typology of landslide occurrences. The independent set of variable selected as possible triggering factors of slope instability phenomena are: elevation, slope, aspect, planform and profile curvature, drained area, lithology, land use, distance from road and river network. The implementation of the landslide inventory was more demanding with respect to a usual multivariate analysis, such as the multiple regression analysis, where the simple presence/absence status of occurrences is being required. According to the classification proposed by Cruden and Varnes, three main landslide typologies were included in the inventory after recognizing by geomorphological survey: a) intermediate to deep-seated compound landslides with failure surface depth > 30m; b) mudslides of shallow to intermediate depth sliding surface; c) deep-seated to intermediate depth rotational landslides with depth of sliding surface < 30m. The inventory implementation constitutes a significant effort supported by the project "Landslide risk assessment for the planning of small urban settlements within chain areas: the case of Daunia" through several expertise. The outcomes of the analysis provide the proneness to landslide, as predicted level of probability, by considering in addition the failure mechanism introduced in the landslide inventory. A map of landslide susceptibility along
[Statistical models for spatial analysis in parasitology].
Biggeri, A; Catelan, D; Dreassi, E; Lagazio, C; Cringoli, G
2004-06-01
The simplest way to study the spatial pattern of a disease is the geographical representation of its cases (or some indicators of them) over a map. Maps based on raw data are generally "wrong" since they do not take into consideration for sampling errors. Indeed, the observed differences between areas (or points in the map) are not directly interpretable, as they derive from the composition of true, structural differences and of the noise deriving from the sampling process. This problem is well known in human epidemiology, and several solutions have been proposed to filter the signal from the noise. These statistical methods are usually referred to as Disease Mapping. In geographical analysis a first goal is to evaluate the statistical significance of the heterogeneity between areas (or points). If the test indicates rejection of the hypothesis of homogeneity the following task is to study the spatial pattern of the disease. The spatial variability of risk is usually decomposed into two terms: a spatially structured (clustering) and a non spatially structured (heterogeneity) one. The heterogeneity term reflects spatial variability due to intrinsic characteristics of the sampling units (e.g. igienic conditions of farms), while the clustering term models the association due to proximity between sampling units, that usually depends on ecological conditions that vary over the study area and that affect in similar way breedings that are close to each other. Hierarchical bayesian models are the main tool to make inference over the clustering and heterogeneity components. The results are based on the marginal posterior distributions of the parameters of the model, that are approximated by Monte Carlo Markov Chain methods. Different models can be defined depending on the terms that are considered, namely a model with only the clustering term, a model with only the heterogeneity term and a model where both are included. Model selection criteria based on a compromise between
Self-Contained Statistical Analysis of Gene Sets
Cannon, Judy L.; Ricoy, Ulises M.; Johnson, Christopher
2016-01-01
Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher’s self-contained method for gene set analysis. We improve Fisher’s differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients. PMID:27711232
Notes on numerical reliability of several statistical analysis programs
Landwehr, J.M.; Tasker, Gary D.
1999-01-01
This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
Statistical Analysis of Refractivity in UAE
NASA Astrophysics Data System (ADS)
Al-Ansari, Kifah; Al-Mal, Abdulhadi Abu; Kamel, Rami
2007-07-01
This paper presents the results of the refractivity statistics in the UAE (United Arab Emirates) for a period of 14 years (1990-2003). Six sites have been considered using meteorological surface data (Abu Dhabi, Dubai, Sharjah, Al-Ain, Ras Al-Kaimah, and Al-Fujairah). Upper air (radiosonde) data were available at one site only, Abu Dhabi airport, which has been considered for the refractivity gradient statistics. Monthly and yearly averages are obtained for the two parameters, refractivity and refractivity gradient. Cumulative distributions are also provided.
{chi}{sup 2} versus median statistics in supernova type Ia data analysis
Barreira, A.; Avelino, P. P.
2011-10-15
In this paper we compare the performances of the {chi}{sup 2} and median likelihood analysis in the determination of cosmological constraints using type Ia supernovae data. We perform a statistical analysis using the 307 supernovae of the Union 2 compilation of the Supernova Cosmology Project and find that the {chi}{sup 2} statistical analysis yields tighter cosmological constraints than the median statistic if only supernovae data is taken into account. We also show that when additional measurements from the cosmic microwave background and baryonic acoustic oscillations are considered, the combined cosmological constraints are not strongly dependent on whether one applies the {chi}{sup 2} statistic or the median statistic to the supernovae data. This indicates that, when complementary information from other cosmological probes is taken into account, the performances of the {chi}{sup 2} and median statistics are very similar, demonstrating the robustness of the statistical analysis.
Acid Rain Analysis by Standard Addition Titration.
ERIC Educational Resources Information Center
Ophardt, Charles E.
1985-01-01
The standard addition titration is a precise and rapid method for the determination of the acidity in rain or snow samples. The method requires use of a standard buret, a pH meter, and Gran's plot to determine the equivalence point. Experimental procedures used and typical results obtained are presented. (JN)
Statistical analysis of pitting corrosion in condenser tubes
Ault, J.P.; Gehring, G.A. Jr.
1997-12-31
Condenser tube failure via wall penetration allows cooling water to contaminate the working fluid (steam). Contamination, especially from brackish or saltwater, will lower steam quality and thus lower overall plant efficiency. Because of the importance of minimizing leakages, power plant engineers are primarily concerned with the maximum localized corrosion in a unit rather than average corrosion values or rates. Extreme value analysis is a useful tool for evaluating the condition of condenser tubing. Extreme value statistical techniques allow the prediction of the most probable deepest pit in a given surface area based upon data acquired from a smaller surface area. Data is gathered from a physical examination of actual tubes (either in-service or from a sidestream unit) rather than small sample coupons. Three distinct applications of extreme value statistics to condenser tube evaluation are presented in this paper: (1) condition assessment of an operating condenser, (2) design data for material selection, and (3) research tool for assessing impact of various factors on condenser tube corrosion. The projections for operating units based on extreme value analysis are shown to be more useful than those made on the basis of other techniques such as eddy current or electrochemical measurements. Extreme value analysis would benefit from advances in two key areas: (1) development of an accurate and economical method for the measurement of maximum pit depths of condenser tubes in-situ would enhance the application of extreme value statistical analysis to the assessment of condenser tubing corrosion pitting and (2) development of methodologies to predict pit depth-time relationship in addition to pit depth-area relationship would be useful for modeling purposes.
On Statistical Analysis of Neuroimages with Imperfect Registration
Kim, Won Hwa; Ravi, Sathya N.; Johnson, Sterling C.; Okonkwo, Ozioma C.; Singh, Vikas
2016-01-01
A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal. PMID:27042168
Critical analysis of adsorption data statistically
NASA Astrophysics Data System (ADS)
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are <1, indicating favourable isotherms. Karl Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
Electrophoretic analysis of Allium alien addition lines.
Peffley, E B; Corgan, J N; Horak, K E; Tanksley, S D
1985-12-01
Meiotic pairing in an interspecific triploid of Allium cepa and A. fistulosum, 'Delta Giant', exhibits preferential pairing between the two A. cepa genomes, leaving the A. fistulosum genome as univalents. Multivalent pairing involving A. fistulosum chromosomes occurs at a low level, allowing for recombination between the genomes. Ten trisomies were recovered from the backcross of 'Delta Giant' x A. cepa cv., 'Temprana', representing a minimum of four of the eight possible alien addition lines. The alien addition lines possessed different A. fistulosum enzyme markers. Those markers, Adh-1, Idh-1 and Pgm-1 reside on different A. fistulosum chromosomes, whereas Pgi-1 and Idh-1 may be linked. Diploid, trisomic and hyperploid progeny were recovered that exhibited putative pink root resistance. The use of interspecific plants as a means to introgress A. fistulosum genes into A. cepa appears to be successful at both the trisomic and the diploid levels. If introgression can be accomplished using an interspecific triploid such as 'Delta Giant' to generate fertile alien addition lines and subsequent fertile diploids, or if introgression can be accomplished directly at the diploid level, this will have accomplished gene flow that has not been possible at the interspecific diploid level.
GROUNDWATER INFORMATION TRACKING SYSTEM/STATISTICAL ANALYSIS SYSTEM
The Groundwater Information Tracking System with STATistical analysis capability (GRITS/STAT) is a tool designed to facilitate the storage, analysis, and reporting of data collected through groundwater monitoring programs at RCRA, CERCLA, and other regulated facilities an...
NASA Technical Reports Server (NTRS)
Smalheer, C. V.
1973-01-01
The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.
Statistical analysis of Contact Angle Hysteresis
NASA Astrophysics Data System (ADS)
Janardan, Nachiketa; Panchagnula, Mahesh
2015-11-01
We present the results of a new statistical approach to determining Contact Angle Hysteresis (CAH) by studying the nature of the triple line. A statistical distribution of local contact angles on a random three-dimensional drop is used as the basis for this approach. Drops with randomly shaped triple lines but of fixed volumes were deposited on a substrate and their triple line shapes were extracted by imaging. Using a solution developed by Prabhala et al. (Langmuir, 2010), the complete three dimensional shape of the sessile drop was generated. A distribution of the local contact angles for several such drops but of the same liquid-substrate pairs is generated. This distribution is a result of several microscopic advancing and receding processes along the triple line. This distribution is used to yield an approximation of the CAH associated with the substrate. This is then compared with measurements of CAH by means of a liquid infusion-withdrawal experiment. Static measurements are shown to be sufficient to measure quasistatic contact angle hysteresis of a substrate. The approach also points towards the relationship between microscopic triple line contortions and CAH.
Statistical analysis of life history calendar data.
Eerola, Mervi; Helske, Satu
2016-04-01
The life history calendar is a data-collection tool for obtaining reliable retrospective data about life events. To illustrate the analysis of such data, we compare the model-based probabilistic event history analysis and the model-free data mining method, sequence analysis. In event history analysis, we estimate instead of transition hazards the cumulative prediction probabilities of life events in the entire trajectory. In sequence analysis, we compare several dissimilarity metrics and contrast data-driven and user-defined substitution costs. As an example, we study young adults' transition to adulthood as a sequence of events in three life domains. The events define the multistate event history model and the parallel life domains in multidimensional sequence analysis. The relationship between life trajectories and excess depressive symptoms in middle age is further studied by their joint prediction in the multistate model and by regressing the symptom scores on individual-specific cluster indices. The two approaches complement each other in life course analysis; sequence analysis can effectively find typical and atypical life patterns while event history analysis is needed for causal inquiries.
Statistics over features: EEG signals analysis.
Derya Ubeyli, Elif
2009-08-01
This paper presented the usage of statistics over the set of the features representing the electroencephalogram (EEG) signals. Since classification is more accurate when the pattern is simplified through representation by important features, feature extraction and selection play an important role in classifying systems such as neural networks. Multilayer perceptron neural network (MLPNN) architectures were formulated and used as basis for detection of electroencephalographic changes. Three types of EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified. The selected Lyapunov exponents, wavelet coefficients and the power levels of power spectral density (PSD) values obtained by eigenvector methods of the EEG signals were used as inputs of the MLPNN trained with Levenberg-Marquardt algorithm. The classification results confirmed that the proposed MLPNN has potential in detecting the electroencephalographic changes. PMID:19555931
Statistical analysis of low level atmospheric turbulence
NASA Technical Reports Server (NTRS)
Tieleman, H. W.; Chen, W. W. L.
1974-01-01
The statistical properties of low-level wind-turbulence data were obtained with the model 1080 total vector anemometer and the model 1296 dual split-film anemometer, both manufactured by Thermo Systems Incorporated. The data obtained from the above fast-response probes were compared with the results obtained from a pair of Gill propeller anemometers. The digitized time series representing the three velocity components and the temperature were each divided into a number of blocks, the length of which depended on the lowest frequency of interest and also on the storage capacity of the available computer. A moving-average and differencing high-pass filter was used to remove the trend and the low frequency components in the time series. The calculated results for each of the anemometers used are represented in graphical or tabulated form.
Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak
2016-06-01
Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Comparative analysis of positive and negative attitudes toward statistics
NASA Astrophysics Data System (ADS)
Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah
2015-02-01
Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Statistical analysis of the 'Almagest' star catalog
NASA Astrophysics Data System (ADS)
Kalashnikov, V. V.; Nosovskii, G. V.; Fomenko, A. T.
The star catalog contained in the 'Almagest', Ptolemy's classical work of astronomy, is examined. An analysis method is proposed which allows the identification of various types of errors committed by the observer. This method not only removes many of the contradictions contained in the catalog but also makes it possible to determine the time period during which the catalog was compiled.
Improved Statistics for Genome-Wide Interaction Analysis
Ueki, Masao; Cordell, Heather J.
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al
Improved statistics for genome-wide interaction analysis.
Ueki, Masao; Cordell, Heather J
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al
Importance of data management with statistical analysis set division.
Wang, Ling; Li, Chan-juan; Jiang, Zhi-wei; Xia, Jie-lai
2015-11-01
Testing of hypothesis was affected by statistical analysis set division which was an important data management work before data base lock-in. Objective division of statistical analysis set under blinding was the guarantee of scientific trial conclusion. All the subjects having accepted at least once trial treatment after randomization should be concluded in safety set. Full analysis set should be close to the intention-to-treat as far as possible. Per protocol set division was the most difficult to control in blinded examination because of more subjectivity than the other two. The objectivity of statistical analysis set division must be guaranteed by the accurate raw data, the comprehensive data check and the scientific discussion, all of which were the strict requirement of data management. Proper division of statistical analysis set objectively and scientifically is an important approach to improve the data management quality. PMID:26911044
Importance of data management with statistical analysis set division.
Wang, Ling; Li, Chan-juan; Jiang, Zhi-wei; Xia, Jie-lai
2015-11-01
Testing of hypothesis was affected by statistical analysis set division which was an important data management work before data base lock-in. Objective division of statistical analysis set under blinding was the guarantee of scientific trial conclusion. All the subjects having accepted at least once trial treatment after randomization should be concluded in safety set. Full analysis set should be close to the intention-to-treat as far as possible. Per protocol set division was the most difficult to control in blinded examination because of more subjectivity than the other two. The objectivity of statistical analysis set division must be guaranteed by the accurate raw data, the comprehensive data check and the scientific discussion, all of which were the strict requirement of data management. Proper division of statistical analysis set objectively and scientifically is an important approach to improve the data management quality.
Internet Data Analysis for the Undergraduate Statistics Curriculum
ERIC Educational Resources Information Center
Sanchez, Juana; He, Yan
2005-01-01
Statistics textbooks for undergraduates have not caught up with the enormous amount of analysis of Internet data that is taking place these days. Case studies that use Web server log data or Internet network traffic data are rare in undergraduate Statistics education. And yet these data provide numerous examples of skewed and bimodal…
Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data
ERIC Educational Resources Information Center
Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val
2006-01-01
Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…
A Realistic Experimental Design and Statistical Analysis Project
ERIC Educational Resources Information Center
Muske, Kenneth R.; Myers, John A.
2007-01-01
A realistic applied chemical engineering experimental design and statistical analysis project is documented in this article. This project has been implemented as part of the professional development and applied statistics courses at Villanova University over the past five years. The novel aspects of this project are that the students are given a…
Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results
NASA Technical Reports Server (NTRS)
Meuer, Hans-Werner; Simon, Horst D.; Strohmeier, Erich; Lasinski, T. A. (Technical Monitor)
1994-01-01
In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.
System statistical reliability model and analysis
NASA Technical Reports Server (NTRS)
Lekach, V. S.; Rood, H.
1973-01-01
A digital computer code was developed to simulate the time-dependent behavior of the 5-kwe reactor thermoelectric system. The code was used to determine lifetime sensitivity coefficients for a number of system design parameters, such as thermoelectric module efficiency and degradation rate, radiator absorptivity and emissivity, fuel element barrier defect constant, beginning-of-life reactivity, etc. A probability distribution (mean and standard deviation) was estimated for each of these design parameters. Then, error analysis was used to obtain a probability distribution for the system lifetime (mean = 7.7 years, standard deviation = 1.1 years). From this, the probability that the system will achieve the design goal of 5 years lifetime is 0.993. This value represents an estimate of the degradation reliability of the system.
Applications of statistics to medical science, IV survival analysis.
Watanabe, Hiroshi
2012-01-01
The fundamental principles of survival analysis are reviewed. In particular, the Kaplan-Meier method and a proportional hazard model are discussed. This work is the last part of a series in which medical statistics are surveyed.
Statistical models and NMR analysis of polymer microstructure
Technology Transfer Automated Retrieval System (TEKTRAN)
Statistical models can be used in conjunction with NMR spectroscopy to study polymer microstructure and polymerization mechanisms. Thus, Bernoullian, Markovian, and enantiomorphic-site models are well known. Many additional models have been formulated over the years for additional situations. Typica...
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
ERIC Educational Resources Information Center
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Statistical analysis of litter experiments in teratology
Williams, R.; Buschbom, R.L.
1982-11-01
Teratological data is binary response data (each fetus is either affected or not) in which the responses within a litter are usually not independent. As a result, the litter should be taken as the experimental unit. For each litter, its size, n, and the number of fetuses, x, possessing the effect of interest are recorded. The ratio p = x/n is then the basic data generated by the experiment. There are currently three general approaches to the analysis of teratological data: nonparametric, transformation followed by t-test or ANOVA, and parametric. The first two are currently in wide use by practitioners while the third is relatively new to the field. These first two also appear to possess comparable power levels while maintaining the nominal level of significance. When transformations are employed, care must be exercised to check that the transformed data has the required properties. Since the data is often highly asymmetric, there may be no transformation which renders the data nearly normal. The parametric procedures, including the beta-binomial model, offer the possibility of increased power.
Chen, Zhe; Ohara, Shinji; Cao, Jianting; Vialatte, François; Lenz, Fred A.; Cichocki, Andrzej
2007-01-01
This article is devoted to statistical modeling and analysis of electrocorticogram (ECoG) signals induced by painful cutaneous laser stimuli, which were recorded from implanted electrodes in awake humans. Specifically, with statistical tools of factor analysis and independent component analysis, the pain-induced laser-evoked potentials (LEPs) were extracted and investigated under different controlled conditions. With the help of wavelet analysis, quantitative and qualitative analyses were conducted regarding the LEPs' attributes of power, amplitude, and latency, in both averaging and single-trial experiments. Statistical hypothesis tests were also applied in various experimental setups. Experimental results reported herein also confirm previous findings in the neurophysiology literature. In addition, single-trial analysis has also revealed many new observations that might be interesting to the neuroscientists or clinical neurophysiologists. These promising results show convincing validation that advanced signal processing and statistical analysis may open new avenues for future studies of such ECoG or other relevant biomedical recordings. PMID:18369410
Statistical Signal Analysis for Systems with Interferenced Inputs
NASA Technical Reports Server (NTRS)
Bai, R. M.; Mielnicka-Pate, A. L.
1985-01-01
A new approach is introduced, based on statistical signal analysis, which overcomes the error due to input signal interference. The model analyzed is given. The input signals u sub 1 (t) and u sub 2 (t) are assumed to be unknown. The measurable signals x sub 1 (t) and x sub 2 (t) are interferened according to the frequency response functions, H sub 12 (f) and H sub 21 (f). The goal of the analysis was to evaluate the power output due to each input, u sub 1 (t) and u sub 2 (t), for the case where both are applied to the same time. In addition, all frequency response functions are calculated. The interferenced system is described by a set of five equations with six unknown functions. An IBM XT Personal Computer, which was interfaced with the FFT, was used to solve the set of equations. The software was tested on an electrical two-input, one-output system. The results were excellent. The research presented includes the analysis of the acoustic radiation from a rectangular plate with two force inputs and the sound pressure as an output signal.
Basic statistical tools in research and data analysis
Ali, Zulfiqar; Bhaskar, S Bala
2016-01-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis. PMID:27729694
Spectral signature verification using statistical analysis and text mining
NASA Astrophysics Data System (ADS)
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.
2014-01-01
Parametric and nonparametric methods have been developed for purposes of predicting phenotypes. These methods are based on retrospective analyses of empirical data consisting of genotypic and phenotypic scores. Recent reports have indicated that parametric methods are unable to predict phenotypes of traits with known epistatic genetic architectures. Herein, we review parametric methods including least squares regression, ridge regression, Bayesian ridge regression, least absolute shrinkage and selection operator (LASSO), Bayesian LASSO, best linear unbiased prediction (BLUP), Bayes A, Bayes B, Bayes C, and Bayes Cπ. We also review nonparametric methods including Nadaraya-Watson estimator, reproducing kernel Hilbert space, support vector machine regression, and neural networks. We assess the relative merits of these 14 methods in terms of accuracy and mean squared error (MSE) using simulated genetic architectures consisting of completely additive or two-way epistatic interactions in an F2 population derived from crosses of inbred lines. Each simulated genetic architecture explained either 30% or 70% of the phenotypic variability. The greatest impact on estimates of accuracy and MSE was due to genetic architecture. Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis. Parametric methods were slightly better than nonparametric methods for additive genetic architectures. Distinctions among parametric methods for additive genetic architectures were incremental. Heritability, i.e., proportion of phenotypic variability, had the second greatest impact on estimates of accuracy and MSE. PMID:24727289
A Divergence Statistics Extension to VTK for Performance Analysis.
Pebay, Philippe Pierre; Bennett, Janine Camille
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Analysis of Coastal Dunes: A Remote Sensing and Statistical Approach.
ERIC Educational Resources Information Center
Jones, J. Richard
1985-01-01
Remote sensing analysis and statistical methods were used to analyze the coastal dunes of Plum Island, Massachusetts. The research methodology used provides an example of a student project for remote sensing, geomorphology, or spatial analysis courses at the university level. (RM)
Saisubramanian, N; Edwinoliver, N G; Nandakumar, N; Kamini, N R; Puvanakrishnan, R
2006-08-01
The efficacy of lipase from Aspergillus niger MTCC 2594 as an additive in laundry detergent formulations was assessed using response surface methodology (RSM). A five-level four-factorial central composite design was chosen to explain the washing protocol with four critical factors, viz. detergent concentration, lipase concentration, buffer pH and washing temperature. The model suggested that all the factors chosen had a significant impact on oil removal and the optimal conditions for the removal of olive oil from cotton fabric were 1.0% detergent, 75 U of lipase, buffer pH of 9.5 and washing temperature of 25 degrees C. Under optimal conditions, the removal of olive oil from cotton fabric was 33 and 17.1% at 25 and 49 degrees C, respectively, in the presence of lipase over treatment with detergent alone. Hence, lipase from A. niger could be effectively used as an additive in detergent formulation for the removal of triglyceride soil both in cold and warm wash conditions.
Additive interaction in survival analysis: use of the additive hazards model.
Rod, Naja Hulvej; Lange, Theis; Andersen, Ingelise; Marott, Jacob Louis; Diderichsen, Finn
2012-09-01
It is a widely held belief in public health and clinical decision-making that interventions or preventive strategies should be aimed at patients or population subgroups where most cases could potentially be prevented. To identify such subgroups, deviation from additivity of absolute effects is the relevant measure of interest. Multiplicative survival models, such as the Cox proportional hazards model, are often used to estimate the association between exposure and risk of disease in prospective studies. In Cox models, deviations from additivity have usually been assessed by surrogate measures of additive interaction derived from multiplicative models-an approach that is both counter-intuitive and sometimes invalid. This paper presents a straightforward and intuitive way of assessing deviation from additivity of effects in survival analysis by use of the additive hazards model. The model directly estimates the absolute size of the deviation from additivity and provides confidence intervals. In addition, the model can accommodate both continuous and categorical exposures and models both exposures and potential confounders on the same underlying scale. To illustrate the approach, we present an empirical example of interaction between education and smoking on risk of lung cancer. We argue that deviations from additivity of effects are important for public health interventions and clinical decision-making, and such estimations should be encouraged in prospective studies on health. A detailed implementation guide of the additive hazards model is provided in the appendix.
Statistical inference in behavior analysis: Friend or foe?
Baron, Alan
1999-01-01
Behavior analysts are undecided about the proper role to be played by inferential statistics in behavioral research. The traditional view, as expressed in Sidman's Tactics of Scientific Research (1960), was that inferential statistics has no place within a science that focuses on the steady-state behavior of individual organisms. Despite this admonition, there have been steady inroads of statistical techniques into behavior analysis since then, as evidenced by publications in the Journal of the Experimental Analysis of Behavior. The issues raised by these developments were considered at a panel held at the 24th annual convention of the Association for Behavior Analysis, Orlando, Florida (May, 1998). The proceedings are reported in this and the following articles. PMID:22478323
Statistical inference in behavior analysis: Experimental control is better
Perone, Michael
1999-01-01
Statistical inference promises automatic, objective, reliable assessments of data, independent of the skills or biases of the investigator, whereas the single-subject methods favored by behavior analysts often are said to rely too much on the investigator's subjective impressions, particularly in the visual analysis of data. In fact, conventional statistical methods are difficult to apply correctly, even by experts, and the underlying logic of null-hypothesis testing has drawn criticism since its inception. By comparison, single-subject methods foster direct, continuous interaction between investigator and subject and development of strong forms of experimental control that obviate the need for statistical inference. Treatment effects are demonstrated in experimental designs that incorporate replication within and between subjects, and the visual analysis of data is adequate when integrated into such designs. Thus, single-subject methods are ideal for shaping—and maintaining—the kind of experimental practices that will ensure the continued success of behavior analysis. PMID:22478328
Cost-effectiveness analysis: a proposal of new reporting standards in statistical analysis.
Bang, Heejung; Zhao, Hongwei
2014-01-01
Cost-effectiveness analysis (CEA) is a method for evaluating the outcomes and costs of competing strategies designed to improve health, and has been applied to a variety of different scientific fields. Yet there are inherent complexities in cost estimation and CEA from statistical perspectives (e.g., skewness, bidimensionality, and censoring). The incremental cost-effectiveness ratio that represents the additional cost per unit of outcome gained by a new strategy has served as the most widely accepted methodology in the CEA. In this article, we call for expanded perspectives and reporting standards reflecting a more comprehensive analysis that can elucidate different aspects of available data. Specifically, we propose that mean- and median-based incremental cost-effectiveness ratios and average cost-effectiveness ratios be reported together, along with relevant summary and inferential statistics, as complementary measures for informed decision making.
Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques
NASA Technical Reports Server (NTRS)
Kuan, Gary M
2008-01-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
Plutonium metal exchange program : current status and statistical analysis
Tandon, L.; Eglin, J. L.; Michalak, S. E.; Picard, R. R.; Temer, D. J.
2004-01-01
The Rocky Flats Plutonium (Pu) Metal Sample Exchange program was conducted to insure the quality and intercomparability of measurements such as Pu assay, Pu isotopics, and impurity analyses. The Rocky Flats program was discontinued in 1989 after more than 30 years. In 2001, Los Alamos National Laboratory (LANL) reestablished the Pu Metal Exchange program. In addition to the Atomic Weapons Establishment (AWE) at Aldermaston, six Department of Energy (DOE) facilities Argonne East, Argonne West, Livermore, Los Alamos, New Brunswick Laboratory, and Savannah River are currently participating in the program. Plutonium metal samples are prepared and distributed to the sites for destructive measurements to determine elemental concentration, isotopic abundance, and both metallic and nonmetallic impurity levels. The program provides independent verification of analytical measurement capabilies for each participating facility and allows problems in analytical methods to be identified. The current status of the program will be discussed with emphasis on the unique statistical analysis and modeling of the data developed for the program. The discussion includes the definition of the consensus values for each analyte (in the presence and absence of anomalous values and/or censored values), and interesting features of the data and the results.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-10-02
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Data analysis using the Gnu R system for statistical computation
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
Spectral signature verification using statistical analysis and text mining
NASA Astrophysics Data System (ADS)
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two diﬀerent aspects of a spectral signature to arrive at a ﬁnal qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
A κ-generalized statistical mechanics approach to income analysis
NASA Astrophysics Data System (ADS)
Clementi, F.; Gallegati, M.; Kaniadakis, G.
2009-02-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.
A novel statistic for genome-wide interaction analysis.
Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao
2010-09-23
Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Meta-analysis for Discovering Rare-Variant Associations: Statistical Methods and Software Programs
Tang, Zheng-Zheng; Lin, Dan-Yu
2015-01-01
There is heightened interest in using next-generation sequencing technologies to identify rare variants that influence complex human diseases and traits. Meta-analysis is essential to this endeavor because large sample sizes are required for detecting associations with rare variants. In this article, we provide a comprehensive overview of statistical methods for meta-analysis of sequencing studies for discovering rare-variant associations. Specifically, we discuss the calculation of relevant summary statistics from participating studies, the construction of gene-level association tests, the choice of transformation for quantitative traits, the use of fixed-effects versus random-effects models, and the removal of shadow association signals through conditional analysis. We also show that meta-analysis based on properly calculated summary statistics is as powerful as joint analysis of individual-participant data. In addition, we demonstrate the performance of different meta-analysis methods by using both simulated and empirical data. We then compare four major software packages for meta-analysis of rare-variant associations—MASS, RAREMETAL, MetaSKAT, and seqMeta—in terms of the underlying statistical methodology, analysis pipeline, and software interface. Finally, we present PreMeta, a software interface that integrates the four meta-analysis packages and allows a consortium to combine otherwise incompatible summary statistics. PMID:26094574
Meta-analysis for Discovering Rare-Variant Associations: Statistical Methods and Software Programs.
Tang, Zheng-Zheng; Lin, Dan-Yu
2015-07-01
There is heightened interest in using next-generation sequencing technologies to identify rare variants that influence complex human diseases and traits. Meta-analysis is essential to this endeavor because large sample sizes are required for detecting associations with rare variants. In this article, we provide a comprehensive overview of statistical methods for meta-analysis of sequencing studies for discovering rare-variant associations. Specifically, we discuss the calculation of relevant summary statistics from participating studies, the construction of gene-level association tests, the choice of transformation for quantitative traits, the use of fixed-effects versus random-effects models, and the removal of shadow association signals through conditional analysis. We also show that meta-analysis based on properly calculated summary statistics is as powerful as joint analysis of individual-participant data. In addition, we demonstrate the performance of different meta-analysis methods by using both simulated and empirical data. We then compare four major software packages for meta-analysis of rare-variant associations-MASS, RAREMETAL, MetaSKAT, and seqMeta-in terms of the underlying statistical methodology, analysis pipeline, and software interface. Finally, we present PreMeta, a software interface that integrates the four meta-analysis packages and allows a consortium to combine otherwise incompatible summary statistics.
Statistical Analysis of Tsunamis of the Italian Coasts
Caputo, M.; Faita, G.F.
1982-01-20
A study of a catalog of 138 tsunamis of the Italian coasts has been made. Intensitities of 106 tsunamis has been assigned and cataloged. The statistical analysis of this data fits a density distribution of the form log n = 3.00-0.425 I, where n is the number of tsunamis of intensity I per thousand years.
Introduction to Statistics and Data Analysis With Computer Applications I.
ERIC Educational Resources Information Center
Morris, Carl; Rolph, John
This document consists of unrevised lecture notes for the first half of a 20-week in-house graduate course at Rand Corporation. The chapter headings are: (1) Histograms and descriptive statistics; (2) Measures of dispersion, distance and goodness of fit; (3) Using JOSS for data analysis; (4) Binomial distribution and normal approximation; (5)…
Feasibility of voxel-based statistical analysis method for myocardial PET
NASA Astrophysics Data System (ADS)
Ram Yu, A.; Kim, Jin Su; Paik, Chang H.; Kim, Kyeong Min; Moo Lim, Sang
2014-09-01
Although statistical parametric mapping (SPM) analysis is widely used in neuroimaging studies, to our best knowledge, there was no application to myocardial PET data analysis. In this study, we developed the voxel based statistical analysis method for myocardial PET which provides statistical comparison results between groups in image space. PET Emission data of normal and myocardial infarction rats were acquired For the SPM analysis, a rat heart template was created. In addition, individual PET data was spatially normalized and smoothed. Two sample t-tests were performed to identify the myocardial infarct region. This developed SPM method was compared with conventional ROI methods. Myocardial glucose metabolism was decreased in the lateral wall of the left ventricle. In the result of ROI analysis, the mean value of the lateral wall was 29% decreased. The newly developed SPM method for myocardial PET could provide quantitative information in myocardial PET study.
Investigation of Weibull statistics in fracture analysis of cast aluminum
NASA Technical Reports Server (NTRS)
Holland, Frederic A., Jr.; Zaretsky, Erwin V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodology based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Statistical Software for spatial analysis of stratigraphic data sets
2003-04-08
Stratistics s a tool for statistical analysis of spatially explicit data sets and model output for description and for model-data comparisons. lt is intended for the analysis of data sets commonly used in geology, such as gamma ray logs and lithologic sequences, as well as 2-D data such as maps. Stratistics incorporates a far wider range of spatial analysis methods drawn from multiple disciplines, than are currently available in other packages. These include incorporation ofmore » techniques from spatial and landscape ecology, fractal analysis, and mathematical geology. Its use should substantially reduce the risk associated with the use of predictive models« less
Systematic misregistration and the statistical analysis of surface data.
Gee, Andrew H; Treece, Graham M
2014-02-01
Spatial normalisation is a key element of statistical parametric mapping and related techniques for analysing cohort statistics on voxel arrays and surfaces. The normalisation process involves aligning each individual specimen to a template using some sort of registration algorithm. Any misregistration will result in data being mapped onto the template at the wrong location. At best, this will introduce spatial imprecision into the subsequent statistical analysis. At worst, when the misregistration varies systematically with a covariate of interest, it may lead to false statistical inference. Since misregistration generally depends on the specimen's shape, we investigate here the effect of allowing for shape as a confound in the statistical analysis, with shape represented by the dominant modes of variation observed in the cohort. In a series of experiments on synthetic surface data, we demonstrate how allowing for shape can reveal true effects that were previously masked by systematic misregistration, and also guard against misinterpreting systematic misregistration as a true effect. We introduce some heuristics for disentangling misregistration effects from true effects, and demonstrate the approach's practical utility in a case study of the cortical bone distribution in 268 human femurs.
Hydrogeochemical characteristics of groundwater in Latvia using multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Retike, Inga; Kalvans, Andis; Bikse, Janis; Popovs, Konrads; Babre, Alise
2015-04-01
product between the two previously named clusters. Groundwater in cluster 2, 6 and 7 is considered to be a result of carbonate weathering with some addition of sea salts or gypsum dissolution. As a conclusion, the highest or lowest concentrations of some trace elements in groundwater was found out to be strongly associated with certain clusters. For example, Cluster 9 represents gypsum dissolution and has the highest concentrations of F, Sr, Rb, Li and the lowest levels of Ba. It can be also concluded that multivariate statistical analysis of major components can be used as an exploratory and predictive tool to identify groundwater objects with high possibility of elevated or reduced concentrations of harmful or essential trace elements. The research is supported by the European Union through the ESF Mobilitas grant No MJD309 and the European Regional Development Fund project Nr.2013/0054/2DP/2.1.1.1.0/13/APIA/VIAA/007 and NRP project EVIDENnT project "Groundwater and climate scenarios" subproject "Groundwater Research".
Statistical analysis of flight times for space shuttle ferry flights
NASA Technical Reports Server (NTRS)
Graves, M. E.; Perlmutter, M.
1974-01-01
Markov chain and Monte Carlo analysis techniques are applied to the simulated Space Shuttle Orbiter Ferry flights to obtain statistical distributions of flight time duration between Edwards Air Force Base and Kennedy Space Center. The two methods are compared, and are found to be in excellent agreement. The flights are subjected to certain operational and meteorological requirements, or constraints, which cause eastbound and westbound trips to yield different results. Persistence of events theory is applied to the occurrence of inclement conditions to find their effect upon the statistical flight time distribution. In a sensitivity test, some of the constraints are varied to observe the corresponding changes in the results.
[Some basic aspects in statistical analysis of visual acuity data].
Ren, Ze-Qin
2007-06-01
All visual acuity charts used currently have their own shortcomings. Therefore, it is difficult for ophthalmologists to evaluate visual acuity data. Many problems present in the use of statistical methods for handling visual acuity data in clinical research. The quantitative relationship between visual acuity and visual angle varied in different visual acuity charts. The type of visual acuity and visual angle are different from each other. Therefore, different statistical methods should be used for different data sources. A correct understanding and analysis of visual acuity data could be obtained only after the elucidation of these aspects.
AstroStat-A VO tool for statistical analysis
NASA Astrophysics Data System (ADS)
Kembhavi, A. K.; Mahabal, A. A.; Kale, T.; Jagade, S.; Vibhute, A.; Garg, P.; Vaghmare, K.; Navelkar, S.; Agrawal, T.; Chattopadhyay, A.; Nandrekar, D.; Shaikh, M.
2015-06-01
AstroStat is an easy-to-use tool for performing statistical analysis on data. It has been designed to be compatible with Virtual Observatory (VO) standards thus enabling it to become an integral part of the currently available collection of VO tools. A user can load data in a variety of formats into AstroStat and perform various statistical tests using a menu driven interface. Behind the scenes, all analyses are done using the public domain statistical software-R and the output returned is presented in a neatly formatted form to the user. The analyses performable include exploratory tests, visualizations, distribution fitting, correlation & causation, hypothesis testing, multivariate analysis and clustering. The tool is available in two versions with identical interface and features-as a web service that can be run using any standard browser and as an offline application. AstroStat will provide an easy-to-use interface which can allow for both fetching data and performing power statistical analysis on them.
Using Pre-Statistical Analysis to Streamline Monitoring Assessments
Reed, J.K.
1999-10-20
A variety of statistical methods exist to aid evaluation of groundwater quality and subsequent decision making in regulatory programs. These methods are applied because of large temporal and spatial extrapolations commonly applied to these data. In short, statistical conclusions often serve as a surrogate for knowledge. However, facilities with mature monitoring programs that have generated abundant data have inherently less uncertainty because of the sheer quantity of analytical results. In these cases, statistical tests can be less important, and ''expert'' data analysis should assume an important screening role.The WSRC Environmental Protection Department, working with the General Separations Area BSRI Environmental Restoration project team has developed a method for an Integrated Hydrogeological Analysis (IHA) of historical water quality data from the F and H Seepage Basins groundwater remediation project. The IHA combines common sense analytical techniques and a GIS presentation that force direct interactive evaluation of the data. The IHA can perform multiple data analysis tasks required by the RCRA permit. These include: (1) Development of a groundwater quality baseline prior to remediation startup, (2) Targeting of constituents for removal from RCRA GWPS, (3) Targeting of constituents for removal from UIC, permit, (4) Targeting of constituents for reduced, (5)Targeting of monitoring wells not producing representative samples, (6) Reduction in statistical evaluation, and (7) Identification of contamination from other facilities.
Multivariate statistical analysis of atom probe tomography data.
Parish, Chad M; Miller, Michael K
2010-10-01
The application of spectrum imaging multivariate statistical analysis methods, specifically principal component analysis (PCA), to atom probe tomography (APT) data has been investigated. The mathematical method of analysis is described and the results for two example datasets are analyzed and presented. The first dataset is from the analysis of a PM 2000 Fe-Cr-Al-Ti steel containing two different ultrafine precipitate populations. PCA properly describes the matrix and precipitate phases in a simple and intuitive manner. A second APT example is from the analysis of an irradiated reactor pressure vessel steel. Fine, nm-scale Cu-enriched precipitates having a core-shell structure were identified and qualitatively described by PCA. Advantages, disadvantages, and future prospects for implementing these data analysis methodologies for APT datasets, particularly with regard to quantitative analysis, are also discussed. PMID:20650566
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-01
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Feature-Based Statistical Analysis of Combustion Simulation Data
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
2008-01-01
There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study. PMID:18765754
SMART: Statistical Metabolomics Analysis-An R Tool.
Liang, Yu-Jen; Lin, Yu-Ting; Chen, Chia-Wei; Lin, Chien-Wei; Chao, Kun-Mao; Pan, Wen-Harn; Yang, Hsin-Chou
2016-06-21
Metabolomics data provide unprecedented opportunities to decipher metabolic mechanisms by analyzing hundreds to thousands of metabolites. Data quality concerns and complex batch effects in metabolomics must be appropriately addressed through statistical analysis. This study developed an integrated analysis tool for metabolomics studies to streamline the complete analysis flow from initial data preprocessing to downstream association analysis. We developed Statistical Metabolomics Analysis-An R Tool (SMART), which can analyze input files with different formats, visually represent various types of data features, implement peak alignment and annotation, conduct quality control for samples and peaks, explore batch effects, and perform association analysis. A pharmacometabolomics study of antihypertensive medication was conducted and data were analyzed using SMART. Neuromedin N was identified as a metabolite significantly associated with angiotensin-converting-enzyme inhibitors in our metabolome-wide association analysis (p = 1.56 × 10(-4) in an analysis of covariance (ANCOVA) with an adjustment for unknown latent groups and p = 1.02 × 10(-4) in an ANCOVA with an adjustment for hidden substructures). This endogenous neuropeptide is highly related to neurotensin and neuromedin U, which are involved in blood pressure regulation and smooth muscle contraction. The SMART software, a user guide, and example data can be downloaded from http://www.stat.sinica.edu.tw/hsinchou/metabolomics/SMART.htm .
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-01
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Thermal hydraulic limits analysis using statistical propagation of parametric uncertainties
Chiang, K. Y.; Hu, L. W.; Forget, B.
2012-07-01
The MIT Research Reactor (MITR) is evaluating the conversion from highly enriched uranium (HEU) to low enrichment uranium (LEU) fuel. In addition to the fuel element re-design, a reactor power upgraded from 6 MW to 7 MW is proposed in order to maintain the same reactor performance of the HEU core. Previous approach in analyzing the impact of engineering uncertainties on thermal hydraulic limits via the use of engineering hot channel factors (EHCFs) was unable to explicitly quantify the uncertainty and confidence level in reactor parameters. The objective of this study is to develop a methodology for MITR thermal hydraulic limits analysis by statistically combining engineering uncertainties with an aim to eliminate unnecessary conservatism inherent in traditional analyses. This method was employed to analyze the Limiting Safety System Settings (LSSS) for the MITR, which is the avoidance of the onset of nucleate boiling (ONB). Key parameters, such as coolant channel tolerances and heat transfer coefficients, were considered as normal distributions using Oracle Crystal Ball to calculate ONB. The LSSS power is determined with 99.7% confidence level. The LSSS power calculated using this new methodology is 9.1 MW, based on core outlet coolant temperature of 60 deg. C, and primary coolant flow rate of 1800 gpm, compared to 8.3 MW obtained from the analytical method using the EHCFs with same operating conditions. The same methodology was also used to calculate the safety limit (SL) for the MITR, conservatively determined using onset of flow instability (OFI) as the criterion, to verify that adequate safety margin exists between LSSS and SL. The calculated SL is 10.6 MW, which is 1.5 MW higher than LSSS. (authors)
3D statistical failure analysis of monolithic dental ceramic crowns.
Nasrin, Sadia; Katsube, Noriko; Seghi, Robert R; Rokhlin, Stanislav I
2016-07-01
For adhesively retained ceramic crown of various types, it has been clinically observed that the most catastrophic failures initiate from the cement interface as a result of radial crack formation as opposed to Hertzian contact stresses originating on the occlusal surface. In this work, a 3D failure prognosis model is developed for interface initiated failures of monolithic ceramic crowns. The surface flaw distribution parameters determined by biaxial flexural tests on ceramic plates and point-to-point variations of multi-axial stress state at the intaglio surface are obtained by finite element stress analysis. They are combined on the basis of fracture mechanics based statistical failure probability model to predict failure probability of a monolithic crown subjected to single-cycle indentation load. The proposed method is verified by prior 2D axisymmetric model and experimental data. Under conditions where the crowns are completely bonded to the tooth substrate, both high flexural stress and high interfacial shear stress are shown to occur in the wall region where the crown thickness is relatively thin while high interfacial normal tensile stress distribution is observed at the margin region. Significant impact of reduced cement modulus on these stress states is shown. While the analyses are limited to single-cycle load-to-failure tests, high interfacial normal tensile stress or high interfacial shear stress may contribute to degradation of the cement bond between ceramic and dentin. In addition, the crown failure probability is shown to be controlled by high flexural stress concentrations over a small area, and the proposed method might be of some value to detect initial crown design errors. PMID:27215334
Bayesian Statistical Analysis of Circadian Oscillations in Fibroblasts
Cohen, Andrew L.; Leise, Tanya L.; Welsh, David K.
2012-01-01
Precise determination of a noisy biological oscillator’s period from limited experimental data can be challenging. The common practice is to calculate a single number (a point estimate) for the period of a particular time course. Uncertainty is inherent in any statistical estimator applied to noisy data, so our confidence in such point estimates depends on the quality and quantity of the data. Ideally, a period estimation method should both produce an accurate point estimate of the period and measure the uncertainty in that point estimate. A variety of period estimation methods are known, but few assess the uncertainty of the estimates, and a measure of uncertainty is rarely reported in the experimental literature. We compare the accuracy of point estimates using six common methods, only one of which can also produce uncertainty measures. We then illustrate the advantages of a new Bayesian method for estimating period, which outperforms the other six methods in accuracy of point estimates for simulated data and also provides a measure of uncertainty. We apply this method to analyze circadian oscillations of gene expression in individual mouse fibroblast cells and compute the number of cells and sampling duration required to reduce the uncertainty in period estimates to a desired level. This analysis indicates that, due to the stochastic variability of noisy intracellular oscillators, achieving a narrow margin of error can require an impractically large number of cells. In addition, we use a hierarchical model to determine the distribution of intrinsic cell periods, thereby separating the variability due to stochastic gene expression within each cell from the variability in period across the population of cells. PMID:22982138
3D statistical failure analysis of monolithic dental ceramic crowns.
Nasrin, Sadia; Katsube, Noriko; Seghi, Robert R; Rokhlin, Stanislav I
2016-07-01
For adhesively retained ceramic crown of various types, it has been clinically observed that the most catastrophic failures initiate from the cement interface as a result of radial crack formation as opposed to Hertzian contact stresses originating on the occlusal surface. In this work, a 3D failure prognosis model is developed for interface initiated failures of monolithic ceramic crowns. The surface flaw distribution parameters determined by biaxial flexural tests on ceramic plates and point-to-point variations of multi-axial stress state at the intaglio surface are obtained by finite element stress analysis. They are combined on the basis of fracture mechanics based statistical failure probability model to predict failure probability of a monolithic crown subjected to single-cycle indentation load. The proposed method is verified by prior 2D axisymmetric model and experimental data. Under conditions where the crowns are completely bonded to the tooth substrate, both high flexural stress and high interfacial shear stress are shown to occur in the wall region where the crown thickness is relatively thin while high interfacial normal tensile stress distribution is observed at the margin region. Significant impact of reduced cement modulus on these stress states is shown. While the analyses are limited to single-cycle load-to-failure tests, high interfacial normal tensile stress or high interfacial shear stress may contribute to degradation of the cement bond between ceramic and dentin. In addition, the crown failure probability is shown to be controlled by high flexural stress concentrations over a small area, and the proposed method might be of some value to detect initial crown design errors.
STATISTICAL ANALYSIS OF THE HEAVY NEUTRAL ATOMS MEASURED BY IBEX
Park, Jeewoo; Kucharek, Harald; Möbius, Eberhard; Galli, André; Livadiotis, George; Fuselier, Steve A.; McComas, David J.
2015-10-15
We investigate the directional distribution of heavy neutral atoms in the heliosphere by using heavy neutral maps generated with the IBEX-Lo instrument over three years from 2009 to 2011. The interstellar neutral (ISN) O and Ne gas flow was found in the first-year heavy neutral map at 601 keV and its flow direction and temperature were studied. However, due to the low counting statistics, researchers have not treated the full sky maps in detail. The main goal of this study is to evaluate the statistical significance of each pixel in the heavy neutral maps to get a better understanding of the directional distribution of heavy neutral atoms in the heliosphere. Here, we examine three statistical analysis methods: the signal-to-noise filter, the confidence limit method, and the cluster analysis method. These methods allow us to exclude background from areas where the heavy neutral signal is statistically significant. These methods also allow the consistent detection of heavy neutral atom structures. The main emission feature expands toward lower longitude and higher latitude from the observational peak of the ISN O and Ne gas flow. We call this emission the extended tail. It may be an imprint of the secondary oxygen atoms generated by charge exchange between ISN hydrogen atoms and oxygen ions in the outer heliosheath.
Statistical Analysis of speckle noise reduction techniques for echocardiographic Images
NASA Astrophysics Data System (ADS)
Saini, Kalpana; Dewal, M. L.; Rohit, Manojkumar
2011-12-01
Echocardiography is the safe, easy and fast technology for diagnosing the cardiac diseases. As in other ultrasound images these images also contain speckle noise. In some cases this speckle noise is useful such as in motion detection. But in general noise removal is required for better analysis of the image and proper diagnosis. Different Adaptive and anisotropic filters are included for statistical analysis. Statistical parameters such as Signal-to-Noise Ratio (SNR), Peak Signal-to-Noise Ratio (PSNR), and Root Mean Square Error (RMSE) calculated for performance measurement. One more important aspect that there may be blurring during speckle noise removal. So it is prefered that filter should be able to enhance edges during noise removal.
Collagen morphology and texture analysis: from statistics to classification
NASA Astrophysics Data System (ADS)
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-07-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage.
Statistics in experimental design, preprocessing, and analysis of proteomics data.
Jung, Klaus
2011-01-01
High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
The statistical analysis of multivariate serological frequency data.
Reyment, Richard A
2005-11-01
Data occurring in the form of frequencies are common in genetics-for example, in serology. Examples are provided by the AB0 group, the Rhesus group, and also DNA data. The statistical analysis of tables of frequencies is carried out using the available methods of multivariate analysis with usually three principal aims. One of these is to seek meaningful relationships between the components of a data set, the second is to examine relationships between populations from which the data have been obtained, the third is to bring about a reduction in dimensionality. This latter aim is usually realized by means of bivariate scatter diagrams using scores computed from a multivariate analysis. The multivariate statistical analysis of tables of frequencies cannot safely be carried out by standard multivariate procedures because they represent compositions and are therefore embedded in simplex space, a subspace of full space. Appropriate procedures for simplex space are compared and contrasted with simple standard methods of multivariate analysis ("raw" principal component analysis). The study shows that the differences between a log-ratio model and a simple logarithmic transformation of proportions may not be very great, particularly as regards graphical ordinations, but important discrepancies do occur. The divergencies between logarithmically based analyses and raw data are, however, great. Published data on Rhesus alleles observed for Italian populations are used to exemplify the subject. PMID:16024067
The statistical analysis of multivariate serological frequency data.
Reyment, Richard A
2005-11-01
Data occurring in the form of frequencies are common in genetics-for example, in serology. Examples are provided by the AB0 group, the Rhesus group, and also DNA data. The statistical analysis of tables of frequencies is carried out using the available methods of multivariate analysis with usually three principal aims. One of these is to seek meaningful relationships between the components of a data set, the second is to examine relationships between populations from which the data have been obtained, the third is to bring about a reduction in dimensionality. This latter aim is usually realized by means of bivariate scatter diagrams using scores computed from a multivariate analysis. The multivariate statistical analysis of tables of frequencies cannot safely be carried out by standard multivariate procedures because they represent compositions and are therefore embedded in simplex space, a subspace of full space. Appropriate procedures for simplex space are compared and contrasted with simple standard methods of multivariate analysis ("raw" principal component analysis). The study shows that the differences between a log-ratio model and a simple logarithmic transformation of proportions may not be very great, particularly as regards graphical ordinations, but important discrepancies do occur. The divergencies between logarithmically based analyses and raw data are, however, great. Published data on Rhesus alleles observed for Italian populations are used to exemplify the subject.
Computed Tomography Inspection and Analysis for Additive Manufacturing Components
NASA Technical Reports Server (NTRS)
Beshears, Ronald D.
2016-01-01
Computed tomography (CT) inspection was performed on test articles additively manufactured from metallic materials. Metallic AM and machined wrought alloy test articles with programmed flaws were inspected using a 2MeV linear accelerator based CT system. Performance of CT inspection on identically configured wrought and AM components and programmed flaws was assessed using standard image analysis techniques to determine the impact of additive manufacturing on inspectability of objects with complex geometries.
Along-tract statistics allow for enhanced tractography analysis
Colby, John B.; Soderberg, Lindsay; Lebel, Catherine; Dinov, Ivo D.; Thompson, Paul M.; Sowell, Elizabeth R.
2011-01-01
Diffusion imaging tractography is a valuable tool for neuroscience researchers because it allows the generation of individualized virtual dissections of major white matter tracts in the human brain. It facilitates between-subject statistical analyses tailored to the specific anatomy of each participant. There is prominent variation in diffusion imaging metrics (e.g., fractional anisotropy, FA) within tracts, but most tractography studies use a “tract-averaged” approach to analysis by averaging the scalar values from the many streamline vertices in a tract dissection into a single point-spread estimate for each tract. Here we describe a complete workflow needed to conduct an along-tract analysis of white matter streamline tract groups. This consists of 1) A flexible MATLAB toolkit for generating along-tract data based on B-spline resampling and compilation of scalar data at different collections of vertices along the curving tract spines, and 2) Statistical analysis and rich data visualization by leveraging tools available through the R platform for statistical computing. We demonstrate the effectiveness of such an along-tract approach over the tract-averaged approach in an example analysis of 10 major white matter tracts in a single subject. We also show that these techniques easily extend to between-group analyses typically used in neuroscience applications, by conducting an along-tract analysis of differences in FA between 9 individuals with fetal alcohol spectrum disorders (FASDs) and 11 typically-developing controls. This analysis reveals localized differences between FASD and control groups that were not apparent using a tract-averaged method. Finally, to validate our approach and highlight the strength of this extensible software framework, we implement 2 other methods from the literature and leverage the existing workflow tools to conduct a comparison study. PMID:22094644
Common misconceptions about data analysis and statistics1
Motulsky, Harvey J
2015-01-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word “significant”. (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Ambiguity and nonidentifiability in the statistical analysis of neural codes
Amarasingham, Asohan; Geman, Stuart; Harrison, Matthew T.
2015-01-01
Many experimental studies of neural coding rely on a statistical interpretation of the theoretical notion of the rate at which a neuron fires spikes. For example, neuroscientists often ask, “Does a population of neurons exhibit more synchronous spiking than one would expect from the covariability of their instantaneous firing rates?” For another example, “How much of a neuron’s observed spiking variability is caused by the variability of its instantaneous firing rate, and how much is caused by spike timing variability?” However, a neuron’s theoretical firing rate is not necessarily well-defined. Consequently, neuroscientific questions involving the theoretical firing rate do not have a meaning in isolation but can only be interpreted in light of additional statistical modeling choices. Ignoring this ambiguity can lead to inconsistent reasoning or wayward conclusions. We illustrate these issues with examples drawn from the neural-coding literature. PMID:25934918
Multivariate statistical analysis of low-voltage EDS spectrum images
Anderson, I.M.
1998-03-01
Whereas energy-dispersive X-ray spectrometry (EDS) has been used for compositional analysis in the scanning electron microscope for 30 years, the benefits of using low operating voltages for such analyses have been explored only during the last few years. This paper couples low-voltage EDS with two other emerging areas of characterization: spectrum imaging and multivariate statistical analysis. The specimen analyzed for this study was a finished Intel Pentium processor, with the polyimide protective coating stripped off to expose the final active layers.
HistFitter: a flexible framework for statistical data analysis
NASA Astrophysics Data System (ADS)
Besjes, G. J.; Baak, M.; Côté, D.; Koutsman, A.; Lorenz, J. M.; Short, D.
2015-12-01
HistFitter is a software framework for statistical data analysis that has been used extensively in the ATLAS Collaboration to analyze data of proton-proton collisions produced by the Large Hadron Collider at CERN. Most notably, HistFitter has become a de-facto standard in searches for supersymmetric particles since 2012, with some usage for Exotic and Higgs boson physics. HistFitter coherently combines several statistics tools in a programmable and flexible framework that is capable of bookkeeping hundreds of data models under study using thousands of generated input histograms. HistFitter interfaces with the statistics tools HistFactory and RooStats to construct parametric models and to perform statistical tests of the data, and extends these tools in four key areas. The key innovations are to weave the concepts of control, validation and signal regions into the very fabric of HistFitter, and to treat these with rigorous methods. Multiple tools to visualize and interpret the results through a simple configuration interface are also provided.
Statistical analysis of heartbeat data with wavelet techniques
NASA Astrophysics Data System (ADS)
Pazsit, Imre
2004-05-01
The purpose of this paper is to demonstrate the use of some methods of signal analysis, performed on ECG and in some cases blood pressure signals, for the classification of the health status of the heart of mice and rats. Spectral and wavelet analysis were performed on the raw signals. FFT-based coherence and phase was also calculated between blood pressure and raw ECG signals. Finally, RR-intervals were deduced from the ECG signals and an analysis of the fractal dimensions was performed. The analysis was made on data from mice and rats. A correlation was found between the health status of the mice and the rats and some of the statistical descriptors, most notably the phase of the cross-spectra between ECG and blood pressure, and the fractal properties and dimensions of the interbeat series (RR-interval fluctuations).
Bayesian statistical analysis of protein side-chain rotamer preferences.
Dunbrack, R. L.; Cohen, F. E.
1997-01-01
We present a Bayesian statistical analysis of the conformations of side chains in proteins from the Protein Data Bank. This is an extension of the backbone-dependent rotamer library, and includes rotamer populations and average chi angles for a full range of phi, psi values. The Bayesian analysis used here provides a rigorous statistical method for taking account of varying amounts of data. Bayesian statistics requires the assumption of a prior distribution for parameters over their range of possible values. This prior distribution can be derived from previous data or from pooling some of the present data. The prior distribution is combined with the data to form the posterior distribution, which is a compromise between the prior distribution and the data. For the chi 2, chi 3, and chi 4 rotamer prior distributions, we assume that the probability of each rotamer type is dependent only on the previous chi rotamer in the chain. For the backbone-dependence of the chi 1 rotamers, we derive prior distributions from the product of the phi-dependent and psi-dependent probabilities. Molecular mechanics calculations with the CHARMM22 potential show a strong similarity with the experimental distributions, indicating that proteins attain their lowest energy rotamers with respect to local backbone-side-chain interactions. The new library is suitable for use in homology modeling, protein folding simulations, and the refinement of X-ray and NMR structures. PMID:9260279
Agriculture, population growth, and statistical analysis of the radiocarbon record
Zahid, H. Jabran; Robinson, Erick; Kelly, Robert L.
2016-01-01
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide. PMID:26699457
Agriculture, population growth, and statistical analysis of the radiocarbon record.
Zahid, H Jabran; Robinson, Erick; Kelly, Robert L
2016-01-26
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.
Agriculture, population growth, and statistical analysis of the radiocarbon record.
Zahid, H Jabran; Robinson, Erick; Kelly, Robert L
2016-01-26
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide. PMID:26699457
Optimal Multicomponent Analysis Using the Generalized Standard Addition Method.
ERIC Educational Resources Information Center
Raymond, Margaret; And Others
1983-01-01
Describes an experiment on the simultaneous determination of chromium and magnesium by spectophotometry modified to include the Generalized Standard Addition Method computer program, a multivariate calibration method that provides optimal multicomponent analysis in the presence of interference and matrix effects. Provides instructions for…
Spatial statistical analysis of tree deaths using airborne digital imagery
NASA Astrophysics Data System (ADS)
Chang, Ya-Mei; Baddeley, Adrian; Wallace, Jeremy; Canci, Michael
2013-04-01
High resolution digital airborne imagery offers unprecedented opportunities for observation and monitoring of vegetation, providing the potential to identify, locate and track individual vegetation objects over time. Analytical tools are required to quantify relevant information. In this paper, locations of trees over a large area of native woodland vegetation were identified using morphological image analysis techniques. Methods of spatial point process statistics were then applied to estimate the spatially-varying tree death risk, and to show that it is significantly non-uniform. [Tree deaths over the area were detected in our previous work (Wallace et al., 2008).] The study area is a major source of ground water for the city of Perth, and the work was motivated by the need to understand and quantify vegetation changes in the context of water extraction and drying climate. The influence of hydrological variables on tree death risk was investigated using spatial statistics (graphical exploratory methods, spatial point pattern modelling and diagnostics).
[Statistical analysis of DNA sequences nearby splicing sites].
Korzinov, O M; Astakhova, T V; Vlasov, P K; Roĭtberg, M A
2008-01-01
Recognition of coding regions within eukaryotic genomes is one of oldest but yet not solved problems of bioinformatics. New high-accuracy methods of splicing sites recognition are needed to solve this problem. A question of current interest is to identify specific features of nucleotide sequences nearby splicing sites and recognize sites in sequence context. We performed a statistical analysis of human genes fragment database and revealed some characteristics of nucleotide sequences in splicing sites neighborhood. Frequencies of all nucleotides and dinucleotides in splicing sites environment were computed and nucleotides and dinucleotides with extremely high\\low occurrences were identified. Statistical information obtained in this work can be used in further development of the methods of splicing sites annotation and exon-intron structure recognition.
Analysis of the Spatial Organization of Molecules with Robust Statistics
Lagache, Thibault; Lang, Gabriel; Sauvonnet, Nathalie; Olivo-Marin, Jean-Christophe
2013-01-01
One major question in molecular biology is whether the spatial distribution of observed molecules is random or organized in clusters. Indeed, this analysis gives information about molecules’ interactions and physical interplay with their environment. The standard tool for analyzing molecules’ distribution statistically is the Ripley’s K function, which tests spatial randomness through the computation of its critical quantiles. However, quantiles’ computation is very cumbersome, hindering its use. Here, we present an analytical expression of these quantiles, leading to a fast and robust statistical test, and we derive the characteristic clusters’ size from the maxima of the Ripley’s K function. Subsequently, we analyze the spatial organization of endocytic spots at the cell membrane and we report that clathrin spots are randomly distributed while clathrin-independent spots are organized in clusters with a radius of , which suggests distinct physical mechanisms and cellular functions for each pathway. PMID:24349021
Statistical analysis of subjective preferences for video enhancement
NASA Astrophysics Data System (ADS)
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
Noise removing in encrypted color images by statistical analysis
NASA Astrophysics Data System (ADS)
Islam, N.; Puech, W.
2012-03-01
Cryptographic techniques are used to secure confidential data from unauthorized access but these techniques are very sensitive to noise. A single bit change in encrypted data can have catastrophic impact over the decrypted data. This paper addresses the problem of removing bit error in visual data which are encrypted using AES algorithm in the CBC mode. In order to remove the noise, a method is proposed which is based on the statistical analysis of each block during the decryption. The proposed method exploits local statistics of the visual data and confusion/diffusion properties of the encryption algorithm to remove the errors. Experimental results show that the proposed method can be used at the receiving end for the possible solution for noise removing in visual data in encrypted domain.
Statistical methods for the detection and analysis of radioactive sources
NASA Astrophysics Data System (ADS)
Klumpp, John
We consider four topics from areas of radioactive statistical analysis in the present study: Bayesian methods for the analysis of count rate data, analysis of energy data, a model for non-constant background count rate distributions, and a zero-inflated model of the sample count rate. The study begins with a review of Bayesian statistics and techniques for analyzing count rate data. Next, we consider a novel system for incorporating energy information into count rate measurements which searches for elevated count rates in multiple energy regions simultaneously. The system analyzes time-interval data in real time to sequentially update a probability distribution for the sample count rate. We then consider a "moving target" model of background radiation in which the instantaneous background count rate is a function of time, rather than being fixed. Unlike the sequential update system, this model assumes a large body of pre-existing data which can be analyzed retrospectively. Finally, we propose a novel Bayesian technique which allows for simultaneous source detection and count rate analysis. This technique is fully compatible with, but independent of, the sequential update system and moving target model.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data
ZHU, HONGTU; IBRAHIM, JOSEPH G.; TANG, NIANSHENG
2013-01-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures. PMID:24753718
STATISTICAL ANALYSIS OF TANK 18F FLOOR SAMPLE RESULTS
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 18F as per the statistical sampling plan developed by Shine [1]. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL [2]. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results [3] to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL{sub 95%}) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 18F. The uncertainty is quantified in this report by an upper 95% confidence limit (UCL{sub 95%}) on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL{sub 95%} was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
STATISTICAL ANALYSIS OF TANK 19F FLOOR SAMPLE RESULTS
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Statistical mechanics analysis of thresholding 1-bit compressed sensing
NASA Astrophysics Data System (ADS)
Xu, Yingying; Kabashima, Yoshiyuki
2016-08-01
The one-bit compressed sensing framework aims to reconstruct a sparse signal by only using the sign information of its linear measurements. To compensate for the loss of scale information, past studies in the area have proposed recovering the signal by imposing an additional constraint on the l 2-norm of the signal. Recently, an alternative strategy that captures scale information by introducing a threshold parameter to the quantization process was advanced. In this paper, we analyze the typical behavior of thresholding 1-bit compressed sensing utilizing the replica method of statistical mechanics, so as to gain an insight for properly setting the threshold value. Our result shows that fixing the threshold at a constant value yields better performance than varying it randomly when the constant is optimally tuned, statistically. Unfortunately, the optimal threshold value depends on the statistical properties of the target signal, which may not be known in advance. In order to handle this inconvenience, we develop a heuristic that adaptively tunes the threshold parameter based on the frequency of positive (or negative) values in the binary outputs. Numerical experiments show that the heuristic exhibits satisfactory performance while incurring low computational cost.
A global analysis of soil acidification caused by nitrogen addition
NASA Astrophysics Data System (ADS)
Tian, Dashuan; Niu, Shuli
2015-02-01
Nitrogen (N) deposition-induced soil acidification has become a global problem. However, the response patterns of soil acidification to N addition and the underlying mechanisms remain far from clear. Here, we conducted a meta-analysis of 106 studies to reveal global patterns of soil acidification in responses to N addition. We found that N addition significantly reduced soil pH by 0.26 on average globally. However, the responses of soil pH varied with ecosystem types, N addition rate, N fertilization forms, and experimental durations. Soil pH decreased most in grassland, whereas boreal forest was not observed a decrease to N addition in soil acidification. Soil pH decreased linearly with N addition rates. Addition of urea and NH4NO3 contributed more to soil acidification than NH4-form fertilizer. When experimental duration was longer than 20 years, N addition effects on soil acidification diminished. Environmental factors such as initial soil pH, soil carbon and nitrogen content, precipitation, and temperature all influenced the responses of soil pH. Base cations of Ca2+, Mg2+ and K+ were critical important in buffering against N-induced soil acidification at the early stage. However, N addition has shifted global soils into the Al3+ buffering phase. Overall, this study indicates that acidification in global soils is very sensitive to N deposition, which is greatly modified by biotic and abiotic factors. Global soils are now at a buffering transition from base cations (Ca2+, Mg2+ and K+) to non-base cations (Mn2+ and Al3+). This calls our attention to care about the limitation of base cations and the toxic impact of non-base cations for terrestrial ecosystems with N deposition.
Statistical energy analysis of complex structures, phase 2
NASA Technical Reports Server (NTRS)
Trudell, R. W.; Yano, L. I.
1980-01-01
A method for estimating the structural vibration properties of complex systems in high frequency environments was investigated. The structure analyzed was the Materials Experiment Assembly, (MEA), which is a portion of the OST-2A payload for the space transportation system. Statistical energy analysis (SEA) techniques were used to model the structure and predict the structural element response to acoustic excitation. A comparison of the intial response predictions and measured acoustic test data is presented. The conclusions indicate that: the SEA predicted the response of primary structure to acoustic excitation over a wide range of frequencies; and the contribution of mechanically induced random vibration to the total MEA is not significant.
Statistical Analysis of Strength Data for an Aerospace Aluminum Alloy
NASA Technical Reports Server (NTRS)
Neergaard, L.; Malone, T.
2001-01-01
Aerospace vehicles are produced in limited quantities that do not always allow development of MIL-HDBK-5 A-basis design allowables. One method of examining production and composition variations is to perform 100% lot acceptance testing for aerospace Aluminum (Al) alloys. This paper discusses statistical trends seen in strength data for one Al alloy. A four-step approach reduced the data to residuals, visualized residuals as a function of time, grouped data with quantified scatter, and conducted analysis of variance (ANOVA).
Statistical design and analysis of RNA sequencing data.
Auer, Paul L; Doerge, R W
2010-06-01
Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collecting replicated RNA sequencing data according to well known statistical designs that partition the sources of biological and technical variation. Examples of these designs and their corresponding models are presented with the goal of testing differential expression.
Statistical Analysis in Genetic Studies of Mental Illnesses
Zhang, Heping
2011-01-01
Identifying the risk factors for mental illnesses is of significant public health importance. Diagnosis, stigma associated with mental illnesses, comorbidity, and complex etiologies, among others, make it very challenging to study mental disorders. Genetic studies of mental illnesses date back at least a century ago, beginning with descriptive studies based on Mendelian laws of inheritance. A variety of study designs including twin studies, family studies, linkage analysis, and more recently, genomewide association studies have been employed to study the genetics of mental illnesses, or complex diseases in general. In this paper, I will present the challenges and methods from a statistical perspective and focus on genetic association studies. PMID:21909187
Statistical Analysis of Strength Data for an Aerospace Aluminum Alloy
NASA Technical Reports Server (NTRS)
Neergaard, Lynn; Malone, Tina; Gentz, Steven J. (Technical Monitor)
2000-01-01
Aerospace vehicles are produced in limited quantities that do not always allow development of MIL-HDBK-5 A-basis design allowables. One method of examining production and composition variations is to perform 100% lot acceptance testing for aerospace Aluminum (Al) alloys. This paper discusses statistical trends seen in strength data for one Al alloy. A four-step approach reduced the data to residuals, visualized residuals as a function of time, grouped data with quantified scatter, and conducted analysis of variance (ANOVA).
Multi-scale statistical analysis of coronal solar activity
Gamborino, Diana; del-Castillo-Negrete, Diego; Martinell, Julio J.
2016-07-08
Multi-filter images from the solar corona are used to obtain temperature maps that are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions, we show that the multi-scale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also to be extracted from the analysis.
[Kinetic analysis of additive effect on desulfurization activity].
Han, Kui-hua; Zhao, Jian-li; Lu, Chun-mei; Wang, Yong-zheng; Zhao, Gai-ju; Cheng, Shi-qing
2006-02-01
The additive effects of A12O3, Fe2O3 and MnCO3 on CaO sulfation kinetics were investigated by thermogravimetic analysis method and modified grain model. The activation energy (Ea) and the pre-exponential factor (k0) of surface reaction, the activation energy (Ep) and the pre-exponential factor (D0) of product layer diffusion reaction were calculated according to the model. Additions of MnCO3 can enhance the initial reaction rate, product layer diffusion and the final CaO conversion of sorbents, the effect mechanism of which is similar to that of Fe2O3. The method based isokinetic temperature Ts and activation energy can not estimate the contribution of additive to the sulfation reactivity, the rate constant of the surface reaction (k), and the effective diffusivity of reactant in the product layer (Ds) under certain experimental conditions can reflect the effect of additives on the activation. Unstoichiometric metal oxide may catalyze the surface reaction and promote the diffusivity of reactant in the product layer by the crystal defect and distinct diffusion of cation and anion. According to the mechanism and effect of additive on the sulfation, the effective temperature and the stoichiometric relation of reaction, it is possible to improve the utilization of sorbent by compounding more additives to the calcium-based sorbent.
Statistical characterization of life drivers for a probabilistic design analysis
NASA Technical Reports Server (NTRS)
Fox, Eric P.; Safie, Fayssal
1992-01-01
This paper discusses the issue of statistical characterization of life drivers for a probabilistic design analysis (PDA) approach to support the conventional deterministic structural design methods that are currently used. The probabilistic approach takes into consideration the modeling inadequacies and uncertainties in many design variables such as loads, environments, and material properties. The importance of the distributional assumption is motivated by illustrating an example where the results differ substantially due to the distribution selected. Different types of distributions are discussed and techniques for estimating the parameters are given. Given this information, procedures are outlined for selecting the appropriate distribution based on the particular type of variable (i.e., dimensional, performance) as well as the information that is available (i.e., test data, engineering analysis). Finally, techniques are given for generating random numbers from these selected distributions within the PDA process.
Detection of bearing damage by statistic vibration analysis
NASA Astrophysics Data System (ADS)
Sikora, E. A.
2016-04-01
The condition of bearings, which are essential components in mechanisms, is crucial to safety. The analysis of the bearing vibration signal, which is always contaminated by certain types of noise, is a very important standard for mechanical condition diagnosis of the bearing and mechanical failure phenomenon. In this paper the method of rolling bearing fault detection by statistical analysis of vibration is proposed to filter out Gaussian noise contained in a raw vibration signal. The results of experiments show that the vibration signal can be significantly enhanced by application of the proposed method. Besides, the proposed method is used to analyse real acoustic signals of a bearing with inner race and outer race faults, respectively. The values of attributes are determined according to the degree of the fault. The results confirm that the periods between the transients, which represent bearing fault characteristics, can be successfully detected.
Vibroacoustic optimization using a statistical energy analysis model
NASA Astrophysics Data System (ADS)
Culla, Antonio; D`Ambrogio, Walter; Fregolent, Annalisa; Milana, Silvia
2016-08-01
In this paper, an optimization technique for medium-high frequency dynamic problems based on Statistical Energy Analysis (SEA) method is presented. Using a SEA model, the subsystem energies are controlled by internal loss factors (ILF) and coupling loss factors (CLF), which in turn depend on the physical parameters of the subsystems. A preliminary sensitivity analysis of subsystem energy to CLF's is performed to select CLF's that are most effective on subsystem energies. Since the injected power depends not only on the external loads but on the physical parameters of the subsystems as well, it must be taken into account under certain conditions. This is accomplished in the optimization procedure, where approximate relationships between CLF's, injected power and physical parameters are derived. The approach is applied on a typical aeronautical structure: the cabin of a helicopter.
Statistical learning analysis in neuroscience: aiming for transparency.
Hanke, Michael; Halchenko, Yaroslav O; Haxby, James V; Pollmann, Stefan
2010-01-01
Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires "neuroscience-aware" technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270
First statistical analysis of Geant4 quality software metrics
NASA Astrophysics Data System (ADS)
Ronchieri, Elisabetta; Grazia Pia, Maria; Giacomini, Francesco
2015-12-01
Geant4 is a simulation system of particle transport through matter, widely used in several experimental areas from high energy physics and nuclear experiments to medical studies. Some of its applications may involve critical use cases; therefore they would benefit from an objective assessment of the software quality of Geant4. In this paper, we provide a first statistical evaluation of software metrics data related to a set of Geant4 physics packages. The analysis aims at identifying risks for Geant4 maintainability, which would benefit from being addressed at an early stage. The findings of this pilot study set the grounds for further extensions of the analysis to the whole of Geant4 and to other high energy physics software systems.
Statistical analysis of cascading failures in power grids
Chertkov, Michael; Pfitzner, Rene; Turitsyn, Konstantin
2010-12-01
We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systems consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.
FRATS: Functional Regression Analysis of DTI Tract Statistics
Zhu, Hongtu; Styner, Martin; Tang, Niansheng; Liu, Zhexing; Lin, Weili; Gilmore, John H.
2010-01-01
Diffusion tensor imaging (DTI) provides important information on the structure of white matter fiber bundles as well as detailed tissue properties along these fiber bundles in vivo. This paper presents a functional regression framework, called FRATS, for the analysis of multiple diffusion properties along fiber bundle as functions in an infinite dimensional space and their association with a set of covariates of interest, such as age, diagnostic status and gender, in real applications. The functional regression framework consists of four integrated components: the local polynomial kernel method for smoothing multiple diffusion properties along individual fiber bundles, a functional linear model for characterizing the association between fiber bundle diffusion properties and a set of covariates, a global test statistic for testing hypotheses of interest, and a resampling method for approximating the p-value of the global test statistic. The proposed methodology is applied to characterizing the development of five diffusion properties including fractional anisotropy, mean diffusivity, and the three eigenvalues of diffusion tensor along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment. Significant age and gestational age effects on the five diffusion properties were found in both tracts. The resulting analysis pipeline can be used for understanding normal brain development, the neural bases of neuropsychiatric disorders, and the joint effects of environmental and genetic factors on white matter fiber bundles. PMID:20335089
Design and statistical analysis of oral medicine studies: common pitfalls.
Baccaglini, L; Shuster, J J; Cheng, J; Theriaque, D W; Schoenbach, V J; Tomar, S L; Poole, C
2010-04-01
A growing number of articles are emerging in the medical and statistics literature that describe epidemiologic and statistical flaws of research studies. Many examples of these deficiencies are encountered in the oral, craniofacial, and dental literature. However, only a handful of methodologic articles have been published in the oral literature warning investigators of potential errors that may arise early in the study and that can irreparably bias the final results. In this study, we briefly review some of the most common pitfalls that our team of epidemiologists and statisticians has identified during the review of submitted or published manuscripts and research grant applications. We use practical examples from the oral medicine and dental literature to illustrate potential shortcomings in the design and analysis of research studies, and how these deficiencies may affect the results and their interpretation. A good study design is essential, because errors in the analysis can be corrected if the design was sound, but flaws in study design can lead to data that are not salvageable. We recommend consultation with an epidemiologist or a statistician during the planning phase of a research study to optimize study efficiency, minimize potential sources of bias, and document the analytic plan.
ANALYSIS OF MPC ACCESS REQUIREMENTS FOR ADDITION OF FILLER MATERIALS
W. Wallin
1996-09-03
This analysis is prepared by the Mined Geologic Disposal System (MGDS) Waste Package Development Department (WPDD) in response to a request received via a QAP-3-12 Design Input Data Request (Ref. 5.1) from WAST Design (formerly MRSMPC Design). The request is to provide: Specific MPC access requirements for the addition of filler materials at the MGDS (i.e., location and size of access required). The objective of this analysis is to provide a response to the foregoing request. The purpose of this analysis is to provide a documented record of the basis for the response. The response is stated in Section 8 herein. The response is based upon requirements from an MGDS perspective.
NASA Astrophysics Data System (ADS)
Miranda, M.; Dorrío, B. V.; Blanco, J.; Diz-Bugarín, J.; Ribas, F.
2011-01-01
Several metrological applications base their measurement principle in the phase sum or difference between two patterns, one original s(r,phi) and another modified t(r,phi+Δphi). Additive or differential phase shifting algorithms directly recover the sum 2phi+Δphi or the difference Δphi of phases without requiring prior calculation of the individual phases. These algorithms can be constructed, for example, from a suitable combination of known phase shifting algorithms. Little has been written on the design, analysis and error compensation of these new two-stage algorithms. Previously we have used computer simulation to study, in a linear approach or with a filter process in reciprocal space, the response of several families of them to the main error sources. In this work we present an error analysis that uses Monte Carlo simulation to achieve results in good agreement with those obtained with spatial and temporal methods.
Statistical Analysis of Surface Water Quality Data of Eastern Massachusetts
NASA Astrophysics Data System (ADS)
Andronache, C.; Hon, R.; Tedder, N.; Xian, Q.; Schaudt, B.
2008-05-01
We present a characterization of current state of surface water, changes in time and dependence on land use, precipitation regime, and possible other natural and human influences based on data from the USGS National Water Quality Assessment (NAWQA) Program for New England streams. Time series analysis is used to detect changes and relationship with discharge and precipitation regime. Statistical techniques are employed to analyze relationships among multiple chemical variable monitored. Analysis of ion concentrations reveals information about possible natural sources and processes, and anthropogenic influences. A notable example is the increase in salt concentration in ground and surface waters, with impact on drinking water quality. Salt concentration increase in water can be linked to road salt usage during winters with heavy snowfall and other factors. Road salt enters water supplies by percolation through soil into groundwater or runoff and drainage into reservoirs. After entering fast-flowing streams, rivers and lakes, salt runoff concentrations are rapidly diluted. Road salt infiltration is more common for groundwater-based supplies, such as wells, springs, and reservoirs that are recharged mainly by groundwater. We use principal component analysis and other statistical procedures to obtain a description of the dominant independent variables that influence the observed chemical compositional range. In most cases, over 85 percent of the total variation can be explained by 3 to 4 components. The overwhelming variation is attributed to a large compositional range of Na and Cl seen even if all data are combined into a single dataset. Na versus Cl correlation coefficients are commonly greater than 0.9. Second components are typically associated with dilutions by overland flows (non winter months) and/or increased concentrations due to evaporation (summer season) or overland flows (winter season) if a snow storm is followed by the application of deicers on road
Statistical Scalability Analysis of Communication Operations in Distributed Applications
Vetter, J S; McCracken, M O
2001-02-27
Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for individual communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.
Statistical analysis of the autoregressive modeling of reverberant speech.
Gaubitch, Nikolay D; Ward, Darren B; Naylor, Patrick A
2006-12-01
Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M-channel observation (M > 1); and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M-channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced (<0.3m). PMID:17225429
Constraining cosmology with shear peak statistics: tomographic analysis
NASA Astrophysics Data System (ADS)
Martinet, Nicolas; Bartlett, James G.; Kiessling, Alina; Sartoris, Barbara
2015-09-01
The abundance of peaks in weak gravitational lensing maps is a potentially powerful cosmological tool, complementary to measurements of the shear power spectrum. We study peaks detected directly in shear maps, rather than convergence maps, an approach that has the advantage of working directly with the observable quantity, the galaxy ellipticity catalog. Using large numbers of numerical simulations to accurately predict the abundance of peaks and their covariance, we quantify the cosmological constraints attainable by a large-area survey similar to that expected from the Euclid mission, focusing on the density parameter, Ωm, and on the power spectrum normalization, σ8, for illustration. We present a tomographic peak counting method that improves the conditional (marginal) constraints by a factor of 1.2 (2) over those from a two-dimensional (i.e., non-tomographic) peak-count analysis. We find that peak statistics provide constraints an order of magnitude less accurate than those from the cluster sample in the ideal situation of a perfectly known observable-mass relation; however, when the scaling relation is not known a priori, the shear-peak constraints are twice as strong and orthogonal to the cluster constraints, highlighting the value of using both clusters and shear-peak statistics.
Statistical analysis of test data for APM rod issue
Edwards, T.B.; Harris, S.P.; Reeve, C.P.
1992-05-01
The uncertainty associated with the use of the K-Reactor axial power monitors (APMs) to measure roof-top-ratios is investigated in this report. Internal heating test data acquired under both DC-flow conditions and AC-flow conditions have been analyzed. These tests were conducted to simulate gamma heating at the lower power levels planned for reactor operation. The objective of this statistical analysis is to investigate the relationship between the observed and true roof-top-ratio (RTR) values and associated uncertainties at power levels within this lower operational range. Conditional on a given, known power level, a prediction interval for the true RTR value corresponding to a new, observed RTR is given. This is done for a range of power levels. Estimates of total system uncertainty are also determined by combining the analog-to-digital converter uncertainty with the results from the test data.
Statistical models of video structure for content analysis and characterization.
Vasconcelos, N; Lippman, A
2000-01-01
Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries.
Statistical analysis of a carcinogen mixture experiment. I. Liver carcinogens.
Elashoff, R M; Fears, T R; Schneiderman, M A
1987-09-01
This paper describes factorial experiments designed to determine whether 2 liver carcinogens act synergistically to produce liver cancers in Fischer 344 rats. Four hepatocarcinogens, cycad flour, lasiocarpine (CAS: 303-34-4), aflatoxin B1 (CAS: 1162-65-8), and dipentylnitrosamine (CAS: 13256-06-9), were studied in pairwise combinations. Each of the 6 possible pairs was studied by means of 4 X 4 factorial experiment, each agent being fed at zero and at 3 non-zero doses. Methods of analysis designed explicitly for this study were derived to study interaction. These methods were supplemented by standard statistical methods appropriate for one-at-a-time studies. Antagonism was not discovered in any chemical mixture. Some chemical mixtures did interact synergistically. Findings for male and female animals were generally, but not always, in agreement.
Barcode localization with region based gradient statistical analysis
NASA Astrophysics Data System (ADS)
Chen, Zhiyuan; Zhao, Yuming
2015-03-01
Barcode, as a kind of data representation method, has been adopted in a wide range of areas. Especially with the rise of the smart phone and the hand-held device equipped with high resolution camera and great computation power, barcode technique has found itself more extensive applications. In industrial field, barcode reading system is highly demanded to be robust to blur, illumination change, pitch, rotation, and scale change. This paper gives a new idea in localizing barcode under a region-based gradient statistical analysis. Making this idea as the basis, four algorithms have been developed for dealing with Linear, PDF417, Stacked 1D1D and Stacked 1D2D barcodes respectively. After being evaluated on our challenging dataset with more than 17000 images, the result shows that our methods can achieve an average localization accuracy of 82.17% with respect to 8 kinds of distortions and within an average time of 12 ms.
Statistical analysis of arch shape with conic sections.
Sampson, P D
1983-06-01
Arcs of conic sections are used to model the shapes of human dental arches and to provide a basis for the statistical and graphical analysis of a population of shapes. The Bingham distribution, an elliptical distribution on a hypersphere, is applied in order to model the coefficients of the conic arcs. It provides a definition of an 'average shape' and it quantifies variation in shape. Geometric envelopes of families of conic arcs whose coefficients satisfy a quadratic constraint are used to depict the distribution of shapes in the plane and to make graphical inferences about the average shape. The methods are demonstrated with conic arcs fitted to a sample of 66 maxillary dental arches.
Dental arch shape: a statistical analysis using conic sections.
Sampson, P D
1981-05-01
This report addresses two problems in the study of the shape of human dental arches; (1) the description of arch shape by mathematical functions and (2) the description of variation among the dental arch shapes in a population. A new algorithm for fitting conic sections is used to model the maxillary dental arches of a sample of sixty-six subjects. A statistical model for shapes represented by arcs of conic sections is demonstrated on the sample of sixty-six dental arches. It permits the definition of an "average shape" and the graphic representation of variation in shape. The model and methods of analysis presented should help dental scientists to better define and quantify "normal" or "ideal" shapes and "normal ranges of variation" for the shape of the dental arch.
A statistical analysis of the daily streamflow hydrograph
NASA Astrophysics Data System (ADS)
Kavvas, M. L.; Delleur, J. W.
1984-03-01
In this study a periodic statistical analysis of daily streamflow data in Indiana, U.S.A., was performed to gain some new insight into the stochastic structure which describes the daily streamflow process. This analysis was performed by the periodic mean and covariance functions of the daily streamflows, by the time and peak discharge -dependent recession limb of the daily streamflow hydrograph, by the time and discharge exceedance level (DEL) -dependent probability distribution of the hydrograph peak interarrival time, and by the time-dependent probability distribution of the time to peak discharge. Some new statistical estimators were developed and used in this study. In general features, this study has shown that: (a) the persistence properties of daily flows depend on the storage state of the basin at the specified time origin of the flow process; (b) the daily streamflow process is time irreversible; (c) the probability distribution of the daily hydrograph peak interarrival time depends both on the occurrence time of the peak from which the inter-arrival time originates and on the discharge exceedance level; and (d) if the daily streamflow process is modeled as the release from a linear watershed storage, this release should depend on the state of the storage and on the time of the release as the persistence properties and the recession limb decay rates were observed to change with the state of the watershed storage and time. Therefore, a time-varying reservoir system needs to be considered if the daily streamflow process is to be modeled as the release from a linear watershed storage.
Statistical analysis and modelling of small satellite reliability
NASA Astrophysics Data System (ADS)
Guo, Jian; Monas, Liora; Gill, Eberhard
2014-05-01
This paper attempts to characterize failure behaviour of small satellites through statistical analysis of actual in-orbit failures. A unique Small Satellite Anomalies Database comprising empirical failure data of 222 small satellites has been developed. A nonparametric analysis of the failure data has been implemented by means of a Kaplan-Meier estimation. An innovative modelling method, i.e. Bayesian theory in combination with Markov Chain Monte Carlo (MCMC) simulations, has been proposed to model the reliability of small satellites. An extensive parametric analysis using the Bayesian/MCMC method has been performed to fit a Weibull distribution to the data. The influence of several characteristics such as the design lifetime, mass, launch year, mission type and the type of satellite developers on the reliability has been analyzed. The results clearly show the infant mortality of small satellites. Compared with the classical maximum-likelihood estimation methods, the proposed Bayesian/MCMC method results in better fitting Weibull models and is especially suitable for reliability modelling where only very limited failures are observed.
Helioseismology of pre-emerging active regions. III. Statistical analysis
Barnes, G.; Leka, K. D.; Braun, D. C.; Birch, A. C.
2014-05-01
The subsurface properties of active regions (ARs) prior to their appearance at the solar surface may shed light on the process of AR formation. Helioseismic holography has been applied to samples taken from two populations of regions on the Sun (pre-emergence and without emergence), each sample having over 100 members, that were selected to minimize systematic bias, as described in Paper I. Paper II showed that there are statistically significant signatures in the average helioseismic properties that precede the formation of an AR. This paper describes a more detailed analysis of the samples of pre-emergence regions and regions without emergence based on discriminant analysis. The property that is best able to distinguish the populations is found to be the surface magnetic field, even a day before the emergence time. However, after accounting for the correlations between the surface field and the quantities derived from helioseismology, there is still evidence of a helioseismic precursor to AR emergence that is present for at least a day prior to emergence, although the analysis presented cannot definitively determine the subsurface properties prior to emergence due to the small sample sizes.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Multivariate Statistical Analysis of MSL APXS Bulk Geochemical Data
NASA Astrophysics Data System (ADS)
Hamilton, V. E.; Edwards, C. S.; Thompson, L. M.; Schmidt, M. E.
2014-12-01
We apply cluster and factor analyses to bulk chemical data of 130 soil and rock samples measured by the Alpha Particle X-ray Spectrometer (APXS) on the Mars Science Laboratory (MSL) rover Curiosity through sol 650. Multivariate approaches such as principal components analysis (PCA), cluster analysis, and factor analysis compliment more traditional approaches (e.g., Harker diagrams), with the advantage of simultaneously examining the relationships between multiple variables for large numbers of samples. Principal components analysis has been applied with success to APXS, Pancam, and Mössbauer data from the Mars Exploration Rovers. Factor analysis and cluster analysis have been applied with success to thermal infrared (TIR) spectral data of Mars. Cluster analyses group the input data by similarity, where there are a number of different methods for defining similarity (hierarchical, density, distribution, etc.). For example, without any assumptions about the chemical contributions of surface dust, preliminary hierarchical and K-means cluster analyses clearly distinguish the physically adjacent rock targets Windjana and Stephen as being distinctly different than lithologies observed prior to Curiosity's arrival at The Kimberley. In addition, they are separated from each other, consistent with chemical trends observed in variation diagrams but without requiring assumptions about chemical relationships. We will discuss the variation in cluster analysis results as a function of clustering method and pre-processing (e.g., log transformation, correction for dust cover) and implications for interpreting chemical data. Factor analysis shares some similarities with PCA, and examines the variability among observed components of a dataset so as to reveal variations attributable to unobserved components. Factor analysis has been used to extract the TIR spectra of components that are typically observed in mixtures and only rarely in isolation; there is the potential for similar
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-01
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. PMID:25500260
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O.; Liang, Jingjing; Young, J. Hunter; Franceschini, Nora; Smith, Jennifer A.; Yanek, Lisa R.; Sun, Yan V.; Edwards, Todd L.; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K.; Chakravati, Aravinda; Cooper, Richard S.; Redline, Susan
2015-01-01
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple—even distinct—traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10−8) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10−7) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. PMID:25500260
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-01
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes.
Statistical Methods for Analysis of High-Throughput RNA Interference Screens
Birmingham, Amanda; Selfors, Laura M.; Forster, Thorsten; Wrobel, David; Kennedy, Caleb J.; Shanks, Emma; Santoyo-Lopez, Javier; Dunican, Dara J.; Long, Aideen; Kelleher, Dermot; Smith, Queta; Beijersbergen, Roderick L.; Ghazal, Peter; Shamu, Caroline E.
2009-01-01
RNA interference (RNAi) has become a powerful technique for reverse genetics and drug discovery and, in both of these areas, large-scale high-throughput RNAi screens are commonly performed. The statistical techniques used to analyze these screens are frequently borrowed directly from small-molecule screening; however small-molecule and RNAi data characteristics differ in meaningful ways. We examine the similarities and differences between RNAi and small-molecule screens, highlighting particular characteristics of RNAi screen data that must be addressed during analysis. Additionally, we provide guidance on selection of analysis techniques in the context of a sample workflow. PMID:19644458
ERIC Educational Resources Information Center
Pearson, Kathryn
2008-01-01
Macquarie University Library was concerned at the length of time that elapsed between placement of an interlibrary loan request to the satisfaction of that request. Taking advantage of improved statistical information available to them through membership of the CLIC Consortium, library staff investigated the reasons for delivery delay. This led to…
Statistical Analysis of Tank 5 Floor Sample Results
Shine, E. P.
2013-01-31
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide1, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
Statistical Analysis Of Tank 5 Floor Sample Results
Shine, E. P.
2012-08-01
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
STATISTICAL ANALYSIS OF TANK 5 FLOOR SAMPLE RESULTS
Shine, E.
2012-03-14
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, radionuclide, inorganic, and anion concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements above their
NASA Astrophysics Data System (ADS)
Donges, Jonathan; Petrova, Irina; Löw, Alexander; Marwan, Norbert; Kurths, Jürgen
2015-04-01
Eigen techniques such as empirical orthogonal function (EOF) or coupled pattern (CP) / maximum covariance analysis have been frequently used for detecting patterns in multivariate climatological data sets. Recently, statistical methods originating from the theory of complex networks have been employed for the very same purpose of spatio-temporal analysis. This climate network (CN) analysis is usually based on the same set of similarity matrices as is used in classical EOF or CP analysis, e.g., the correlation matrix of a single climatological field or the cross-correlation matrix between two distinct climatological fields. In this study, formal relationships as well as conceptual differences between both eigen and network approaches are derived and illustrated using global precipitation, evaporation and surface air temperature data sets. These results allow us to pinpoint that CN analysis can complement classical eigen techniques and provides additional information on the higher-order structure of statistical interrelationships in climatological data. Hence, CNs are a valuable supplement to the statistical toolbox of the climatologist, particularly for making sense out of very large data sets such as those generated by satellite observations and climate model intercomparison exercises.
NASA Astrophysics Data System (ADS)
Donges, Jonathan F.; Petrova, Irina; Loew, Alexander; Marwan, Norbert; Kurths, Jürgen
2015-11-01
Eigen techniques such as empirical orthogonal function (EOF) or coupled pattern (CP)/maximum covariance analysis have been frequently used for detecting patterns in multivariate climatological data sets. Recently, statistical methods originating from the theory of complex networks have been employed for the very same purpose of spatio-temporal analysis. This climate network (CN) analysis is usually based on the same set of similarity matrices as is used in classical EOF or CP analysis, e.g., the correlation matrix of a single climatological field or the cross-correlation matrix between two distinct climatological fields. In this study, formal relationships as well as conceptual differences between both eigen and network approaches are derived and illustrated using global precipitation, evaporation and surface air temperature data sets. These results allow us to pinpoint that CN analysis can complement classical eigen techniques and provides additional information on the higher-order structure of statistical interrelationships in climatological data. Hence, CNs are a valuable supplement to the statistical toolbox of the climatologist, particularly for making sense out of very large data sets such as those generated by satellite observations and climate model intercomparison exercises.
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
A statistical method for draft tube pressure pulsation analysis
NASA Astrophysics Data System (ADS)
Doerfler, P. K.; Ruchonnet, N.
2012-11-01
Draft tube pressure pulsation (DTPP) in Francis turbines is composed of various components originating from different physical phenomena. These components may be separated because they differ by their spatial relationships and by their propagation mechanism. The first step for such an analysis was to distinguish between so-called synchronous and asynchronous pulsations; only approximately periodic phenomena could be described in this manner. However, less regular pulsations are always present, and these become important when turbines have to operate in the far off-design range, in particular at very low load. The statistical method described here permits to separate the stochastic (random) component from the two traditional 'regular' components. It works in connection with the standard technique of model testing with several pressure signals measured in draft tube cone. The difference between the individual signals and the averaged pressure signal, together with the coherence between the individual pressure signals is used for analysis. An example reveals that a generalized, non-periodic version of the asynchronous pulsation is important at low load.
Data Analysis & Statistical Methods for Command File Errors
NASA Technical Reports Server (NTRS)
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Autotasked Performance in the NAS Workload: A Statistical Analysis
NASA Technical Reports Server (NTRS)
Carter, R. L.; Stockdale, I. E.; Kutler, Paul (Technical Monitor)
1998-01-01
A statistical analysis of the workload performance of a production quality FORTRAN code for five different Cray Y-MP hardware and system software configurations is performed. The analysis was based on an experimental procedure that was designed to minimize correlations between the number of requested CPUs and the time of day the runs were initiated. Observed autotasking over heads were significantly larger for the set of jobs that requested the maximum number of CPUs. Speedups for UNICOS 6 releases show consistent wall clock speedups in the workload of around 2. which is quite good. The observed speed ups were very similar for the set of jobs that requested 8 CPUs and the set that requested 4 CPUs. The original NAS algorithm for determining charges to the user discourages autotasking in the workload. A new charging algorithm to be applied to jobs run in the NQS multitasking queues also discourages NAS users from using auto tasking. The new algorithm favors jobs requesting 8 CPUs over those that request less, although the jobs requesting 8 CPUs experienced significantly higher over head and presumably degraded system throughput. A charging algorithm is presented that has the following desirable characteristics when applied to the data: higher overhead jobs requesting 8 CPUs are penalized when compared to moderate overhead jobs requesting 4 CPUs, thereby providing a charging incentive to NAS users to use autotasking in a manner that provides them with significantly improved turnaround while also maintaining system throughput.
External quality assessment in water microbiology: statistical analysis of performance.
Tillett, H E; Lightfoot, N F; Eaton, S
1993-04-01
A UK-based scheme of water microbiology assessment requires participants to record counts of relevant organisms. Not every sample will contain the target number of organisms because of natural variation and therefore a range of results is acceptable. Results which are tail-end (i.e. at the extreme low or high end of this range) could occasionally be reported by any individual laboratory by chance. Several tail-end results might imply a laboratory problem. Statistical assessment is done in two stages. A non-parametric test of the distribution of tail-end counts amongst laboratories is performed (Cochran's Q) and, if they are not random, then observed and expected frequencies of tail-end counts are compared to identify participants who may have reported excessive numbers of low or high results. Analyses so far have shown that laboratories find high counts no more frequently than would be expected by chance, but that significant clusters of low counts can be detected among participants. These findings have been observed both in short-term and in long-term assessments, thus allowing detection of new episodes of poor performance and intermittent problems. The analysis relies on an objective definition of tail-end results. Working definitions are presented which should identify poor performance in terms of microbiological significance, and which allow fair comparison between membrane-filtration and multiple-tube techniques. Smaller differences between laboratories, which may be statistically significant, will not be detected. Different definitions of poor performance could be incorporated into future assessments.
Spectroscopic analysis and DFT calculations of a food additive Carmoisine
NASA Astrophysics Data System (ADS)
Snehalatha, M.; Ravikumar, C.; Hubert Joe, I.; Sekar, N.; Jayakumar, V. S.
2009-04-01
FT-IR and Raman techniques were employed for the vibrational characterization of the food additive Carmoisine (E122). The equilibrium geometry, various bonding features, and harmonic vibrational wavenumbers have been investigated with the help of density functional theory (DFT) calculations. A good correlation was found between the computed and experimental wavenumbers. Azo stretching wavenumbers have been lowered due to conjugation and π-electron delocalization. Predicted electronic absorption spectra from TD-DFT calculation have been analysed comparing with the UV-vis spectrum. The first hyperpolarizability of the molecule is calculated. Intramolecular charge transfer (ICT) responsible for the optical nonlinearity of the dye molecule has been discussed theoretically and experimentally. Stability of the molecule arising from hyperconjugative interactions, charge delocalization and C-H⋯O, improper, blue shifted hydrogen bonds have been analysed using natural bond orbital (NBO) analysis.
[Analysis of constituents in urushi wax, a natural food additive].
Jin, Zhe-Long; Tada, Atsuko; Sugimoto, Naoki; Sato, Kyoko; Masuda, Aino; Yamagata, Kazuo; Yamazaki, Takeshi; Tanamoto, Kenichi
2006-08-01
Urushi wax is a natural gum base used as a food additive. In order to evaluate the quality of urushi wax as a food additive and to obtain information useful for setting official standards, we investigated the constituents and their concentrations in urushi wax, using the same sample as scheduled for toxicity testing. After methanolysis of urushi wax, the composition of fatty acids was analyzed by GC/MS. The results indicated that the main fatty acids were palmitic acid, oleic acid and stearic acid. LC/MS analysis of urushi wax provided molecular-related ions of the main constituents. The main constituents were identified as triglycerides, namely glyceryl tripalmitate (30.7%), glyceryl dipalmitate monooleate (21.2%), glyceryl dioleate monopalmitate (2.1%), glyceryl monooleate monopalmitate monostearate (2.6%), glyceryl dipalmitate monostearate (5.6%), glyceryl distearate monopalmitate (1.4%). Glyceryl dipalmitate monooleate isomers differing in the binding sites of each constituent fatty acid could be separately determined by LC/MS/MS. PMID:16984037
Decreasing Cloudiness Over China: An Updated Analysis Examining Additional Variables
Kaiser, D.P.
2000-01-14
As preparation of the IPCC's Third Assessment Report takes place, one of the many observed climate variables of key interest is cloud amount. For several nations of the world, there exist records of surface-observed cloud amount dating back to the middle of the 20th Century or earlier, offering valuable information on variations and trends. Studies using such databases include Sun and Groisman (1999) and Kaiser and Razuvaev (1995) for the former Soviet Union, Angel1 et al. (1984) for the United States, Henderson-Sellers (1986) for Europe, Jones and Henderson-Sellers (1992) for Australia, and Kaiser (1998) for China. The findings of Kaiser (1998) differ from the other studies in that much of China appears to have experienced decreased cloudiness over recent decades (1954-1994), whereas the other land regions for the most part show evidence of increasing cloud cover. This paper expands on Kaiser (1998) by analyzing trends in additional meteorological variables for Chi na [station pressure (p), water vapor pressure (e), and relative humidity (rh)] and extending the total cloud amount (N) analysis an additional two years (through 1996).
Statistical Analysis of Data with Non-Detectable Values
Frome, E.L.
2004-08-26
Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical data analysis and graphics has greatly
ERIC Educational Resources Information Center
Petocz, Agnes; Newbery, Glenn
2010-01-01
Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…
Sensitivity analysis of geometric errors in additive manufacturing medical models.
Pinto, Jose Miguel; Arrieta, Cristobal; Andia, Marcelo E; Uribe, Sergio; Ramos-Grez, Jorge; Vargas, Alex; Irarrazaval, Pablo; Tejos, Cristian
2015-03-01
Additive manufacturing (AM) models are used in medical applications for surgical planning, prosthesis design and teaching. For these applications, the accuracy of the AM models is essential. Unfortunately, this accuracy is compromised due to errors introduced by each of the building steps: image acquisition, segmentation, triangulation, printing and infiltration. However, the contribution of each step to the final error remains unclear. We performed a sensitivity analysis comparing errors obtained from a reference with those obtained modifying parameters of each building step. Our analysis considered global indexes to evaluate the overall error, and local indexes to show how this error is distributed along the surface of the AM models. Our results show that the standard building process tends to overestimate the AM models, i.e. models are larger than the original structures. They also show that the triangulation resolution and the segmentation threshold are critical factors, and that the errors are concentrated at regions with high curvatures. Errors could be reduced choosing better triangulation and printing resolutions, but there is an important need for modifying some of the standard building processes, particularly the segmentation algorithms.
Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard
2013-01-01
Purpose: With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. Methods: A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. Results: The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. Conclusions: The work demonstrates the viability of the design approach and the software tool for analysis of large data sets. PMID:24320426
A statistical analysis of the impact of advertising signs on road safety.
Yannis, George; Papadimitriou, Eleonora; Papantoniou, Panagiotis; Voulgari, Chrisoula
2013-01-01
This research aims to investigate the impact of advertising signs on road safety. An exhaustive review of international literature was carried out on the effect of advertising signs on driver behaviour and safety. Moreover, a before-and-after statistical analysis with control groups was applied on several road sites with different characteristics in the Athens metropolitan area, in Greece, in order to investigate the correlation between the placement or removal of advertising signs and the related occurrence of road accidents. Road accident data for the 'before' and 'after' periods on the test sites and the control sites were extracted from the database of the Hellenic Statistical Authority, and the selected 'before' and 'after' periods vary from 2.5 to 6 years. The statistical analysis shows no statistical correlation between road accidents and advertising signs in none of the nine sites examined, as the confidence intervals of the estimated safety effects are non-significant at 95% confidence level. This can be explained by the fact that, in the examined road sites, drivers are overloaded with information (traffic signs, directions signs, labels of shops, pedestrians and other vehicles, etc.) so that the additional information load from advertising signs may not further distract them. PMID:22587341
Common pitfalls in statistical analysis: Intention-to-treat versus per-protocol analysis
Ranganathan, Priya; Pramesh, C. S.; Aggarwal, Rakesh
2016-01-01
During the conduct of clinical trials, it is not uncommon to have protocol violations or inability to assess outcomes. This article in our series on common pitfalls in statistical analysis explains the complexities of analyzing results from such trials and highlights the importance of “intention-to-treat” analysis. PMID:27453832
Statistical Analysis of the AIAA Drag Prediction Workshop CFD Solutions
NASA Technical Reports Server (NTRS)
Morrison, Joseph H.; Hemsch, Michael J.
2007-01-01
The first AIAA Drag Prediction Workshop (DPW), held in June 2001, evaluated the results from an extensive N-version test of a collection of Reynolds-Averaged Navier-Stokes CFD codes. The code-to-code scatter was more than an order of magnitude larger than desired for design and experimental validation of cruise conditions for a subsonic transport configuration. The second AIAA Drag Prediction Workshop, held in June 2003, emphasized the determination of installed pylon-nacelle drag increments and grid refinement studies. The code-to-code scatter was significantly reduced compared to the first DPW, but still larger than desired. However, grid refinement studies showed no significant improvement in code-to-code scatter with increasing grid refinement. The third AIAA Drag Prediction Workshop, held in June 2006, focused on the determination of installed side-of-body fairing drag increments and grid refinement studies for clean attached flow on wing alone configurations and for separated flow on the DLR-F6 subsonic transport model. This report compares the transonic cruise prediction results of the second and third workshops using statistical analysis.
Statistical analysis of mission profile parameters of civil transport airplanes
NASA Technical Reports Server (NTRS)
Buxbaum, O.
1972-01-01
The statistical analysis of flight times as well as airplane gross weights and fuel weights of jet-powered civil transport airplanes has shown that the distributions of their frequency of occurrence per flight can be presented approximately in general form. Before, however, these results may be used during the project stage of an airplane for defining a typical mission profile (the parameters of which are assumed to occur, for example, with a probability of 50 percent), the following points have to be taken into account. Because the individual airplanes were rotated during service, the scatter between the distributions of mission profile parameters for airplanes of the same type, which were flown with similar payload, has proven to be very small. Significant deviations from the generalized distributions may occur if an operator uses one airplane preferably on one or two specific routes. Another reason for larger deviations could be that the maintenance services of the operators of the observed airplanes are not representative of other airlines. Although there are indications that this is unlikely, similar information should be obtained from other operators. Such information would improve the reliability of the data.
Statistical Analysis of Resistivity Anomalies Caused by Underground Caves
NASA Astrophysics Data System (ADS)
Frid, V.; Averbach, A.; Frid, M.; Dudkinski, D.; Liskevich, G.
2015-05-01
Geophysical prospecting of underground caves being performed on a construction site is often still a challenging procedure. Estimation of a likelihood level of an anomaly found is frequently a mandatory requirement of a project principal due to necessity of risk/safety assessment. However, the methodology of such estimation is not hitherto developed. Aiming to put forward such a methodology the present study (being performed as a part of an underground caves mapping prior to the land development on the site area) consisted of application of electrical resistivity tomography (ERT) together with statistical analysis utilized for the likelihood assessment of underground anomalies located. The methodology was first verified via a synthetic modeling technique and applied to the in situ collected ERT data and then crossed referenced with intrusive investigations (excavation and drilling) for the data verification. The drilling/excavation results showed that the proper discovering of underground caves can be done if anomaly probability level is not lower than 90 %. Such a probability value was shown to be consistent with the modeling results. More than 30 underground cavities were discovered on the site utilizing the methodology.
Measurement of Plethysmogram and Statistical Method for Analysis
NASA Astrophysics Data System (ADS)
Shimizu, Toshihiro
The plethysmogram is measured at different points of human body by using the photo interrupter, which sensitively depends on the physical and mental situation of human body. In this paper the statistical method of the data-analysis is investigated to discuss the dependence of plethysmogram on stress and aging. The first one is the representation method based on the return map, which provides usuful information for the waveform, the flucuation in phase and the fluctuation in amplitude. The return map method makes it possible to understand the fluctuation of plethymogram in amplitude and in phase more clearly and globally than in the conventional power spectrum method. The second is the Lisajous plot and the correlation function to analyze the phase difference between the plethysmograms of the right finger tip and of the left finger tip. The third is the R-index, from which we can estimate “the age of the blood flow”. The R-index is defined by the global character of plethysmogram, which is different from the usual APG-index. The stress- and age-dependence of plethysmogram is discussed by using these methods.
Slow and fast solar wind - data selection and statistical analysis
NASA Astrophysics Data System (ADS)
Wawrzaszek, Anna; Macek, Wiesław M.; Bruno, Roberto; Echim, Marius
2014-05-01
In this work we consider the important problem of selection of slow and fast solar wind data measured in-situ by the Ulysses spacecraft during two solar minima (1995-1997, 2007-2008) and solar maximum (1999-2001). To recognise different types of solar wind we use a set of following parameters: radial velocity, proton density, proton temperature, the distribution of charge states of oxygen ions, and compressibility of magnetic field. We present how this idea of the data selection works on Ulysses data. In the next step we consider the chosen intervals for fast and slow solar wind and perform statistical analysis of the fluctuating magnetic field components. In particular, we check the possibility of identification of inertial range by considering the scale dependence of the third and fourth orders scaling exponents of structure function. We try to verify the size of inertial range depending on the heliographic latitudes, heliocentric distance and phase of the solar cycle. Research supported by the European Community's Seventh Framework Programme (FP7/2007 - 2013) under grant agreement no 313038/STORM.
A Statistical Aggregation Engine for Climatology and Trend Analysis
NASA Astrophysics Data System (ADS)
Chapman, D. R.; Simon, T. A.; Halem, M.
2014-12-01
Fundamental climate data records (FCDRs) from satellite instruments often span tens to hundreds of terabytes or even petabytes in scale. These large volumes make it difficult to aggregate or summarize their climatology and climate trends. It is especially cumbersome to supply the full derivation (provenance) of these aggregate calculations. We present a lightweight and resilient software platform, Gridderama that simplifies the calculation of climatology by exploiting the "Data-Cube" topology often present in earth observing satellite records. By using the large array storage (LAS) paradigm, Gridderama allows the analyst to more easily produce a series of aggregate climate data products at progressively coarser spatial and temporal resolutions. Furthermore, provenance tracking and extensive visualization capabilities allow the analyst to track down and correct for data problems such as missing data and outliers that may impact the scientific results. We have developed and applied Gridderama to calculate a trend analysis of 55 Terabytes of AIRS Level 1b infrared radiances, and show statistically significant trending in the greenhouse gas absorption bands as observed by AIRS over the 2003-2012 decade. We will extend this calculation to show regional changes in CO2 concentration from AIRS over the 2003-2012 decade by using a neural network retrieval algorithm.
Statistical analysis of the seasonal variation in the twinning rate.
Fellman, J; Eriksson, A W
1999-03-01
There have been few secular analyses of the seasonal variation in human twinning and the results are conflicting. One reason for this is that the seasonal pattern of twinning varies in different populations and at different periods. Another reason is that the statistical methods used are different. The changing pattern of seasonal variation in twinning rates and total maternities in Denmark was traced for three periods (1855-69, 1870-94, and 1937-84). Two alternative methods of analysis are considered. The method of Walter and Elwood and a trigonometric regression model give closely similar results. The seasonal distribution of twin maternities for the periods in the 19th century showed highly significant departures. For both twin and general maternities, the main peaks can be seen from March to June and a local peak in September. During the spring-summer season the twinning rates were higher than the total birth rates, indicating a stronger seasonal variation for the twin maternities than for the general maternities. For 1937-84, there was a similar, but less accentuated, pattern. Studies of other populations are compared with the Danish results. The more accentuated seasonal variation of twinning in the past indicate that some factors in the past affected women during summer-autumn and around Christmas time, making them more fecund and particularly to be more prone to polyovulation and/or more able to complete a gestation with multiple embryos.
NASA Astrophysics Data System (ADS)
Allen, Kirk
The Statistics Concept Inventory (SCI) is a multiple choice test designed to assess students' conceptual understanding of topics typically encountered in an introductory statistics course. This dissertation documents the development of the SCI from Fall 2002 up to Spring 2006. The first phase of the project essentially sought to answer the question: "Can you write a test to assess topics typically encountered in introductory statistics?" Book One presents the results utilized in answering this question in the affirmative. The bulk of the results present the development and evolution of the items, primarily relying on objective metrics to gauge effectiveness but also incorporating student feedback. The second phase boils down to: "Now that you have the test, what else can you do with it?" This includes an exploration of Cronbach's alpha, the most commonly-used measure of test reliability in the literature. An online version of the SCI was designed, and its equivalency to the paper version is assessed. Adding an extra wrinkle to the online SCI, subjects rated their answer confidence. These results show a general positive trend between confidence and correct responses. However, some items buck this trend, revealing potential sources of misunderstandings, with comparisons offered to the extant statistics and probability educational research. The third phase is a re-assessment of the SCI: "Are you sure?" A factor analytic study favored a uni-dimensional structure for the SCI, although maintaining the likelihood of a deeper structure if more items can be written to tap similar topics. A shortened version of the instrument is proposed, demonstrated to be able to maintain a reliability nearly identical to that of the full instrument. Incorporating student feedback and a faculty topics survey, improvements to the items and recommendations for further research are proposed. The state of the concept inventory movement is assessed, to offer a comparison to the work presented
Bayesian Analysis of Order-Statistics Models for Ranking Data.
ERIC Educational Resources Information Center
Yu, Philip L. H.
2000-01-01
Studied the order-statistics models, extending the usual normal order-statistics model into one in which the underlying random variables followed a multivariate normal distribution. Used a Bayesian approach and the Gibbs sampling technique. Applied the proposed method to analyze presidential election data from the American Psychological…
The Higher Education System in Israel: Statistical Abstract and Analysis.
ERIC Educational Resources Information Center
Herskovic, Shlomo
This edition of a statistical abstract published every few years on the higher education system in Israel presents the most recent data available through 1990-91. The data were gathered through the cooperation of the Central Bureau of Statistics and institutions of higher education. Chapter 1 presents a summary of principal findings covering the…
A statistical analysis of icing prediction in complex terrains
NASA Astrophysics Data System (ADS)
Terborg, Amanda M.
The issue of icing has been around for decades in aviation industry, and while notable improvements have been made in the study of the formation and process of icing, the prediction of icing events is a challenge that has yet to be completely overcome. Low level icing prediction, particularly in complex terrain, has been bumped to the back burner in an attempt to perfect the models created for in-flight icing. However, over the years there have been a number of different, non-model methods used to better refine the variable involved in low-level icing prediction. One of those methods comes through statistical analysis and modeling, particularly through the use of the Classification and Regression Tree (CART) techniques. These techniques examine the statistical significance of each predictor within a data set to determine various decision rules. Those rules in which the overall misclassification error is the smallest are then used to construct a decision tree and can be used to create a forecast for icing events. Using adiabatically adjusted Rapid Update Cycle (RUC) interpolated sounding data these CART techniques are used in this study to examine icing events in the White Mountains of New Hampshire, specifically on the summit of Mount Washington. The Mount Washington Observatory (MWO), which sits on the summit and is manned year around by weather observers, is no stranger to icing occurrences. In fact, the summit sees icing events from October all the way until April, and occasionally even into May. In this study, these events are examined in detail for the October 2010 to April 2011 season, and five CART models generated for icing in general, rime icing, and glaze icing in attempt to create a decision tree or trees with a high predictive accuracy. Also examined in this study for the October 2010 to April 2011 icing season is the Air Weather Service Pamphlet (AWSP) algorithm, a decision tree model currently in use by the Air Force to predict icing events. Producing
Using the statistical analysis method to assess the landslide susceptibility
NASA Astrophysics Data System (ADS)
Chan, Hsun-Chuan; Chen, Bo-An; Wen, Yo-Ting
2015-04-01
This study assessed the landslide susceptibility in Jing-Shan River upstream watershed, central Taiwan. The landslide inventories during typhoons Toraji in 2001, Mindulle in 2004, Kalmaegi and Sinlaku in 2008, Morakot in 2009, and the 0719 rainfall event in 2011, which were established by Taiwan Central Geological Survey, were used as landslide data. This study aims to assess the landslide susceptibility by using different statistical methods including logistic regression, instability index method and support vector machine (SVM). After the evaluations, the elevation, slope, slope aspect, lithology, terrain roughness, slope roughness, plan curvature, profile curvature, total curvature, average of rainfall were chosen as the landslide factors. The validity of the three established models was further examined by the receiver operating characteristic curve. The result of logistic regression showed that the factor of terrain roughness and slope roughness had a stronger impact on the susceptibility value. Instability index method showed that the factor of terrain roughness and lithology had a stronger impact on the susceptibility value. Due to the fact that the use of instability index method may lead to possible underestimation around the river side. In addition, landslide susceptibility indicated that the use of instability index method laid a potential issue about the number of factor classification. An increase of the number of factor classification may cause excessive variation coefficient of the factor. An decrease of the number of factor classification may make a large range of nearby cells classified into the same susceptibility level. Finally, using the receiver operating characteristic curve discriminate the three models. SVM is a preferred method than the others in assessment of landslide susceptibility. Moreover, SVM is further suggested to be nearly logistic regression in terms of recognizing the medium-high and high susceptibility.
Combined statistical analysis of landslide release and propagation
NASA Astrophysics Data System (ADS)
Mergili, Martin; Rohmaneo, Mohammad; Chu, Hone-Jay
2016-04-01
Statistical methods - often coupled with stochastic concepts - are commonly employed to relate areas affected by landslides with environmental layers, and to estimate spatial landslide probabilities by applying these relationships. However, such methods only concern the release of landslides, disregarding their motion. Conceptual models for mass flow routing are used for estimating landslide travel distances and possible impact areas. Automated approaches combining release and impact probabilities are rare. The present work attempts to fill this gap by a fully automated procedure combining statistical and stochastic elements, building on the open source GRASS GIS software: (1) The landslide inventory is subset into release and deposition zones. (2) We employ a traditional statistical approach to estimate the spatial release probability of landslides. (3) We back-calculate the probability distribution of the angle of reach of the observed landslides, employing the software tool r.randomwalk. One set of random walks is routed downslope from each pixel defined as release area. Each random walk stops when leaving the observed impact area of the landslide. (4) The cumulative probability function (cdf) derived in (3) is used as input to route a set of random walks downslope from each pixel in the study area through the DEM, assigning the probability gained from the cdf to each pixel along the path (impact probability). The impact probability of a pixel is defined as the average impact probability of all sets of random walks impacting a pixel. Further, the average release probabilities of the release pixels of all sets of random walks impacting a given pixel are stored along with the area of the possible release zone. (5) We compute the zonal release probability by increasing the release probability according to the size of the release zone - the larger the zone, the larger the probability that a landslide will originate from at least one pixel within this zone. We
Parallelization of the Physical-Space Statistical Analysis System (PSAS)
NASA Technical Reports Server (NTRS)
Larson, J. W.; Guo, J.; Lyster, P. M.
1999-01-01
Atmospheric data assimilation is a method of combining observations with model forecasts to produce a more accurate description of the atmosphere than the observations or forecast alone can provide. Data assimilation plays an increasingly important role in the study of climate and atmospheric chemistry. The NASA Data Assimilation Office (DAO) has developed the Goddard Earth Observing System Data Assimilation System (GEOS DAS) to create assimilated datasets. The core computational components of the GEOS DAS include the GEOS General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). The need for timely validation of scientific enhancements to the data assimilation system poses computational demands that are best met by distributed parallel software. PSAS is implemented in Fortran 90 using object-based design principles. The analysis portions of the code solve two equations. The first of these is the "innovation" equation, which is solved on the unstructured observation grid using a preconditioned conjugate gradient (CG) method. The "analysis" equation is a transformation from the observation grid back to a structured grid, and is solved by a direct matrix-vector multiplication. Use of a factored-operator formulation reduces the computational complexity of both the CG solver and the matrix-vector multiplication, rendering the matrix-vector multiplications as a successive product of operators on a vector. Sparsity is introduced to these operators by partitioning the observations using an icosahedral decomposition scheme. PSAS builds a large (approx. 128MB) run-time database of parameters used in the calculation of these operators. Implementing a message passing parallel computing paradigm into an existing yet developing computational system as complex as PSAS is nontrivial. One of the technical challenges is balancing the requirements for computational reproducibility with the need for high performance. The problem of computational
Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).
Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W
2016-07-20
Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022
Statistical analysis of simple repeats in the human genome
NASA Astrophysics Data System (ADS)
Piazza, F.; Liò, P.
2005-03-01
The human genome contains repetitive DNA at different level of sequence length, number and dispersion. Highly repetitive DNA is particularly rich in homo- and di-nucleotide repeats, while middle repetitive DNA is rich of families of interspersed, mobile elements hundreds of base pairs (bp) long, among which belong the Alu families. A link between homo- and di-polymeric tracts and mobile elements has been recently highlighted. In particular, the mobility of Alu repeats, which form 10% of the human genome, has been correlated with the length of poly(A) tracts located at one end of the Alu. These tracts have a rigid and non-bendable structure and have an inhibitory effect on nucleosomes, which normally compact the DNA. We performed a statistical analysis of the genome-wide distribution of lengths and inter-tract separations of poly(X) and poly(XY) tracts in the human genome. Our study shows that in humans the length distributions of these sequences reflect the dynamics of their expansion and DNA replication. By means of general tools from linguistics, we show that the latter play the role of highly-significant content-bearing terms in the DNA text. Furthermore, we find that such tracts are positioned in a non-random fashion, with an apparent periodicity of 150 bases. This allows us to extend the link between repetitive, highly mobile elements such as Alus and low-complexity words in human DNA. More precisely, we show that Alus are sources of poly(X) tracts, which in turn affect in a subtle way the combination and diversification of gene expression and the fixation of multigene families.
SUBMILLIMETER NUMBER COUNTS FROM STATISTICAL ANALYSIS OF BLAST MAPS
Patanchon, Guillaume; Ade, Peter A. R.; Griffin, Matthew; Hargrave, Peter C.; Mauskopf, Philip; Moncelsi, Lorenzo; Pascale, Enzo; Bock, James J.; Chapin, Edward L.; Halpern, Mark; Marsden, Gaelen; Scott, Douglas; Devlin, Mark J.; Dicker, Simon R.; Klein, Jeff; Rex, Marie; Gundersen, Joshua O.; Hughes, David H.; Netterfield, Calvin B.; Olmi, Luca
2009-12-20
We describe the application of a statistical method to estimate submillimeter galaxy number counts from confusion-limited observations by the Balloon-borne Large Aperture Submillimeter Telescope (BLAST). Our method is based on a maximum likelihood fit to the pixel histogram, sometimes called 'P(D)', an approach which has been used before to probe faint counts, the difference being that here we advocate its use even for sources with relatively high signal-to-noise ratios. This method has an advantage over standard techniques of source extraction in providing an unbiased estimate of the counts from the bright end down to flux densities well below the confusion limit. We specifically analyze BLAST observations of a roughly 10 deg{sup 2} map centered on the Great Observatories Origins Deep Survey South field. We provide estimates of number counts at the three BLAST wavelengths 250, 350, and 500 mum; instead of counting sources in flux bins we estimate the counts at several flux density nodes connected with power laws. We observe a generally very steep slope for the counts of about -3.7 at 250 mum, and -4.5 at 350 and 500 mum, over the range approx0.02-0.5 Jy, breaking to a shallower slope below about 0.015 Jy at all three wavelengths. We also describe how to estimate the uncertainties and correlations in this method so that the results can be used for model-fitting. This method should be well suited for analysis of data from the Herschel satellite.
Statistical Design, Models and Analysis for the Job Change Framework.
ERIC Educational Resources Information Center
Gleser, Leon Jay
1990-01-01
Proposes statistical methodology for testing Loughead and Black's "job change thermostat." Discusses choice of target population; relationship between job satisfaction and values, perceptions, and opportunities; and determinants of job change. (SK)
NASA Technical Reports Server (NTRS)
Djorgovski, Stanislav
1992-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
NASA Technical Reports Server (NTRS)
Djorgovski, George
1993-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
Analysis of statistical model properties from discrete nuclear structure data
NASA Astrophysics Data System (ADS)
Firestone, Richard B.
2012-02-01
Experimental M1, E1, and E2 photon strengths have been compiled from experimental data in the Evaluated Nuclear Structure Data File (ENSDF) and the Evaluated Gamma-ray Activation File (EGAF). Over 20,000 Weisskopf reduced transition probabilities were recovered from the ENSDF and EGAF databases. These transition strengths have been analyzed for their dependence on transition energies, initial and final level energies, spin/parity dependence, and nuclear deformation. ENSDF BE1W values were found to increase exponentially with energy, possibly consistent with the Axel-Brink hypothesis, although considerable excess strength observed for transitions between 4-8 MeV. No similar energy dependence was observed in EGAF or ARC data. BM1W average values were nearly constant at all energies above 1 MeV with substantial excess strength below 1 MeV and between 4-8 MeV. BE2W values decreased exponentially by a factor of 1000 from 0 to 16 MeV. The distribution of ENSDF transition probabilities for all multipolarities could be described by a lognormal statistical distribution. BE1W, BM1W, and BE2W strengths all increased substantially for initial transition level energies between 4-8 MeV possibly due to dominance of spin-flip and Pygmy resonance transitions at those excitations. Analysis of the average resonance capture data indicated no transition probability dependence on final level spins or energies between 0-3 MeV. The comparison of favored to unfavored transition probabilities for odd-A or odd-Z targets indicated only partial support for the expected branching intensity ratios with many unfavored transitions having nearly the same strength as favored ones. Average resonance capture BE2W transition strengths generally increased with greater deformation. Analysis of ARC data suggest that there is a large E2 admixture in M1 transitions with the mixing ratio δ ≈ 1.0. The ENSDF reduced transition strengths were considerably stronger than those derived from capture gamma ray
Statistical analysis of synaptic transmission: model discrimination and confidence limits.
Stricker, C; Redman, S; Daley, D
1994-01-01
Procedures for discriminating between competing statistical models of synaptic transmission, and for providing confidence limits on the parameters of these models, have been developed. These procedures were tested against simulated data and were used to analyze the fluctuations in synaptic currents evoked in hippocampal neurones. All models were fitted to data using the Expectation-Maximization algorithm and a maximum likelihood criterion. Competing models were evaluated using the log-likelihood ratio (Wilks statistic). When the competing models were not nested, Monte Carlo sampling of the model used as the null hypothesis (H0) provided density functions against which H0 and the alternate model (H1) were tested. The statistic for the log-likelihood ratio was determined from the fit of H0 and H1 to these probability densities. This statistic was used to determine the significance level at which H0 could be rejected for the original data. When the competing models were nested, log-likelihood ratios and the chi 2 statistic were used to determine the confidence level for rejection. Once the model that provided the best statistical fit to the data was identified, many estimates for the model parameters were calculated by resampling the original data. Bootstrap techniques were then used to obtain the confidence limits of these parameters. PMID:7948672
Precessing rotating flows with additional shear: Stability analysis
NASA Astrophysics Data System (ADS)
Salhi, A.; Cambon, C.
2009-03-01
We consider unbounded precessing rotating flows in which vertical or horizontal shear is induced by the interaction between the solid-body rotation (with angular velocity Ω0 ) and the additional “precessing” Coriolis force (with angular velocity -ɛΩ0 ), normal to it. A “weak” shear flow, with rate 2ɛ of the same order of the Poincaré “small” ratio ɛ , is needed for balancing the gyroscopic torque, so that the whole flow satisfies Euler’s equations in the precessing frame (the so-called admissibility conditions). The base flow case with vertical shear (its cross-gradient direction is aligned with the main angular velocity) corresponds to Mahalov’s [Phys. Fluids A 5, 891 (1993)] precessing infinite cylinder base flow (ignoring boundary conditions), while the base flow case with horizontal shear (its cross-gradient direction is normal to both main and precessing angular velocities) corresponds to the unbounded precessing rotating shear flow considered by Kerswell [Geophys. Astrophys. Fluid Dyn. 72, 107 (1993)]. We show that both these base flows satisfy the admissibility conditions and can support disturbances in terms of advected Fourier modes. Because the admissibility conditions cannot select one case with respect to the other, a more physical derivation is sought: Both flows are deduced from Poincaré’s [Bull. Astron. 27, 321 (1910)] basic state of a precessing spheroidal container, in the limit of small ɛ . A Rapid distortion theory (RDT) type of stability analysis is then performed for the previously mentioned disturbances, for both base flows. The stability analysis of the Kerswell base flow, using Floquet’s theory, is recovered, and its counterpart for the Mahalov base flow is presented. Typical growth rates are found to be the same for both flows at very small ɛ , but significant differences are obtained regarding growth rates and widths of instability bands, if larger ɛ values, up to 0.2, are considered. Finally, both flow cases
Dutra, Rosilene L; Cantos, Geny A; Carasek, Eduardo
2006-01-01
The quantification of target analytes in complex matrices requires special calibration approaches to compensate for additional capacity or activity in the matrix samples. The standard addition is one of the most important calibration procedures for quantification of analytes in such matrices. However, this technique requires a great number of reagents and material, and it consumes a considerable amount of time throughout the analysis. In this work, a new calibration procedure to analyze biological samples is proposed. The proposed calibration, called the addition calibration technique, was used for the determination of zinc (Zn) in blood serum and erythrocyte samples. The results obtained were compared with those obtained using conventional calibration techniques (standard addition and standard calibration). The proposed addition calibration was validated by recovery tests using blood samples spiked with Zn. The range of recovery for blood serum and erythrocyte samples were 90-132% and 76-112%, respectively. Statistical studies among results obtained by the addition technique and conventional techniques, using a paired two-tailed Student's t-test and linear regression, demonstrated good agreement among them. PMID:16943611
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most
Statistical Analysis of CMC Constituent and Processing Data
NASA Technical Reports Server (NTRS)
Fornuff, Jonathan
2004-01-01
observed using statistical analysis software. The ultimate purpose of this study is to determine what variations in material processing can lead to the most critical changes in the materials property. The work I have taken part in this summer explores, in general, the key properties needed In this study SiC/SiC composites of varying architectures, utilizing a boron-nitride (BN)
Statistical analysis of large-scale neuronal recording data
Reed, Jamie L.; Kaas, Jon H.
2010-01-01
Relating stimulus properties to the response properties of individual neurons and neuronal networks is a major goal of sensory research. Many investigators implant electrode arrays in multiple brain areas and record from chronically implanted electrodes over time to answer a variety of questions. Technical challenges related to analyzing large-scale neuronal recording data are not trivial. Several analysis methods traditionally used by neurophysiologists do not account for dependencies in the data that are inherent in multi-electrode recordings. In addition, when neurophysiological data are not best modeled by the normal distribution and when the variables of interest may not be linearly related, extensions of the linear modeling techniques are recommended. A variety of methods exist to analyze correlated data, even when data are not normally distributed and the relationships are nonlinear. Here we review expansions of the Generalized Linear Model designed to address these data properties. Such methods are used in other research fields, and the application to large-scale neuronal recording data will enable investigators to determine the variable properties that convincingly contribute to the variances in the observed neuronal measures. Standard measures of neuron properties such as response magnitudes can be analyzed using these methods, and measures of neuronal network activity such as spike timing correlations can be analyzed as well. We have done just that in recordings from 100-electrode arrays implanted in the primary somatosensory cortex of owl monkeys. Here we illustrate how one example method, Generalized Estimating Equations analysis, is a useful method to apply to large-scale neuronal recordings. PMID:20472395
The sensitivity analysis of the economic and economic statistical designs of the synthetic X¯ chart
NASA Astrophysics Data System (ADS)
Yeong, Wai Chung; Khoo, Michael Boon Chong; Chong, Jia Kit; Lim, Shun Jinn; Teoh, Wei Lin
2014-12-01
The economic and economic statistical designs allow the practitioner to implement the control chart in an economically optimal manner. For the economic design, the optimal chart parameters are obtained to minimize the cost, while for the economic statistical design, additional constraints in terms of the average run length is imposed. However, these designs involve the estimation of quite a number of input parameters. Some of these input parameters are difficult to estimate accurately. Thus, a sensitivity analysis is required in order to identify which parameters need to be estimated accurately, and which requires just a rough estimation. This study focuses on the significance of 11 input parameters toward the optimal cost and average run lengths of the synthetic ¯X chart. The significant input parameters are identified through a two-level fractional factorial design, which allows interaction effects to be identified. An analysis of variance is performed to obtain the P-values by using the Minitab software. The significant input parameters and interactions on the optimal cost and average run lengths are identified based on a 5% significance level. The results of this study show that the input parameters which are significant towards the economic design may not be significant for the economic statistical design, and vice versa. This study also shows that there are quite a number of significant interaction effects, which may mask the significance of the main effects.
Hybrid Additive Manufacturing Technologies - An Analysis Regarding Potentials and Applications
NASA Astrophysics Data System (ADS)
Merklein, Marion; Junker, Daniel; Schaub, Adam; Neubauer, Franziska
Imposing the trend of mass customization of lightweight construction in industry, conventional manufacturing processes like forming technology and chipping production are pushed to their limits for economical manufacturing. More flexible processes are needed which were developed by the additive manufacturing technology. This toolless production principle offers a high geometrical freedom and an optimized utilization of the used material. Thus load adjusted lightweight components can be produced in small lot sizes in an economical way. To compensate disadvantages like inadequate accuracy and surface roughness hybrid machines combining additive and subtractive manufacturing are developed. Within this paper the principles of mainly used additive manufacturing processes of metals and their possibility to be integrated into a hybrid production machine are summarized. It is pointed out that in particular the integration of deposition processes into a CNC milling center supposes high potential for manufacturing larger parts with high accuracy. Furthermore the combination of additive and subtractive manufacturing allows the production of ready to use products within one single machine. Additionally actual research for the integration of additive manufacturing processes into the production chain will be analyzed. For the long manufacturing time of additive production processes the combination with conventional manufacturing processes like sheet or bulk metal forming seems an effective solution. Especially large volumes can be produced by conventional processes. In an additional production step active elements can be applied by additive manufacturing. This principle is also investigated for tool production to reduce chipping of the high strength material used for forming tools. The aim is the addition of active elements onto a geometrical simple basis by using Laser Metal Deposition. That process allows the utilization of several powder materials during one process what
Interfaces between statistical analysis packages and the ESRI geographic information system
NASA Technical Reports Server (NTRS)
Masuoka, E.
1980-01-01
Interfaces between ESRI's geographic information system (GIS) data files and real valued data files written to facilitate statistical analysis and display of spatially referenced multivariable data are described. An example of data analysis which utilized the GIS and the statistical analysis system is presented to illustrate the utility of combining the analytic capability of a statistical package with the data management and display features of the GIS.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Re-analysis of survival data of cancer patients utilizing additive homeopathy.
Gleiss, Andreas; Frass, Michael; Gaertner, Katharina
2016-08-01
In this short communication we present a re-analysis of homeopathic patient data in comparison to control patient data from the same Outpatient´s Unit "Homeopathy in malignant diseases" of the Medical University of Vienna. In this analysis we took account of a probable immortal time bias. For patients suffering from advanced stages of cancer and surviving the first 6 or 12 months after diagnosis, respectively, the results show that utilizing homeopathy gives a statistically significant (p<0.001) advantage over control patients regarding survival time. In conclusion, bearing in mind all limitations, the results of this retrospective study suggest that patients with advanced stages of cancer might benefit from additional homeopathic treatment until a survival time of up to 12 months after diagnosis. PMID:27515878
A multiple additive regression tree analysis of three exposure measures during Hurricane Katrina.
Curtis, Andrew; Li, Bin; Marx, Brian D; Mills, Jacqueline W; Pine, John
2011-01-01
This paper analyses structural and personal exposure to Hurricane Katrina. Structural exposure is measured by flood height and building damage; personal exposure is measured by the locations of 911 calls made during the response. Using these variables, this paper characterises the geography of exposure and also demonstrates the utility of a robust analytical approach in understanding health-related challenges to disadvantaged populations during recovery. Analysis is conducted using a contemporary statistical approach, a multiple additive regression tree (MART), which displays considerable improvement over traditional regression analysis. By using MART, the percentage of improvement in R-squares over standard multiple linear regression ranges from about 62 to more than 100 per cent. The most revealing finding is the modelled verification that African Americans experienced disproportionate exposure in both structural and personal contexts. Given the impact of exposure to health outcomes, this finding has implications for understanding the long-term health challenges facing this population.
Wang, Youping; Sonntag, Karin; Rudloff, Eicke; Wehling, Peter; Snowdon, Rod J
2006-02-01
Two Brassica napus-Crambe abyssinica monosomic addition lines (2n=39, AACC plus a single chromosome from C. abyssinca) were obtained from the F(2) progeny of the asymmetric somatic hybrid. The alien chromosome from C. abyssinca in the addition line was clearly distinguished by genomic in situ hybridization (GISH). Twenty-seven microspore-derived plants from the addition lines were obtained. Fourteen seedlings were determined to be diploid plants (2n=38) arising from spontaneous chromosome doubling, while 13 seedlings were confirmed as haploid plants. Doubled haploid plants produced after treatment with colchicine and two disomic chromosome addition lines (2n=40, AACC plus a single pair of homologous chromosomes from C. abyssinca) could again be identified by GISH analysis. The lines are potentially useful for molecular genetic analysis of novel C. abyssinica genes or alleles contributing to traits relevant for oilseed rape (B. napus) breeding.
Estimating Reliability of Disturbances in Satellite Time Series Data Based on Statistical Analysis
NASA Astrophysics Data System (ADS)
Zhou, Z.-G.; Tang, P.; Zhou, M.
2016-06-01
Normally, the status of land cover is inherently dynamic and changing continuously on temporal scale. However, disturbances or abnormal changes of land cover — caused by such as forest fire, flood, deforestation, and plant diseases — occur worldwide at unknown times and locations. Timely detection and characterization of these disturbances is of importance for land cover monitoring. Recently, many time-series-analysis methods have been developed for near real-time or online disturbance detection, using satellite image time series. However, the detection results were only labelled with "Change/ No change" by most of the present methods, while few methods focus on estimating reliability (or confidence level) of the detected disturbances in image time series. To this end, this paper propose a statistical analysis method for estimating reliability of disturbances in new available remote sensing image time series, through analysis of full temporal information laid in time series data. The method consists of three main steps. (1) Segmenting and modelling of historical time series data based on Breaks for Additive Seasonal and Trend (BFAST). (2) Forecasting and detecting disturbances in new time series data. (3) Estimating reliability of each detected disturbance using statistical analysis based on Confidence Interval (CI) and Confidence Levels (CL). The method was validated by estimating reliability of disturbance regions caused by a recent severe flooding occurred around the border of Russia and China. Results demonstrated that the method can estimate reliability of disturbances detected in satellite image with estimation error less than 5% and overall accuracy up to 90%.
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
NASA Technical Reports Server (NTRS)
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
Statistical Power Analysis in Education Research. NCSER 2010-3006
ERIC Educational Resources Information Center
Hedges, Larry V.; Rhoads, Christopher
2010-01-01
This paper provides a guide to calculating statistical power for the complex multilevel designs that are used in most field studies in education research. For multilevel evaluation studies in the field of education, it is important to account for the impact of clustering on the standard errors of estimates of treatment effects. Using ideas from…
Did Tanzania Achieve the Second Millennium Development Goal? Statistical Analysis
ERIC Educational Resources Information Center
Magoti, Edwin
2016-01-01
Development Goal "Achieve universal primary education", the challenges faced, along with the way forward towards achieving the fourth Sustainable Development Goal "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all". Statistics show that Tanzania has made very promising steps…
Statistical Analysis Tools for Learning in Engineering Laboratories.
ERIC Educational Resources Information Center
Maher, Carolyn A.
1990-01-01
Described are engineering programs that have used automated data acquisition systems to implement data collection and analyze experiments. Applications include a biochemical engineering laboratory, heat transfer performance, engineering materials testing, mechanical system reliability, statistical control laboratory, thermo-fluid laboratory, and a…
Private School Universe Survey, 1991-92. Statistical Analysis Report.
ERIC Educational Resources Information Center
Broughman, Stephen; And Others
This report on the private school universe, a data collection system developed by the National Center for Education Statistics, presents data on schools with grades kindergarten through 12 by school size, school level, religious orientation, geographical region, and program emphasis. Numbers of students and teachers are reported in the same…
The PRIME System: Computer Programs for Statistical Analysis.
ERIC Educational Resources Information Center
Veldman, Donald J.
PRIME is a library of 44 batch-oriented computer routines: 20 major package programs, which use 12 statistical utility routines, and 12 other utility routines for input/output and data manipulation. This manual contains a general description of data preparation and coding, standard control cards, input deck arrangement, standard options, and…
Fixed-ratio ray designs have been used for detecting and characterizing interactions of large numbers of chemicals in combination. Single chemical dose-response data are used to predict an “additivity curve” along an environmentally relevant ray. A “mixture curve” is estimated fr...
Trend Analysis of Tropical Ozone From the Southern Hemisphere Additional Ozonesondes (SHADOZ) Data
NASA Astrophysics Data System (ADS)
Morioka, H.; Fujiwara, M.; Shiotani, M.; Thompson, A. M.; Witte, J. C.; Oltmans, S. J.
2007-12-01
Linear trends of ozone for 1998-2007 are estimated for the troposphere through the lower stratosphere at ten tropical ozonesonde stations participating in the Southern Hemisphere Additional Ozonesondes (SHADOZ) project. Most stations cover the period from early 1998 to the end of 2006, but some stations have a shorter or longer record. Soundings are made once to four times per month, varying for station and year, but cover basically all seasons. The total sounding number ranges from 102 for Malindi to 429 for Ascension Island. Trends are calculated for vertically averaged values in each 1-km bin from 0-1 km to 30-31 km, and expressed as percent per year. Statistical test is also made. Around the tropopause, between 15 and 20 km, negative trends are seen for most stations. At San Cristobal (in the eastern Pacific) at 16-17 km, the trend is -4.3 ± 3.0 percent per year, and at Watukosek (in Indonesia) at 17-18 km, it is -4.8 ± 3.9 percent per year, both statistically significant. However, at Ascension (in the Atlantic) and at Natal (in South America), the tropopause trend is near zero and not statistically significant. At Natal at 12-13 km, the trend is +3.7 ± 3.0 percent per year, and at Malindi (in Africa) at 11-12 km, it is +5.0 ± 4.6 percent per year, both statistically significant. Generally in the free troposphere, positive trends are seen, but are statistically not significant for most regions. In the planetary boundary layer, statistically significant positive trends are seen at Kuala Lumpur (in Southeast Asia) and at Fiji (in the southwestern Pacific), and a statistically significant negative trend is seen at Paramaribo (in South America). The trend analysis is also made for four different seasons. Around the tropopause, seasonality in trend is small for all stations. In the upper troposphere, at Fiji and at Samoa, negative trends are seen in SON, but positive trends are seen in DJF.
Interactive statistical-distribution-analysis program utilizing numerical and graphical methods
Glandon, S. R.; Fields, D. E.
1982-04-01
The TERPED/P program is designed to facilitate the quantitative analysis of experimental data, determine the distribution function that best describes the data, and provide graphical representations of the data. This code differs from its predecessors, TEDPED and TERPED, in that a printer-plotter has been added for graphical output flexibility. The addition of the printer-plotter provides TERPED/P with a method of generating graphs that is not dependent on DISSPLA, Integrated Software Systems Corporation's confidential proprietary graphics package. This makes it possible to use TERPED/P on systems not equipped with DISSPLA. In addition, the printer plot is usually produced more rapidly than a high-resolution plot can be generated. Graphical and numerical tests are performed on the data in accordance with the user's assumption of normality or lognormality. Statistical analysis options include computation of the chi-squared statistic and its significance level and the Kolmogorov-Smirnov one-sample test confidence level for data sets of more than 80 points. Plots can be produced on a Calcomp paper plotter, a FR80 film plotter, or a graphics terminal using the high-resolution, DISSPLA-dependent plotter or on a character-type output device by the printer-plotter. The plots are of cumulative probability (abscissa) versus user-defined units (ordinate). The program was developed on a Digital Equipment Corporation (DEC) PDP-10 and consists of 1500 statements. The language used is FORTRAN-10, DEC's extended version of FORTRAN-IV.
Mazumdar, Madhu; Banerjee, Samprit; Van Epps, Heather L
2010-01-01
A majority of original articles published in biomedical journals include some form of statistical analysis. Unfortunately, many of the articles contain errors in statistical design and/or analysis. These errors are worrisome, as the misuse of statistics jeopardizes the process of scientific discovery and the accumulation of scientific knowledge. To help avoid these errors and improve statistical reporting, four approaches are suggested: (1) development of guidelines for statistical reporting that could be adopted by all journals, (2) improvement in statistics curricula in biomedical research programs with an emphasis on hands-on teaching by biostatisticians, (3) expansion and enhancement of biomedical science curricula in statistics programs, and (4) increased participation of biostatisticians in the peer review process along with the adoption of more rigorous journal editorial policies regarding statistics. In this chapter, we provide an overview of these issues with emphasis to the field of molecular biology and highlight the need for continuing efforts on all fronts.
Research on the integrative strategy of spatial statistical analysis of GIS
NASA Astrophysics Data System (ADS)
Xie, Zhong; Han, Qi Juan; Wu, Liang
2008-12-01
Presently, the spacial social and natural phenomenon is studied by both the GIS technique and statistics methods. However, plenty of complex practical applications restrict these research methods. The data models and technologies exploited are full of special localization. This paper firstly sums up the requirement of spacial statistical analysis. On the base of the requirement, the universal spatial statistical models are transformed into the function tools in statistical GIS system. A pyramidal structure of three layers is brought forward. Therefore, it is feasible to combine the techniques of spacial dada management, searches and visualization in GIS with the methods of processing data in the statistic analysis. It will form an integrative statistical GIS environment with the management, analysis, application and assistant decision-making of spacial statistical information.
A Statistical Framework for the Functional Analysis of Metagenomes
Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.
2008-10-01
Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.
Statistical Analysis of CFD Solutions from the Drag Prediction Workshop
NASA Technical Reports Server (NTRS)
Hemsch, Michael J.
2002-01-01
A simple, graphical framework is presented for robust statistical evaluation of results obtained from N-Version testing of a series of RANS CFD codes. The solutions were obtained by a variety of code developers and users for the June 2001 Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration used for the computational tests is the DLR-F4 wing-body combination previously tested in several European wind tunnels and for which a previous N-Version test had been conducted. The statistical framework is used to evaluate code results for (1) a single cruise design point, (2) drag polars and (3) drag rise. The paper concludes with a discussion of the meaning of the results, especially with respect to predictability, Validation, and reporting of solutions.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology
NASA Technical Reports Server (NTRS)
Morgan, Carolyn; DePriest, Douglas; Thompson, Richard (Technical Monitor)
2002-01-01
The cost and safety goals for NASA's next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to establish statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The research work was focused on establishing the suitable mathematical/statistical models for these purposes. It is anticipated that the resulting models can be incorporated into a software tool to provide rapid, variable-fidelity, aerothermal environments to predict heating along an arbitrary trajectory. This work will support development of an integrated design tool to perform automated thermal protection system (TPS) sizing and material selection.
Statistical analysis of motion contrast in optical coherence tomography angiography
NASA Astrophysics Data System (ADS)
Cheng, Yuxuan; Guo, Li; Pan, Cong; Lu, Tongtong; Hong, Tianyu; Ding, Zhihua; Li, Peng
2015-11-01
Optical coherence tomography angiography (Angio-OCT), mainly based on the temporal dynamics of OCT scattering signals, has found a range of potential applications in clinical and scientific research. Based on the model of random phasor sums, temporal statistics of the complex-valued OCT signals are mathematically described. Statistical distributions of the amplitude differential and complex differential Angio-OCT signals are derived. The theories are validated through the flow phantom and live animal experiments. Using the model developed, the origin of the motion contrast in Angio-OCT is mathematically explained, and the implications in the improvement of motion contrast are further discussed, including threshold determination and its residual classification error, averaging method, and scanning protocol. The proposed mathematical model of Angio-OCT signals can aid in the optimal design of the system and associated algorithms.
Statistical analysis of modeling error in structural dynamic systems
NASA Technical Reports Server (NTRS)
Hasselman, T. K.; Chrostowski, J. D.
1990-01-01
The paper presents a generic statistical model of the (total) modeling error for conventional space structures in their launch configuration. Modeling error is defined as the difference between analytical prediction and experimental measurement. It is represented by the differences between predicted and measured real eigenvalues and eigenvectors. Comparisons are made between pre-test and post-test models. Total modeling error is then subdivided into measurement error, experimental error and 'pure' modeling error, and comparisons made between measurement error and total modeling error. The generic statistical model presented in this paper is based on the first four global (primary structure) modes of four different structures belonging to the generic category of Conventional Space Structures (specifically excluding large truss-type space structures). As such, it may be used to evaluate the uncertainty of predicted mode shapes and frequencies, sinusoidal response, or the transient response of other structures belonging to the same generic category.
Introduction to the statistical analysis of two-color microarray data.
Bremer, Martina; Himelblau, Edward; Madlung, Andreas
2010-01-01
Microarray experiments have become routine in the past few years in many fields of biology. Analysis of array hybridizations is often performed with the help of commercial software programs, which produce gene lists, graphs, and sometimes provide values for the statistical significance of the results. Exactly what is computed by many of the available programs is often not easy to reconstruct or may even be impossible to know for the end user. It is therefore not surprising that many biology students and some researchers using microarray data do not fully understand the nature of the underlying statistics used to arrive at the results.We have developed a module that we have used successfully in undergraduate biology and statistics education that allows students to get a better understanding of both the basic biological and statistical theory needed to comprehend primary microarray data. The module is intended for the undergraduate level but may be useful to anyone who is new to the field of microarray biology. Additional course material that was developed for classroom use can be found at http://www.polyploidy.org/ .In our undergraduate classrooms we encourage students to manipulate microarray data using Microsoft Excel to reinforce some of the concepts they learn. We have included instructions for some of these manipulations throughout this chapter (see the "Do this..." boxes). However, it should be noted that while Excel can effectively analyze our small sample data set, more specialized software would typically be used to analyze full microarray data sets. Nevertheless, we believe that manipulating a small data set with Excel can provide insights into the workings of more advanced analysis software. PMID:20652509
Computational and Statistical Analysis of Protein Mass Spectrometry Data
Noble, William Stafford; MacCoss, Michael J.
2012-01-01
High-throughput proteomics experiments involving tandem mass spectrometry produce large volumes of complex data that require sophisticated computational analyses. As such, the field offers many challenges for computational biologists. In this article, we briefly introduce some of the core computational and statistical problems in the field and then describe a variety of outstanding problems that readers of PLoS Computational Biology might be able to help solve. PMID:22291580
NASA Astrophysics Data System (ADS)
Daeid, N. Nic; Meier-Augenstein, W.; Kemp, H. F.
2012-04-01
The analysis of cotton fibres can be particularly challenging within a forensic science context where discrimination of one fibre from another is of importance. Normally cotton fibre analysis examines the morphological structure of the recovered material and compares this with that of a known fibre from a particular source of interest. However, the conventional microscopic and chemical analysis of fibres and any associated dyes is generally unsuccessful because of the similar morphology of the fibres. Analysis of the dyes which may have been applied to the cotton fibre can also be undertaken though this can be difficult and unproductive in terms of discriminating one fibre from another. In the study presented here we have explored the potential for Isotope Ratio Mass Spectrometry (IRMS) to be utilised as an additional tool for cotton fibre analysis in an attempt to reveal further discriminatory information. This work has concentrated on un-dyed cotton fibres of known origin in order to expose the potential of the analytical technique. We report the results of a pilot study aimed at testing the hypothesis that multi-element stable isotope analysis of cotton fibres in conjunction with multivariate statistical analysis of the resulting isotopic abundance data using well established chemometric techniques permits sample provenancing based on the determination of where the cotton was grown and as such will facilitate sample discrimination. To date there is no recorded literature of this type of application of IRMS to cotton samples, which may be of forensic science relevance.
Real-time areal precipitation determination from radar by means of statistical objective analysis
NASA Astrophysics Data System (ADS)
Gerstner, E.-M.; Heinemann, G.
2008-05-01
SummaryPrecipitation measurement by radar allows for areal rainfall determination with a high spatial and temporal resolution. However, hydrological applications require an accuracy of the precipitation quantification which cannot be obtained by today's weather radar devices. The quality of the radar-derived precipitation can be significantly improved with the aid of ground measurements. In this paper, a complete processing pipeline for real-time radar precipitation determination using a modified statistical objective analysis method is presented. Thereby, several additional algorithms, such as a dynamical use of Z-R relationships, a bias correction and an advection correction scheme are employed. The performance of the algorithms is tested for several case studies. For an error analysis, an eight months data set of X-band radar scans and rain gauge precipitation measurements is used. We show a reduction in the radar-rain gauge RMS difference of up to 59% for the optimal combination of the different algorithms.
NASA Astrophysics Data System (ADS)
Beyer, Hans Georg; Chougule, Abhijit
2016-04-01
While wind energy industry growing rapidly and siting of wind turbines onshore as well as offshore is increasing, many wind engineering model tools have been developed for the assessment of loads on wind turbines due to varying wind speeds. In order to have proper wind turbine design and performance analysis, it is important to have an accurate representation of the incoming wind field. To ease the analysis, tools for the generation of synthetic wind fields have been developed, e.g the widely used TurbSim procedure. We analyse respective synthetic data sets on one hand in view of the similarity of the spectral characteristics of measured and synthetic sets. In addition, second order characteristics with direct relevance to load assessment as given by the statistics of increments and rainflow count results are inspected.
Hewett, Paul; Bullock, William H
2014-01-01
For more than 20 years CSX Transportation (CSXT) has collected exposure measurements from locomotive engineers and conductors who are potentially exposed to diesel emissions. The database included measurements for elemental and total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, carbon monoxide, and nitrogen dioxide. This database was statistically analyzed and summarized, and the resulting statistics and exposure profiles were compared to relevant occupational exposure limits (OELs) using both parametric and non-parametric descriptive and compliance statistics. Exposure ratings, using the American Industrial Health Association (AIHA) exposure categorization scheme, were determined using both the compliance statistics and Bayesian Decision Analysis (BDA). The statistical analysis of the elemental carbon data (a marker for diesel particulate) strongly suggests that the majority of levels in the cabs of the lead locomotives (n = 156) were less than the California guideline of 0.020 mg/m(3). The sample 95th percentile was roughly half the guideline; resulting in an AIHA exposure rating of category 2/3 (determined using BDA). The elemental carbon (EC) levels in the trailing locomotives tended to be greater than those in the lead locomotive; however, locomotive crews rarely ride in the trailing locomotive. Lead locomotive EC levels were similar to those reported by other investigators studying locomotive crew exposures and to levels measured in urban areas. Lastly, both the EC sample mean and 95%UCL were less than the Environmental Protection Agency (EPA) reference concentration of 0.005 mg/m(3). With the exception of nitrogen dioxide, the overwhelming majority of the measurements for total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, and combustion gases in the cabs of CSXT locomotives were either non-detects or considerably less than the working OELs for the years represented in the database. When compared to the previous American
Statistical model and error analysis of a proposed audio fingerprinting algorithm
NASA Astrophysics Data System (ADS)
McCarthy, E. P.; Balado, F.; Silvestre, G. C. M.; Hurley, N. J.
2006-01-01
In this paper we present a statistical analysis of a particular audio fingerprinting method proposed by Haitsma et al.1 Due to the excellent robustness and synchronisation properties of this particular fingerprinting method, we would like to examine its performance for varying values of the parameters involved in the computation and ascertain its capabilities. For this reason, we pursue a statistical model of the fingerprint (also known as a hash, message digest or label). Initially we follow the work of a previous attempt made by Doets and Lagendijk 2-4 to obtain such a statistical model. By reformulating the representation of the fingerprint as a quadratic form, we present a model in which the parameters derived by Doets and Lagendijk may be obtained more easily. Furthermore, our model allows further insight into certain aspects of the behaviour of the fingerprinting algorithm not previously examined. Using our model, we then analyse the probability of error (P e) of the hash. We identify two particular error scenarios and obtain an expression for the probability of error in each case. We present three methods of varying accuracy to approximate P e following Gaussian noise addition to the signal of interest. We then analyse the probability of error following desynchronisation of the signal at the input of the hashing system and provide an approximation to P e for different parameters of the algorithm under varying degrees of desynchronisation.
Zhukovsky, Michael; Varaksin, Anatole; Pakholkina, Olga
2014-07-01
An observational study is a type of epidemiological study when the researcher observes the situation but is not able to change the conditions of the experiment. The statistical analysis of the observational study of the population of Lermontov city (North Caucasus) was conducted. In the initial group, there were 121 people with lung cancer diagnosis and 196 people of the control group. Statistical analysis was performed only for men (95 cases and 76 controls). The use of logistic regression with correction on age gives the value of odds ratio 1.95 (0.87÷4.37; 90% CI) per 100 working levels per month of combined (occupational and domestic) radon exposure. It was demonstrated that chronic lung diseases are an additional risk factor for uranium miners but it is not a significant risk factor for general population. Thus, the possibility of obtaining statistically reliable results in the observational studies when using the correct methods of analysis is demonstrated.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-05
... HUMAN SERVICES Food and Drug Administration Guidance for Industry on Documenting Statistical Analysis...: The Food and Drug Administration (FDA) is announcing the availability of a guidance for industry 197 entitled ``Documenting Statistical Analysis Programs and Data Files.'' This guidance is provided to...
Statistics Education Research in Malaysia and the Philippines: A Comparative Analysis
ERIC Educational Resources Information Center
Reston, Enriqueta; Krishnan, Saras; Idris, Noraini
2014-01-01
This paper presents a comparative analysis of statistics education research in Malaysia and the Philippines by modes of dissemination, research areas, and trends. An electronic search for published research papers in the area of statistics education from 2000-2012 yielded 20 for Malaysia and 19 for the Philippines. Analysis of these papers showed…
Statistical analysis of wing/fin buffeting response
NASA Astrophysics Data System (ADS)
Lee, B. H. K.
2002-05-01
The random nature of the aerodynamic loading on the wing and tail structures of an aircraft makes it necessary to adopt a statistical approach in the prediction of the buffeting response. This review describes a buffeting prediction technique based on rigid model pressure measurements that is commonly used in North America, and also the buffet excitation parameter technique favored by many researchers in the UK. It is shown that the two models are equivalent and have their origin based on a statistical theory of the response of a mechanical system to a random load. In formulating the model for predicting aircraft response at flight conditions using rigid model wind tunnel pressure measurements, the wing (fin) is divided into panels, and the load is computed from measured pressure fluctuations at the center of each panel. The methods used to model pressure correlation between panels are discussed. The coupling between the wing (fin) motion and the induced aerodynamics using a doublet-lattice unsteady aerodynamics code is described. The buffet excitation parameter approach to predict flight test response using wind tunnel model data is derived from the equations for the pressure model formulation. Examples of flight correlation with prediction based on wind tunnel measurements for wing and vertical tail buffeting response are presented for a number of aircraft. For rapid maneuvers inside the buffet regime, the statistical properties of the buffet load are usually non-stationary because of the short time records and difficulties in maintaining constant flight conditions. The time history of the applied load is segmented into a number of time intervals. In each time segment, the non-stationary load is represented as a product of a deterministic shaping function and a random function. Various forms of the load power spectral density that permits analytical solution of the mean square displacement and acceleration response are considered. Illustrations are given using F
Statistical analysis of epidemiologic data of pregnancy outcomes
Butler, W.J.; Kalasinski, L.A. )
1989-02-01
In this paper, a generalized logistic regression model for correlated observations is used to analyze epidemiologic data on the frequency of spontaneous abortion among a group of women office workers. The results are compared to those obtained from the use of the standard logistic regression model that assumes statistical independence among all the pregnancies contributed by one woman. In this example, the correlation among pregnancies from the same woman is fairly small and did not have a substantial impact on the magnitude of estimates of parameters of the model. This is due at least partly to the small average number of pregnancies contributed by each woman.
Statistical Analysis of Noisy Signals Using Classification Tools
Thompson, Sandra E.; Heredia-Langner, Alejandro; Johnson, Timothy J.; Foster, Nancy S.; Valentine, Nancy B.; Amonette, James E.
2005-06-04
The potential use of chemicals, biotoxins and biological pathogens are a threat to military and police forces as well as the general public. Rapid identification of these agents is made difficult due to the noisy nature of the signal that can be obtained from portable, in-field sensors. In previously published articles, we created a flowchart that illustrated a method for triaging bacterial identification by combining standard statistical techniques for discrimination and identification with mid-infrared spectroscopic data. The present work documents the process of characterizing and eliminating the sources of the noise and outlines how multidisciplinary teams are necessary to accomplish that goal.
Period04: Statistical analysis of large astronomical time series
NASA Astrophysics Data System (ADS)
Lenz, Patrick; Breger, Michel
2014-07-01
Period04 statistically analyzes large astronomical time series containing gaps. It calculates formal uncertainties, can extract the individual frequencies from the multiperiodic content of time series, and provides a flexible interface to perform multiple-frequency fits with a combination of least-squares fitting and the discrete Fourier transform algorithm. Period04, written in Java/C++, supports the SAMP communication protocol to provide interoperability with other applications of the Virtual Observatory. It is a reworked and extended version of Period98 (Sperl 1998) and PERIOD/PERDET (Breger 1990).
Statistical analysis of multivariate atmospheric variables. [cloud cover
NASA Technical Reports Server (NTRS)
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.
[Statistical analysis of fabrication of indirect single restorations].
Sato, T; Kawawa, A; Okada, D; Ohno, S; Akiba, H; Watanabe, Y; Endo, K; Mayanagi, A; Miura, H; Hasegawa, S
1999-09-01
A statistical survey based on laboratory records was performed on the number of indirect restorations fabricated at the dental hospital of Tokyo Medical and Dental University from April 1 to September 30, 1997. A comparison was also carried out with a previous survey, which had been carried out in 1986, in order to detect any change and possible alterations in the near future. Based on the results of this statistical survey, the conclusions were as follows: 1. A total of 9,126 indirect restorations were fabricated during the six month period in 1997; among them, 8,007 (87.7%) restorations were covered by health insurance and 1,119 (12.3%) restorations were not. 2. The most common restoration was the cast post and core (28.6%), followed by full crowns (18.5%) and removable partial dentures (15.6%). On the other hand, the least number were post crowns (0.03%) and resin jacket crowns (0.2%). 3. When making a comparison with the data in 1986, an increase in the number of removable partial dentures and a decrease in the number of inlays were the most distinctive features. 4. For anterior teeth, resin-veneered crowns were most common, especially for lower teeth. The percentage of restorations, which were not covered by health insurance, decreased from 45.0% (in 1986) to 12.3% (in 1997).
Statistical analysis of bankrupting and non-bankrupting stocks
NASA Astrophysics Data System (ADS)
Li, Qian; Wang, Fengzhong; Wei, Jianrong; Liang, Yuan; Huang, Jiping; Stanley, H. Eugene
2012-04-01
The recent financial crisis has caused extensive world-wide economic damage, affecting in particular those who invested in companies that eventually filed for bankruptcy. A better understanding of stocks that become bankrupt would be helpful in reducing risk in future investments. Economists have conducted extensive research on this topic, and here we ask whether statistical physics concepts and approaches may offer insights into pre-bankruptcy stock behavior. To this end, we study all 20092 stocks listed in US stock markets for the 20-year period 1989-2008, including 4223 (21 percent) that became bankrupt during that period. We find that, surprisingly, the distributions of the daily returns of those stocks that become bankrupt differ significantly from those that do not. Moreover, these differences are consistent for the entire period studied. We further study the relation between the distribution of returns and the length of time until bankruptcy, and observe that larger differences of the distribution of returns correlate with shorter time periods preceding bankruptcy. This behavior suggests that sharper fluctuations in the stock price occur when the stock is closer to bankruptcy. We also analyze the cross-correlations between the return and the trading volume, and find that stocks approaching bankruptcy tend to have larger return-volume cross-correlations than stocks that are not. Furthermore, the difference increases as bankruptcy approaches. We conclude that before a firm becomes bankrupt its stock exhibits unusual behavior that is statistically quantifiable.
Statistical analysis of biotissues Mueller matrix images in cancer diagnostics
NASA Astrophysics Data System (ADS)
Yermolenko, Sergey; Ivashko, Pavlo; Goudail, François; Gruia, Ion
2010-11-01
This work is directed to the investigation of the scope of the technique of laser polarimetry and polarization spectrometry of oncological changes of the human prostate tissue under the conditions of multiple scattering. It was shown that the third statistic moment in the intensity distribution proved to be the most sensitive to pathological changes in orientation structure. Its value in the intensity distribution of polarization image I (0 - 90) of oncologically changed tissue is 21 times higher if compared with the similar statistic parameter of the intensity distribution of the healthy tissue. The results of studies of size linear dichroism prostate gland, as healthy and affected by malignant tumor at different stages of its development was presented. Significant difference in the values of linear dichroism and its spectral dependence in the spectral range λ = 280 - 840 nm as between research facilities, and between biotissues - healthy (or affected by benign tumors) and cancer patients was shown. These results may have diagnostic value for detection and assessment of the development of cancer.
Texture analysis with statistical methods for wheat ear extraction
NASA Astrophysics Data System (ADS)
Bakhouche, M.; Cointault, F.; Gouton, P.
2007-01-01
In agronomic domain, the simplification of crop counting, necessary for yield prediction and agronomic studies, is an important project for technical institutes such as Arvalis. Although the main objective of our global project is to conceive a mobile robot for natural image acquisition directly in a field, Arvalis has proposed us first to detect by image processing the number of wheat ears in images before to count them, which will allow to obtain the first component of the yield. In this paper we compare different texture image segmentation techniques based on feature extraction by first and higher order statistical methods which have been applied on our images. The extracted features are used for unsupervised pixel classification to obtain the different classes in the image. So, the K-means algorithm is implemented before the choice of a threshold to highlight the ears. Three methods have been tested in this feasibility study with very average error of 6%. Although the evaluation of the quality of the detection is visually done, automatic evaluation algorithms are currently implementing. Moreover, other statistical methods of higher order will be implemented in the future jointly with methods based on spatio-frequential transforms and specific filtering.
Statistical analysis of the Indus script using n-grams.
Yadav, Nisha; Joglekar, Hrishikesh; Rao, Rajesh P N; Vahia, Mayank N; Adhikari, Ronojoy; Mahadevan, Iravatham
2010-03-19
The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilization. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyze the syntax of the Indus script. We find that unigrams follow a Zipf-Mandelbrot distribution. Text beginner and ender distributions are unequal, providing internal evidence for syntax. We see clear evidence of strong bigram correlations and extract significant pairs and triplets using a log-likelihood measure of association. Highly frequent pairs and triplets are not always highly significant. The model performance is evaluated using information-theoretic measures and cross-validation. The model can restore doubtfully read texts with an accuracy of about 75%. We find that a quadrigram Markov chain saturates information theoretic measures against a held-out corpus. Our work forms the basis for the development of a stochastic grammar which may be used to explore the syntax of the Indus script in greater detail.
How Many Studies Do You Need? A Primer on Statistical Power for Meta-Analysis
ERIC Educational Resources Information Center
Valentine, Jeffrey C.; Pigott, Therese D.; Rothstein, Hannah R.
2010-01-01
In this article, the authors outline methods for using fixed and random effects power analysis in the context of meta-analysis. Like statistical power analysis for primary studies, power analysis for meta-analysis can be done either prospectively or retrospectively and requires assumptions about parameters that are unknown. The authors provide…
Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control.
Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M
2014-01-01
Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part's porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented.
Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control.
Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M
2014-01-01
Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part's porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented. PMID:26601041
Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control
Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M
2014-01-01
Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part’s porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented. PMID:26601041
Additional EIPC Study Analysis: Interim Report on High Priority Topics
Hadley, Stanton W
2013-11-01
Between 2010 and 2012 the Eastern Interconnection Planning Collaborative (EIPC) conducted a major long-term resource and transmission study of the Eastern Interconnection (EI). With guidance from a Stakeholder Steering Committee (SSC) that included representatives from the Eastern Interconnection States Planning Council (EISPC) among others, the project was conducted in two phases. Phase 1 involved a long-term capacity expansion analysis that involved creation of eight major futures plus 72 sensitivities. Three scenarios were selected for more extensive transmission- focused evaluation in Phase 2. Five power flow analyses, nine production cost model runs (including six sensitivities), and three capital cost estimations were developed during this second phase. The results from Phase 1 and 2 provided a wealth of data that could be examined further to address energy-related questions. A list of 13 topics was developed for further analysis; this paper discusses the first five.
Statistical group differences in anatomical shape analysis using Hotelling T2 metric
NASA Astrophysics Data System (ADS)
Styner, Martin; Oguz, Ipek; Xu, Shun; Pantazis, Dimitrios; Gerig, Guido
2007-03-01
Shape analysis has become of increasing interest to the neuroimaging community due to its potential to precisely locate morphological changes between healthy and pathological structures. This manuscript presents a comprehensive set of tools for the computation of 3D structural statistical shape analysis. It has been applied in several studies on brain morphometry, but can potentially be employed in other 3D shape problems. Its main limitations is the necessity of spherical topology. The input of the proposed shape analysis is a set of binary segmentation of a single brain structure, such as the hippocampus or caudate. These segmentations are converted into a corresponding spherical harmonic description (SPHARM), which is then sampled into a triangulated surfaces (SPHARM-PDM). After alignment, differences between groups of surfaces are computed using the Hotelling T2 two sample metric. Statistical p-values, both raw and corrected for multiple comparisons, result in significance maps. Additional visualization of the group tests are provided via mean difference magnitude and vector maps, as well as maps of the group covariance information. The correction for multiple comparisons is performed via two separate methods that each have a distinct view of the problem. The first one aims to control the family-wise error rate (FWER) or false-positives via the extrema histogram of non-parametric permutations. The second method controls the false discovery rate and results in a less conservative estimate of the false-negatives. Prior versions of this shape analysis framework have been applied already to clinical studies on hippocampus and lateral ventricle shape in adult schizophrenics. The novelty of this submission is the use of the Hotelling T2 two-sample group difference metric for the computation of a template free statistical shape analysis. Template free group testing allowed this framework to become independent of any template choice, as well as it improved the
Disclosure of hydraulic fracturing fluid chemical additives: analysis of regulations.
Maule, Alexis L; Makey, Colleen M; Benson, Eugene B; Burrows, Isaac J; Scammell, Madeleine K
2013-01-01
Hydraulic fracturing is used to extract natural gas from shale formations. The process involves injecting into the ground fracturing fluids that contain thousands of gallons of chemical additives. Companies are not mandated by federal regulations to disclose the identities or quantities of chemicals used during hydraulic fracturing operations on private or public lands. States have begun to regulate hydraulic fracturing fluids by mandating chemical disclosure. These laws have shortcomings including nondisclosure of proprietary or "trade secret" mixtures, insufficient penalties for reporting inaccurate or incomplete information, and timelines that allow for after-the-fact reporting. These limitations leave lawmakers, regulators, public safety officers, and the public uninformed and ill-prepared to anticipate and respond to possible environmental and human health hazards associated with hydraulic fracturing fluids. We explore hydraulic fracturing exemptions from federal regulations, as well as current and future efforts to mandate chemical disclosure at the federal and state level.
Disclosure of hydraulic fracturing fluid chemical additives: analysis of regulations.
Maule, Alexis L; Makey, Colleen M; Benson, Eugene B; Burrows, Isaac J; Scammell, Madeleine K
2013-01-01
Hydraulic fracturing is used to extract natural gas from shale formations. The process involves injecting into the ground fracturing fluids that contain thousands of gallons of chemical additives. Companies are not mandated by federal regulations to disclose the identities or quantities of chemicals used during hydraulic fracturing operations on private or public lands. States have begun to regulate hydraulic fracturing fluids by mandating chemical disclosure. These laws have shortcomings including nondisclosure of proprietary or "trade secret" mixtures, insufficient penalties for reporting inaccurate or incomplete information, and timelines that allow for after-the-fact reporting. These limitations leave lawmakers, regulators, public safety officers, and the public uninformed and ill-prepared to anticipate and respond to possible environmental and human health hazards associated with hydraulic fracturing fluids. We explore hydraulic fracturing exemptions from federal regulations, as well as current and future efforts to mandate chemical disclosure at the federal and state level. PMID:23552653
Risk analysis of sulfites used as food additives in China.
Zhang, Jian Bo; Zhang, Hong; Wang, Hua Li; Zhang, Ji Yue; Luo, Peng Jie; Zhu, Lei; Wang, Zhu Tian
2014-02-01
This study was to analyze the risk of sulfites in food consumed by the Chinese people and assess the health protection capability of maximum-permitted level (MPL) of sulfites in GB 2760-2011. Sulfites as food additives are overused or abused in many food categories. When the MPL in GB 2760-2011 was used as sulfites content in food, the intake of sulfites in most surveyed populations was lower than the acceptable daily intake (ADI). Excess intake of sulfites was found in all the surveyed groups when a high percentile of sulfites in food was in taken. Moreover, children aged 1-6 years are at a high risk to intake excess sulfites. The primary cause for the excess intake of sulfites in Chinese people is the overuse and abuse of sulfites by the food industry. The current MPL of sulfites in GB 2760-2011 protects the health of most populations.
In-Situ Statistical Analysis of Autotune Simulation Data using Graphical Processing Units
Ranjan, Niloo; Sanyal, Jibonananda; New, Joshua Ryan
2013-08-01
Developing accurate building energy simulation models to assist energy efficiency at speed and scale is one of the research goals of the Whole-Building and Community Integration group, which is a part of Building Technologies Research and Integration Center (BTRIC) at Oak Ridge National Laboratory (ORNL). The aim of the Autotune project is to speed up the automated calibration of building energy models to match measured utility or sensor data. The workflow of this project takes input parameters and runs EnergyPlus simulations on Oak Ridge Leadership Computing Facility s (OLCF) computing resources such as Titan, the world s second fastest supercomputer. Multiple simulations run in parallel on nodes having 16 processors each and a Graphics Processing Unit (GPU). Each node produces a 5.7 GB output file comprising 256 files from 64 simulations. Four types of output data covering monthly, daily, hourly, and 15-minute time steps for each annual simulation is produced. A total of 270TB+ of data has been produced. In this project, the simulation data is statistically analyzed in-situ using GPUs while annual simulations are being computed on the traditional processors. Titan, with its recent addition of 18,688 Compute Unified Device Architecture (CUDA) capable NVIDIA GPUs, has greatly extended its capability for massively parallel data processing. CUDA is used along with C/MPI to calculate statistical metrics such as sum, mean, variance, and standard deviation leveraging GPU acceleration. The workflow developed in this project produces statistical summaries of the data which reduces by multiple orders of magnitude the time and amount of data that needs to be stored. These statistical capabilities are anticipated to be useful for sensitivity analysis of EnergyPlus simulations.
Statistical analysis of loopy belief propagation in random fields
NASA Astrophysics Data System (ADS)
Yasuda, Muneki; Kataoka, Shun; Tanaka, Kazuyuki
2015-10-01
Loopy belief propagation (LBP), which is equivalent to the Bethe approximation in statistical mechanics, is a message-passing-type inference method that is widely used to analyze systems based on Markov random fields (MRFs). In this paper, we propose a message-passing-type method to analytically evaluate the quenched average of LBP in random fields by using the replica cluster variation method. The proposed analytical method is applicable to general pairwise MRFs with random fields whose distributions differ from each other and can give the quenched averages of the Bethe free energies over random fields, which are consistent with numerical results. The order of its computational cost is equivalent to that of standard LBP. In the latter part of this paper, we describe the application of the proposed method to Bayesian image restoration, in which we observed that our theoretical results are in good agreement with the numerical results for natural images.
Statistical Analysis of Complexity Generators for Cost Estimation
NASA Technical Reports Server (NTRS)
Rowell, Ginger Holmes
1999-01-01
Predicting the cost of cutting edge new technologies involved with spacecraft hardware can be quite complicated. A new feature of the NASA Air Force Cost Model (NAFCOM), called the Complexity Generator, is being developed to model the complexity factors that drive the cost of space hardware. This parametric approach is also designed to account for the differences in cost, based on factors that are unique to each system and subsystem. The cost driver categories included in this model are weight, inheritance from previous missions, technical complexity, and management factors. This paper explains the Complexity Generator framework, the statistical methods used to select the best model within this framework, and the procedures used to find the region of predictability and the prediction intervals for the cost of a mission.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation
NASA Technical Reports Server (NTRS)
DePriest, Douglas; Morgan, Carolyn
2003-01-01
The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.
Statistical Analysis of Haralick Texture Features to Discriminate Lung Abnormalities
Zayed, Nourhan; Elnemr, Heba A.
2015-01-01
The Haralick texture features are a well-known mathematical method to detect the lung abnormalities and give the opportunity to the physician to localize the abnormality tissue type, either lung tumor or pulmonary edema. In this paper, statistical evaluation of the different features will represent the reported performance of the proposed method. Thirty-seven patients CT datasets with either lung tumor or pulmonary edema were included in this study. The CT images are first preprocessed for noise reduction and image enhancement, followed by segmentation techniques to segment the lungs, and finally Haralick texture features to detect the type of the abnormality within the lungs. In spite of the presence of low contrast and high noise in images, the proposed algorithms introduce promising results in detecting the abnormality of lungs in most of the patients in comparison with the normal and suggest that some of the features are significantly recommended than others. PMID:26557845
Statistical Analysis of Haralick Texture Features to Discriminate Lung Abnormalities.
Zayed, Nourhan; Elnemr, Heba A
2015-01-01
The Haralick texture features are a well-known mathematical method to detect the lung abnormalities and give the opportunity to the physician to localize the abnormality tissue type, either lung tumor or pulmonary edema. In this paper, statistical evaluation of the different features will represent the reported performance of the proposed method. Thirty-seven patients CT datasets with either lung tumor or pulmonary edema were included in this study. The CT images are first preprocessed for noise reduction and image enhancement, followed by segmentation techniques to segment the lungs, and finally Haralick texture features to detect the type of the abnormality within the lungs. In spite of the presence of low contrast and high noise in images, the proposed algorithms introduce promising results in detecting the abnormality of lungs in most of the patients in comparison with the normal and suggest that some of the features are significantly recommended than others. PMID:26557845
Statistical analysis of Nomao customer votes for spots of France
NASA Astrophysics Data System (ADS)
Pálovics, Róbert; Daróczy, Bálint; Benczúr, András; Pap, Julia; Ermann, Leonardo; Phan, Samuel; Chepelianskii, Alexei D.; Shepelyansky, Dima L.
2015-08-01
We investigate the statistical properties of votes of customers for spots of France collected by the startup company Nomao. The frequencies of votes per spot and per customer are characterized by a power law distribution which remains stable on a time scale of a decade when the number of votes is varied by almost two orders of magnitude. Using the computer science methods we explore the spectrum and the eigenvalues of a matrix containing user ratings to geolocalized items. Eigenvalues nicely map to large towns and regions but show certain level of instability as we modify the interpretation of the underlying matrix. We evaluate imputation strategies that provide improved prediction performance by reaching geographically smooth eigenvectors. We point on possible links between distribution of votes and the phenomenon of self-organized criticality.
Analysis of surface sputtering on a quantum statistical basis
NASA Technical Reports Server (NTRS)
Wilhelm, H. E.
1975-01-01
Surface sputtering is explained theoretically by means of a 3-body sputtering mechanism involving the ion and two surface atoms of the solid. By means of quantum-statistical mechanics, a formula for the sputtering ratio S(E) is derived from first principles. The theoretical sputtering rate S(E) was found experimentally to be proportional to the square of the difference between incident ion energy and the threshold energy for sputtering of surface atoms at low ion energies. Extrapolation of the theoretical sputtering formula to larger ion energies indicates that S(E) reaches a saturation value and finally decreases at high ion energies. The theoretical sputtering ratios S(E) for wolfram, tantalum, and molybdenum are compared with the corresponding experimental sputtering curves in the low energy region from threshold sputtering energy to 120 eV above the respective threshold energy. Theory and experiment are shown to be in good agreement.
Power flow as a complement to statistical energy analysis and finite element analysis
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1987-01-01
Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.
Application of the Statistical ICA Technique in the DANCE Data Analysis
NASA Astrophysics Data System (ADS)
Baramsai, Bayarbadrakh; Jandel, M.; Bredeweg, T. A.; Rusev, G.; Walker, C. L.; Couture, A.; Mosby, S.; Ullmann, J. L.; Dance Collaboration
2015-10-01
The Detector for Advanced Neutron Capture Experiments (DANCE) at the Los Alamos Neutron Science Center is used to improve our understanding of the neutron capture reaction. DANCE is a highly efficient 4 π γ-ray detector array consisting of 160 BaF2 crystals which make it an ideal tool for neutron capture experiments. The (n, γ) reaction Q-value equals to the sum energy of all γ-rays emitted in the de-excitation cascades from the excited capture state to the ground state. The total γ-ray energy is used to identify reactions on different isotopes as well as the background. However, it's challenging to identify contribution in the Esum spectra from different isotopes with the similar Q-values. Recently we have tested the applicability of modern statistical methods such as Independent Component Analysis (ICA) to identify and separate different (n, γ) reaction yields on different isotopes that are present in the target material. ICA is a recently developed computational tool for separating multidimensional data into statistically independent additive subcomponents. In this conference talk, we present some results of the application of ICA algorithms and its modification for the DANCE experimental data analysis. This research is supported by the U. S. Department of Energy, Office of Science, Nuclear Physics under the Early Career Award No. LANL20135009.
Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell
2012-01-01
Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.
Rashid, Naim U.; Sun, Wei; Ibrahim, Joseph G.
2014-01-01
In DAE (DNA After Enrichment)-seq experiments, genomic regions related with certain biological processes are enriched/isolated by an assay and are then sequenced on a high-throughput sequencing platform to determine their genomic positions. Statistical analysis of DAE-seq data aims to detect genomic regions with significant aggregations of isolated DNA fragments (“enriched regions”) versus all the other regions (“background”). However, many confounding factors may influence DAE-seq signals. In addition, the signals in adjacent genomic regions may exhibit strong correlations, which invalidate the independence assumption employed by many existing methods. To mitigate these issues, we develop a novel Autoregressive Hidden Markov Model (AR-HMM) to account for covariates effects and violations of the independence assumption. We demonstrate that our AR-HMM leads to improved performance in identifying enriched regions in both simulated and real datasets, especially in those in epigenetic datasets with broader regions of DAE-seq signal enrichment. We also introduce a variable selection procedure in the context of the HMM/AR-HMM where the observations are not independent and the mean value of each state-specific emission distribution is modeled by some covariates. We study the theoretical properties of this variable selection procedure and demonstrate its efficacy in simulated and real DAE-seq data. In summary, we develop several practical approaches for DAE-seq data analysis that are also applicable to more general problems in statistics. PMID:24678134
Statistical analysis of suicide characteristics in Iaşi County.
Herea, Speranta-Giulia; Scripcaru, C
2012-01-01
A prospective study intended for statistic analysis of suicide events occurring in 2004-2009 period, in lasi County, was performed. Specific data emerged from the conventional investigation, focusing on the sex, age, seasonality, marital condition, occupation status, blood alcohol concentration, religion adherence, and previous suicide attempts of the persons who committed the lethal self-aggression. The results showed a males: females (M:F) ratio of 4.13:1, central tendency to suicide towards the 46 years, a mean age of the self-murderers series of 45 years, while the most frequent age was 49 years. The interquartile range expanded from 33 to 56 years. The rural:urban (R:U) ratio was 1.38:1, whereas a statistically-significant seasonal variation was found in villages. Suicide events occurred more frequently around the Easter and Christmas, whereas the orthodox Christian believers seemed to suicide more than Catholics. Additionally, a correlated analysis, based essentially on data provided by the local Institute of Legal Medicine and Psychiatry Hospital, offered a comprehensive understanding of the mental state of the self-murderers and their psychiatric profile.
Statistical analysis of imperfection effect on cylindrical buckling response
NASA Astrophysics Data System (ADS)
Ismail, M. S.; Purbolaksono, J.; Muhammad, N.; Andriyana, A.; Liew, H. L.
2015-12-01
It is widely reported that no efficient guidelines for modelling imperfections in composite structures are available. In response, this work evaluates the imperfection factors of axially compressed Carbon Fibre Reinforced Polymer (CFRP) cylinder with different ply angles through finite element (FE) analysis. The sensitivity of imperfection factors were analysed using design of experiment: factorial design approach. From the analysis it identified three critical factors that sensitively reacted towards buckling load. Furthermore empirical equation is proposed according to each type of cylinder. Eventually, critical buckling loads estimated by empirical equation showed good agreements with FE analysis. The design of experiment methodology is useful in identifying parameters that lead to structures imperfection tolerance.
Condition of America's Public School Facilities, 1999. Statistical Analysis Report.
ERIC Educational Resources Information Center
Lewis, Laurie; Snow, Kyle; Farris, Elizabeth; Smerdon, Becky; Cronen, Stephanie; Kaplan, Jessica
This report provides national data for 903 U.S. public elementary and secondary schools on the condition of public schools in 1999 and the costs to bring them into good condition. Additionally provided are school plans for repairs, renovations, and replacements; data on the age of public schools; and overcrowding and practices used to address…
New Statistical Approach to the Analysis of Hierarchical Data
NASA Astrophysics Data System (ADS)
Neuman, S. P.; Guadagnini, A.; Riva, M.
2014-12-01
Many variables possess a hierarchical structure reflected in how their increments vary in space and/or time. Quite commonly the increments (a) fluctuate in a highly irregular manner; (b) possess symmetric, non-Gaussian frequency distributions characterized by heavy tails that often decay with separation distance or lag; (c) exhibit nonlinear power-law scaling of sample structure functions in a midrange of lags, with breakdown in such scaling at small and large lags; (d) show extended power-law scaling (ESS) at all lags; and (e) display nonlinear scaling of power-law exponent with order of sample structure function. Some interpret this to imply that the variables are multifractal, which explains neither breakdowns in power-law scaling nor ESS. We offer an alternative interpretation consistent with all above phenomena. It views data as samples from stationary, anisotropic sub-Gaussian random fields subordinated to truncated fractional Brownian motion (tfBm) or truncated fractional Gaussian noise (tfGn). The fields are scaled Gaussian mixtures with random variances. Truncation of fBm and fGn entails filtering out components below data measurement or resolution scale and above domain scale. Our novel interpretation of the data allows us to obtain maximum likelihood estimates of all parameters characterizing the underlying truncated sub-Gaussian fields. These parameters in turn make it possible to downscale or upscale all statistical moments to situations entailing smaller or larger measurement or resolution and sampling scales, respectively. They also allow one to perform conditional or unconditional Monte Carlo simulations of random field realizations corresponding to these scales. Aspects of our approach are illustrated on field and laboratory measured porous and fractured rock permeabilities, as well as soil texture characteristics and neural network estimates of unsaturated hydraulic parameters in a deep vadose zone near Phoenix, Arizona. We also use our approach
Statistical analysis of properties of dwarf novae outbursts
NASA Astrophysics Data System (ADS)
Otulakowska-Hypka, Magdalena; Olech, Arkadiusz; Patterson, Joseph
2016-08-01
We present a statistical study of all measurable photometric features of a large sample of dwarf novae during their outbursts and superoutbursts. We used all accessible photometric data for all our objects to make the study as complete and up to date as possible. Our aim was to check correlations between these photometric features in order to constrain theoretical models which try to explain the nature of dwarf novae outbursts. We managed to confirm a few of the known correlations, that is the Stolz and Schoembs relation, the Bailey relation for long outbursts above the period gap, the relations between the cycle and supercycle lengths, amplitudes of normal and superoutbursts, amplitude and duration of superoutbursts, outburst duration and orbital period, outburst duration and mass ratio for short and normal outbursts, as well as the relation between the rise and decline rates of superoutbursts. However, we question the existence of the Kukarkin-Parenago relation but we found an analogous relation for superoutbursts. We also failed to find one presumed relation between outburst duration and mass ratio for superoutbursts. This study should help to direct theoretical work dedicated to dwarf novae.
Statistical analysis of AFE GN&C aeropass performance
NASA Technical Reports Server (NTRS)
Chang, Ho-Pen; French, Raymond A.
1990-01-01
Performance of the guidance, navigation, and control (GN&C) system used on the Aeroassist Flight Experiment (AFE) spacecraft has been studied with Monte Carlo techniques. The performance of the AFE GN&C is investigated with a 6-DOF numerical dynamic model which includes a Global Reference Atmospheric Model (GRAM) and a gravitational model with oblateness corrections. The study considers all the uncertainties due to the environment and the system itself. In the AFE's aeropass phase, perturbations on the system performance are caused by an error space which has over 20 dimensions of the correlated/uncorrelated error sources. The goal of this study is to determine, in a statistical sense, how much flight path angle error can be tolerated at entry interface (EI) and still have acceptable delta-V capability at exit to position the AFE spacecraft for recovery. Assuming there is fuel available to produce 380 ft/sec of delta-V at atmospheric exit, a 3-sigma standard deviation in flight path angle error of 0.04 degrees at EI would result in a 98-percent probability of mission success.
A statistical mechanics analysis of the set covering problem
NASA Astrophysics Data System (ADS)
Fontanari, J. F.
1996-02-01
The dependence of the optimal solution average cost 0305-4470/29/3/004/img1 of the set covering problem on the density of 1's of the incidence matrix (0305-4470/29/3/004/img2) and on the number of constraints (P) is investigated in the limit where the number of items (N) goes to infinity. The annealed approximation is employed to study two stochastic models: the constant density model, where the elements of the incidence matrix are statistically independent random variables, and the Karp model, where the rows of the incidence matrix possess the same number of 1's. Lower bounds for 0305-4470/29/3/004/img1 are presented in the case that P scales with ln N and 0305-4470/29/3/004/img2 is of order 1, as well as in the case that P scales linearly with N and 0305-4470/29/3/004/img2 is of order 1/N. It is shown that in the case that P scales with exp N and 0305-4470/29/3/004/img2 is of order 1 the annealed approximation yields exact results for both models.
Statistical analysis of dendritic spine distributions in rat hippocampal cultures
2013-01-01
Background Dendritic spines serve as key computational structures in brain plasticity. Much remains to be learned about their spatial and temporal distribution among neurons. Our aim in this study was to perform exploratory analyses based on the population distributions of dendritic spines with regard to their morphological characteristics and period of growth in dissociated hippocampal neurons. We fit a log-linear model to the contingency table of spine features such as spine type and distance from the soma to first determine which features were important in modeling the spines, as well as the relationships between such features. A multinomial logistic regression was then used to predict the spine types using the features suggested by the log-linear model, along with neighboring spine information. Finally, an important variant of Ripley’s K-function applicable to linear networks was used to study the spatial distribution of spines along dendrites. Results Our study indicated that in the culture system, (i) dendritic spine densities were "completely spatially random", (ii) spine type and distance from the soma were independent quantities, and most importantly, (iii) spines had a tendency to cluster with other spines of the same type. Conclusions Although these results may vary with other systems, our primary contribution is the set of statistical tools for morphological modeling of spines which can be used to assess neuronal cultures following gene manipulation such as RNAi, and to study induced pluripotent stem cells differentiated to neurons. PMID:24088199
Statistical analysis of the properties of foreshock density cavitons
NASA Astrophysics Data System (ADS)
Kajdič, P.; Blanco-Cano, X.; Omidi, N.; Russell, C. T.
2008-12-01
Global hybrid simulations (kinetic ions, fluid electrons) have shown the existence of foreshock density cavitons immersed in regions permeated by ULF waves (Omidi, 2007, Blanco-Cano et al., 2008). These cavitons are characterized by large depressions in magnetic field magnitude and density, and are bounded by regions with enhanced field and density. In this work we study statistical properties of foreshock cavitons observed by Cluster spacecraft between the years 2001 and 2005. We have identified approximately 90 foreshock cavitons and use magnetic field and plasma data to analyze their durations, sizes, amplitude, and orientation. We compare caviton B and n values with ambient values. We also study the foreshock conditions in which the cavitons are detected, i.e. θBV, the angle between the incoming solar wind flow and the IMF, and Mach number, among others. We also determine the characteristics of the waves that surround the cavitons or even appear within them. We find that the foreshock cavitons can be observed in various ways - some are found as single cavitons immersed in ULF waves, others appear in groups, separated temporally only by a few minutes. In some cases we find two or three cavitons that are in the process of merging into a larger structure, and still developing.
On the Statistical Analysis of X-ray Polarization Measurements
NASA Technical Reports Server (NTRS)
Strohmayer, T. E.; Kallman, T. R.
2013-01-01
In many polarimetry applications, including observations in the X-ray band, the measurement of a polarization signal can be reduced to the detection and quantification of a deviation from uniformity of a distribution of measured angles of the form alpha plus beta cosine (exp 2)(phi - phi(sub 0) (0 (is) less than phi is less than pi). We explore the statistics of such polarization measurements using both Monte Carlo simulations as well as analytic calculations based on the appropriate probability distributions. We derive relations for the number of counts required to reach a given detection level (parameterized by beta the "number of sigma's" of the measurement) appropriate for measuring the modulation amplitude alpha by itself (single interesting parameter case) or jointly with the position angle phi (two interesting parameters case). We show that for the former case when the intrinsic amplitude is equal to the well known minimum detectable polarization (MDP) it is, on average, detected at the 3sigma level. For the latter case, when one requires a joint measurement at the same confidence level, then more counts are needed, by a factor of approximately equal to 2.2, than that required to achieve the MDP level. We find that the position angle uncertainty at 1sigma confidence is well described by the relation sigma(sub pi) equals 28.5(degrees) divided by beta.
Statistical language analysis for automatic exfiltration event detection.
Robinson, David Gerald
2010-04-01
This paper discusses the recent development a statistical approach for the automatic identification of anomalous network activity that is characteristic of exfiltration events. This approach is based on the language processing method eferred to as latent dirichlet allocation (LDA). Cyber security experts currently depend heavily on a rule-based framework for initial detection of suspect network events. The application of the rule set typically results in an extensive list of uspect network events that are then further explored manually for suspicious activity. The ability to identify anomalous network events is heavily dependent on the experience of the security personnel wading through the network log. Limitations f this approach are clear: rule-based systems only apply to exfiltration behavior that has previously been observed, and experienced cyber security personnel are rare commodities. Since the new methodology is not a discrete rule-based pproach, it is more difficult for an insider to disguise the exfiltration events. A further benefit is that the methodology provides a risk-based approach that can be implemented in a continuous, dynamic or evolutionary fashion. This permits uspect network activity to be identified early with a quantifiable risk associated with decision making when responding to suspicious activity.
Lachowiec, Jennifer; Shen, Xia; Queitsch, Christine; Carlborg, Örjan
2015-01-01
Efforts to identify loci underlying complex traits generally assume that most genetic variance is additive. Here, we examined the genetics of Arabidopsis thaliana root length and found that the genomic narrow-sense heritability for this trait in the examined population was statistically zero. The low amount of additive genetic variance that could be captured by the genome-wide genotypes likely explains why no associations to root length could be found using standard additive-model-based genome-wide association (GWA) approaches. However, as the broad-sense heritability for root length was significantly larger, and primarily due to epistasis, we also performed an epistatic GWA analysis to map loci contributing to the epistatic genetic variance. Four interacting pairs of loci were revealed, involving seven chromosomal loci that passed a standard multiple-testing corrected significance threshold. The genotype-phenotype maps for these pairs revealed epistasis that cancelled out the additive genetic variance, explaining why these loci were not detected in the additive GWA analysis. Small population sizes, such as in our experiment, increase the risk of identifying false epistatic interactions due to testing for associations with very large numbers of multi-marker genotypes in few phenotyped individuals. Therefore, we estimated the false-positive risk using a new statistical approach that suggested half of the associated pairs to be true positive associations. Our experimental evaluation of candidate genes within the seven associated loci suggests that this estimate is conservative; we identified functional candidate genes that affected root development in four loci that were part of three of the pairs. The statistical epistatic analyses were thus indispensable for confirming known, and identifying new, candidate genes for root length in this population of wild-collected A. thaliana accessions. We also illustrate how epistatic cancellation of the additive genetic variance
Lachowiec, Jennifer; Shen, Xia; Queitsch, Christine; Carlborg, Örjan
2015-01-01
Efforts to identify loci underlying complex traits generally assume that most genetic variance is additive. Here, we examined the genetics of Arabidopsis thaliana root length and found that the genomic narrow-sense heritability for this trait in the examined population was statistically zero. The low amount of additive genetic variance that could be captured by the genome-wide genotypes likely explains why no associations to root length could be found using standard additive-model-based genome-wide association (GWA) approaches. However, as the broad-sense heritability for root length was significantly larger, and primarily due to epistasis, we also performed an epistatic GWA analysis to map loci contributing to the epistatic genetic variance. Four interacting pairs of loci were revealed, involving seven chromosomal loci that passed a standard multiple-testing corrected significance threshold. The genotype-phenotype maps for these pairs revealed epistasis that cancelled out the additive genetic variance, explaining why these loci were not detected in the additive GWA analysis. Small population sizes, such as in our experiment, increase the risk of identifying false epistatic interactions due to testing for associations with very large numbers of multi-marker genotypes in few phenotyped individuals. Therefore, we estimated the false-positive risk using a new statistical approach that suggested half of the associated pairs to be true positive associations. Our experimental evaluation of candidate genes within the seven associated loci suggests that this estimate is conservative; we identified functional candidate genes that affected root development in four loci that were part of three of the pairs. The statistical epistatic analyses were thus indispensable for confirming known, and identifying new, candidate genes for root length in this population of wild-collected A. thaliana accessions. We also illustrate how epistatic cancellation of the additive genetic variance
Ockham's razor and Bayesian analysis. [statistical theory for systems evaluation
NASA Technical Reports Server (NTRS)
Jefferys, William H.; Berger, James O.
1992-01-01
'Ockham's razor', the ad hoc principle enjoining the greatest possible simplicity in theoretical explanations, is presently shown to be justifiable as a consequence of Bayesian inference; Bayesian analysis can, moreover, clarify the nature of the 'simplest' hypothesis consistent with the given data. By choosing the prior probabilities of hypotheses, it becomes possible to quantify the scientific judgment that simpler hypotheses are more likely to be correct. Bayesian analysis also shows that a hypothesis with fewer adjustable parameters intrinsically possesses an enhanced posterior probability, due to the clarity of its predictions.
Additional challenges for uncertainty analysis in river engineering
NASA Astrophysics Data System (ADS)
Berends, Koen; Warmink, Jord; Hulscher, Suzanne
2016-04-01
the proposed intervention. The implicit assumption underlying such analysis is that both models are commensurable. We hypothesize that they are commensurable only to a certain extent. In an idealised study we have demonstrated that prediction performance loss should be expected with increasingly large engineering works. When accounting for parametric uncertainty of floodplain roughness in model identification, we see uncertainty bounds for predicted effects of interventions increase with increasing intervention scale. Calibration of these types of models therefore seems to have a shelf-life, beyond which calibration does not longer improves prediction. Therefore a qualification scheme for model use is required that can be linked to model validity. In this study, we characterize model use along three dimensions: extrapolation (using the model with different external drivers), extension (using the model for different output or indicators) and modification (using modified models). Such use of models is expected to have implications for the applicability of surrogating modelling for efficient uncertainty analysis as well, which is recommended for future research. Warmink, J. J.; Straatsma, M. W.; Huthoff, F.; Booij, M. J. & Hulscher, S. J. M. H. 2013. Uncertainty of design water levels due to combined bed form and vegetation roughness in the Dutch river Waal. Journal of Flood Risk Management 6, 302-318 . DOI: 10.1111/jfr3.12014
NASA Astrophysics Data System (ADS)
Omura, Masaaki; Yoshida, Kenji; Kohta, Masushi; Kubo, Takabumi; Ishiguro, Toshimichi; Kobayashi, Kazuto; Hozumi, Naohiro; Yamaguchi, Tadashi
2016-07-01
To characterize skin ulcers for bacterial infection, quantitative ultrasound (QUS) parameters were estimated by the multiple statistical analysis of the echo amplitude envelope based on both Weibull and generalized gamma distributions and the ratio of mean to standard deviation of the echo amplitude envelope. Measurement objects were three rat models (noninfection, critical colonization, and infection models). Ultrasound data were acquired using a modified ultrasonic diagnosis system with a center frequency of 11 MHz. In parallel, histopathological images and two-dimensional map of speed of sound (SoS) were observed. It was possible to detect typical tissue characteristics such as infection by focusing on the relationship of QUS parameters and to indicate the characteristic differences that were consistent with the scatterer structure. Additionally, the histopathological characteristics and SoS of noninfected and infected tissues were matched to the characteristics of QUS parameters in each rat model.
Statistical analysis from recent abundance determinations in HgMn stars
NASA Astrophysics Data System (ADS)
Ghazaryan, S.; Alecian, G.
2016-08-01
To better understand the hot chemically peculiar group of HgMn stars, we have considered a compilation of a large number of recently published data obtained for these stars from spectroscopy. We compare these data to the previous compilation by Smith. We confirm the main trends of the abundance peculiarities, namely the increasing overabundances with increasing atomic number of heavy elements, and their large spread from star to star. For all the measured elements, we have looked for correlations between abundances and effective temperature (Teff). In addition to the known correlation for Mn, some other elements are found to show some connection between their abundances and Teff. We have also checked if multiplicity is a determinant parameter for abundance peculiarities determined for these stars. A statistical analysis using a Kolmogorov-Smirnov test shows that the abundances anomalies in the atmosphere of HgMn stars do not present significant dependence on the multiplicity.
New Test Statistics for MANOVA/Descriptive Discriminant Analysis.
ERIC Educational Resources Information Center
Coombs, William T.; Algina, James
1996-01-01
Univariate procedures proposed by M. Brown and A. Forsythe (1974) and the multivariate procedures from D. Nel and C. van der Merwe (1986) were generalized to form five new multivariate alternatives to one-way multivariate analysis of variance (MANOVA) for use when dispersion matrices are heteroscedastic. These alternatives are evaluated for Type I…
The Patterns of Teacher Compensation. Statistical Analysis Report.
ERIC Educational Resources Information Center
Chambers, Jay; Bobbitt, Sharon A.
This report presents information regarding the patterns of variation in the salaries paid to public and private school teachers in relation to various personal and job characteristics. Specifically, the analysis examines the relationship between compensation and variables such as public/private schools, gender, race/ethnic background, school level…
Using Neural Networks for Descriptive Statistical Analysis of Educational Data.
ERIC Educational Resources Information Center
Tirri, Henry; And Others
Methodological issues of using a class of neural networks called Mixture Density Networks (MDN) for discriminant analysis are discussed. MDN models have the advantage of having a rigorous probabilistic interpretation, and they have proven to be a viable alternative as a classification procedure in discrete domains. Both classification and…
Statistical analysis of unsolicited thermal sensation complaints in commercial buildings
Federspiel, C.C.
1998-10-01
Unsolicited complaints from 23,500 occupants in 690 commercial buildings were examined with regard to absolute and relative frequency of complaints, temperatures at which thermal sensation complaints (too hot or too cold) occurred, and response times and actions. The analysis shows that thermal sensation complaints are the single most common complaint of any type and that they are the overwhelming majority of environmental complaints. The analysis indicates that thermal sensation complaints are mostly the result of poor control performance and HVAC system faults rather than inter-individual differences in preferred temperatures. The analysis also shows that the neutral temperature in summer is greater than in winter, and the difference between summer and winter neutral temperatures is smaller than the difference between the midpoints of the summer and winter ASHRAE comfort zones. On average, women complain that it is cold at a higher temperature than men, and the temperature at which men complain that it is hot is more variable than for women. Analysis of response times and actions provides information that may be useful for designing a dispatching policy, and it also demonstrates that there is potential to reduce the labor cost of HVAC maintenance by 20% by reducing the frequency of thermal sensation complaints.
Granger causality--statistical analysis under a configural perspective.
von Eye, Alexander; Wiedermann, Wolfgang; Mun, Eun-Young
2014-03-01
The concept of Granger causality can be used to examine putative causal relations between two series of scores. Based on regression models, it is asked whether one series can be considered the cause for the second series. In this article, we propose extending the pool of methods available for testing hypotheses that are compatible with Granger causation by adopting a configural perspective. This perspective allows researchers to assume that effects exist for specific categories only or for specific sectors of the data space, but not for other categories or sectors. Configural Frequency Analysis (CFA) is proposed as the method of analysis from a configural perspective. CFA base models are derived for the exploratory analysis of Granger causation. These models are specified so that they parallel the regression models used for variable-oriented analysis of hypotheses of Granger causation. An example from the development of aggression in adolescence is used. The example shows that only one pattern of change in aggressive impulses over time Granger-causes change in physical aggression against peers.
Open Access Publishing Trend Analysis: Statistics beyond the Perception
ERIC Educational Resources Information Center
Poltronieri, Elisabetta; Bravo, Elena; Curti, Moreno; Maurizio Ferri,; Mancini, Cristina
2016-01-01
Introduction: The purpose of this analysis was twofold: to track the number of open access journals acquiring impact factor, and to investigate the distribution of subject categories pertaining to these journals. As a case study, journals in which the researchers of the National Institute of Health (Istituto Superiore di Sanità) in Italy have…
Statistical analysis of geodetic networks for detecting regional events
NASA Technical Reports Server (NTRS)
Granat, Robert
2004-01-01
We present an application of hidden Markov models (HMMs) to analysis of geodetic time series in Southern California. Our model fitting method uses a regularized version of the deterministic annealing expectation-maximization algorithm to ensure that model solutions are both robust and of high quality.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems
NASA Technical Reports Server (NTRS)
He, Yuning; Davies, Misty Dawn
2014-01-01
The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
Modular reweighting software for statistical mechanical analysis of biased equilibrium data
NASA Astrophysics Data System (ADS)
Sindhikara, Daniel J.
2012-07-01
Here a simple, useful, modular approach and software suite designed for statistical reweighting and analysis of equilibrium ensembles is presented. Statistical reweighting is useful and sometimes necessary for analysis of equilibrium enhanced sampling methods, such as umbrella sampling or replica exchange, and also in experimental cases where biasing factors are explicitly known. Essentially, statistical reweighting allows extrapolation of data from one or more equilibrium ensembles to another. Here, the fundamental separable steps of statistical reweighting are broken up into modules - allowing for application to the general case and avoiding the black-box nature of some “all-inclusive” reweighting programs. Additionally, the programs included are, by-design, written with little dependencies. The compilers required are either pre-installed on most systems, or freely available for download with minimal trouble. Examples of the use of this suite applied to umbrella sampling and replica exchange molecular dynamics simulations will be shown along with advice on how to apply it in the general case. New version program summaryProgram title: Modular reweighting version 2 Catalogue identifier: AEJH_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJH_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 179 118 No. of bytes in distributed program, including test data, etc.: 8 518 178 Distribution format: tar.gz Programming language: C++, Python 2.6+, Perl 5+ Computer: Any Operating system: Any RAM: 50-500 MB Supplementary material: An updated version of the original manuscript (Comput. Phys. Commun. 182 (2011) 2227) is available Classification: 4.13 Catalogue identifier of previous version: AEJH_v1_0 Journal reference of previous version: Comput. Phys. Commun. 182 (2011) 2227 Does the new
An overview of the mathematical and statistical analysis component of RICIS
NASA Technical Reports Server (NTRS)
Hallum, Cecil R.
1987-01-01
Mathematical and statistical analysis components of RICIS (Research Institute for Computing and Information Systems) can be used in the following problem areas: (1) quantification and measurement of software reliability; (2) assessment of changes in software reliability over time (reliability growth); (3) analysis of software-failure data; and (4) decision logic for whether to continue or stop testing software. Other areas of interest to NASA/JSC where mathematical and statistical analysis can be successfully employed include: math modeling of physical systems, simulation, statistical data reduction, evaluation methods, optimization, algorithm development, and mathematical methods in signal processing.
Kinetic analysis of microbial respiratory response to substrate addition
NASA Astrophysics Data System (ADS)
Blagodatskaya, Evgenia; Blagodatsky, Sergey; Yuyukina, Tatayna; Kuzyakov, Yakov
2010-05-01
Heterotrophic component of CO2 emitted from soil is mainly due to the respiratory activity of soil microorganisms. Field measurements of microbial respiration can be used for estimation of C-budget in soil, while laboratory estimation of respiration kinetics allows the elucidation of mechanisms of soil C sequestration. Physiological approaches based on 1) time-dependent or 2) substrate-dependent respiratory response of soil microorganisms decomposing the organic substrates allow to relate the functional properties of soil microbial community with decomposition rates of soil organic matter. We used a novel methodology combining (i) microbial growth kinetics and (ii) enzymes affinity to the substrate to show the shift in functional properties of the soil microbial community after amendments with substrates of contrasting availability. We combined the application of 14C labeled glucose as easily available C source to soil with natural isotope labeling of old and young soil SOM. The possible contribution of two processes: isotopic fractionation and preferential substrate utilization to the shifts in δ13C during SOM decomposition in soil after C3-C4 vegetation change was evaluated. Specific growth rate (µ) of soil microorganisms was estimated by fitting the parameters of the equation v(t) = A + B * exp(µ*t), to the measured CO2 evolution rate (v(t)) after glucose addition, and where A is the initial rate of non-growth respiration, B - initial rate of the growing fraction of total respiration. Maximal mineralization rate (Vmax), substrate affinity of microbial enzymes (Ks) and substrate availability (Sn) were determined by Michaelis-Menten kinetics. To study the effect of plant originated C on δ13C signature of SOM we compared the changes in isotopic composition of different C pools in C3 soil under grassland with C3-C4 soil where C4 plant Miscanthus giganteus was grown for 12 years on the plot after grassland. The shift in 13δ C caused by planting of M. giganteus
Characterization of Nuclear Fuel using Multivariate Statistical Analysis
Robel, M; Robel, M; Robel, M; Kristo, M J; Kristo, M J
2007-11-27
Various combinations of reactor type and fuel composition have been characterized using principle components analysis (PCA) of the concentrations of 9 U and Pu isotopes in the 10 fuel as a function of burnup. The use of PCA allows the reduction of the 9-dimensional data (isotopic concentrations) into a 3-dimensional approximation, giving a visual representation of the changes in nuclear fuel composition with burnup. Real-world variation in the concentrations of {sup 234}U and {sup 236}U in the fresh (unirradiated) fuel was accounted for. The effects of reprocessing were also simulated. The results suggest that, 15 even after reprocessing, Pu isotopes can be used to determine both the type of reactor and the initial fuel composition with good discrimination. Finally, partial least squares discriminant analysis (PSLDA) was investigated as a substitute for PCA. Our results suggest that PLSDA is a better tool for this application where separation between known classes is most important.
Three dimensional graphics in the statistical analysis of scientific data
Grotch, S.L.
1986-05-01
In scientific data analysis, the two-dimensional plot has become an indispensable tool. As the scientist more commonly encounters multivariate data, three dimensional graphics will form the natural extension of these more traditional representations. There can be little doubt that as the accessibility to ever more powerful graphics tools increases, their use will expand dramatically. In using three dimensional graphics in routine data analysis for nearly a decade, they have proved to be a powerful means for obtaining insights into data simply not available with traditional 2D methods. Examples of this work, taken primarily from chemistry and meteorology, are presented to illustrate a variety of 3D graphics found to be practically useful. Some approaches for improving these presentations are also highlighted.
Statistical theory and methodology for remote sensing data analysis
NASA Technical Reports Server (NTRS)
Odell, P. L.
1974-01-01
A model is developed for the evaluation of acreages (proportions) of different crop-types over a geographical area using a classification approach and methods for estimating the crop acreages are given. In estimating the acreages of a specific croptype such as wheat, it is suggested to treat the problem as a two-crop problem: wheat vs. nonwheat, since this simplifies the estimation problem considerably. The error analysis and the sample size problem is investigated for the two-crop approach. Certain numerical results for sample sizes are given for a JSC-ERTS-1 data example on wheat identification performance in Hill County, Montana and Burke County, North Dakota. Lastly, for a large area crop acreages inventory a sampling scheme is suggested for acquiring sample data and the problem of crop acreage estimation and the error analysis is discussed.
Practical guidance for statistical analysis of operational event data
Atwood, C.L.
1995-10-01
This report presents ways to avoid mistakes that are sometimes made in analysis of operational event data. It then gives guidance on what to do when a model is rejected, a list of standard types of models to consider, and principles for choosing one model over another. For estimating reliability, it gives advice on which failure modes to model, and moment formulas for combinations of failure modes. The issues are illustrated with many examples and case studies.
Zhu, Xiaofeng
2016-01-01
Meta-analysis of single trait for multiple cohorts has been used for increasing statistical power in genome-wide association studies (GWASs). Although hundreds of variants have been identified by GWAS, these variants only explain a small fraction of phenotypic variation. Cross-phenotype association analysis (CPASSOC) can further improve statistical power by searching for variants that contribute to multiple traits, which is often relevant to pleiotropy. In this study, we performed CPASSOC analysis on the summary statistics from the Genetic Investigation of ANthropometric Traits (GIANT) consortium using a novel method recently developed by our group. Sex-specific meta-analysis data for height, body mass index (BMI), and waist-to-hip ratio adjusted for BMI (WHRadjBMI) from discovery phase of the GIANT consortium study were combined using CPASSOC for each trait as well as 3 traits together. The conventional meta-analysis results from the discovery phase data of GIANT consortium studies were used to compare with that from CPASSOC analysis. The CPASSOC analysis was able to identify 17 loci associated with anthropometric traits that were missed by conventional meta-analysis. Among these loci, 16 have been reported in literature by including additional samples and 1 is novel. We also demonstrated that CPASSOC is able to detect pleiotropic effects when analyzing multiple traits. PMID:27701450
Measuring the Success of an Academic Development Programme: A Statistical Analysis
ERIC Educational Resources Information Center
Smith, L. C.
2009-01-01
This study uses statistical analysis to estimate the impact of first-year academic development courses in microeconomics, statistics, accountancy, and information systems, offered by the University of Cape Town's Commerce Academic Development Programme, on students' graduation performance relative to that achieved by mainstream students. The data…
A new statistic for the analysis of circular data in gamma-ray astronomy
NASA Technical Reports Server (NTRS)
Protheroe, R. J.
1985-01-01
A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.
ERIC Educational Resources Information Center
Papadimitriou, Fivos; Kidman, Gillian
2012-01-01
Certain statistic and scientometric features of articles published in the journal "International Research in Geographical and Environmental Education" (IRGEE) are examined in this paper for the period 1992-2009 by applying nonparametric statistics and Shannon's entropy (diversity) formula. The main findings of this analysis are: (a) after 2004,…
ERIC Educational Resources Information Center
Jones, Andrew T.
2011-01-01
Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…
ERIC Educational Resources Information Center
Huston, Holly L.
This paper begins with a general discussion of statistical significance, effect size, and power analysis; and concludes by extending the discussion to the multivariate case (MANOVA). Historically, traditional statistical significance testing has guided researchers' thinking about the meaningfulness of their data. The use of significance testing…
Statistical correlation analysis for comparing vibration data from test and analysis
NASA Technical Reports Server (NTRS)
Butler, T. G.; Strang, R. F.; Purves, L. R.; Hershfeld, D. J.
1986-01-01
A theory was developed to compare vibration modes obtained by NASTRAN analysis with those obtained experimentally. Because many more analytical modes can be obtained than experimental modes, the analytical set was treated as expansion functions for putting both sources in comparative form. The dimensional symmetry was developed for three general cases: nonsymmetric whole model compared with a nonsymmetric whole structural test, symmetric analytical portion compared with a symmetric experimental portion, and analytical symmetric portion with a whole experimental test. The theory was coded and a statistical correlation program was installed as a utility. The theory is established with small classical structures.
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Analysis of compressive fracture in rock using statistical techniques
Blair, S.C.
1994-12-01
Fracture of rock in compression is analyzed using a field-theory model, and the processes of crack coalescence and fracture formation and the effect of grain-scale heterogeneities on macroscopic behavior of rock are studied. The model is based on observations of fracture in laboratory compression tests, and incorporates assumptions developed using fracture mechanics analysis of rock fracture. The model represents grains as discrete sites, and uses superposition of continuum and crack-interaction stresses to create cracks at these sites. The sites are also used to introduce local heterogeneity. Clusters of cracked sites can be analyzed using percolation theory. Stress-strain curves for simulated uniaxial tests were analyzed by studying the location of cracked sites, and partitioning of strain energy for selected intervals. Results show that the model implicitly predicts both development of shear-type fracture surfaces and a strength-vs-size relation that are similar to those observed for real rocks. Results of a parameter-sensitivity analysis indicate that heterogeneity in the local stresses, attributed to the shape and loading of individual grains, has a first-order effect on strength, and that increasing local stress heterogeneity lowers compressive strength following an inverse power law. Peak strength decreased with increasing lattice size and decreasing mean site strength, and was independent of site-strength distribution. A model for rock fracture based on a nearest-neighbor algorithm for stress redistribution is also presented and used to simulate laboratory compression tests, with promising results.
Statistical methods for the forensic analysis of striated tool marks
Hoeksema, Amy Beth
2013-01-01
In forensics, fingerprints can be used to uniquely identify suspects in a crime. Similarly, a tool mark left at a crime scene can be used to identify the tool that was used. However, the current practice of identifying matching tool marks involves visual inspection of marks by forensic experts which can be a very subjective process. As a result, declared matches are often successfully challenged in court, so law enforcement agencies are particularly interested in encouraging research in more objective approaches. Our analysis is based on comparisons of profilometry data, essentially depth contours of a tool mark surface taken along a linear path. In current practice, for stronger support of a match or non-match, multiple marks are made in the lab under the same conditions by the suspect tool. We propose the use of a likelihood ratio test to analyze the difference between a sample of comparisons of lab tool marks to a field tool mark, against a sample of comparisons of two lab tool marks. Chumbley et al. (2010) point out that the angle of incidence between the tool and the marked surface can have a substantial impact on the tool mark and on the effectiveness of both manual and algorithmic matching procedures. To better address this problem, we describe how the analysis can be enhanced to model the effect of tool angle and allow for angle estimation for a tool mark left at a crime scene. With sufficient development, such methods may lead to more defensible forensic analyses.
An experimental statistical analysis of stress projection factors in BCC tantalum
Carroll, J. D.; Clark, B. G.; Buchheit, T. E.; Boyce, B. L.; Weinberger, C. R.
2013-10-01
Crystallographic slip planes in body centered cubic (BCC) metals are not fully understood. In polycrystals, there are additional confounding effects from grain interactions. This paper describes an experimental investigation into the effects of grain orientation and neighbors on elastic–plastic strain accumulation. In situ strain fields were obtained by performing digital image correlation (DIC) on images from a scanning electron microscope (SEM) and from optical microscopy. These strain fields were statistically compared to the grain structure measured by electron backscatter diffraction (EBSD). Spearman rank correlations were performed between effective strain and six microstructural factors including four Schmid factors associated with the <111> slip direction, grain size, and Taylor factor. Modest correlations (~10%) were found for a polycrystal tension specimen. The influence of grain neighbors was first investigated by re-correlating the polycrystal data using clusters of similarly-oriented grains identified by low grain boundary misorientation angles. Second, the experiment was repeated on a tantalum oligocrystal, with through-thickness grains. Much larger correlation coefficients were found in this multicrystal due to the dearth of grain neighbors and subsurface microstructure. Finally, a slip trace analysis indicated (in agreement with statistical correlations) that macroscopic slip often occurs on {110}<111> slip systems and sometimes by pencil glide on maximum resolved shear stress planes (MRSSP). These results suggest that Schmid factors are suitable for room temperature, quasistatic, tensile deformation in tantalum as long as grain neighbor effects are accounted for.
Statistical analysis of the ambiguities in the asteroid period determinations
NASA Astrophysics Data System (ADS)
Butkiewicz, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.
2014-07-01
A synodic period of an asteroid can be derived from its lightcurve by standard methods like Fourier-series fitting. A problem appears when results of observations are based on less than a full coverage of a lightcurve and/or high level of noise. Also, long gaps between individual lightcurves create an ambiguity in the cycle count which leads to aliases. Excluding binary systems and objects with non-principal-axis rotation, the rotation period is usually identical to the period of the second Fourier harmonic of the lightcurve. There are cases, however, where it may be connected with the 1st, 3rd, or 4th harmonic and it is difficult to choose among them when searching for the period. To help remove such uncertainties we analysed asteroid lightcurves for a range of shapes and observing/illuminating geometries. We simulated them using a modified internal code from the ISAM service (Marciniak et al. 2012, A&A 545, A131). In our computations, shapes of asteroids were modeled as Gaussian random spheres (Muinonen 1998, A&A, 332, 1087). A combination of Lommel-Seeliger and Lambert scattering laws was assumed. For each of the 100 shapes, we randomly selected 1000 positions of the spin axis, systematically changing the solar phase angle with a step of 5°. For each lightcurve, we determined its peak-to-peak amplitude, fitted the 6th-order Fourier series and derived the amplitudes of its harmonics. Instead of the number of the lightcurve extrema, which in many cases is subjective, we characterized each lightcurve by the order of the highest-amplitude Fourier harmonic. The goal of our simulations was to derive statistically significant conclusions (based on the underlying assumptions) about the dominance of different harmonics in the lightcurves of the specified amplitude and phase angle. The results, presented in the Figure, can be used in individual cases to estimate the probability that the obtained lightcurve is dominated by a specified Fourier harmonic. Some of the
Statistical analysis of aerosol species, trace gasses, and meteorology in Chicago.
Binaku, Katrina; O'Brien, Timothy; Schmeling, Martina; Fosco, Tinamarie
2013-09-01
Both canonical correlation analysis (CCA) and principal component analysis (PCA) were applied to atmospheric aerosol and trace gas concentrations and meteorological data collected in Chicago during the summer months of 2002, 2003, and 2004. Concentrations of ammonium, calcium, nitrate, sulfate, and oxalate particulate matter, as well as, meteorological parameters temperature, wind speed, wind direction, and humidity were subjected to CCA and PCA. Ozone and nitrogen oxide mixing ratios were also included in the data set. The purpose of statistical analysis was to determine the extent of existing linear relationship(s), or lack thereof, between meteorological parameters and pollutant concentrations in addition to reducing dimensionality of the original data to determine sources of pollutants. In CCA, the first three canonical variate pairs derived were statistically significant at the 0.05 level. Canonical correlation between the first canonical variate pair was 0.821, while correlations of the second and third canonical variate pairs were 0.562 and 0.461, respectively. The first canonical variate pair indicated that increasing temperatures resulted in high ozone mixing ratios, while the second canonical variate pair showed wind speed and humidity's influence on local ammonium concentrations. No new information was uncovered in the third variate pair. Canonical loadings were also interpreted for information regarding relationships between data sets. Four principal components (PCs), expressing 77.0 % of original data variance, were derived in PCA. Interpretation of PCs suggested significant production and/or transport of secondary aerosols in the region (PC1). Furthermore, photochemical production of ozone and wind speed's influence on pollutants were expressed (PC2) along with overall measure of local meteorology (PC3). In summary, CCA and PCA results combined were successful in uncovering linear relationships between meteorology and air pollutants in Chicago and
Statistical Analysis Strategies for Association Studies Involving Rare Variants
Bansal, Vikas; Libiger, Ondrej; Torkamani, Ali; Schork, Nicholas J.
2013-01-01
The limitations of genome-wide association (GWA) studies that focus on the phenotypic influence of common genetic variants have motivated human geneticists to consider the contribution of rare variants to phenotypic expression. The increasing availability of high-throughput sequencing technology has enabled studies of rare variants, but will not be sufficient for their success since appropriate analytical methods are also needed. We consider data analysis approaches to testing associations between a phenotype and collections of rare variants in a defined genomic region or set of regions. Ultimately, although a wide variety of analytical approaches exist, more work is needed to refine them and determine their properties and power in different contexts. PMID:20940738
STATISTICAL ANALYSIS OF THE VERY QUIET SUN MAGNETISM
Martinez Gonzalez, M. J.; Manso Sainz, R.; Asensio Ramos, A.
2010-03-10
The behavior of the observed polarization amplitudes with spatial resolution is a strong constraint on the nature and organization of solar magnetic fields below the resolution limit. We study the polarization of the very quiet Sun at different spatial resolutions using ground- and space-based observations. It is shown that 80% of the observed polarization signals do not change with spatial resolution, suggesting that, observationally, the very quiet Sun magnetism remains the same despite the high spatial resolution of space-based observations. Our analysis also reveals a cascade of spatial scales for the magnetic field within the resolution element. It is manifest that the Zeeman effect is sensitive to the microturbulent field usually associated with Hanle diagnostics. This demonstrates that Zeeman and Hanle studies show complementary perspectives of the same magnetism.
Statistical Analysis of Factors Affecting Child Mortality in Pakistan.
Ahmed, Zoya; Kamal, Asifa; Kamal, Asma
2016-06-01
Child mortality is a composite indicator reflecting economic, social, environmental, healthcare services, and their delivery situation in a country. Globally, Pakistan has the third highest burden of fetal, maternal, and child mortality. Factors affecting child mortality in Pakistan are investigated by using Binary Logistic Regression Analysis. Region, education of mother, birth order, preceding birth interval (the period between the previous child birth and the index child birth), size of child at birth, and breastfeeding and family size were found to be significantly important with child mortality in Pakistan. Child mortality decreased as level of mother's education, preceding birth interval, size of child at birth, and family size increased. Child mortality was found to be significantly higher in Balochistan as compared to other regions. Child mortality was low for low birth orders. Child survival was significantly higher for children who were breastfed as compared to those who were not.
Statistical analysis of the temporal properties of BL Lacertae
NASA Astrophysics Data System (ADS)
Guo, Yu Cheng; Hu, Shao Ming; Li, Yu Tong; Chen, Xu
2016-08-01
A comprehensive temporal analysis has been performed on optical light curves of BL Lacertae in the B, V and R bands. The light curves were denoised by Gaussian smoothing and decomposed into individual flares using an exponential profile. The asymmetry, duration, peak flux and equivalent energy output of flares were measured and the frequency distributions presented. Most optical flares of BL Lacertae are highly symmetric, with a weak tendency towards gradual rises and rapid decays. The distribution of flare durations is not random, but consistent with a gamma distribution. Peak fluxes and energy outputs of flares all follow a log-normal distribution. A positive correlation is detected between flare durations and peak fluxes. The temporal properties of BL Lacertae provide evidence of the stochastic magnetohydrodynamic process in the accretion disc and jet.The results presented here can serve as constraints on physical models attempting to interpret blazar variations.
Spectral reflectance of surface soils - A statistical analysis
NASA Technical Reports Server (NTRS)
Crouse, K. R.; Henninger, D. L.; Thompson, D. R.
1983-01-01
The relationship of the physical and chemical properties of soils to their spectral reflectance as measured at six wavebands of Thematic Mapper (TM) aboard NASA's Landsat-4 satellite was examined. The results of performing regressions of over 20 soil properties on the six TM bands indicated that organic matter, water, clay, cation exchange capacity, and calcium were the properties most readily predicted from TM data. The middle infrared bands, bands 5 and 7, were the best bands for predicting soil properties, and the near infrared band, band 4, was nearly as good. Clustering 234 soil samples on the TM bands and characterizing the clusters on the basis of soil properties revealed several clear relationships between properties and reflectance. Discriminant analysis found organic matter, fine sand, base saturation, sand, extractable acidity, and water to be significant in discriminating among clusters.
Fine needle aspiration biopsy diagnosis of mucoepidermoid carcinoma. Statistical analysis.
Cohen, M B; Fisher, P E; Holly, E A; Ljung, B M; Löwhagen, T; Bottles, K
1990-01-01
Fine needle aspiration (FNA) biopsy is an increasingly popular method for the evaluation of salivary gland tumors. Of the common salivary gland tumors, mucoepidermoid carcinoma is probably the most difficult to diagnose accurately by this means. A series of 96 FNA biopsy specimens of salivary gland masses, including 34 mucoepidermoid carcinomas, 51 other benign and malignant neoplasms, 7 nonneoplastic lesions and 4 normal salivary glands, were analyzed in order to identify the most useful criteria for diagnosing mucoepidermoid carcinoma. Thirteen cytologic criteria were evaluated in the FNA specimens, and a stepwise logistic regression analysis was performed. The three cytologic features selected as most predictive of mucoepidermoid carcinoma were intermediate cells, squamous cells and overlapping epithelial groups. Using these three features together, the sensitivity and specificity of accurately diagnosing mucoepidermoid carcinoma were 97% and 100%, respectively.
Ordinary chondrites - Multivariate statistical analysis of trace element contents
NASA Technical Reports Server (NTRS)
Lipschutz, Michael E.; Samuels, Stephen M.
1991-01-01
The contents of mobile trace elements (Co, Au, Sb, Ga, Se, Rb, Cs, Te, Bi, Ag, In, Tl, Zn, and Cd) in Antarctic and non-Antarctic populations of H4-6 and L4-6 chondrites, were compared using standard multivariate discriminant functions borrowed from linear discriminant analysis and logistic regression. A nonstandard randomization-simulation method was developed, making it possible to carry out probability assignments on a distribution-free basis. Compositional differences were found both between the Antarctic and non-Antarctic H4-6 chondrite populations and between two L4-6 chondrite populations. It is shown that, for various types of meteorites (in particular, for the H4-6 chondrites), the Antarctic/non-Antarctic compositional difference is due to preterrestrial differences in the genesis of their parent materials.
Statistical Analysis of Temple Orientation in Ancient India
NASA Astrophysics Data System (ADS)
Aller, Alba; Belmonte, Juan Antonio
2015-05-01
The great diversity of religions that have been followed in India for over 3000 years is the reason why there are hundreds of temples built to worship dozens of different divinities. In this work, more than one hundred temples geographically distributed over the whole Indian land have been analyzed, obtaining remarkable results. For this purpose, a deep analysis of the main deities who are worshipped in each of them, as well as of the different dynasties (or cultures) who built them has also been conducted. As a result, we have found that the main axes of the temples dedicated to Shiva seem to be oriented to the east cardinal point while those temples dedicated to Vishnu would be oriented to both the east and west cardinal points. To explain these cardinal directions we propose to look back to the origins of Hinduism. Besides these cardinal orientations, clear solar orientations have also been found, especially at the equinoctial declination.
Statistical Analysis of Acoustic Wave Parameters Near Solar Active Regions
NASA Astrophysics Data System (ADS)
Rabello-Soares, M. Cristina; Bogart, Richard S.; Scherrer, Philip H.
2016-08-01
In order to quantify the influence of magnetic fields on acoustic mode parameters and flows in and around active regions, we analyze the differences in the parameters in magnetically quiet regions nearby an active region (which we call “nearby regions”), compared with those of quiet regions at the same disk locations for which there are no neighboring active regions. We also compare the mode parameters in active regions with those in comparably located quiet regions. Our analysis is based on ring-diagram analysis of all active regions observed by the Helioseismic and Magnetic Imager (HMI) during almost five years. We find that the frequency at which the mode amplitude changes from attenuation to amplification in the quiet nearby regions is around 4.2 mHz, in contrast to the active regions, for which it is about 5.1 mHz. This amplitude enhacement (the “acoustic halo effect”) is as large as that observed in the active regions, and has a very weak dependence on the wave propagation direction. The mode energy difference in nearby regions also changes from a deficit to an excess at around 4.2 mHz, but averages to zero over all modes. The frequency difference in nearby regions increases with increasing frequency until a point at which the frequency shifts turn over sharply, as in active regions. However, this turnover occurs around 4.9 mHz, which is significantly below the acoustic cutoff frequency. Inverting the horizontal flow parameters in the direction of the neigboring active regions, we find flows that are consistent with a model of the thermal energy flow being blocked directly below the active region.
Multiple outcomes are often measured on each experimental unit in toxicology experiments. These multiple observations typically imply the existence of correlation between endpoints, and a statistical analysis that incorporates it may result in improved inference. When both disc...
Shokes, T.; Einerson, J.
2007-07-01
One goal of characterizing, processing, and shipping waste to the Waste Isolation Pilot Plant (WIPP) is to make all activities as efficient as possible. Data management and repetitive calculations are a critical part of the process that can be automated, thereby increasing the accuracy and rate at which work is completed and reducing costs. This paper presents the tools developed to automate statistical analysis and other calculations required by the WIPP Hazardous Waste Facility Permit (HWFP). Statistical analyses are performed on the analytical results on gas samples from the headspace of waste containers and solid samples from the core of the waste container. The calculations include determining the number of samples, test for the shape of the distribution of the analytical results, mean, standard deviation, upper 90-percent confidence limit of the mean, and the minimum required Waste Acceptance Plan (WAP) sample size. The input data for these calculations are from the batch data reports for headspace gas analytical results and solids analysis, which must also be obtained and collated for proper use. The most challenging component of the statistical analysis, if performed manually, is the determination of the distribution shape; therefore, the distribution testing is typically performed using a certified software tool. All other calculations can be completed manually, with a spreadsheet, custom developed software, and/or certified software tool. Out of the options available, manually performing the calculations or using a spreadsheet are the least desirable. These methods rely heavily on the availability of an expert, such as a statistician, to perform the calculation. These methods are also more open to human error such as transcription or 'cut and paste' errors. A SAS program is in the process of being developed to perform the calculations. Due to the potential size of the data input files and the need to archive the data in an accessible format, the SAS
New Statistical Methods for the Analysis of the Cratering on Venus
NASA Astrophysics Data System (ADS)
Xie, M.; Smrekar, S. E.; Handcock, M. S.
2014-12-01
The sparse crater population (~1000 craters) on Venus is the most important clue of determining the planet's surface age and aids in understanding its geologic history. What processes (volcanism, tectonism, weathering, etc.) modify the total impact crater population? Are the processes regional or global in occurrence? The heated debate on these questions points to the need for better approaches. We present new statistical methods for the analysis of the crater locations and characteristics. Specifically: 1) We produce a map of crater density and the proportion of no halo craters (inferred to be modified) by using generalized additive models, and smoothing splines with a spherical spline basis set. Based on this map, we are able to predict the probability of a crater has no halo given that there is a crater at that point. We also obtain a continuous representation of the ratio of craters with no halo as a function of crater density. This approach allows us to look for regions that appear to have experienced more or less modification, and are thus potentially older or younger. 2) We examine the randomness or clustering of distributions of craters by type (e.g. dark floored, intermediate). For example, for dark floored craters we consider two hypotheses: i) the dark floored craters are randomly distributed on the surface; ii) the dark floored craters are random given the locations of the crater population. Instead of only using a single measure such as average nearest neighbor distance, we use the probability density function of these distances, and compare it to complete spatial randomness to get the relative probability density function. This function gives us a clearer picture of how and where the nearest neighbor distances differ from complete spatial randomness. We also conduct statistical tests of these hypotheses. Confidence intervals with specified global coverage are constructed. Software to reproduce the methods is available in the open source statistics
Statistical analysis of low-voltage EDS spectrum images
Anderson, I.M.
1998-03-01
The benefits of using low ({le}5 kV) operating voltages for energy-dispersive X-ray spectrometry (EDS) of bulk specimens have been explored only during the last few years. This paper couples low-voltage EDS with two other emerging areas of characterization: spectrum imaging of a computer chip manufactured by a major semiconductor company. Data acquisition was performed with a Philips XL30-FEG SEM operated at 4 kV and equipped with an Oxford super-ATW detector and XP3 pulse processor. The specimen was normal to the electron beam and the take-off angle for acquisition was 35{degree}. The microscope was operated with a 150 {micro}m diameter final aperture at spot size 3, which yielded an X-ray count rate of {approximately}2,000 s{sup {minus}1}. EDS spectrum images were acquired as Adobe Photoshop files with the 4pi plug-in module. (The spectrum images could also be stored as NIH Image files, but the raw data are automatically rescaled as maximum-contrast (0--255) 8-bit TIFF images -- even at 16-bit resolution -- which poses an inconvenience for quantitative analysis.) The 4pi plug-in module is designed for EDS X-ray mapping and allows simultaneous acquisition of maps from 48 elements plus an SEM image. The spectrum image was acquired by re-defining the energy intervals of 48 elements to form a series of contiguous 20 eV windows from 1.25 kV to 2.19 kV. A spectrum image of 450 x 344 pixels was acquired from the specimen with a sampling density of 50 nm/pixel and a dwell time of 0.25 live seconds per pixel, for a total acquisition time of {approximately}14 h. The binary data files were imported into Mathematica for analysis with software developed by the author at Oak Ridge National Laboratory. A 400 x 300 pixel section of the original image was analyzed. MSA required {approximately}185 Mbytes of memory and {approximately}18 h of CPU time on a 300 MHz Power Macintosh 9600.
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets
Sandstrom, Mary M.; Brown, Geoffrey W.; Preston, Daniel N.; Pollard, Colin J.; Warner, Kirstin F.; Sorensen, Daniel N.; Remmers, Daniel L.; Phillips, Jason J.; Shelley, Timothy J.; Reyes, Jose A.; Hsu, Peter C.; Reynolds, John G.
2015-10-30
The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statistically significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.
Wheat signature modeling and analysis for improved training statistics
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Malila, W. A.; Cicone, R. C.; Gleason, J. M.
1976-01-01
The author has identified the following significant results. The spectral, spatial, and temporal characteristics of wheat and other signatures in LANDSAT multispectral scanner data were examined through empirical analysis and simulation. Irrigation patterns varied widely within Kansas; 88 percent of wheat acreage in Finney was irrigated and 24 percent in Morton, as opposed to less than 3 percent for western 2/3's of the State. The irrigation practice was definitely correlated with the observed spectral response; wheat variety differences produced observable spectral differences due to leaf coloration and different dates of maturation. Between-field differences were generally greater than within-field differences, and boundary pixels produced spectral features distinct from those within field centers. Multiclass boundary pixels contributed much of the observed bias in proportion estimates. The variability between signatures obtained by different draws of training data decreased as the sample size became larger; also, the resulting signatures became more robust and the particular decision threshold value became less important.
Performance analysis of the Alliant FX/8 multiprocessor using statistical clustering
NASA Technical Reports Server (NTRS)
Dimpsey, Robert Tod
1988-01-01
Results for two distinct, real, scientific workloads executed on an Alliant FX/8 are discussed. A combination of user concurrency and system overhead measurements was taken for both workloads. Preliminary analysis shows that the first sampled workload is comprised of consistently high user concurrency, low system overhead, and little paging. The second sample has much less user concurrency, but significant paging and system overhead. Statistical cluster analysis is used to extract a state transition model to jointly characterize user concurrency and system overhead. A skewness factor is introduced and used to bring out the effects of unbalanced clustering when determining states with important transitions. The results from the models show that during the collection of the first sample, the system was operating in states of high user concurrency approximately 75 percent of the time. The second workload sample shows the system in high user concurrency states only 26 percent of the time. In addition, it is ascertained that high system overhead is usually accompanied by low user concurrency. The analysis also shows a high predictability of system behavior for both workloads.
Statistical analysis of molecular nanotemplate driven DNA adsorption on graphite.
Dubrovin, E V; Speller, S; Yaminsky, I V
2014-12-30
In this work, we have studied the conformation of DNA molecules aligned on the nanotemplates of octadecylamine, stearyl alcohol, and stearic acid on highly oriented pyrolytic graphite (HOPG). For this purpose, fluctuations of contours of adsorbed biopolymers obtained from atomic force microscopy (AFM) images were analyzed using the wormlike chain model. Moreover, the conformations of adsorbed biopolymer molecules were characterized by the analysis of the scaling exponent ν, which relates the mean squared end-to-end distance and contour length of the polymer. During adsorption on octadecylamine and stearyl alcohol nanotemplates, DNA forms straight segments, which order along crystallographic axes of graphite. In this case, the conformation of DNA molecules can be described using two different length scales. On a large length scale (at contour lengths l > 200-400 nm), aligned DNA molecules have either 2D compact globule or partially relaxed 2D conformation, whereas on a short length scale (at l ≤ 200-400 nm) their conformation is close to that of rigid rods. The latter type of conformation can be also assigned to DNA adsorbed on a stearic acid nanotemplate. The different conformation of DNA molecules observed on the studied monolayers is connected with the different DNA-nanotemplate interactions associated with the nature of the functional group of the alkane derivative in the nanotemplate (amine, alcohol, or acid). The persistence length of λ-DNA adsorbed on octadecylamine nanotemplates is 31 ± 2 nm indicating the loss of DNA rigidity in comparison with its native state. Similar values of the persistence length (34 ± 2 nm) obtained for 24-times shorter DNA molecules adsorbed on an octadecylamine nanotemplate demonstrate that this rigidity change does not depend on biopolymer length. Possible reasons for the reduction of DNA persistence length are discussed in view of the internal DNA structure and DNA-surface interaction.
Processing and statistical analysis of soil-root images
NASA Astrophysics Data System (ADS)
Razavi, Bahar S.; Hoang, Duyen; Kuzyakov, Yakov
2016-04-01
Importance of the hotspots such as rhizosphere, the small soil volume that surrounds and is influenced by plant roots, calls for spatially explicit methods to visualize distribution of microbial activities in this active site (Kuzyakov and Blagodatskaya, 2015). Zymography technique has previously been adapted to visualize the spatial dynamics of enzyme activities in rhizosphere (Spohn and Kuzyakov, 2014). Following further developing of soil zymography -to obtain a higher resolution of enzyme activities - we aimed to 1) quantify the images, 2) determine whether the pattern (e.g. distribution of hotspots in space) is clumped (aggregated) or regular (dispersed). To this end, we incubated soil-filled rhizoboxes with maize Zea mays L. and without maize (control box) for two weeks. In situ soil zymography was applied to visualize enzymatic activity of β-glucosidase and phosphatase at soil-root interface. Spatial resolution of fluorescent images was improved by direct application of a substrate saturated membrane to the soil-root system. Furthermore, we applied "spatial point pattern analysis" to determine whether the pattern (e.g. distribution of hotspots in space) is clumped (aggregated) or regular (dispersed). Our results demonstrated that distribution of hotspots at rhizosphere is clumped (aggregated) compare to control box without plant which showed regular (dispersed) pattern. These patterns were similar in all three replicates and for both enzymes. We conclude that improved zymography is promising in situ technique to identify, analyze, visualize and quantify spatial distribution of enzyme activities in the rhizosphere. Moreover, such different patterns should be considered in assessments and modeling of rhizosphere extension and the corresponding effects on soil properties and functions. Key words: rhizosphere, spatial point pattern, enzyme activity, zymography, maize.
R: A Software Environment for Comprehensive Statistical Analysis of Astronomical Data
NASA Astrophysics Data System (ADS)
Feigelson, E. D.
2012-09-01
R is the largest public domain software language for statistical analysis of data. Together with CRAN, its rapidly growing collection of >3000 add-on specialized packages, it implements around 60,000 statistical functionalities in a cohesive software environment. Extensive graphical capabilities and interfaces with other programming languages are also available. The scope and language of R/CRAN are briefly described, along with efforts to promulgate its use in the astronomy. R can become an important tool for advanced statistical analysis of astronomical data.
Convertino, Matteo; Mangoubi, Rami S.; Linkov, Igor; Lowry, Nathan C.; Desai, Mukund
2012-01-01
Background The quantification of species-richness and species-turnover is essential to effective monitoring of ecosystems. Wetland ecosystems are particularly in need of such monitoring due to their sensitivity to rainfall, water management and other external factors that affect hydrology, soil, and species patterns. A key challenge for environmental scientists is determining the linkage between natural and human stressors, and the effect of that linkage at the species level in space and time. We propose pixel intensity based Shannon entropy for estimating species-richness, and introduce a method based on statistical wavelet multiresolution texture analysis to quantitatively assess interseasonal and interannual species turnover. Methodology/Principal Findings We model satellite images of regions of interest as textures. We define a texture in an image as a spatial domain where the variations in pixel intensity across the image are both stochastic and multiscale. To compare two textures quantitatively, we first obtain a multiresolution wavelet decomposition of each. Either an appropriate probability density function (pdf) model for the coefficients at each subband is selected, and its parameters estimated, or, a non-parametric approach using histograms is adopted. We choose the former, where the wavelet coefficients of the multiresolution decomposition at each subband are modeled as samples from the generalized Gaussian pdf. We then obtain the joint pdf for the coefficients for all subbands, assuming independence across subbands; an approximation that simplifies the computational burden significantly without sacrificing the ability to statistically distinguish textures. We measure the difference between two textures' representative pdf's via the Kullback-Leibler divergence (KL). Species turnover, or diversity, is estimated using both this KL divergence and the difference in Shannon entropy. Additionally, we predict species richness, or diversity, based on the
NASA Technical Reports Server (NTRS)
Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.
2011-01-01
Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
Methods of learning in statistical education: Design and analysis of a randomized trial
NASA Astrophysics Data System (ADS)
Boyd, Felicity Turner
Background. Recent psychological and technological advances suggest that active learning may enhance understanding and retention of statistical principles. A randomized trial was designed to evaluate the addition of innovative instructional methods within didactic biostatistics courses for public health professionals. Aims. The primary objectives were to evaluate and compare the addition of two active learning methods (cooperative and internet) on students' performance; assess their impact on performance after adjusting for differences in students' learning style; and examine the influence of learning style on trial participation. Methods. Consenting students enrolled in a graduate introductory biostatistics course were randomized to cooperative learning, internet learning, or control after completing a pretest survey. The cooperative learning group participated in eight small group active learning sessions on key statistical concepts, while the internet learning group accessed interactive mini-applications on the same concepts. Controls received no intervention. Students completed evaluations after each session and a post-test survey. Study outcome was performance quantified by examination scores. Intervention effects were analyzed by generalized linear models using intent-to-treat analysis and marginal structural models accounting for reported participation. Results. Of 376 enrolled students, 265 (70%) consented to randomization; 69, 100, and 96 students were randomized to the cooperative, internet, and control groups, respectively. Intent-to-treat analysis showed no differences between study groups; however, 51% of students in the intervention groups had dropped out after the second session. After accounting for reported participation, expected examination scores were 2.6 points higher (of 100 points) after completing one cooperative learning session (95% CI: 0.3, 4.9) and 2.4 points higher after one internet learning session (95% CI: 0.0, 4.7), versus
Statistical Analysis of Large Simulated Yield Datasets for Studying Climate Effects
NASA Technical Reports Server (NTRS)
Makowski, David; Asseng, Senthold; Ewert, Frank; Bassu, Simona; Durand, Jean-Louis; Martre, Pierre; Adam, Myriam; Aggarwal, Pramod K.; Angulo, Carlos; Baron, Chritian; Basso, Bruno; Bertuzzi, Patrick; Biemath, Christian; Boogaard, Hendrik; Boote, Kenneth J.; Brisson, Nadine; Cammarano, Davide; Challinor, Andrew J.; Conijn, Sjakk J. G.; Corbeels, Marc; Deryng, Delphine; De Sanctis, Giacomo; Doltra, Jordi; Gayler, Sebastian; Goldberg, Richard A.; Grassini, Patricio; Hatfield, Jerry L.; Heng, Lee; Hoek, Steven; Hooker, Josh; Hunt, Tony L. A.; Ingwersen, Joachim; Izaurralde, Cesar; Jongschaap, Raymond E. E.; Rosenzweig, Cynthia
2015-01-01
Many studies have been carried out during the last decade to study the effect of climate change on crop yields and other key crop characteristics. In these studies, one or several crop models were used to simulate crop growth and development for different climate scenarios that correspond to different projections of atmospheric CO2 concentration, temperature, and rainfall changes (Semenov et al., 1996; Tubiello and Ewert, 2002; White et al., 2011). The Agricultural Model Intercomparison and Improvement Project (AgMIP; Rosenzweig et al., 2013) builds on these studies with the goal of using an ensemble of multiple crop models in order to assess effects of climate change scenarios for several crops in contrasting environments. These studies generate large datasets, including thousands of simulated crop yield data. They include series of yield values obtained by combining several crop models with different climate scenarios that are defined by several climatic variables (temperature, CO2, rainfall, etc.). Such datasets potentially provide useful information on the possible effects of different climate change scenarios on crop yields. However, it is sometimes difficult to analyze these datasets and to summarize them in a useful way due to their structural complexity; simulated yield data can differ among contrasting climate scenarios, sites, and crop models. Another issue is that it is not straightforward to extrapolate the results obtained for the scenarios to alternative climate change scenarios not initially included in the simulation protocols. Additional dynamic crop model simulations for new climate change scenarios are an option but this approach is costly, especially when a large number of crop models are used to generate the simulated data, as in AgMIP. Statistical models have been used to analyze responses of measured yield data to climate variables in past studies (Lobell et al., 2011), but the use of a statistical model to analyze yields simulated by complex
ROOT — A C++ framework for petabyte data storage, statistical analysis and visualization
NASA Astrophysics Data System (ADS)
Antcheva, I.; Ballintijn, M.; Bellenot, B.; Biskup, M.; Brun, R.; Buncic, N.; Canal, Ph.; Casadei, D.; Couet, O.; Fine, V.; Franco, L.; Ganis, G.; Gheata, A.; Maline, D. Gonzalez; Goto, M.; Iwaszkiewicz, J.; Kreshuk, A.; Segura, D. Marcos; Maunder, R.; Moneta, L.; Naumann, A.; Offermann, E.; Onuchin, V.; Panacek, S.; Rademakers, F.; Russo, P.; Tadel, M.
2009-12-01
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, the RooFit package allows the user to perform complex data modeling and fitting while the RooStats library provides abstractions and implementations for advanced statistical tools. Multivariate classification methods based on machine learning techniques are available via the TMVA package. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks — e.g. data mining in HEP — by using PROOF, which will take care of optimally
NASA Astrophysics Data System (ADS)
Plaschke, F.; Glassmeier, K.-H.; Constantinescu, O. D.; Mann, I. R.; Milling, D. K.; Motschmann, U.; Rae, I. J.
2008-11-01
In this paper we introduce the field line resonance detector (FLRD), a wave telescope technique which has been specially adapted to estimate the spectral energy density of field line resonance (FLR) phase structures in a superposed wave field. The field line resonance detector is able to detect and correctly characterize several superposed FLR structures of a wave field and therefore constitutes a new and powerful tool in ULF pulsation studies. In our work we derive the technique from the classical wave telescope beamformer and present a statistical analysis of one year of ground based magnetometer data from the Canadian magnetometer network CANOPUS, now known as CARISMA. The statistical analysis shows that the FLRD is capable of detecting and characterizing superposed or hidden FLR structures in most of the detected ULF pulsation events; the one year statistical database is therefore extraordinarily comprehensive. The results of this analysis confirm the results of previous FLR characterizations and furthermore allow a detailed generalized dispersion analysis of FLRs.
Liu, Na; Li, Jun; Li, Bao-Guo
2014-11-01
The study of quality control of Chinese medicine has always been the hot and the difficulty spot of the development of traditional Chinese medicine (TCM), which is also one of the key problems restricting the modernization and internationalization of Chinese medicine. Multivariate statistical analysis is an analytical method which is suitable for the analysis of characteristics of TCM. It has been used widely in the study of quality control of TCM. Multivariate Statistical analysis was used for multivariate indicators and variables that appeared in the study of quality control and had certain correlation between each other, to find out the hidden law or the relationship between the data can be found,.which could apply to serve the decision-making and realize the effective quality evaluation of TCM. In this paper, the application of multivariate statistical analysis in the quality control of Chinese medicine was summarized, which could provided the basis for its further study. PMID:25775806
Application of multivariate statistical methods to the analysis of ancient Turkish potsherds
Martin, R.C.
1986-01-01
Three hundred ancient Turkish potsherds were analyzed by instrumental neutron activation analysis, and the resulting data analyzed by several techniques of multivariate statistical analysis, some only recently developed. The programs AGCLUS, MASLOC, and SIMCA were sequentially employed to characterize and group the samples by type of pottery and site of excavation. Comparison of the statistical analyses by each method provided archaeological insight into the site/type relationships of the samples and ultimately evidence relevant to the commercial relations between the ancient communities and specialization of pottery production over time. The techniques used for statistical analysis were found to be of significant potential utility in the future analysis of other archaeometric data sets. 25 refs., 33 figs.
Analysis of Variance with Summary Statistics in Microsoft® Excel®
ERIC Educational Resources Information Center
Larson, David A.; Hsu, Ko-Cheng
2010-01-01
Students regularly are asked to solve Single Factor Analysis of Variance problems given only the sample summary statistics (number of observations per category, category means, and corresponding category standard deviations). Most undergraduate students today use Excel for data analysis of this type. However, Excel, like all other statistical…
NASA Astrophysics Data System (ADS)
Chan, Kwai S.
2015-12-01
Rectangular plates of Ti-6Al-4V with extra low interstitial (ELI) were fabricated by layer-by-layer deposition techniques that included electron beam melting (EBM) and laser beam melting (LBM). The surface conditions of these plates were characterized using x-ray micro-computed tomography. The depth and radius of surface notch-like features on the LBM and EBM plates were measured from sectional images of individual virtual slices of the rectangular plates. The stress concentration factors of individual surface notches were computed and analyzed statistically to determine the appropriate distributions for the notch depth, notch radius, and stress concentration factor. These results were correlated with the fatigue life of the Ti-6Al-4V ELI alloys from an earlier investigation. A surface notch analysis was performed to assess the debit in the fatigue strength due to the surface notches. The assessment revealed that the fatigue lives of the additively manufactured plates with rough surface topographies and notch-like features are dominated by the fatigue crack growth of large cracks for both the LBM and EBM materials. The fatigue strength reduction due to the surface notches can be as large as 60%-75%. It is concluded that for better fatigue performance, the surface notches on EBM and LBM materials need to be removed by machining and the surface roughness be improved to a surface finish of about 1 μm.
Kamal, Ghulam Mustafa; Wang, Xiaohua; Bin Yuan; Wang, Jie; Sun, Peng; Zhang, Xu; Liu, Maili
2016-09-01
Soy sauce a well known seasoning all over the world, especially in Asia, is available in global market in a wide range of types based on its purpose and the processing methods. Its composition varies with respect to the fermentation processes and addition of additives, preservatives and flavor enhancers. A comprehensive (1)H NMR based study regarding the metabonomic variations of soy sauce to differentiate among different types of soy sauce available on the global market has been limited due to the complexity of the mixture. In present study, (13)C NMR spectroscopy coupled with multivariate statistical data analysis like principle component analysis (PCA), and orthogonal partial least square-discriminant analysis (OPLS-DA) was applied to investigate metabonomic variations among different types of soy sauce, namely super light, super dark, red cooking and mushroom soy sauce. The main additives in soy sauce like glutamate, sucrose and glucose were easily distinguished and quantified using (13)C NMR spectroscopy which were otherwise difficult to be assigned and quantified due to serious signal overlaps in (1)H NMR spectra. The significantly higher concentration of sucrose in dark, red cooking and mushroom flavored soy sauce can directly be linked to the addition of caramel in soy sauce. Similarly, significantly higher level of glutamate in super light as compared to super dark and mushroom flavored soy sauce may come from the addition of monosodium glutamate. The study highlights the potentiality of (13)C NMR based metabonomics coupled with multivariate statistical data analysis in differentiating between the types of soy sauce on the basis of level of additives, raw materials and fermentation procedures. PMID:27343582
NASA Astrophysics Data System (ADS)
Li, Hongxin; Jiang, Haodong; Gao, Ming; Ma, Zhi; Ma, Chuangui; Wang, Wei
2015-12-01
The statistical fluctuation problem is a critical factor in all quantum key distribution (QKD) protocols under finite-key conditions. The current statistical fluctuation analysis is mainly based on independent random samples, however, the precondition cannot always be satisfied because of different choices of samples and actual parameters. As a result, proper statistical fluctuation methods are required to solve this problem. Taking the after-pulse contributions into consideration, this paper gives the expression for the secure key rate and the mathematical model for statistical fluctuations, focusing on a decoy-state QKD protocol [Z.-C. Wei et al., Sci. Rep. 3, 2453 (2013), 10.1038/srep02453] with a biased basis choice. On this basis, a classified analysis of statistical fluctuation is represented according to the mutual relationship between random samples. First, for independent identical relations, a deviation comparison is made between the law of large numbers and standard error analysis. Second, a sufficient condition is given that the Chernoff bound achieves a better result than Hoeffding's inequality based on only independent relations. Third, by constructing the proper martingale, a stringent way is proposed to deal issues based on dependent random samples through making use of Azuma's inequality. In numerical optimization, the impact on the secure key rate, the comparison of secure key rates, and the respective deviations under various kinds of statistical fluctuation analyses are depicted.
NASA Astrophysics Data System (ADS)
Li, Hui-Chuan
2014-10-01
This study examines students' procedural and conceptual achievement in fraction addition in England and Taiwan. A total of 1209 participants (561 British students and 648 Taiwanese students) at ages 12 and 13 were recruited from England and Taiwan to take part in the study. A quantitative design by means of a self-designed written test is adopted as central to the methodological considerations. The test has two major parts: the concept part and the skill part. The former is concerned with students' conceptual knowledge of fraction addition and the latter is interested in students' procedural competence when adding fractions. There were statistically significant differences both in concept and skill parts between the British and Taiwanese groups with the latter having a higher score. The analysis of the students' responses to the skill section indicates that the superiority of Taiwanese students' procedural achievements over those of their British peers is because most of the former are able to apply algorithms to adding fractions far more successfully than the latter. Earlier, Hart [1] reported that around 30% of the British students in their study used an erroneous strategy (adding tops and bottoms, for example, 2/3 + 1/7 = 3/10) while adding fractions. This study also finds that nearly the same percentage of the British group remained using this erroneous strategy to add fractions as Hart found in 1981. The study also provides evidence to show that students' understanding of fractions is confused and incomplete, even those who are successfully able to perform operations. More research is needed to be done to help students make sense of the operations and eventually attain computational competence with meaningful grounding in the domain of fractions.
The linear statistical d.c. model of GaAs MESFET using factor analysis
NASA Astrophysics Data System (ADS)
Dobrzanski, Lech
1995-02-01
The linear statistical model of the GaAs MESFET's current generator is obtained by means of factor analysis. Three different MESFET deterministic models are taken into account in the analysis: the Statz model (ST), the Materka-type model (MT) and a new proprietary model of MESFET with implanted channel (PLD). It is shown that statistical models obtained using factor analysis provide excellent generation of the multidimensional random variable representing the drain current of MESFET. The method of implementation of the statistical model into the SPICE program is presented. It is proved that for a strongly limited number of Monte Carlo analysis runs in that program, the statistical models considered in each case (ST, MT and PLD) enable good reconstruction of the empirical factor structure. The empirical correlation matrix of model parameters is not reconstructed exactly by statistical modelling, but values of correlation matrix elements obtained from simulated data are within the confidence intervals for the small sample. This paper proves that a formal approach to statistical modelling using factor analysis is the right path to follow, in spite of the fact, that CAD systems (PSpice[MicroSim Corp.], Microwave Harmonica[Compact Software]) are not designed properly for generation of the multidimensional random variable. It is obvious that further progress in implementation of statistical methods in CAD software is required. Furthermore, a new approach to the MESFET's d.c. model is presented. The separate functions, describing the linear as well as the saturated region of MESFET output characteristics, are combined in the single equation. This way of modelling is particularly suitable for transistors with an implanted channel.
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Bassari, Jinous; Triantafyllopoulos, Spiros
1984-01-01
The University of Southwestern Louisiana (USL) NASA PC R and D statistical analysis support package is designed to be a three-level package to allow statistical analysis for a variety of applications within the USL Data Base Management System (DBMS) contract work. The design addresses usage of the statistical facilities as a library package, as an interactive statistical analysis system, and as a batch processing package.
A longitudinal functional analysis framework for analysis of white matter tract statistics.
Yuan, Ying; Gilmore, John H; Geng, Xiujuan; Styner, Martin A; Chen, Kehui; Wang, Jane-Ling; Zhu, Hongtu
2013-01-01
Many longitudinal imaging studies have been/are being widely conducted to use diffusion tensor imaging (DTI) to better understand white matter maturation in normal controls and diseased subjects. There is an urgent demand for the development of statistical methods for analyzing diffusion properties along major fiber tracts obtained from longitudinal DTI studies. Jointly analyzing fiber-tract diffusion properties and covariates from longitudinal studies raises several major challenges including (i) infinite-dimensional functional response data, (ii) complex spatial-temporal correlation structure, and (iii) complex spatial smoothness. To address these challenges, this article is to develop a longitudinal functional analysis framework (LFAF) to delineate the dynamic changes of diffusion properties along major fiber tracts and their association with a set of covariates of interest (e.g., age and group status) and the structure of the variability of these white matter tract properties in various longitudinal studies. Our LFAF consists of a functional mixed effects model for addressing all three challenges, an efficient method for spatially smoothing varying coefficient functions, an estimation method for estimating the spatial-temporal correlation structure, a test procedure with a global test statistic for testing hypotheses of interest associated with functional response, and a simultaneous confidence band for quantifying the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of LFAF and to demonstrate that LFAF significantly outperforms a voxel-wise mixed model method. We apply LFAF to study the spatial-temporal dynamics of white-matter fiber tracts in a clinical study of neurodevelopment.
Dried blood spot analysis of creatinine with LC-MS/MS in addition to immunosuppressants analysis.
Koster, Remco A; Greijdanus, Ben; Alffenaar, Jan-Willem C; Touw, Daan J
2015-02-01
In order to monitor creatinine levels or to adjust the dosage of renally excreted or nephrotoxic drugs, the analysis of creatinine in dried blood spots (DBS) could be a useful addition to DBS analysis. We developed a LC-MS/MS method for the analysis of creatinine in the same DBS extract that was used for the analysis of tacrolimus, sirolimus, everolimus, and cyclosporine A in transplant patients with the use of Whatman FTA DMPK-C cards. The method was validated using three different strategies: a seven-point calibration curve using the intercept of the calibration to correct for the natural presence of creatinine in reference samples, a one-point calibration curve at an extremely high concentration in order to diminish the contribution of the natural presence of creatinine, and the use of creatinine-[(2)H3] with an eight-point calibration curve. The validated range for creatinine was 120 to 480 μmol/L (seven-point calibration curve), 116 to 7000 μmol/L (1-point calibration curve), and 1.00 to 400.0 μmol/L for creatinine-[(2)H3] (eight-point calibration curve). The precision and accuracy results for all three validations showed a maximum CV of 14.0% and a maximum bias of -5.9%. Creatinine in DBS was found stable at ambient temperature and 32 °C for 1 week and at -20 °C for 29 weeks. Good correlations were observed between patient DBS samples and routine enzymatic plasma analysis and showed the capability of the DBS method to be used as an alternative for creatinine plasma measurement.
Lee, L.; Helsel, D.
2005-01-01
Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
New ordering principle for the classical statistical analysis of Poisson processes with background
NASA Astrophysics Data System (ADS)
Giunti, C.
1999-03-01
Inspired by the recent proposal by Feldman and Cousins of a ``unified approach to the classical statistical analysis of small signals'' based on a choice of ordering in Neyman's construction of classical confidence intervals, I propose a new ordering principle for the classical statistical analysis of Poisson processes with a background which minimizes the effect on the resulting confidence intervals of the observation of fewer background events than expected. The new ordering principle is applied to the calculation of the confidence region implied by the recent null result of the KARMEN neutrino oscillation experiment.
Huang, Huei-Chung; Niu, Yi; Qin, Li-Xuan
2015-01-01
Deep sequencing has recently emerged as a powerful alternative to microarrays for the high-throughput profiling of gene expression. In order to account for the discrete nature of RNA sequencing data, new statistical methods and computational tools have been developed for the analysis of differential expression to identify genes that are relevant to a disease such as cancer. In this paper, it is thus timely to provide an overview of these analysis methods and tools. For readers with statistical background, we also review the parameter estimation algorithms and hypothesis testing strategies used in these methods. PMID:26688660
Ribes, Delphine; Parafita, Julia; Charrier, Rémi; Magara, Fulvio; Magistretti, Pierre J; Thiran, Jean-Philippe
2010-11-23
In this article we introduce JULIDE, a software toolkit developed to perform the 3D reconstruction, intensity normalization, volume standardization by 3D image registration and voxel-wise statistical analysis of autoradiographs of mouse brain sections. This software tool has been developed in the open-source ITK software framework and is freely available under a GPL license. The article presents the complete image processing chain from raw data acquisition to 3D statistical group analysis. Results of the group comparison in the context of a study on spatial learning are shown as an illustration of the data that can be obtained with this tool.
Cacuci, Dan G.; Ionescu-Bujor, Mihaela
2004-07-15
Part II of this review paper highlights the salient features of the most popular statistical methods currently used for local and global sensitivity and uncertainty analysis of both large-scale computational models and indirect experimental measurements. These statistical procedures represent sampling-based methods (random sampling, stratified importance sampling, and Latin Hypercube sampling), first- and second-order reliability algorithms (FORM and SORM, respectively), variance-based methods (correlation ratio-based methods, the Fourier Amplitude Sensitivity Test, and the Sobol Method), and screening design methods (classical one-at-a-time experiments, global one-at-a-time design methods, systematic fractional replicate designs, and sequential bifurcation designs). It is emphasized that all statistical uncertainty and sensitivity analysis procedures first commence with the 'uncertainty analysis' stage and only subsequently proceed to the 'sensitivity analysis' stage; this path is the exact reverse of the conceptual path underlying the methods of deterministic sensitivity and uncertainty analysis where the sensitivities are determined prior to using them for uncertainty analysis. By comparison to deterministic methods, statistical methods for uncertainty and sensitivity analysis are relatively easier to develop and use but cannot yield exact values of the local sensitivities. Furthermore, current statistical methods have two major inherent drawbacks as follows: 1. Since many thousands of simulations are needed to obtain reliable results, statistical methods are at best expensive (for small systems) or, at worst, impracticable (e.g., for large time-dependent systems).2. Since the response sensitivities and parameter uncertainties are inherently and inseparably amalgamated in the results produced by these methods, improvements in parameter uncertainties cannot be directly propagated to improve response uncertainties; rather, the entire set of simulations and
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.
Lin, Johnny; Bentler, Peter M
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis.
Obeid, Rita; Brooks, Patricia J; Powers, Kasey L; Gillespie-Lynch, Kristen; Lum, Jarrad A G
2016-01-01
Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD.
Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis
Obeid, Rita; Brooks, Patricia J.; Powers, Kasey L.; Gillespie-Lynch, Kristen; Lum, Jarrad A. G.
2016-01-01
Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD. PMID:27602006
Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis.
Obeid, Rita; Brooks, Patricia J; Powers, Kasey L; Gillespie-Lynch, Kristen; Lum, Jarrad A G
2016-01-01
Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD. PMID:27602006
Statistical Learning in Specific Language Impairment and Autism Spectrum Disorder: A Meta-Analysis
Obeid, Rita; Brooks, Patricia J.; Powers, Kasey L.; Gillespie-Lynch, Kristen; Lum, Jarrad A. G.
2016-01-01
Impairments in statistical learning might be a common deficit among individuals with Specific Language Impairment (SLI) and Autism Spectrum Disorder (ASD). Using meta-analysis, we examined statistical learning in SLI (14 studies, 15 comparisons) and ASD (13 studies, 20 comparisons) to evaluate this hypothesis. Effect sizes were examined as a function of diagnosis across multiple statistical learning tasks (Serial Reaction Time, Contextual Cueing, Artificial Grammar Learning, Speech Stream, Observational Learning, and Probabilistic Classification). Individuals with SLI showed deficits in statistical learning relative to age-matched controls. In contrast, statistical learning was intact in individuals with ASD relative to controls. Effect sizes did not vary as a function of task modality or participant age. Our findings inform debates about overlapping social-communicative difficulties in children with SLI and ASD by suggesting distinct underlying mechanisms. In line with the procedural deficit hypothesis (Ullman and Pierpont, 2005), impaired statistical learning may account for phonological and syntactic difficulties associated with SLI. In contrast, impaired statistical learning fails to account for the social-pragmatic difficulties associated with ASD.
Long-term Statistical Analysis of the Simultaneity of Forbush Decrease Events at Middle Latitudes
NASA Astrophysics Data System (ADS)
Lee, Seongsuk; Oh, Suyeon; Yi, Yu; Evenson, Paul; Jee, Geonhwa; Choi, Hwajin
2015-03-01
Forbush Decreases (FD) are transient, sudden reductions of cosmic ray (CR) intensity lasting a few days, to a week. Such events are observed globally using ground neutron monitors (NMs). Most studies of FD events indicate that an FD event is observed simultaneously at NM stations located all over the Earth. However, using statistical analysis, previous researchers verified that while FD events could occur simultaneously, in some cases, FD events could occur non-simultaneously. Previous studies confirmed the statistical reality of non-simultaneous FD events and the mechanism by which they occur, using data from high-latitude and middle-latitude NM stations. In this study, we used long-term data (1971-2006) from middle-latitude NM stations (Irkutsk, Climax, and Jungfraujoch) to enhance statistical reliability. According to the results from this analysis, the variation of cosmic ray intensity during the main phase, is larger (statistically significant) for simultaneous FD events, than for non-simultaneous ones. Moreover, the distribution of main-phase-onset time shows differences that are statistically significant. While the onset times for the simultaneous FDs are distributed evenly over 24- hour intervals (day and night), those of non-simultaneous FDs are mostly distributed over 12-hour intervals, in daytime. Thus, the existence of the two kinds of FD events, according to differences in their statistical properties, were verified based on data from middle-latitude NM stations.
Statistical analysis of data from dilution assays with censored correlated counts.
Quiroz, Jorge; Wilson, Jeffrey R; Roychoudhury, Satrajit
2012-01-01
Frequently, count data obtained from dilution assays are subject to an upper detection limit, and as such, data obtained from these assays are usually censored. Also, counts from the same subject at different dilution levels are correlated. Ignoring the censoring and the correlation may provide unreliable and misleading results. Therefore, any meaningful data modeling requires that the censoring and the correlation be simultaneously addressed. Such comprehensive approaches of modeling censoring and correlation are not widely used in the analysis of dilution assays data. Traditionally, these data are analyzed using a general linear model on a logarithmic-transformed average count per subject. However, this traditional approach ignores the between-subject variability and risks, providing inconsistent results and unreliable conclusions. In this paper, we propose the use of a censored negative binomial model with normal random effects to analyze such data. This model addresses, in addition to the censoring and the correlation, any overdispersion that may be present in count data. The model is shown to be widely accessible through the use of several modern statistical software.
Statistical analysis of gait maturation in children based on probability density functions.
Wu, Yunfeng; Zhong, Zhangting; Lu, Meng; He, Jia
2011-01-01
Analysis of gait patterns in children is useful for the study of maturation of locomotor control. In this paper, we utilized the Parzen-window method to estimate the probability density functions (PDFs) of the stride interval for 50 children. With the estimated PDFs, the statistical measures, i.e., averaged stride interval (ASI), variation of stride interval (VSI), PDF skewness (SK), and PDF kurtosis (KU), were computed for the gait maturation in three age groups (aged 3-5 years, 6-8 years, and 10-14 years) of young children. The results indicated that the ASI and VSI values are significantly different between the three age groups. The VSI is decreased rapidly until 8 years of age, and then continues to be decreased at a slower rate. The SK values of the PDFs for all of the three age groups are positive, which shows a slight imbalance in the stride interval distribution within each age group. In addition, the decrease of the KU values of the PDFs is age-dependent, which suggests the effects of the musculo-skeletal growth on the gait maturation in young children. PMID:22254641
NASA Astrophysics Data System (ADS)
Um, Myoung-Jin; Kim, Hanbeen; Heo, Jun-Haeng
2016-08-01
A general circulation model (GCM) can be applied to project future climate factors, such as precipitation and atmospheric temperature, to study hydrological and environmental climate change. Although many improvements in GCMs have been proposed recently, projected climate data are still required to be corrected for the biases in generating data before applying the model to practical applications. In this study, a new hybrid process was proposed, and its ability to perform bias correction for the prediction of annual precipitation and annual daily maxima, was tested. The hybrid process in this study was based on quantile mapping with the gamma and generalized extreme value (GEV) distributions and a spline technique to correct the bias of projected daily precipitation. The observed and projected daily precipitation values from the selected stations were analyzed using three bias correction methods, namely, linear scaling, quantile mapping, and hybrid methods. The performances of these methods were analyzed to find the optimal method for prediction of annual precipitation and annual daily maxima. The linear scaling method yielded the best results for estimating the annual average precipitation, while the hybrid method was optimal for predicting the variation in annual precipitation. The hybrid method described the statistical characteristics of the annual maximum series (AMS) similarly to the observed data. In addition, this method demonstrated the lowest root mean squared error (RMSE) and the highest coefficient of determination (R2) for predicting the quantiles of the AMS for the extreme value analysis of precipitation.
Bochanski, John J.; Hawley, Suzanne L.; West, Andrew A.
2011-03-15
We present a statistical parallax analysis of low-mass dwarfs from the Sloan Digital Sky Survey. We calculate absolute r-band magnitudes (M{sub r} ) as a function of color and spectral type and investigate changes in M{sub r} with location in the Milky Way. We find that magnetically active M dwarfs are intrinsically brighter in M{sub r} than their inactive counterparts at the same color or spectral type. Metallicity, as traced by the proxy {zeta}, also affects M{sub r} , with metal-poor stars having fainter absolute magnitudes than higher metallicity M dwarfs at the same color or spectral type. Additionally, we measure the velocity ellipsoid and solar reflex motion for each subsample of M dwarfs. We find good agreement between our measured solar peculiar motion and previous results for similar populations, as well as some evidence for differing motions of early and late M-type populations in U and W velocities that cannot be attributed to asymmetric drift. The reflex solar motion and the velocity dispersions both show that younger populations, as traced by magnetic activity and location near the Galactic plane, have experienced less dynamical heating. We introduce a new parameter, the independent position altitude (IPA), to investigate populations as a function of vertical height from the Galactic plane. M dwarfs at all types exhibit an increase in velocity dispersion when analyzed in comparable IPA subgroups.
Statistical analysis of the MODIS atmosphere products for the Tomsk region
NASA Astrophysics Data System (ADS)
Afonin, Sergey V.; Belov, Vladimir V.; Engel, Marina V.
2005-10-01
The paper presents the results of using the MODIS Atmosphere Products satellite information to study the atmospheric characteristics (the aerosol and water vapor) in the Tomsk Region (56-61°N, 75-90°E) in 2001-2004. The satellite data were received from the NASA Goddard Distributed Active Archive Center (DAAC) through the INTERNET.To use satellite data for a solution of scientific and applied problems, it is very important to know their accuracy. Despite the results of validation of the MODIS data have already been available in the literature, we decided to carry out additional investigations for the Tomsk Region. The paper presents the results of validation of the aerosol optical thickness (AOT) and total column precipitable water (TCPW), which are in good agreement with the test data. The statistical analysis revealed some interesting facts. Thus, for example, analyzing the data on the spatial distribution of the average seasonal values of AOT or TCPW for 2001-2003 in the Tomsk Region, we established that instead of the expected spatial homogeneity of these distributions, they have similar spatial structures.
Chatzisymeon, Efthalia; Xekoukoulotakis, Nikolaos P; Diamadopoulos, Evan; Katsaounis, Alexandros; Mantzavinos, Dionissios
2009-09-01
The electrochemical treatment of olive mill wastewaters (OMW) over boron-doped diamond (BDD) electrodes was investigated. A factorial design methodology was implemented to evaluate the statistically important operating parameters, amongst initial COD load (1000-5000 mg/L), treatment time (1-4h), current intensity (10-20A), initial pH (4-6) and the use of 500 mg/L H(2)O(2) as an additional oxidant, on treatment efficiency; the latter was assessed in terms of COD, phenols, aromatics and color removal. Of the five parameters tested, the first two had a considerable effect on COD removal. Hence, analysis was repeated at more intense conditions, i.e. initial COD values up to 10,000 mg/L and reaction times up to 7h and a simple model was developed and validated to predict COD evolution profiles. The model suggests that the rate of COD degradation is zero order regarding its concentration and agrees well with an electrochemical model for the anodic oxidation of organics over BDD developed elsewhere. The treatability of the undiluted effluent (40,000 mg/L COD) was tested at 20A for 15h yielding 19% COD and 36% phenols' removal respectively with a specific energy consumption of 96 kWh/kg COD removed. Aerobic biodegradability and ecotoxicity assays were also performed to assess the respective effects of electrochemical treatment. PMID:19423147
Gene set analysis for GWAS: assessing the use of modified Kolmogorov-Smirnov statistics.
Debrabant, Birgit; Soerensen, Mette
2014-10-01
We discuss the use of modified Kolmogorov-Smirnov (KS) statistics in the context of gene set analysis and review corresponding null and alternative hypotheses. Especially, we show that, when enhancing the impact of highly significant genes in the calculation of the test statistic, the corresponding test can be considered to infer the classical self-contained null hypothesis. We use simulations to estimate the power for different kinds of alternatives, and to assess the impact of the weight parameter of the modified KS statistic on the power. Finally, we show the analogy between the weight parameter and the genesis and distribution of the gene-level statistics, and illustrate the effects of differential weighting in a real-life example.
The statistical analysis techniques to support the NGNP fuel performance experiments
Binh T. Pham; Jeffrey J. Einerson
2013-10-01
This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He–Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.
NASA Astrophysics Data System (ADS)
Duan, Ji; Liu, Yu; Shen, Yuandeng; Song, Tengfei; Liu, Shunqing; Zhang, Xuefei; Wen, Xiao; Yang, Lei; Lin, Jun; Liu, Zhong; Wang, Jiancheng
2012-04-01
This paper statistically analyzes the meteorological conditions of Mt. Dashanbao by using the meteorological data from the year 1960 to the year 1988, which were collected by the Meteorologic Bureau of the Zhaoyang District, Zhaotong City, Yunnan Province. We focus on meteorological parameters particularly relevant to the selection of an excellent solar observation site. We investigate the annual variation patterns of the parameters with wavelet analysis. We have found that the monthly relative humidity, average sunshine time, and average temperature all have nearly-fixed variation cycles of about one year, which is very important for ground-based solar observation. In addition, these parameters show two distinct steady stages in each year, and we call them the dry season (from November to April of next year) and wet season (from May to October). In the dry season, the monthly relative humidity, mean temperature, and average cloud amount are low, while the montly average sunshine-time and wind-speed values are relatively large. The average wind speed over a long time is still low, even though at some occasions high wind speed can influence observation. In addition, the wind direction is very stable there, so we can easily overcome the drawback of the occurrences of high wind speeds. In the wet season, the site does not fit well for solar observation due to the poor meteorological conditions. However, we do not rule out short-lasting meteorological conditions for observation in the wet season. Moreover, the site on Mt. Dashanbao is not far from the Yunnan Observatory, Chinese Academy of Sciences, Kunming, China. The convenient transportation can save a significant amount of financial resources in the installation, operation, and maintenance of telescope(s). Based on our statistical results, we conclude that Mt. Dashanbao is a potentially excellent solar observation site in the west China. In summary the site has long sunshine time, low coverage of clouds, low wind
A new statistical analysis of rare earth element diffusion data in garnet
NASA Astrophysics Data System (ADS)
Chu, X.; Ague, J. J.
2015-12-01
The incorporation of rare earth elements (REE) in garnet, Sm and Lu in particular, links garnet chemical zoning to absolute age determinations. The application of REE-based geochronology depends critically on the diffusion behaviors of the parent and daughter isotopes. Previous experimental studies on REE diffusion in garnet, however, exhibit significant discrepancies that impact interpretations of garnet Sm/Nd and Lu/Hf ages.We present a new statistical framework to analyze diffusion data for REE using an Arrhenius relationship that accounts for oxygen fugacity, cation radius and garnet unit-cell dimensions [1]. Our approach is based on Bayesian statistics and is implemented by the Markov chain Monte Carlo method. A similar approach has been recently applied to model diffusion of divalent cations in garnet [2]. The analysis incorporates recent data [3] in addition to the data compilation in ref. [1]. We also include the inter-run bias that helps reconcile the discrepancies among data sets. This additional term estimates the reproducibility and other experimental variabilities not explicitly incorporated in the Arrhenius relationship [2] (e.g., compositional dependence [3] and water content).The fitted Arrhenius relationships are consistent with the models in ref. [3], as well as refs. [1]&[4] at high temperatures. Down-temperature extrapolation leads to >0.5 order of magnitude faster diffusion coefficients than in refs. [1]&[4] at <750 °C. The predicted diffusion coefficients are significantly slower than ref. [5]. The fast diffusion [5] was supported by a field test of the Pikwitonei Granulite—the garnet Sm/Nd age postdates the metamorphic peak (750 °C) by ~30 Myr [6], suggesting considerable resetting of the Sm/Nd system during cooling. However, the Pikwitonei Granulite is a recently recognized UHT terrane with peak temperature exceeding 900 °C [7]. The revised closure temperature (~730 °C) is consistent with our new diffusion model.[1] Carlson (2012) Am
Subramanyam, Busetty; Das, Ashutosh
2014-01-01
In adsorption study, to describe sorption process and evaluation of best-fitting isotherm model is a key analysis to investigate the theoretical hypothesis. Hence, numerous statistically analysis have been extensively used to estimate validity of the experimental equilibrium adsorption values with the predicted equilibrium values. Several statistical error analysis were carried out. In the present study, the following statistical analysis were carried out to evaluate the adsorption isotherm model fitness, like the Pearson correlation, the coefficient of determination and the Chi-square test, have been used. The ANOVA test was carried out for evaluating significance of various error functions and also coefficient of dispersion were evaluated for linearised and non-linearised models. The adsorption of phenol onto natural soil (Local name Kalathur soil) was carried out, in batch mode at 30 ± 20 C. For estimating the isotherm parameters, to get a holistic view of the analysis the models were compared between linear and non-linear isotherm models. The result reveled that, among above mentioned error functions and statistical functions were designed to determine the best fitting isotherm. PMID:25018878
NASA Astrophysics Data System (ADS)
Oliveira Mendes, Thiago de; Pinto, Liliane Pereira; Santos, Laurita dos; Tippavajhala, Vamshi Krishna; Téllez Soto, Claudio Alberto; Martin, Airton Abrahão
2016-07-01
The analysis of biological systems by spectroscopic techniques involves the evaluation of hundreds to thousands of variables. Hence, different statistical approaches are used to elucidate regions that discriminate classes of samples and to propose new vibrational markers for explaining various phenomena like disease monitoring, mechanisms of action of drugs, food, and so on. However, the technical statistics are not always widely discussed in applied sciences. In this context, this work presents a detailed discussion including the various steps necessary for proper statistical analysis. It includes univariate parametric and nonparametric tests, as well as multivariate unsupervised and supervised approaches. The main objective of this study is to promote proper understanding of the application of various statistical tools in these spectroscopic methods used for the analysis of biological samples. The discussion of these methods is performed on a set of in vivo confocal Raman spectra of human skin analysis that aims to identify skin aging markers. In the Appendix, a complete routine of data analysis is executed in a free software that can be used by the scientific community involved in these studies.
Statistical analysis of temperature extremes in long-time series from Uppsala
NASA Astrophysics Data System (ADS)
Rydén, Jesper
2011-08-01
Temperature records in Uppsala, Sweden, during the period 1840-2001, are analysed. More precisely, yearly maxima and minima are studied in order to investigate possible trends. Extreme-value distributions are fitted, and a nonstationary model is introduced by allowing for a time-dependent location parameter. Comparisons are made with an estimated trend for mean temperature. In addition, a Mann-Kendall test is performed in order to investigate a present trend. The results obtained from the statistical models agree with those found earlier by descriptive statistics, in particular an increasing trend for the coldest days of the year.
Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop
NASA Technical Reports Server (NTRS)
Morrison, Joseph H.
2010-01-01
A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.
Landing Site Dispersion Analysis and Statistical Assessment for the Mars Phoenix Lander
NASA Technical Reports Server (NTRS)
Bonfiglio, Eugene P.; Adams, Douglas; Craig, Lynn; Spencer, David A.; Strauss, William; Seelos, Frank P.; Seelos, Kimberly D.; Arvidson, Ray; Heet, Tabatha
2008-01-01
The Mars Phoenix Lander launched on August 4, 2007 and successfully landed on Mars 10 months later on May 25, 2008. Landing ellipse predicts and hazard maps were key in selecting safe surface targets for Phoenix. Hazard maps were based on terrain slopes, geomorphology maps and automated rock counts of MRO's High Resolution Imaging Science Experiment (HiRISE) images. The expected landing dispersion which led to the selection of Phoenix's surface target is discussed as well as the actual landing dispersion predicts determined during operations in the weeks, days, and hours before landing. A statistical assessment of these dispersions is performed, comparing the actual landing-safety probabilities to criteria levied by the project. Also discussed are applications for this statistical analysis which were used by the Phoenix project. These include using the statistical analysis used to verify the effectiveness of a pre-planned maneuver menu and calculating the probability of future maneuvers.
Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis
Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun
2014-01-01
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Buttigieg, Pier Luigi; Ramette, Alban
2014-12-01
The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community.
Buttigieg, Pier Luigi; Ramette, Alban
2014-12-01
The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. PMID:25314312
NASA Astrophysics Data System (ADS)
Bachmann, Michael
2013-05-01
The simulation of biomolecular structural transitions such as folding and aggregation does not only require adequate models that reflect the key aspects of the cooperative transition behaviour. It is likewise important to employ thermodynamically correct simulation methods and to perform an accurate subsequent statistical analysis of the data obtained in the simulation. The efficient combination of methodology and analysis can be quite sophisticated, but also very instructive in their feedback to a better understanding of the physics of the underlying cooperative processes that drive the conformational transition. We here show that the density of states, which is the central result of multicanonical sampling and any other generalized-ensemble simulation, serves as the optimal basis for the microcanonical statistical analysis of transitions. The microcanonical inflection-point analysis method, which has been introduced for this purpose recently, is a perfect tool for a precise, unique identification and classification of all structural transitions.
Statistical analysis of interaction between lake seepage rates and groundwater and lake levels
NASA Astrophysics Data System (ADS)
Ala-aho, P.; Rossi, P. M.; Klöve, B.
2012-04-01
measurement locations. Result suggested that underlying hydrogeological conditions dictated the variation of seepage rates to some extent for all of the measurement locations. Correlation analysis of seepage meter measurements and water levels and precipitation indicated that seepage rates in a specific seepage measurement location were influenced by different parts of the hydrogeological system. Location and rate of inseepage were dictated by a different groundwater flow system compared to locations where outseepage were measured. In addition locations of outseepage responded differently to changes in the lake and groundwater levels. Variation of seepage rate in some locations reflected the changes in the regional groundwater system, as other responded to changes in the local flow system. Study shows that a simple statistical analysis of temporal variability of lake seepage rates and lake and groundwater level recordings can give a valuable insight to the dynamics of lake - groundwater interaction. Such understanding is of crucial importance when effects of changes in climate, land use or water extraction needs to be understood and managed in lake - aquifer systems. ACKNOWLEDGEMENTS The project was financed by the 7th framework project GENESIS (226536).
Exstatix: Expandable Statistical Analysis System for the Macintosh. A Software Review.
ERIC Educational Resources Information Center
Ferrell, Barbara G.
The Exstatix statistical analysis software package by K. C. Killion for use with Macintosh computers is evaluated. In evaluating the package, the framework developed by C. J. Ansorge et al. (1986) was used. This framework encompasses features such as transportability of files, compatibility of files with other Macintosh software, and ability to…
ERIC Educational Resources Information Center
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
1977-78 Cost Analysis for Florida Schools and Districts. Statistical Report. Series 79-01.
ERIC Educational Resources Information Center
Florida State Dept. of Education, Tallahassee. Div. of Public Schools.
This statistical report describes some of the cost analysis information available from computer reports produced by the Florida Department of Education. It reproduces examples of Florida school and school district financial data that can be used by state, district, and school-level administrators as they analyze program costs and expenditures. The…
ERIC Educational Resources Information Center
Zhou, Ping; Wang, Qinwen; Yang, Jie; Li, Jingqiu; Guo, Junming; Gong, Zhaohui
2015-01-01
This study aimed to investigate the statuses on the publishing and usage of college biochemistry textbooks in China. A textbook database was constructed and the statistical analysis was adopted to evaluate the textbooks. The results showed that there were 945 (~57%) books for theory teaching, 379 (~23%) books for experiment teaching and 331 (~20%)…
Indexing Combined with Statistical Deflation as a Tool for Analysis of Longitudinal Data.
ERIC Educational Resources Information Center
Babcock, Judith A.
Indexing is a tool that can be used with longitudinal, quantitative data for analysis of relative changes and for comparisons of changes among items. For greater accuracy, raw financial data should be deflated into constant dollars prior to indexing. This paper demonstrates the procedures for indexing, statistical deflation, and the use of…
ERIC Educational Resources Information Center
Lau, Joann M.; Korn, Robert W.
2007-01-01
In this article, the authors present a laboratory exercise in data collection and statistical analysis in biological space using clustered stomates on leaves of "Begonia" plants. The exercise can be done in middle school classes by students making their own slides and seeing imprints of cells, or at the high school level through collecting data of…
Bayesian Statistical Analysis Applied to NAA Data for Neutron Flux Spectrum Determination
NASA Astrophysics Data System (ADS)
Chiesa, D.; Previtali, E.; Sisti, M.
2014-04-01
In this paper, we present a statistical method, based on Bayesian statistics, to evaluate the neutron flux spectrum from the activation data of different isotopes. The experimental data were acquired during a neutron activation analysis (NAA) experiment [A. Borio di Tigliole et al., Absolute flux measurement by NAA at the Pavia University TRIGA Mark II reactor facilities, ENC 2012 - Transactions Research Reactors, ISBN 978-92-95064-14-0, 22 (2012)] performed at the TRIGA Mark II reactor of Pavia University (Italy). In order to evaluate the neutron flux spectrum, subdivided in energy groups, we must solve a system of linear equations containing the grouped cross sections and the activation rate data. We solve this problem with Bayesian statistical analysis, including the uncertainties of the coefficients and the a priori information about the neutron flux. A program for the analysis of Bayesian hierarchical models, based on Markov Chain Monte Carlo (MCMC) simulations, is used to define the problem statistical model and solve it. The energy group fluxes and their uncertainties are then determined with great accuracy and the correlations between the groups are analyzed. Finally, the dependence of the results on the prior distribution choice and on the group cross section data is investigated to confirm the reliability of the analysis.
Development of Statistically Parallel Tests by Analysis of Unique Item Variance.
ERIC Educational Resources Information Center
Ree, Malcolm James
A method for developing statistically parallel tests based on the analysis of unique item variance was developed. A test population of 907 basic airmen trainees were required to estimate the angle at which an object in a photograph was viewed, selecting from eight possibilities. A FORTRAN program known as VARSEL was used to rank all the test items…
The Impact of Training and Demographics in WIA Program Performance: A Statistical Analysis
ERIC Educational Resources Information Center
Moore, Richard W.; Gorman, Philip C.
2009-01-01
The Workforce Investment Act (WIA) measures participant labor market outcomes to drive program performance. This article uses statistical analysis to examine the relationship between participant characteristics and key outcome measures in one large California local WIA program. This study also measures the impact of different training…
NASA Astrophysics Data System (ADS)
Peterlin, Primož
2010-07-01
Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared.
A Statistical Analysis of Infrequent Events on Multiple-Choice Tests that Indicate Probable Cheating
ERIC Educational Resources Information Center
Sundermann, Michael J.
2008-01-01
A statistical analysis of multiple-choice answers is performed to identify anomalies that can be used as evidence of student cheating. The ratio of exact errors in common (EEIC: two students put the same wrong answer for a question) to differences (D: two students get different answers) was found to be a good indicator of cheating under a wide…
ERIC Educational Resources Information Center
Peterlin, Primoz
2010-01-01
Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…
2011-01-01
Background Verbal autopsies provide valuable information for studying mortality patterns in populations that lack reliable vital registration data. Methods for transforming verbal autopsy results into meaningful information for health workers and policymakers, however, are often costly or complicated to use. We present a simple additive algorithm, the Tariff Method (termed Tariff), which can be used for assigning individual cause of death and for determining cause-specific mortality fractions (CSMFs) from verbal autopsy data. Methods Tariff calculates a score, or "tariff," for each cause, for each sign/symptom, across a pool of validated verbal autopsy data. The tariffs are summed for a given response pattern in a verbal autopsy, and this sum (score) provides the basis for predicting the cause of death in a dataset. We implemented this algorithm and evaluated the method's predictive ability, both in terms of chance-corrected concordance at the individual cause assignment level and in terms of CSMF accuracy at the population level. The analysis was conducted separately for adult, child, and neonatal verbal autopsies across 500 pairs of train-test validation verbal autopsy data. Results Tariff is capable of outperforming physician-certified verbal autopsy in most cases. In terms of chance-corrected concordance, the method achieves 44.5% in adults, 39% in children, and 23.9% in neonates. CSMF accuracy was 0.745 in adults, 0.709 in children, and 0.679 in neonates. Conclusions Verbal autopsies can be an efficient means of obtaining cause of death data, and Tariff provides an intuitive, reliable method for generating individual cause assignment and CSMFs. The method is transparent and flexible and can be readily implemented by users without training in statistics or computer science. PMID:21816107
HYPOTHESIS SETTING AND ORDER STATISTIC FOR ROBUST GENOMIC META-ANALYSIS.
Song, Chi; Tseng, George C
2014-01-01
Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values (rth ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.
Quantitative shape analysis with weighted covariance estimates for increased statistical efficiency
2013-01-01
Background The introduction and statistical formalisation of landmark-based methods for analysing biological shape has made a major impact on comparative morphometric analyses. However, a satisfactory solution for including information from 2D/3D shapes represented by ‘semi-landmarks’ alongside well-defined landmarks into the analyses is still missing. Also, there has not been an integration of a statistical treatment of measurement error in the current approaches. Results We propose a procedure based upon the description of landmarks with measurement covariance, which extends statistical linear modelling processes to semi-landmarks for further analysis. Our formulation is based upon a self consistent approach to the construction of likelihood-based parameter estimation and includes corrections for parameter bias, induced by the degrees of freedom within the linear model. The method has been implemented and tested on measurements from 2D fly wing, 2D mouse mandible and 3D mouse skull data. We use these data to explore possible advantages and disadvantages over the use of standard Procrustes/PCA analysis via a combination of Monte-Carlo studies and quantitative statistical tests. In the process we show how appropriate weighting provides not only greater stability but also more efficient use of the available landmark data. The set of new landmarks generated in our procedure (‘ghost points’) can then be used in any further downstream statistical analysis. Conclusions Our approach provides a consistent way of including different forms of landmarks into an analysis and reduces instabilities due to poorly defined points. Our results suggest that the method has the potential to be utilised for the analysis of 2D/3D data, and in particular, for the inclusion of information from surfaces represented by multiple landmark points. PMID:23548043
SEM/EDX spectrum imaging and statistical analysis of a metal/ceramic braze
KOTULA,PAUL G.; KEENAN,MICHAEL R.; ANDERSON,IAN M.
2000-01-25
Energy dispersive x-ray (EDX) spectrum imaging has been performed in a scanning electron microscope (SEM) on a metal/ceramic braze to characterize the elemental distribution near the interface. Statistical methods were utilized to extract the relevant information (i.e., chemical phases and their distributions) from the spectrum image data set in a robust and unbiased way. The raw spectrum image was over 15 Mbytes (7500 spectra) while the statistical analysis resulted in five spectra and five images which describe the phases resolved above the noise level and their distribution in the microstructure.
NASA Technical Reports Server (NTRS)
Koch, Steven E.; Golus, Robert E.
1988-01-01
This paper presents a statistical analysis of the characteristics of the wavelike activity that occurred over the north-central United States on July 11-12, 1981, using data from the Cooperative Convective Precipitation Experiment in Montana. In particular, two distinct wave episodes of about 8-h duration within a longer (33 h) period of wave activity were studied in detail. It is demonstrated that the observed phenomena display features consistent with those of mesoscale gravity waves. The principles of statistical methods used to detect and track mesoscale gravity waves are discussed together with their limitations.
Analysis methods for the determination of anthropogenic additions of P to agricultural soils
Technology Transfer Automated Retrieval System (TEKTRAN)
Phosphorus additions and measurement in soil is of concern on lands where biosolids have been applied. Colorimetric analysis for plant-available P may be inadequate for the accurate assessment of soil P. Phosphate additions in a regulatory environment need to be accurately assessed as the reported...
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments
Dorazio, Robert; Hunter, Margaret
2015-01-01
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
NASA Astrophysics Data System (ADS)
Piersanti, Mirko; Materassi, Massimo; Spogli, Luca; Cicone, Antonio; Alberti, Tommaso
2016-04-01
Highly irregular fluctuations of the power of trans-ionospheric GNSS signals, namely radio power scintillation, are, at least to a large extent, the effect of ionospheric plasma turbulence, a by-product of the non-linear and non-stationary evolution of the plasma fields defining the Earth's upper atmosphere. One could expect the ionospheric turbulence characteristics of inter-scale coupling, local randomness and high time variability to be inherited by the scintillation on radio signals crossing the medium. On this basis, the remote sensing of local features of the turbulent plasma could be expected as feasible by studying radio scintillation. The dependence of the statistical properties of the medium fluctuations on the space- and time-scale is the distinctive character of intermittent turbulent media. In this paper, a multi-scale statistical analysis of some samples of GPS radio scintillation is presented: the idea is that assessing how the statistics of signal fluctuations vary with time scale under different Helio-Geophysical conditions will be of help in understanding the corresponding multi-scale statistics of the turbulent medium causing that scintillation. In particular, two techniques are tested as multi-scale decomposition schemes of the signals: the discrete wavelet analysis and the Empirical Mode Decomposition. The discussion of the results of the one analysis versus the other will be presented, trying to highlight benefits and limits of each scheme, also under suitably different helio-geophysical conditions.
An Application of Multivariate Statistical Analysis for Query-Driven Visualization
Gosink, Luke J.; Garth, Christoph; Anderson, John C.; Bethel, E. Wes; Joy, Kenneth I.
2010-03-01
Abstract?Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E
2014-04-01
This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs.
Belianinov, Alex; Panchapakesan, G.; Lin, Wenzhi; Sales, Brian C.; Sefat, Athena Safa; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.
2014-12-02
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1 x Sex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.
Advanced statistical methods for improved data analysis of NASA astrophysics missions
NASA Astrophysics Data System (ADS)
Feigelson, Eric D.
The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.
NASA Astrophysics Data System (ADS)
Belianinov, Alex; Ganesh, Panchapakesan; Lin, Wenzhi; Sales, Brian C.; Sefat, Athena S.; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.
2014-12-01
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1-xSex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.
On the Interpretation of Running Trends as Summary Statistics for Time Series Analysis
NASA Astrophysics Data System (ADS)
Vigo, Isabel M.; Trottini, Mario; Belda, Santiago
2016-04-01
In recent years, running trends analysis (RTA) has been widely used in climate applied research as summary statistics for time series analysis. There is no doubt that RTA might be a useful descriptive tool, but despite its general use in applied research, precisely what it reveals about the underlying time series is unclear and, as a result, its interpretation is unclear too. This work contributes to such interpretation in two ways: 1) an explicit formula is obtained for the set of time series with a given series of running trends, making it possible to show that running trends, alone, perform very poorly as summary statistics for time series analysis; and 2) an equivalence is established between RTA and the estimation of a (possibly nonlinear) trend component of the underlying time series using a weighted moving average filter. Such equivalence provides a solid ground for RTA implementation and interpretation/validation.
Advanced statistical methods for improved data analysis of NASA astrophysics missions
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.
1992-01-01
The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.
Statistical Analysis of Current Sheets in Three-dimensional Magnetohydrodynamic Turbulence
NASA Astrophysics Data System (ADS)
Zhdankin, Vladimir; Uzdensky, Dmitri A.; Perez, Jean C.; Boldyrev, Stanislav
2013-07-01
We develop a framework for studying the statistical properties of current sheets in numerical simulations of magnetohydrodynamic (MHD) turbulence with a strong guide field, as modeled by reduced MHD. We describe an algorithm that identifies current sheets in a simulation snapshot and then determines their geometrical properties (including length, width, and thickness) and intensities (peak current density and total energy dissipation rate). We then apply this procedure to simulations of reduced MHD and perform a statistical analysis on the obtained population of current sheets. We evaluate the role of reconnection by separately studying the populations of current sheets which contain magnetic X-points and those which do not. We find that the statistical properties of the two populations are different in general. We compare the scaling of these properties to phenomenological predictions obtained for the inertial range of MHD turbulence. Finally, we test whether the reconnecting current sheets are consistent with the Sweet-Parker model.
Statistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech
NASA Astrophysics Data System (ADS)
Přibil, J.; Přibilová, A.
2009-01-01
The paper addresses reflection of microintonation and spectral properties in male and female acted emotional speech. Microintonation component of speech melody is analyzed regarding its spectral and statistical parameters. According to psychological research of emotional speech, different emotions are accompanied by different spectral noise. We control its amount by spectral flatness according to which the high frequency noise is mixed in voiced frames during cepstral speech synthesis. Our experiments are aimed at statistical analysis of cepstral coefficient values and ranges of spectral flatness in three emotions (joy, sadness, anger), and a neutral state for comparison. Calculated histograms of spectral flatness distribution are visually compared and modelled by Gamma probability distribution. Histograms of cepstral coefficient distribution are evaluated and compared using skewness and kurtosis. Achieved statistical results show good correlation comparing male and female voices for all emotional states portrayed by several Czech and Slovak professional actors.
Statistical analysis of CMEs' geoeffectiveness over one year of solar maximum during cycle 23
NASA Astrophysics Data System (ADS)
Schmieder, Brigitte; Bocchialini, Karine; Menvielle, Michel
2016-07-01
Using different propagation models from the Sun to the Earth, we performed a statistical analysis over the year 2002 on CME's geoeffectiveness linked to sudden storm commencements (ssc). We also classified the perturbations of the interplanetary medium that trigger the sscs. For each CME, the sources on the Sun of the CME are identified as well as the properties of the parameters deduced from spacecraft measurements along the path of the CME related event, in the solar atmosphere, the interplanetary medium, and the Earth ionized (magnetosphere and ionosphere) and neutral (thermosphere) environments. The set of observations is statistically analysed so as to evaluate the geoeffectiveness of CMEs in terms of ionospheric and thermospheric signatures, with attention to possible differences related to different kinds of solar sources. The observed Sun-to-Earth travel times are compared to those estimated using the existing models of propagation in the interplanetary medium, and this comparison is used to statistically assess the performances of the various models.
Quantitative Analysis of Polymer Additives with MALDI-TOF MS Using an Internal Standard Approach
NASA Astrophysics Data System (ADS)
Schwarzinger, Clemens; Gabriel, Stefan; Beißmann, Susanne; Buchberger, Wolfgang
2012-06-01
MALDI-TOF MS is used for the qualitative analysis of seven different polymer additives directly from the polymer without tedious sample pretreatment. Additionally, by using a solid sample preparation technique, which avoids the concentration gradient problems known to occur with dried droplets and by adding tetraphenylporphyrine as an internal standard to the matrix, it is possible to perform quantitative analysis of additives directly from the polymer sample. Calibration curves for Tinuvin 770, Tinuvin 622, Irganox 1024, Irganox 1010, Irgafos 168, and Chimassorb 944 are presented, showing coefficients of determination between 0.911 and 0.990.
Quantitative analysis of polymer additives with MALDI-TOF MS using an internal standard approach.
Schwarzinger, Clemens; Gabriel, Stefan; Beißmann, Susanne; Buchberger, Wolfgang
2012-06-01
MALDI-TOF MS is used for the qualitative analysis of seven different polymer additives directly from the polymer without tedious sample pretreatment. Additionally, by using a solid sample preparation technique, which avoids the concentration gradient problems known to occur with dried droplets and by adding tetraphenylporphyrine as an internal standard to the matrix, it is possible to perform quantitative analysis of additives directly from the polymer sample. Calibration curves for Tinuvin 770, Tinuvin 622, Irganox 1024, Irganox 1010, Irgafos 168, and Chimassorb 944 are presented, showing coefficients of determination between 0.911 and 0.990.
Properties of some statistics for AR-ARCH model with application to technical analysis
NASA Astrophysics Data System (ADS)
Huang, Xudong; Liu, Wei
2009-03-01
In this paper, we investigate some popular technical analysis indexes for AR-ARCH model as real stock market. Under the given conditions, we show that the corresponding statistics are asymptotically stationary and the law of large numbers hold for frequencies of the stock prices falling out normal scope of these technical analysis indexes under AR-ARCH, and give the rate of convergence in the case of nonstationary initial values, which give a mathematical rationale for these methods of technical analysis in supervising the security trends.
ERIC Educational Resources Information Center
Green, Jeffrey J.; Stone, Courtenay C.; Zegeye, Abera; Charles, Thomas A.
2009-01-01
Because statistical analysis requires the ability to use mathematics, students typically are required to take one or more prerequisite math courses prior to enrolling in the business statistics course. Despite these math prerequisites, however, many students find it difficult to learn business statistics. In this study, we use an ordered probit…
Weck, P J; Schaffner, D A; Brown, M R; Wicks, R T
2015-02-01
The Bandt-Pompe permutation entropy and the Jensen-Shannon statistical complexity are used to analyze fluctuating time series of three different turbulent plasmas: the magnetohydrodynamic (MHD) turbulence in the plasma wind tunnel of the Swarthmore Spheromak Experiment (SSX), drift-wave turbulence of ion saturation current fluctuations in the edge of the Large Plasma Device (LAPD), and fully developed turbulent magnetic fluctuations of the solar wind taken from the Wind spacecraft. The entropy and complexity values are presented as coordinates on the CH plane for comparison among the different plasma environments and other fluctuation models. The solar wind is found to have the highest permutation entropy and lowest statistical complexity of the three data sets analyzed. Both laboratory data sets have larger values of statistical complexity, suggesting that these systems have fewer degrees of freedom in their fluctuations, with SSX magnetic fluctuations having slightly less complexity than the LAPD edge I(sat). The CH plane coordinates are compared to the shape and distribution of a spectral decomposition of the wave forms. These results suggest that fully developed turbulence (solar wind) occupies the lower-right region of the CH plane, and that other plasma systems considered to be turbulent have less permutation entropy and more statistical complexity. This paper presents use of this statistical analysis tool on solar wind plasma, as well as on an MHD turbulent experimental plasma. PMID:25768612
Weck, P J; Schaffner, D A; Brown, M R; Wicks, R T
2015-02-01
The Bandt-Pompe permutation entropy and the Jensen-Shannon statistical complexity are used to analyze fluctuating time series of three different turbulent plasmas: the magnetohydrodynamic (MHD) turbulence in the plasma wind tunnel of the Swarthmore Spheromak Experiment (SSX), drift-wave turbulence of ion saturation current fluctuations in the edge of the Large Plasma Device (LAPD), and fully developed turbulent magnetic fluctuations of the solar wind taken from the Wind spacecraft. The entropy and complexity values are presented as coordinates on the CH plane for comparison among the different plasma environments and other fluctuation models. The solar wind is found to have the highest permutation entropy and lowest statistical complexity of the three data sets analyzed. Both laboratory data sets have larger values of statistical complexity, suggesting that these systems have fewer degrees of freedom in their fluctuations, with SSX magnetic fluctuations having slightly less complexity than the LAPD edge I(sat). The CH plane coordinates are compared to the shape and distribution of a spectral decomposition of the wave forms. These results suggest that fully developed turbulence (solar wind) occupies the lower-right region of the CH plane, and that other plasma systems considered to be turbulent have less permutation entropy and more statistical complexity. This paper presents use of this statistical analysis tool on solar wind plasma, as well as on an MHD turbulent experimental plasma.
NASA Astrophysics Data System (ADS)
Weck, P. J.; Schaffner, D. A.; Brown, M. R.; Wicks, R. T.
2015-02-01
The Bandt-Pompe permutation entropy and the Jensen-Shannon statistical complexity are used to analyze fluctuating time series of three different turbulent plasmas: the magnetohydrodynamic (MHD) turbulence in the plasma wind tunnel of the Swarthmore Spheromak Experiment (SSX), drift-wave turbulence of ion saturation current fluctuations in the edge of the Large Plasma Device (LAPD), and fully developed turbulent magnetic fluctuations of the solar wind taken from the Wind spacecraft. The entropy and complexity values are presented as coordinates on the CH plane for comparison among the different plasma environments and other fluctuation models. The solar wind is found to have the highest permutation entropy and lowest statistical complexity of the three data sets analyzed. Both laboratory data sets have larger values of statistical complexity, suggesting that these systems have fewer degrees of freedom in their fluctuations, with SSX magnetic fluctuations having slightly less complexity than the LAPD edge Isat. The CH plane coordinates are compared to the shape and distribution of a spectral decomposition of the wave forms. These results suggest that fully developed turbulence (solar wind) occupies the lower-right region of the CH plane, and that other plasma systems considered to be turbulent have less permutation entropy and more statistical complexity. This paper presents use of this statistical analysis tool on solar wind plasma, as well as on an MHD turbulent experimental plasma.
TEGS-CN: A Statistical Method for Pathway Analysis of Genome-wide Copy Number Profile.
Huang, Yen-Tsung; Hsu, Thomas; Christiani, David C
2014-01-01
The effects of copy number alterations make up a significant part of the tumor genome profile, but pathway analyses of these alterations are still not well established. We proposed a novel method to analyze multiple copy numbers of genes within a pathway, termed Test for the Effect of a Gene Set with Copy Number data (TEGS-CN). TEGS-CN was adapted from TEGS, a method that we previously developed for gene expression data using a variance component score test. With additional development, we extend the method to analyze DNA copy number data, accounting for different sizes and thus various numbers of copy number probes in genes. The test statistic follows a mixture of X (2) distributions that can be obtained using permutation with scaled X (2) approximation. We conducted simulation studies to evaluate the size and the power of TEGS-CN and to compare its performance with TEGS. We analyzed a genome-wide copy number data from 264 patients of non-small-cell lung cancer. With the Molecular Signatures Database (MSigDB) pathway database, the genome-wide copy number data can be classified into 1814 biological pathways or gene sets. We investigated associations of the copy number profile of the 1814 gene sets with pack-years of cigarette smoking. Our analysis revealed five pathways with significant P values after Bonferroni adjustment (<2.8 × 10(-5)), including the PTEN pathway (7.8 × 10(-7)), the gene set up-regulated under heat shock (3.6 × 10(-6)), the gene sets involved in the immune profile for rejection of kidney transplantation (9.2 × 10(-6)) and for transcriptional control of leukocytes (2.2 × 10(-5)), and the ganglioside biosynthesis pathway (2.7 × 10(-5)). In conclusion, we present a new method for pathway analyses of copy number data, and causal mechanisms of the five pathways require further study.
Temporal Correlations In Natural Time Analysis and Tsallis Non Extensive Statistical Mechanics
NASA Astrophysics Data System (ADS)
Sarlis, N. V.; Varotsos, P.; Skordas, E. S.
2015-12-01
Upon analyzing the seismic catalog in a new time domain termed natural time[1-3] and employing a sliding natural time window comprising a number of events that would occur in a few months, we find that the fluctuations β of the order parameter of seismicity[4] show a minimum βmin a few months before major earthquakes (EQs)[5,6]. Such a minimum appears simultaneously[7] with the initiation of Seismic Electric Signals activity[8] being the first time in which two geophysical observables of different nature exhibit simultaneous anomalous behavior before major EQs. In addition, we show[9] that each precursory βmin is preceded as well as followed by characteristic changes of temporal correlations between EQ magnitudes identified by the celebrated Detrended Fluctuation Analysis of magnitude time series. We indicate that Tsallis non extensive statistical mechanics[10], in the frame of which kappa distributions arise[11], can capture temporal correlations between EQ magnitudes if complemented with natural time analysis [12]. References P.A. Varotsos, N.V. Sarlis, and E.S. Skordas, Phys Rev E, 66 (2002) 011902. P.A. Varotsos et al., Phys Rev E 72 (2005) 041103. Varotsos P. A., Sarlis N. V. and Skordas E. S., Natural Time Analysis: The new view of time. (Springer-Verlag, Berlin Heidelberg) 2011. N. V. Sarlis, E. S. Skordas and P. A. Varotsos, EPL 91 (2010) 59001. N. V. Sarlis et al., Proc Natl Acad Sci USA 110 (2013) 13734. N. V. Sarlis et al., Proc Natl Acad Sci USA 112 (2015) 986. P. A. Varotsos et al., Tectonophysics, 589 (2013) 116. P. Varotsos and M. Lazaridou, Tectonophysics 188 (1991) 321. P. A. Varotsos, N. V. Sarlis, and E. S. Skordas, J Geophys Res Space Physics, 119 (2014), 9192, doi: 10.1002/2014JA0205800. C. Tsallis, J Stat Phys 52 (1988) 479. G. Livadiotis, and D. J. McComas, J Geophys Res 114 (2009) A11105, doi:10.1029/2009JA014352. N. V. Sarlis, E. S. Skordas and P. A. Varotsos, Phys Rev E 82 (2010) 021110.
Yang, Jinzhong; Woodward, Wendy A.; Reed, Valerie K.; Strom, Eric A.; Perkins, George H.; Tereffe, Welela; Buchholz, Thomas A.; Zhang, Lifei; Balter, Peter; Court, Laurence E.; Li, X. Allen; Dong, Lei
2014-05-01
Purpose: To develop a new approach for interobserver variability analysis. Methods and Materials: Eight radiation oncologists specializing in breast cancer radiation therapy delineated a patient's left breast “from scratch” and from a template that was generated using deformable image registration. Three of the radiation oncologists had previously received training in Radiation Therapy Oncology Group consensus contouring for breast cancer atlas. The simultaneous truth and performance level estimation algorithm was applied to the 8 contours delineated “from scratch” to produce a group consensus contour. Individual Jaccard scores were fitted to a beta distribution model. We also applied this analysis to 2 or more patients, which were contoured by 9 breast radiation oncologists from 8 institutions. Results: The beta distribution model had a mean of 86.2%, standard deviation (SD) of ±5.9%, a skewness of −0.7, and excess kurtosis of 0.55, exemplifying broad interobserver variability. The 3 RTOG-trained physicians had higher agreement scores than average, indicating that their contours were close to the group consensus contour. One physician had high sensitivity but lower specificity than the others, which implies that this physician tended to contour a structure larger than those of the others. Two other physicians had low sensitivity but specificity similar to the others, which implies that they tended to contour a structure smaller than the others. With this information, they could adjust their contouring practice to be more consistent with others if desired. When contouring from the template, the beta distribution model had a mean of 92.3%, SD ± 3.4%, skewness of −0.79, and excess kurtosis of 0.83, which indicated a much better consistency among individual contours. Similar results were obtained for the analysis of 2 additional patients. Conclusions: The proposed statistical approach was able to measure interobserver variability quantitatively and to
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
Bihn T. Pham; Jeffrey J. Einerson
2010-06-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automated processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.
2010-01-01
Background Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. Findings MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. Conclusions The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org. PMID:21159174
Anomalous heat transfer modes of nanofluids: a review based on statistical analysis.
Sergis, Antonis; Hardalupas, Yannis
2011-05-19
This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids.
Lindsey, David A.
2001-01-01
Pebble count data from Quaternary gravel deposits north of Denver, Colo., were analyzed by multivariate statistical methods to identify lithologic factors that might affect aggregate quality. The pebble count data used in this analysis were taken from the map by Colton and Fitch (1974) and are supplemented by data reported by the Front Range Infrastructure Resources Project. This report provides data tables and results of the statistical analysis. The multivariate statistical analysis used here consists of log-contrast principal components analysis (method of Reyment and Savazzi, 1999) followed by rotation of principal components and factor interpretation. Three lithologic factors that might affect aggregate quality were identified: 1) granite and gneiss versus pegmatite, 2) quartz + quartzite versus total volcanic rocks, and 3) total sedimentary rocks (mainly sandstone) versus granite. Factor 1 (grain size of igneous and metamorphic rocks) may represent destruction during weathering and transport or varying proportions of rocks in source areas. Factor 2 (resistant source rocks) represents the dispersion shadow of metaquartzite detritus, perhaps enhanced by resistance of quartz and quartzite during weathering and transport. Factor 3 (proximity to sandstone source) represents dilution of gravel by soft sedimentary rocks (mainly sandstone), which are exposed mainly in hogbacks near the mountain front. Factor 1 probably does not affect aggregate quality. Factor 2 would be expected to enhance aggregate quality as measured by the Los Angeles degradation test. Factor 3 may diminish aggregate quality.
Anomalous heat transfer modes of nanofluids: a review based on statistical analysis
2011-01-01
This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids. PMID:21711932
Anomalous heat transfer modes of nanofluids: a review based on statistical analysis
NASA Astrophysics Data System (ADS)
Sergis, Antonis; Hardalupas, Yannis
2011-05-01
This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids.
Badran, M; Morsy, R; Soliman, H; Elnimr, T
2016-01-01
The trace elements metabolism has been reported to possess specific roles in the pathogenesis and progress of diabetes mellitus. Due to the continuous increase in the population of patients with Type 2 diabetes (T2D), this study aims to assess the levels and inter-relationships of fast blood glucose (FBG) and serum trace elements in Type 2 diabetic patients. This study was conducted on 40 Egyptian Type 2 diabetic patients and 36 healthy volunteers (Hospital of Tanta University, Tanta, Egypt). The blood serum was digested and then used to determine the levels of 24 trace elements using an inductive coupled plasma mass spectroscopy (ICP-MS). Multivariate statistical analysis depended on correlation coefficient, cluster analysis (CA) and principal component analysis (PCA), were used to analysis the data. The results exhibited significant changes in FBG and eight of trace elements, Zn, Cu, Se, Fe, Mn, Cr, Mg, and As, levels in the blood serum of Type 2 diabetic patients relative to those of healthy controls. The statistical analyses using multivariate statistical techniques were obvious in the reduction of the experimental variables, and grouping the trace elements in patients into three clusters. The application of PCA revealed a distinct difference in associations of trace elements and their clustering patterns in control and patients group in particular for Mg, Fe, Cu, and Zn that appeared to be the most crucial factors which related with Type 2 diabetes. Therefore, on the basis of this study, the contributors of trace elements content in Type 2 diabetic patients can be determine and specify with correlation relationship and multivariate statistical analysis, which confirm that the alteration of some essential trace metals may play a role in the development of diabetes mellitus.
Badran, M; Morsy, R; Soliman, H; Elnimr, T
2016-01-01
The trace elements metabolism has been reported to possess specific roles in the pathogenesis and progress of diabetes mellitus. Due to the continuous increase in the population of patients with Type 2 diabetes (T2D), this study aims to assess the levels and inter-relationships of fast blood glucose (FBG) and serum trace elements in Type 2 diabetic patients. This study was conducted on 40 Egyptian Type 2 diabetic patients and 36 healthy volunteers (Hospital of Tanta University, Tanta, Egypt). The blood serum was digested and then used to determine the levels of 24 trace elements using an inductive coupled plasma mass spectroscopy (ICP-MS). Multivariate statistical analysis depended on correlation coefficient, cluster analysis (CA) and principal component analysis (PCA), were used to analysis the data. The results exhibited significant changes in FBG and eight of trace elements, Zn, Cu, Se, Fe, Mn, Cr, Mg, and As, levels in the blood serum of Type 2 diabetic patients relative to those of healthy controls. The statistical analyses using multivariate statistical techniques were obvious in the reduction of the experimental variables, and grouping the trace elements in patients into three clusters. The application of PCA revealed a distinct difference in associations of trace elements and their clustering patterns in control and patients group in particular for Mg, Fe, Cu, and Zn that appeared to be the most crucial factors which related with Type 2 diabetes. Therefore, on the basis of this study, the contributors of trace elements content in Type 2 diabetic patients can be determine and specify with correlation relationship and multivariate statistical analysis, which confirm that the alteration of some essential trace metals may play a role in the development of diabetes mellitus. PMID:26653752
A statistical method for the analysis of nonlinear temperature time series from compost.
Yu, Shouhai; Clark, O Grant; Leonard, Jerry J
2008-04-01
Temperature is widely accepted as a critical indicator of aerobic microbial activity during composting but, to date, little effort has been made to devise an appropriate statistical approach for the analysis of temperature time series. Nonlinear, time-correlated effects have not previously been considered in the statistical analysis of temperature data from composting, despite their importance and the ubiquity of such features. A novel mathematical model is proposed here, based on a modified Gompertz function, which includes nonlinear, time-correlated effects. Methods are shown to estimate initial values for the model parameter. Algorithms in SAS are used to fit the model to different sets of temperature data from passively aerated compost. Methods are then shown for testing the goodness-of-fit of the model to data. Next, a method is described to determine, in a statistically rigorous manner, the significance of differences among the time-correlated characteristics of the datasets as described using the proposed model. An extra-sum-of-squares method was selected for this purpose. Finally, the model and methods are used to analyze a sample dataset and are shown to be useful tools for the statistical comparison of temperature data in composting. PMID:17997302
NASA Astrophysics Data System (ADS)
Wilms, Matthias; Ehrhardt, Jan; Werner, René; Marx, Mirko; Handels, Heinz
2014-03-01
Respiratory motion and its variability lead to location uncertainties in radiation therapy (RT) of thoracic and abdominal tumors. Current approaches for motion compensation in RT are usually driven by respiratory surrogate signals, e.g., spirometry. In this contribution, we present an approach for statistical analysis, modeling and subsequent simulation of surrogate signals on a cycle-by-cycle basis. The simulated signals represent typical patient-specific variations of, e.g., breathing amplitude and cycle period. For the underlying statistical analysis, all breathing cycles of an observed signal are consistently parameterized using approximating B-spline curves. Statistics on breathing cycles are then performed by using the parameters of the B-spline approximations. Assuming that these parameters follow a multivariate Gaussian distribution, realistic time-continuous surrogate signals of arbitrary length can be generated and used to simulate the internal motion of tumors and organs based on a patient-specific diffeomorphic correspondence model. As an example, we show how this approach can be employed in RT treatment planning to calculate tumor appearance probabilities and to statistically assess the impact of respiratory motion and its variability on planned dose distributions.
Confirmatory Factor Analysis of the Statistical Anxiety Rating Scale With Online Graduate Students.
DeVaney, Thomas A
2016-04-01
The Statistical Anxiety Rating Scale was examined using data from a convenience sample of 450 female and 65 male students enrolled in online, graduate-level introductory statistics courses. The mean age of the students was 33.1 (SD = 8.2), and 58.3% had completed six or fewer online courses. The majority of students were enrolled in education or counseling degree programs. Confirmatory factor analysis using unweighted least squares estimation was used to test three proposed models, and alpha coefficients were used to examine the internal consistency. The confirmatory factor analysis results supported the six-factor structure and indicated that proper models should include correlations among the six factors or two second-order factors (anxiety and attitude). Internal consistency estimates ranged from .82 to .95 and were consistent with values reported by previous researchers. The findings suggest that, when measuring statistics anxiety of online students using Statistical Anxiety Rating Scale, researchers and instructors can use scores from the individual subscales or generate two composite scores, anxiety and attitude, instead of a total score. PMID:27154380
NASA Astrophysics Data System (ADS)
Song, L. Y.; Wang, H. Q.; Gao, J. J.; Yang, J. F.; Liu, W. B.; Chen, P.
2012-05-01
Condition diagnosis of roller bearings depends largely on the feature analysis of vibration signals. Spectrum statistics filter (SSF) method could adaptively reduce the noise. This method is based on hypothesis testing in the frequency domain to eliminate the identical component between the reference signal and the primary signal. This paper presents a statistical parameter namely similarity factor to evaluate the filtering performance. The performance of the method is compared with the classical method, band pass filter (BPF). Results show that statistics filter is preferable to BPF in vibration signal processing. Moreover, the significance level awould be optimized by genetic algorithms. However, it is very difficult to identify fault states only from time domain waveform or frequency spectrum when the effect of the noise is so strong or fault feature is not obvious. Pattern recognition is then applied to fault diagnosis in this study through system clustering method. This paper processes experiment rig data that after statistics filter, and the accuracy of clustering analysis increases substantially.
Spouge, J L
1992-08-15
Reports on retroviral primate trials rarely publish any statistical analysis. Present statistical methodology lacks appropriate tests for these trials and effectively discourages quantitative assessment. This paper describes the theory behind VACMAN, a user-friendly computer program that calculates statistics for in vitro and in vivo infectivity data. VACMAN's analysis applies to many retroviral trials using i.v. challenges and is valid whenever the viral dose-response curve has a particular shape. Statistics from actual i.v. retroviral trials illustrate some unappreciated principles of effective animal use: dilutions other than 1:10 can improve titration accuracy; infecting titration animals at the lowest doses possible can lower challenge doses; and finally, challenging test animals in small trials with more virus than controls safeguards against false successes, "reuses" animals, and strengthens experimental conclusions. The theory presented also explains the important concept of viral saturation, a phenomenon that may cause in vitro and in vivo titrations to agree for some retroviral strains and disagree for others.
Tenan, Matthew S; Marti, C Nathan; Griffin, Lisa
2014-12-01
Statistical analysis of motor unit discharge rate commonly uses the ordinary least squares based ANOVA and regression analyses or a repeated-measures ANOVA is used to account for within motor unit variance when the same motor unit is assessed multiple times. Both of these methods assume statistical independence of multiple motor units assessed within an individual. This investigation details two studies which quantify the statistical dependence of motor units within an individual. During a ramp contraction, motor unit initial discharge rate is mildly correlated within an individual (ICC: 0.11), though accounting for this effect significantly impacts regression analysis (p=0.01). When a contraction is held at constant force and multiple observations are made on a motor unit, the motor unit discharges are more highly correlated (ICC: 0.41), even after accounting for the effects of multiple motor unit observations. A subject-level ICC of 0.01 can increase Type 1 error rate to 3.9-19.7%, depending on the number of motor units and study subjects. The increase in Type 1 error due to subject-level effects can be mitigated through the use of multilevel modeling techniques. This study details the use and benefit of multilevel models when statistically analyzing motor unit discharge data.
Spouge, J L
1992-01-01
Reports on retroviral primate trials rarely publish any statistical analysis. Present statistical methodology lacks appropriate tests for these trials and effectively discourages quantitative assessment. This paper describes the theory behind VACMAN, a user-friendly computer program that calculates statistics for in vitro and in vivo infectivity data. VACMAN's analysis applies to many retroviral trials using i.v. challenges and is valid whenever the viral dose-response curve has a particular shape. Statistics from actual i.v. retroviral trials illustrate some unappreciated principles of effective animal use: dilutions other than 1:10 can improve titration accuracy; infecting titration animals at the lowest doses possible can lower challenge doses; and finally, challenging test animals in small trials with more virus than controls safeguards against false successes, "reuses" animals, and strengthens experimental conclusions. The theory presented also explains the important concept of viral saturation, a phenomenon that may cause in vitro and in vivo titrations to agree for some retroviral strains and disagree for others. PMID:1323844
NASA Astrophysics Data System (ADS)
Ohyanagi, S.; Dileonardo, C.
2013-12-01
As a natural phenomenon earthquake occurrence is difficult to predict. Statistical analysis of earthquake data was performed using candlestick chart and Bollinger Band methods. These statistical methods, commonly used in the financial world to analyze market trends were tested against earthquake data. Earthquakes above Mw 4.0 located on shore of Sanriku (37.75°N ~ 41.00°N, 143.00°E ~ 144.50°E) from February 1973 to May 2013 were selected for analysis. Two specific patterns in earthquake occurrence were recognized through the analysis. One is a spread of candlestick prior to the occurrence of events greater than Mw 6.0. A second pattern shows convergence in the Bollinger Band, which implies a positive or negative change in the trend of earthquakes. Both patterns match general models for the buildup and release of strain through the earthquake cycle, and agree with both the characteristics of the candlestick chart and Bollinger Band analysis. These results show there is a high correlation between patterns in earthquake occurrence and trend analysis by these two statistical methods. The results of this study agree with the appropriateness of the application of these financial analysis methods to the analysis of earthquake occurrence.
ROOT — A C++ framework for petabyte data storage, statistical analysis and visualization
NASA Astrophysics Data System (ADS)
Antcheva, I.; Ballintijn, M.; Bellenot, B.; Biskup, M.; Brun, R.; Buncic, N.; Canal, Ph.; Casadei, D.; Couet, O.; Fine, V.; Franco, L.; Ganis, G.; Gheata, A.; Maline, D. Gonzalez; Goto, M.; Iwaszkiewicz, J.; Kreshuk, A.; Segura, D. Marcos; Maunder, R.; Moneta, L.; Naumann, A.; Offermann, E.; Onuchin, V.; Panacek, S.; Rademakers, F.; Russo, P.; Tadel, M.
2011-06-01
new TEfficiency class has been provided to handle the calculation of efficiencies and their uncertainties, TH2Poly for polygon-shaped bins (e.g. maps), TKDE for kernel density estimation, and TSVDUnfold for singular value decomposition. Graphics Kerning is now supported in TLatex, PostScript and PDF; a table of contents can be added to PDF files. A new font provides italic symbols. A TPad containing GL can be stored in a binary (i.e. non-vector) image file; add support for full-scene anti-aliasing. Usability enhancements to EVE. Math New interfaces for generating random number according to a given distribution, goodness of fit tests of unbinned data, binning multidimensional data, and several advanced statistical functions were added. RooFit Introduction of HistFactory; major additions to RooStats. TMVA Updated to version 4.1.0, adding e.g. the support for simultaneous classification of multiple output classes for several multivariate methods. PROOF Many new features, adding to PROOF's usability, plus improvements and fixes. PyROOT Support of Python 3 has been added. Tutorials Several new tutorials were provided for above new features (notably RooStats). A detailed list of all the changes is available at http://root.cern.ch/root/htmldoc/examples/V5. Additional comments: For an up-to-date author list see: http://root.cern.ch/drupal/content/root-development-team and http://root.cern.ch/drupal/content/former-root-developers. The distribution file for this program is over 30 Mbytes and therefore is not delivered directly when download or E-mail is requested. Instead a html file giving details of how the program can be obtained is sent. Running time: Depending on the data size and complexity of analysis algorithms. References: id="pr0100" view="all">http://root.cern.ch. http://root.cern.ch/drupal/content/production-version-528. I. Antcheva, M. Ballintijn, B. Bellenot, M. Biskup, R. Brun, N. Buncic, Ph. Canal, D. Casadei, O. Couet, V. Fine, L. Franco, G. Ganis, A. Gheata, D
Statistical assessment on a combined analysis of GRYN-ROMN-UCBN upland vegetation vital signs
Irvine, Kathryn M.; Rodhouse, Thomas J.
2014-01-01
As of 2013, Rocky Mountain and Upper Columbia Basin Inventory and Monitoring Networks have multiple years of vegetation data and Greater Yellowstone Network has three years of vegetation data and monitoring is ongoing in all three networks. Our primary objective is to assess whether a combined analysis of these data aimed at exploring correlations with climate and weather data is feasible. We summarize the core survey design elements across protocols and point out the major statistical challenges for a combined analysis at present. The dissimilarity in response designs between ROMN and UCBN-GRYN network protocols presents a statistical challenge that has not been resolved yet. However, the UCBN and GRYN data are compatible as they implement a similar response design; therefore, a combined analysis is feasible and will be pursued in future. When data collected by different networks are combined, the survey design describing the merged dataset is (likely) a complex survey design. A complex survey design is the result of combining datasets from different sampling designs. A complex survey design is characterized by unequal probability sampling, varying stratification, and clustering (see Lohr 2010 Chapter 7 for general overview). Statistical analysis of complex survey data requires modifications to standard methods, one of which is to include survey design weights within a statistical model. We focus on this issue for a combined analysis of upland vegetation from these networks, leaving other topics for future research. We conduct a simulation study on the possible effects of equal versus unequal probability selection of points on parameter estimates of temporal trend using available packages within the R statistical computing package. We find that, as written, using lmer or lm for trend detection in a continuous response and clm and clmm for visually estimated cover classes with “raw” GRTS design weights specified for the weight argument leads to substantially