Information Measures for Statistical Orbit Determination
ERIC Educational Resources Information Center
Mashiku, Alinda K.
2013-01-01
The current Situational Space Awareness (SSA) is faced with a huge task of tracking the increasing number of space objects. The tracking of space objects requires frequent and accurate monitoring for orbit maintenance and collision avoidance using methods for statistical orbit determination. Statistical orbit determination enables us to obtain…
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses
Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy
2015-01-01
Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579
Characterizing the D2 statistic: word matches in biological sequences.
Forêt, Sylvain; Wilson, Susan R; Burden, Conrad J
2009-01-01
Word matches are often used in sequence comparison methods, either as a measure of sequence similarity or in the first search steps of algorithms such as BLAST or BLAT. The D2 statistic is the number of matches of words of k letters between two sequences. Recent advances have been made in the characterization of this statistic and in the approximation of its distribution. Here, these results are extended to the case of approximate word matches. We compute the exact value of the variance of the D2 statistic for the case of a uniform letter distribution, and introduce a method to provide accurate approximations of the variance in the remaining cases. This enables the distribution of D2 to be approximated for typical situations arising in biological research. We apply these results to the identification of cis-regulatory modules, and show that this method detects such sequences with a high accuracy. The ability to approximate the distribution of D2 for both exact and approximate word matches will enable the use of this statistic in a more precise manner for sequence comparison, database searches, and identification of transcription factor binding sites.
Evaluation of assumptions in soil moisture triple collocation analysis
USDA-ARS?s Scientific Manuscript database
Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually-independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and ort...
2010-02-28
implemented a fast method to enable the statistical characterization of electromagnetic interference and compatibility (EMI/EMC) phenomena on electrically...higher accuracy is needed, e.g., to compute higher moment statistics . To address this problem, we have developed adaptive stochastic collocation methods ...SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) AF OFFICE OF SCIENTIFIC RESEARCH 875 N. RANDOLPH ST. ROOM 3112 ARLINGTON VA 22203 UA
Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics
NASA Astrophysics Data System (ADS)
Abe, Sumiyoshi
2014-11-01
The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.
Does daily nurse staffing match ward workload variability? Three hospitals' experiences.
Gabbay, Uri; Bukchin, Michael
2009-01-01
Nurse shortage and rising healthcare resource burdens mean that appropriate workforce use is imperative. This paper aims to evaluate whether daily nursing staffing meets ward workload needs. Nurse attendance and daily nurses' workload capacity in three hospitals were evaluated. Statistical process control was used to evaluate intra-ward nurse workload capacity and day-to-day variations. Statistical process control is a statistics-based method for process monitoring that uses charts with predefined target measure and control limits. Standardization was performed for inter-ward analysis by converting ward-specific crude measures to ward-specific relative measures by dividing observed/expected. Two charts: acceptable and tolerable daily nurse workload intensity, were defined. Appropriate staffing indicators were defined as those exceeding predefined rates within acceptable and tolerable limits (50 percent and 80 percent respectively). A total of 42 percent of the overall days fell within acceptable control limits and 71 percent within tolerable control limits. Appropriate staffing indicators were met in only 33 percent of wards regarding acceptable nurse workload intensity and in only 45 percent of wards regarding tolerable workloads. The study work did not differentiate crude nurse attendance and it did not take into account patient severity since crude bed occupancy was used. Double statistical process control charts and certain staffing indicators were used, which is open to debate. Wards that met appropriate staffing indicators prove the method's feasibility. Wards that did not meet appropriate staffing indicators prove the importance and the need for process evaluations and monitoring. Methods presented for monitoring daily staffing appropriateness are simple to implement either for intra-ward day-to-day variation by using nurse workload capacity statistical process control charts or for inter-ward evaluation using standardized measure of nurse workload intensity. The real challenge will be to develop planning systems and implement corrective interventions such as dynamic and flexible daily staffing, which will face difficulties and barriers. The paper fulfils the need for workforce utilization evaluation. A simple method using available data for daily staffing appropriateness evaluation, which is easy to implement and operate, is presented. The statistical process control method enables intra-ward evaluation, while standardization by converting crude into relative measures enables inter-ward analysis. The staffing indicator definitions enable performance evaluation. This original study uses statistical process control to develop simple standardization methods and applies straightforward statistical tools. This method is not limited to crude measures, rather it uses weighted workload measures such as nursing acuity or weighted nurse level (i.e. grade/band).
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-07-05
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.
Some connections between importance sampling and enhanced sampling methods in molecular dynamics.
Lie, H C; Quer, J
2017-11-21
In molecular dynamics, enhanced sampling methods enable the collection of better statistics of rare events from a reference or target distribution. We show that a large class of these methods is based on the idea of importance sampling from mathematical statistics. We illustrate this connection by comparing the Hartmann-Schütte method for rare event simulation (J. Stat. Mech. Theor. Exp. 2012, P11004) and the Valsson-Parrinello method of variationally enhanced sampling [Phys. Rev. Lett. 113, 090601 (2014)]. We use this connection in order to discuss how recent results from the Monte Carlo methods literature can guide the development of enhanced sampling methods.
Some connections between importance sampling and enhanced sampling methods in molecular dynamics
NASA Astrophysics Data System (ADS)
Lie, H. C.; Quer, J.
2017-11-01
In molecular dynamics, enhanced sampling methods enable the collection of better statistics of rare events from a reference or target distribution. We show that a large class of these methods is based on the idea of importance sampling from mathematical statistics. We illustrate this connection by comparing the Hartmann-Schütte method for rare event simulation (J. Stat. Mech. Theor. Exp. 2012, P11004) and the Valsson-Parrinello method of variationally enhanced sampling [Phys. Rev. Lett. 113, 090601 (2014)]. We use this connection in order to discuss how recent results from the Monte Carlo methods literature can guide the development of enhanced sampling methods.
Langoju, Rajesh; Patil, Abhijit; Rastogi, Pramod
2007-11-20
Signal processing methods based on maximum-likelihood theory, discrete chirp Fourier transform, and spectral estimation methods have enabled accurate measurement of phase in phase-shifting interferometry in the presence of nonlinear response of the piezoelectric transducer to the applied voltage. We present the statistical study of these generalized nonlinear phase step estimation methods to identify the best method by deriving the Cramér-Rao bound. We also address important aspects of these methods for implementation in practical applications and compare the performance of the best-identified method with other bench marking algorithms in the presence of harmonics and noise.
Kawata, Masaaki; Sato, Chikara
2007-06-01
In determining the three-dimensional (3D) structure of macromolecular assemblies in single particle analysis, a large representative dataset of two-dimensional (2D) average images from huge number of raw images is a key for high resolution. Because alignments prior to averaging are computationally intensive, currently available multireference alignment (MRA) software does not survey every possible alignment. This leads to misaligned images, creating blurred averages and reducing the quality of the final 3D reconstruction. We present a new method, in which multireference alignment is harmonized with classification (multireference multiple alignment: MRMA). This method enables a statistical comparison of multiple alignment peaks, reflecting the similarities between each raw image and a set of reference images. Among the selected alignment candidates for each raw image, misaligned images are statistically excluded, based on the principle that aligned raw images of similar projections have a dense distribution around the correctly aligned coordinates in image space. This newly developed method was examined for accuracy and speed using model image sets with various signal-to-noise ratios, and with electron microscope images of the Transient Receptor Potential C3 and the sodium channel. In every data set, the newly developed method outperformed conventional methods in robustness against noise and in speed, creating 2D average images of higher quality. This statistically harmonized alignment-classification combination should greatly improve the quality of single particle analysis.
Methods for the evaluation of alternative disaster warning systems
NASA Technical Reports Server (NTRS)
Agnew, C. E.; Anderson, R. J., Jr.; Lanen, W. N.
1977-01-01
For each of the methods identified, a theoretical basis is provided and an illustrative example is described. The example includes sufficient realism and detail to enable an analyst to conduct an evaluation of other systems. The methods discussed in the study include equal capability cost analysis, consumers' surplus, and statistical decision theory.
ERIC Educational Resources Information Center
Peterson, Ivars
1991-01-01
A method that enables people to obtain the benefits of statistics and probability theory without the shortcomings of conventional methods because it is free of mathematical formulas and is easy to understand and use is described. A resampling technique called the "bootstrap" is discussed in terms of application and development. (KR)
Using data warehousing and OLAP in public health care.
Hristovski, D; Rogac, M; Markota, M
2000-01-01
The paper describes the possibilities of using data warehousing and OLAP technologies in public health care in general and then our own experience with these technologies gained during the implementation of a data warehouse of outpatient data at the national level. Such a data warehouse serves as a basis for advanced decision support systems based on statistical, OLAP or data mining methods. We used OLAP to enable interactive exploration and analysis of the data. We found out that data warehousing and OLAP are suitable for the domain of public health and that they enable new analytical possibilities in addition to the traditional statistical approaches.
Using data warehousing and OLAP in public health care.
Hristovski, D.; Rogac, M.; Markota, M.
2000-01-01
The paper describes the possibilities of using data warehousing and OLAP technologies in public health care in general and then our own experience with these technologies gained during the implementation of a data warehouse of outpatient data at the national level. Such a data warehouse serves as a basis for advanced decision support systems based on statistical, OLAP or data mining methods. We used OLAP to enable interactive exploration and analysis of the data. We found out that data warehousing and OLAP are suitable for the domain of public health and that they enable new analytical possibilities in addition to the traditional statistical approaches. PMID:11079907
Gooding, Owen W
2004-06-01
The use of parallel synthesis techniques with statistical design of experiment (DoE) methods is a powerful combination for the optimization of chemical processes. Advances in parallel synthesis equipment and easy to use software for statistical DoE have fueled a growing acceptance of these techniques in the pharmaceutical industry. As drug candidate structures become more complex at the same time that development timelines are compressed, these enabling technologies promise to become more important in the future.
Virtualising the Quantitative Research Methods Course: An Island-Based Approach
ERIC Educational Resources Information Center
Baglin, James; Reece, John; Baker, Jenalle
2015-01-01
Many recent improvements in pedagogical practice have been enabled by the rapid development of innovative technologies, particularly for teaching quantitative research methods and statistics. This study describes the design, implementation, and evaluation of a series of specialised computer laboratory sessions. The sessions combined the use of an…
Automatic identification of bacterial types using statistical imaging methods
NASA Astrophysics Data System (ADS)
Trattner, Sigal; Greenspan, Hayit; Tepper, Gapi; Abboud, Shimon
2003-05-01
The objective of the current study is to develop an automatic tool to identify bacterial types using computer-vision and statistical modeling techniques. Bacteriophage (phage)-typing methods are used to identify and extract representative profiles of bacterial types, such as the Staphylococcus Aureus. Current systems rely on the subjective reading of plaque profiles by human expert. This process is time-consuming and prone to errors, especially as technology is enabling the increase in the number of phages used for typing. The statistical methodology presented in this work, provides for an automated, objective and robust analysis of visual data, along with the ability to cope with increasing data volumes.
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-01-01
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell. DOI: http://dx.doi.org/10.7554/eLife.26580.001 PMID:28678007
Active Subspace Methods for Data-Intensive Inverse Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qiqi
2017-04-27
The project has developed theory and computational tools to exploit active subspaces to reduce the dimension in statistical calibration problems. This dimension reduction enables MCMC methods to calibrate otherwise intractable models. The same theoretical and computational tools can also reduce the measurement dimension for calibration problems that use large stores of data.
Confidence interval or p-value?: part 4 of a series on evaluation of scientific publications.
du Prel, Jean-Baptist; Hommel, Gerhard; Röhrig, Bernd; Blettner, Maria
2009-05-01
An understanding of p-values and confidence intervals is necessary for the evaluation of scientific articles. This article will inform the reader of the meaning and interpretation of these two statistical concepts. The uses of these two statistical concepts and the differences between them are discussed on the basis of a selective literature search concerning the methods employed in scientific articles. P-values in scientific studies are used to determine whether a null hypothesis formulated before the performance of the study is to be accepted or rejected. In exploratory studies, p-values enable the recognition of any statistically noteworthy findings. Confidence intervals provide information about a range in which the true value lies with a certain degree of probability, as well as about the direction and strength of the demonstrated effect. This enables conclusions to be drawn about the statistical plausibility and clinical relevance of the study findings. It is often useful for both statistical measures to be reported in scientific articles, because they provide complementary types of information.
Chung, Chi-Jung; Kuo, Yu-Chen; Hsieh, Yun-Yu; Li, Tsai-Chung; Lin, Cheng-Chieh; Liang, Wen-Miin; Liao, Li-Na; Li, Chia-Ing; Lin, Hsueh-Chun
2017-11-01
This study applied open source technology to establish a subject-enabled analytics model that can enhance measurement statistics of case studies with the public health data in cloud computing. The infrastructure of the proposed model comprises three domains: 1) the health measurement data warehouse (HMDW) for the case study repository, 2) the self-developed modules of online health risk information statistics (HRIStat) for cloud computing, and 3) the prototype of a Web-based process automation system in statistics (PASIS) for the health risk assessment of case studies with subject-enabled evaluation. The system design employed freeware including Java applications, MySQL, and R packages to drive a health risk expert system (HRES). In the design, the HRIStat modules enforce the typical analytics methods for biomedical statistics, and the PASIS interfaces enable process automation of the HRES for cloud computing. The Web-based model supports both modes, step-by-step analysis and auto-computing process, respectively for preliminary evaluation and real time computation. The proposed model was evaluated by computing prior researches in relation to the epidemiological measurement of diseases that were caused by either heavy metal exposures in the environment or clinical complications in hospital. The simulation validity was approved by the commercial statistics software. The model was installed in a stand-alone computer and in a cloud-server workstation to verify computing performance for a data amount of more than 230K sets. Both setups reached efficiency of about 10 5 sets per second. The Web-based PASIS interface can be used for cloud computing, and the HRIStat module can be flexibly expanded with advanced subjects for measurement statistics. The analytics procedure of the HRES prototype is capable of providing assessment criteria prior to estimating the potential risk to public health. Copyright © 2017 Elsevier B.V. All rights reserved.
An observational method for fast stochastic X-ray polarimetry timing
NASA Astrophysics Data System (ADS)
Ingram, Adam R.; Maccarone, Thomas J.
2017-11-01
The upcoming launch of the first space based X-ray polarimeter in ˜40 yr will provide powerful new diagnostic information to study accreting compact objects. In particular, analysis of rapid variability of the polarization degree and angle will provide the opportunity to probe the relativistic motions of material in the strong gravitational fields close to the compact objects, and enable new methods to measure black hole and neutron star parameters. However, polarization properties are measured in a statistical sense, and a statistically significant polarization detection requires a fairly long exposure, even for the brightest objects. Therefore, the sub-minute time-scales of interest are not accessible using a direct time-resolved analysis of polarization degree and angle. Phase-folding can be used for coherent pulsations, but not for stochastic variability such as quasi-periodic oscillations. Here, we introduce a Fourier method that enables statistically robust detection of stochastic polarization variability for arbitrarily short variability time-scales. Our method is analogous to commonly used spectral-timing techniques. We find that it should be possible in the near future to detect the quasi-periodic swings in polarization angle predicted by Lense-Thirring precession of the inner accretion flow. This is contingent on the mean polarization degree of the source being greater than ˜4-5 per cent, which is consistent with the best current constraints on Cygnus X-1 from the late 1970s.
Methods for Evaluating Mammography Imaging Techniques
1999-06-01
Distribution Unlimited 12b. DIS5TRIBUTION CODE 13. ABSTRACT (Maximum 200 words) This Department of Defense Breast Cancer Research Program Career...Development Award is enabling Dr. Rütter to develop bio’statistical methods for breast cancer research. Dr. Rutter is focusing on methods for...evaluating the accuracy of breast cancer screening. This four year program includes advanced training in the epidemiology of breast cancer , training in
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vidal-Codina, F., E-mail: fvidal@mit.edu; Nguyen, N.C., E-mail: cuongng@mit.edu; Giles, M.B., E-mail: mike.giles@maths.ox.ac.uk
We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basismore » approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method.« less
Statistical methods for change-point detection in surface temperature records
NASA Astrophysics Data System (ADS)
Pintar, A. L.; Possolo, A.; Zhang, N. F.
2013-09-01
We describe several statistical methods to detect possible change-points in a time series of values of surface temperature measured at a meteorological station, and to assess the statistical significance of such changes, taking into account the natural variability of the measured values, and the autocorrelations between them. These methods serve to determine whether the record may suffer from biases unrelated to the climate signal, hence whether there may be a need for adjustments as considered by M. J. Menne and C. N. Williams (2009) "Homogenization of Temperature Series via Pairwise Comparisons", Journal of Climate 22 (7), 1700-1717. We also review methods to characterize patterns of seasonality (seasonal decomposition using monthly medians or robust local regression), and explain the role they play in the imputation of missing values, and in enabling robust decompositions of the measured values into a seasonal component, a possible climate signal, and a station-specific remainder. The methods for change-point detection that we describe include statistical process control, wavelet multi-resolution analysis, adaptive weights smoothing, and a Bayesian procedure, all of which are applicable to single station records.
Line identification studies using traditional techniques and wavelength coincidence statistics
NASA Technical Reports Server (NTRS)
Cowley, Charles R.; Adelman, Saul J.
1990-01-01
Traditional line identification techniques result in the assignment of individual lines to an atomic or ionic species. These methods may be supplemented by wavelength coincidence statistics (WCS). The strength and weakness of these methods are discussed using spectra of a number of normal and peculiar B and A stars that have been studied independently by both methods. The present results support the overall findings of some earlier studies. WCS would be most useful in a first survey, before traditional methods have been applied. WCS can quickly make a global search for all species and in this way may enable identifications of an unexpected spectrum that could easily be omitted entirely from a traditional study. This is illustrated by O I. WCS is a subject to well known weakness of any statistical technique, for example, a predictable number of spurious results are to be expected. The danger of small number statistics are illustrated. WCS is at its best relative to traditional methods in finding a line-rich atomic species that is only weakly present in a complicated stellar spectrum.
Improved dynamical scaling analysis using the kernel method for nonequilibrium relaxation.
Echinaka, Yuki; Ozeki, Yukiyasu
2016-10-01
The dynamical scaling analysis for the Kosterlitz-Thouless transition in the nonequilibrium relaxation method is improved by the use of Bayesian statistics and the kernel method. This allows data to be fitted to a scaling function without using any parametric model function, which makes the results more reliable and reproducible and enables automatic and faster parameter estimation. Applying this method, the bootstrap method is introduced and a numerical discrimination for the transition type is proposed.
Inferring Demographic History Using Two-Locus Statistics.
Ragsdale, Aaron P; Gutenkunst, Ryan N
2017-06-01
Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.
Synchronization for Optical PPM Signals
NASA Technical Reports Server (NTRS)
Vilnrotter, V. A.
1985-01-01
Method based on statistical properties of weak pulse-positionmodulated (PPM) signal enables synchronization of receiver clock with received-signal time base. Method applies to weak optical M-ary PPM signals, for which there is only one pulse of length Tp transmitted during one of timeslots of length T in each successive interval of M timeslots. Method requires small dead time, Td, at beginning and end of each timeslot, during which pulse amplitude is zero.
NASA Astrophysics Data System (ADS)
Chakravarty, T.; Chowdhury, A.; Ghose, A.; Bhaumik, C.; Balamuralidhar, P.
2014-03-01
Telematics form an important technology enabler for intelligent transportation systems. By deploying on-board diagnostic devices, the signatures of vehicle vibration along with its location and time are recorded. Detailed analyses of the collected signatures offer deep insights into the state of the objects under study. Towards that objective, we carried out experiments by deploying telematics device in one of the office bus that ferries employees to office and back. Data is being collected from 3-axis accelerometer, GPS, speed and the time for all the journeys. In this paper, we present initial results of the above exercise by applying statistical methods to derive information through systematic analysis of the data collected over four months. It is demonstrated that the higher order derivative of the measured Z axis acceleration samples display the properties Weibull distribution when the time axis is replaced by the amplitude of such processed acceleration data. Such an observation offers us a method to predict future behaviour where deviations from prediction are classified as context-based aberrations or progressive degradation of the system. In addition we capture the relationship between speed of the vehicle and median of the jerk energy samples using regression analysis. Such results offer an opportunity to develop a robust method to model road-vehicle interaction thereby enabling us to predict such like driving behaviour and condition based maintenance etc.
A robust bayesian estimate of the concordance correlation coefficient.
Feng, Dai; Baumgartner, Richard; Svetnik, Vladimir
2015-01-01
A need for assessment of agreement arises in many situations including statistical biomarker qualification or assay or method validation. Concordance correlation coefficient (CCC) is one of the most popular scaled indices reported in evaluation of agreement. Robust methods for CCC estimation currently present an important statistical challenge. Here, we propose a novel Bayesian method of robust estimation of CCC based on multivariate Student's t-distribution and compare it with its alternatives. Furthermore, we extend the method to practically relevant settings, enabling incorporation of confounding covariates and replications. The superiority of the new approach is demonstrated using simulation as well as real datasets from biomarker application in electroencephalography (EEG). This biomarker is relevant in neuroscience for development of treatments for insomnia.
Measurement in Physical Education. 5th Edition.
ERIC Educational Resources Information Center
Mathews, Donald K.
Concepts of measurement in physical education are presented in this college-level text to enable the preservice physical education major to develop skills in determining pupil status, designing effective physical activity programs, and measuring student progress. Emphasis is placed upon discussion of essential statistical methods, test…
Blood detection in wireless capsule endoscope images based on salient superpixels.
Iakovidis, Dimitris K; Chatzis, Dimitris; Chrysanthopoulos, Panos; Koulaouzidis, Anastasios
2015-08-01
Wireless capsule endoscopy (WCE) enables screening of the gastrointestinal (GI) tract with a miniature, optical endoscope packed within a small swallowable capsule, wirelessly transmitting color images. In this paper we propose a novel method for automatic blood detection in contemporary WCE images. Blood is an alarming indication for the presence of pathologies requiring further treatment. The proposed method is based on a new definition of superpixel saliency. The saliency of superpixels is assessed upon their color, enabling the identification of image regions that are likely to contain blood. The blood patterns are recognized by their color features using a supervised learning machine. Experiments performed on a public dataset using automatically selected first-order statistical features from various color components indicate that the proposed method outperforms state-of-the-art methods.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Applications of quantum entropy to statistics
NASA Astrophysics Data System (ADS)
Silver, R. N.; Martz, H. F.
This paper develops two generalizations of the maximum entropy (ME) principle. First, Shannon classical entropy is replaced by von Neumann quantum entropy to yield a broader class of information divergences (or penalty functions) for statistics applications. Negative relative quantum entropy enforces convexity, positivity, non-local extensivity and prior correlations such as smoothness. This enables the extension of ME methods from their traditional domain of ill-posed in-verse problems to new applications such as non-parametric density estimation. Second, given a choice of information divergence, a combination of ME and Bayes rule is used to assign both prior and posterior probabilities. Hyperparameters are interpreted as Lagrange multipliers enforcing constraints. Conservation principles are proposed to act statistical regularization and other hyperparameters, such as conservation of information and smoothness. ME provides an alternative to hierarchical Bayes methods.
Text mining by Tsallis entropy
NASA Astrophysics Data System (ADS)
Jamaati, Maryam; Mehri, Ali
2018-01-01
Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.
2013-05-02
REPORT Statistical Relational Learning ( SRL ) as an Enabling Technology for Data Acquisition and Data Fusion in Video 14. ABSTRACT 16. SECURITY...particular, it is important to reason about which portions of video require expensive analysis and storage. This project aims to make these...inferences using new and existing tools from Statistical Relational Learning ( SRL ). SRL is a recently emerging technology that enables the effective 1
Non-Earth-centric life detection
NASA Technical Reports Server (NTRS)
Conrad, P. G.; Nealson, K. H.
2000-01-01
Our hope is that life will, bit by bit, reveal the clues that will allow us to piece together enough evidence to recognize it whenever and however it presents itself. Indisputable evidence is measurable, statistically meaningful and independent of the nature of the life it defines. That the evidence for life be measurable is a fundamental requirement of the scientific method, as is the requirement for statistical significance, and this quantitation is what enables us to differentiate the measurable criteria of candidate biosignatures from a background (host environment).
Evaluation of Models of the Reading Process.
ERIC Educational Resources Information Center
Balajthy, Ernest
A variety of reading process models have been proposed and evaluated in reading research. Traditional approaches to model evaluation specify the workings of a system in a simplified fashion to enable organized, systematic study of the system's components. Following are several statistical methods of model evaluation: (1) empirical research on…
Statistical considerations in the analysis of data from replicated bioassays
USDA-ARS?s Scientific Manuscript database
Multiple-dose bioassay is generally the preferred method for characterizing virulence of insect pathogens. Linear regression of probit mortality on log dose enables estimation of LD50/LC50 and slope, the latter having substantial effect on LD90/95s (doses of considerable interest in pest management)...
Crown, William; Chang, Jessica; Olson, Melvin; Kahler, Kristijan; Swindle, Jason; Buzinec, Paul; Shah, Nilay; Borah, Bijan
2015-09-01
Missing data, particularly missing variables, can create serious analytic challenges in observational comparative effectiveness research studies. Statistical linkage of datasets is a potential method for incorporating missing variables. Prior studies have focused upon the bias introduced by imperfect linkage. This analysis uses a case study of hepatitis C patients to estimate the net effect of statistical linkage on bias, also accounting for the potential reduction in missing variable bias. The results show that statistical linkage can reduce bias while also enabling parameter estimates to be obtained for the formerly missing variables. The usefulness of statistical linkage will vary depending upon the strength of the correlations of the missing variables with the treatment variable, as well as the outcome variable of interest.
Representation of Probability Density Functions from Orbit Determination using the Particle Filter
NASA Technical Reports Server (NTRS)
Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell
2012-01-01
Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.
Data-Enabled Quantification of Aluminum Microstructural Damage Under Tensile Loading
NASA Astrophysics Data System (ADS)
Wayne, Steven F.; Qi, G.; Zhang, L.
2016-08-01
The study of material failure with digital analytics is in its infancy and offers a new perspective to advance our understanding of damage initiation and evolution in metals. In this article, we study the failure of aluminum using data-enabled methods, statistics and data mining. Through the use of tension tests, we establish a multivariate acoustic-data matrix of random damage events, which typically are not visible and are very difficult to measure due to their variability, diversity and interactivity during damage processes. Aluminium alloy 6061-T651 and single crystal aluminium with a (111) orientation were evaluated by comparing the collection of acoustic signals from damage events caused primarily by slip in the single crystal and multimode fracture of the alloy. We found the resulting acoustic damage-event data to be large semi-structured volumes of Big Data with the potential to be mined for information that describes the materials damage state under strain. Our data-enabled analyses has allowed us to determine statistical distributions of multiscale random damage that provide a means to quantify the material damage state.
Direct evaluation of free energy for large system through structure integration approach.
Takeuchi, Kazuhito; Tanaka, Ryohei; Yuge, Koretaka
2015-09-30
We propose a new approach, 'structure integration', enabling direct evaluation of configurational free energy for large systems. The present approach is based on the statistical information of lattice. Through first-principles-based simulation, we find that the present method evaluates configurational free energy accurately in disorder states above critical temperature.
High-throughput electrical measurement and microfluidic sorting of semiconductor nanowires.
Akin, Cevat; Feldman, Leonard C; Durand, Corentin; Hus, Saban M; Li, An-Ping; Hui, Ho Yee; Filler, Michael A; Yi, Jingang; Shan, Jerry W
2016-05-24
Existing nanowire electrical characterization tools not only are expensive and require sophisticated facilities, but are far too slow to enable statistical characterization of highly variable samples. They are also generally not compatible with further sorting and processing of nanowires. Here, we demonstrate a high-throughput, solution-based electro-orientation-spectroscopy (EOS) method, which is capable of automated electrical characterization of individual nanowires by direct optical visualization of their alignment behavior under spatially uniform electric fields of different frequencies. We demonstrate that EOS can quantitatively characterize the electrical conductivities of nanowires over a 6-order-of-magnitude range (10(-5) to 10 S m(-1), corresponding to typical carrier densities of 10(10)-10(16) cm(-3)), with different fluids used to suspend the nanowires. By implementing EOS in a simple microfluidic device, continuous electrical characterization is achieved, and the sorting of nanowires is demonstrated as a proof-of-concept. With measurement speeds two orders of magnitude faster than direct-contact methods, the automated EOS instrument enables for the first time the statistical characterization of highly variable 1D nanomaterials.
An M-estimator for reduced-rank system identification.
Chen, Shaojie; Liu, Kai; Yang, Yuguang; Xu, Yuting; Lee, Seonjoo; Lindquist, Martin; Caffo, Brian S; Vogelstein, Joshua T
2017-01-15
High-dimensional time-series data from a wide variety of domains, such as neuroscience, are being generated every day. Fitting statistical models to such data, to enable parameter estimation and time-series prediction, is an important computational primitive. Existing methods, however, are unable to cope with the high-dimensional nature of these data, due to both computational and statistical reasons. We mitigate both kinds of issues by proposing an M-estimator for Reduced-rank System IDentification ( MR. SID). A combination of low-rank approximations, ℓ 1 and ℓ 2 penalties, and some numerical linear algebra tricks, yields an estimator that is computationally efficient and numerically stable. Simulations and real data examples demonstrate the usefulness of this approach in a variety of problems. In particular, we demonstrate that MR. SID can accurately estimate spatial filters, connectivity graphs, and time-courses from native resolution functional magnetic resonance imaging data. MR. SID therefore enables big time-series data to be analyzed using standard methods, readying the field for further generalizations including non-linear and non-Gaussian state-space models.
An M-estimator for reduced-rank system identification
Chen, Shaojie; Liu, Kai; Yang, Yuguang; Xu, Yuting; Lee, Seonjoo; Lindquist, Martin; Caffo, Brian S.; Vogelstein, Joshua T.
2018-01-01
High-dimensional time-series data from a wide variety of domains, such as neuroscience, are being generated every day. Fitting statistical models to such data, to enable parameter estimation and time-series prediction, is an important computational primitive. Existing methods, however, are unable to cope with the high-dimensional nature of these data, due to both computational and statistical reasons. We mitigate both kinds of issues by proposing an M-estimator for Reduced-rank System IDentification ( MR. SID). A combination of low-rank approximations, ℓ1 and ℓ2 penalties, and some numerical linear algebra tricks, yields an estimator that is computationally efficient and numerically stable. Simulations and real data examples demonstrate the usefulness of this approach in a variety of problems. In particular, we demonstrate that MR. SID can accurately estimate spatial filters, connectivity graphs, and time-courses from native resolution functional magnetic resonance imaging data. MR. SID therefore enables big time-series data to be analyzed using standard methods, readying the field for further generalizations including non-linear and non-Gaussian state-space models. PMID:29391659
Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation
NASA Technical Reports Server (NTRS)
DePriest, Douglas; Morgan, Carolyn
2003-01-01
The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.
NASA Technical Reports Server (NTRS)
Lewis, Michael
1994-01-01
Statistical encoding techniques enable the reduction of the number of bits required to encode a set of symbols, and are derived from their probabilities. Huffman encoding is an example of statistical encoding that has been used for error-free data compression. The degree of compression given by Huffman encoding in this application can be improved by the use of prediction methods. These replace the set of elevations by a set of corrections that have a more advantageous probability distribution. In particular, the method of Lagrange Multipliers for minimization of the mean square error has been applied to local geometrical predictors. Using this technique, an 8-point predictor achieved about a 7 percent improvement over an existing simple triangular predictor.
An Automated Blur Detection Method for Histological Whole Slide Imaging
Moles Lopez, Xavier; D'Andrea, Etienne; Barbot, Paul; Bridoux, Anne-Sophie; Rorive, Sandrine; Salmon, Isabelle; Debeir, Olivier; Decaestecker, Christine
2013-01-01
Whole slide scanners are novel devices that enable high-resolution imaging of an entire histological slide. Furthermore, the imaging is achieved in only a few minutes, which enables image rendering of large-scale studies involving multiple immunohistochemistry biomarkers. Although whole slide imaging has improved considerably, locally poor focusing causes blurred regions of the image. These artifacts may strongly affect the quality of subsequent analyses, making a slide review process mandatory. This tedious and time-consuming task requires the scanner operator to carefully assess the virtual slide and to manually select new focus points. We propose a statistical learning method that provides early image quality feedback and automatically identifies regions of the image that require additional focus points. PMID:24349343
Data exploration systems for databases
NASA Technical Reports Server (NTRS)
Greene, Richard J.; Hield, Christopher
1992-01-01
Data exploration systems apply machine learning techniques, multivariate statistical methods, information theory, and database theory to databases to identify significant relationships among the data and summarize information. The result of applying data exploration systems should be a better understanding of the structure of the data and a perspective of the data enabling an analyst to form hypotheses for interpreting the data. This paper argues that data exploration systems need a minimum amount of domain knowledge to guide both the statistical strategy and the interpretation of the resulting patterns discovered by these systems.
Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines
NASA Astrophysics Data System (ADS)
Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.
2016-12-01
Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.
A statistical method for measuring activation of gene regulatory networks.
Esteves, Gustavo H; Reis, Luiz F L
2018-06-13
Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Application of Ontology Technology in Health Statistic Data Analysis.
Guo, Minjiang; Hu, Hongpu; Lei, Xingyun
2017-01-01
Research Purpose: establish health management ontology for analysis of health statistic data. Proposed Methods: this paper established health management ontology based on the analysis of the concepts in China Health Statistics Yearbook, and used protégé to define the syntactic and semantic structure of health statistical data. six classes of top-level ontology concepts and their subclasses had been extracted and the object properties and data properties were defined to establish the construction of these classes. By ontology instantiation, we can integrate multi-source heterogeneous data and enable administrators to have an overall understanding and analysis of the health statistic data. ontology technology provides a comprehensive and unified information integration structure of the health management domain and lays a foundation for the efficient analysis of multi-source and heterogeneous health system management data and enhancement of the management efficiency.
A Method for Search Engine Selection using Thesaurus for Selective Meta-Search Engine
NASA Astrophysics Data System (ADS)
Goto, Shoji; Ozono, Tadachika; Shintani, Toramatsu
In this paper, we propose a new method for selecting search engines on WWW for selective meta-search engine. In selective meta-search engine, a method is needed that would enable selecting appropriate search engines for users' queries. Most existing methods use statistical data such as document frequency. These methods may select inappropriate search engines if a query contains polysemous words. In this paper, we describe an search engine selection method based on thesaurus. In our method, a thesaurus is constructed from documents in a search engine and is used as a source description of the search engine. The form of a particular thesaurus depends on the documents used for its construction. Our method enables search engine selection by considering relationship between terms and overcomes the problems caused by polysemous words. Further, our method does not have a centralized broker maintaining data, such as document frequency for all search engines. As a result, it is easy to add a new search engine, and meta-search engines become more scalable with our method compared to other existing methods.
Israel, Yonatan; Tenne, Ron; Oron, Dan; Silberberg, Yaron
2017-01-01
Despite advances in low-light-level detection, single-photon methods such as photon correlation have rarely been used in the context of imaging. The few demonstrations, for example of subdiffraction-limited imaging utilizing quantum statistics of photons, have remained in the realm of proof-of-principle demonstrations. This is primarily due to a combination of low values of fill factors, quantum efficiencies, frame rates and signal-to-noise characteristic of most available single-photon sensitive imaging detectors. Here we describe an imaging device based on a fibre bundle coupled to single-photon avalanche detectors that combines a large fill factor, a high quantum efficiency, a low noise and scalable architecture. Our device enables localization-based super-resolution microscopy in a non-sparse non-stationary scene, utilizing information on the number of active emitters, as gathered from non-classical photon statistics. PMID:28287167
NASA Technical Reports Server (NTRS)
Sprowls, D. O.; Bucci, R. J.; Ponchel, B. M.; Brazill, R. L.; Bretz, P. E.
1984-01-01
A technique is demonstrated for accelerated stress corrosion testing of high strength aluminum alloys. The method offers better precision and shorter exposure times than traditional pass fail procedures. The approach uses data from tension tests performed on replicate groups of smooth specimens after various lengths of exposure to static stress. The breaking strength measures degradation in the test specimen load carrying ability due to the environmental attack. Analysis of breaking load data by extreme value statistics enables the calculation of survival probabilities and a statistically defined threshold stress applicable to the specific test conditions. A fracture mechanics model is given which quantifies depth of attack in the stress corroded specimen by an effective flaw size calculated from the breaking stress and the material strength and fracture toughness properties. Comparisons are made with experimental results from three tempers of 7075 alloy plate tested by the breaking load method and by traditional tests of statistically loaded smooth tension bars and conventional precracked specimens.
Markovic, Gabriela; Schult, Marie-Louise; Bartfai, Aniko; Elg, Mattias
2017-01-31
Progress in early cognitive recovery after acquired brain injury is uneven and unpredictable, and thus the evaluation of rehabilitation is complex. The use of time-series measurements is susceptible to statistical change due to process variation. To evaluate the feasibility of using a time-series method, statistical process control, in early cognitive rehabilitation. Participants were 27 patients with acquired brain injury undergoing interdisciplinary rehabilitation of attention within 4 months post-injury. The outcome measure, the Paced Auditory Serial Addition Test, was analysed using statistical process control. Statistical process control identifies if and when change occurs in the process according to 3 patterns: rapid, steady or stationary performers. The statistical process control method was adjusted, in terms of constructing the baseline and the total number of measurement points, in order to measure a process in change. Statistical process control methodology is feasible for use in early cognitive rehabilitation, since it provides information about change in a process, thus enabling adjustment of the individual treatment response. Together with the results indicating discernible subgroups that respond differently to rehabilitation, statistical process control could be a valid tool in clinical decision-making. This study is a starting-point in understanding the rehabilitation process using a real-time-measurements approach.
NASA Astrophysics Data System (ADS)
Reis, Wieland G.; Tomović, Željko; Weitz, R. Thomas; Krupke, Ralph; Mikhael, Jules
2017-03-01
The potential of single-walled carbon nanotubes (SWCNTs) to outperform silicon in electronic application was finally enabled through selective separation of semiconducting nanotubes from the as-synthesized statistical mix with polymeric dispersants. Such separation methods provide typically high semiconducting purity samples with narrow diameter distribution, i.e. almost single chiralities. But for a wide range of applications high purity mixtures of small and large diameters are sufficient or even required. Here we proof that weak field centrifugation is a diameter independent method for enrichment of semiconducting nanotubes. We show that the non-selective and strong adsorption of polyarylether dispersants on nanostructured carbon surfaces enables simple separation of diverse raw materials with different SWCNT diameter. In addition and for the first time, we demonstrate that increased temperature enables higher purity separation. Furthermore we show that the mode of action behind this electronic enrichment is strongly connected to both colloidal stability and protonation. By giving simple access to electronically sorted SWCNTs of any diameter, the wide dynamic range of weak field centrifugation can provide economical relevance to SWCNTs.
A segmentation editing framework based on shape change statistics
NASA Astrophysics Data System (ADS)
Mostapha, Mahmoud; Vicory, Jared; Styner, Martin; Pizer, Stephen
2017-02-01
Segmentation is a key task in medical image analysis because its accuracy significantly affects successive steps. Automatic segmentation methods often produce inadequate segmentations, which require the user to manually edit the produced segmentation slice by slice. Because editing is time-consuming, an editing tool that enables the user to produce accurate segmentations by only drawing a sparse set of contours would be needed. This paper describes such a framework as applied to a single object. Constrained by the additional information enabled by the manually segmented contours, the proposed framework utilizes object shape statistics to transform the failed automatic segmentation to a more accurate version. Instead of modeling the object shape, the proposed framework utilizes shape change statistics that were generated to capture the object deformation from the failed automatic segmentation to its corresponding correct segmentation. An optimization procedure was used to minimize an energy function that consists of two terms, an external contour match term and an internal shape change regularity term. The high accuracy of the proposed segmentation editing approach was confirmed by testing it on a simulated data set based on 10 in-vivo infant magnetic resonance brain data sets using four similarity metrics. Segmentation results indicated that our method can provide efficient and adequately accurate segmentations (Dice segmentation accuracy increase of 10%), with very sparse contours (only 10%), which is promising in greatly decreasing the work expected from the user.
Reproducibility-optimized test statistic for ranking genes in microarray studies.
Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero
2008-01-01
A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.
WEC Design Response Toolbox v. 1.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coe, Ryan; Michelen, Carlos; Eckert-Gallup, Aubrey
2016-03-30
The WEC Design Response Toolbox (WDRT) is a numerical toolbox for design-response analysis of wave energy converters (WECs). The WDRT was developed during a series of efforts to better understand WEC survival design. The WDRT has been designed as a tool for researchers and developers, enabling the straightforward application of statistical and engineering methods. The toolbox includes methods for short-term extreme response, environmental characterization, long-term extreme response and risk analysis, fatigue, and design wave composition.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Defining the best quality-control systems by design and inspection.
Hinckley, C M
1997-05-01
Not all of the many approaches to quality control are equally effective. Nonconformities in laboratory testing are caused basically by excessive process variation and mistakes. Statistical quality control can effectively control process variation, but it cannot detect or prevent most mistakes. Because mistakes or blunders are frequently the dominant source of nonconformities, we conclude that statistical quality control by itself is not effective. I explore the 100% inspection methods essential for controlling mistakes. Unlike the inspection techniques that Deming described as ineffective, the new "source" inspection methods can detect mistakes and enable corrections before nonconformities are generated, achieving the highest degree of quality at a fraction of the cost of traditional methods. Key relationships between task complexity and nonconformity rates are also described, along with cultural changes that are essential for implementing the best quality-control practices.
Statistical Detection of Atypical Aircraft Flights
NASA Technical Reports Server (NTRS)
Statler, Irving; Chidester, Thomas; Shafto, Michael; Ferryman, Thomas; Amidan, Brett; Whitney, Paul; White, Amanda; Willse, Alan; Cooley, Scott; Jay, Joseph;
2006-01-01
A computational method and software to implement the method have been developed to sift through vast quantities of digital flight data to alert human analysts to aircraft flights that are statistically atypical in ways that signify that safety may be adversely affected. On a typical day, there are tens of thousands of flights in the United States and several times that number throughout the world. Depending on the specific aircraft design, the volume of data collected by sensors and flight recorders can range from a few dozen to several thousand parameters per second during a flight. Whereas these data have long been utilized in investigating crashes, the present method is oriented toward helping to prevent crashes by enabling routine monitoring of flight operations to identify portions of flights that may be of interest with respect to safety issues.
Probabilistic numerical methods for PDE-constrained Bayesian inverse problems
NASA Astrophysics Data System (ADS)
Cockayne, Jon; Oates, Chris; Sullivan, Tim; Girolami, Mark
2017-06-01
This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significant solver error. Theoretical results are presented describing rates of convergence for the posteriors in both the forward and inverse problems. This method is tested on a challenging inverse problem with a nonlinear forward model.
Visualizing statistical significance of disease clusters using cartograms.
Kronenfeld, Barry J; Wong, David W S
2017-05-15
Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss. On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions. The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.
Estimation of Global Network Statistics from Incomplete Data
Bliss, Catherine A.; Danforth, Christopher M.; Dodds, Peter Sheridan
2014-01-01
Complex networks underlie an enormous variety of social, biological, physical, and virtual systems. A profound complication for the science of complex networks is that in most cases, observing all nodes and all network interactions is impossible. Previous work addressing the impacts of partial network data is surprisingly limited, focuses primarily on missing nodes, and suggests that network statistics derived from subsampled data are not suitable estimators for the same network statistics describing the overall network topology. We generate scaling methods to predict true network statistics, including the degree distribution, from only partial knowledge of nodes, links, or weights. Our methods are transparent and do not assume a known generating process for the network, thus enabling prediction of network statistics for a wide variety of applications. We validate analytical results on four simulated network classes and empirical data sets of various sizes. We perform subsampling experiments by varying proportions of sampled data and demonstrate that our scaling methods can provide very good estimates of true network statistics while acknowledging limits. Lastly, we apply our techniques to a set of rich and evolving large-scale social networks, Twitter reply networks. Based on 100 million tweets, we use our scaling techniques to propose a statistical characterization of the Twitter Interactome from September 2008 to November 2008. Our treatment allows us to find support for Dunbar's hypothesis in detecting an upper threshold for the number of active social contacts that individuals maintain over the course of one week. PMID:25338183
NASA Astrophysics Data System (ADS)
Bolodurina, I. P.; Parfenov, D. I.
2018-01-01
We have elaborated a neural network model of virtual network flow identification based on the statistical properties of flows circulating in the network of the data center and characteristics that describe the content of packets transmitted through network objects. This enabled us to establish the optimal set of attributes to identify virtual network functions. We have established an algorithm for optimizing the placement of virtual data functions using the data obtained in our research. Our approach uses a hybrid method of visualization using virtual machines and containers, which enables to reduce the infrastructure load and the response time in the network of the virtual data center. The algorithmic solution is based on neural networks, which enables to scale it at any number of the network function copies.
Scanning probe recognition microscopy investigation of tissue scaffold properties
Fan, Yuan; Chen, Qian; Ayres, Virginia M; Baczewski, Andrew D; Udpa, Lalita; Kumar, Shiva
2007-01-01
Scanning probe recognition microscopy is a new scanning probe microscopy technique which enables selective scanning along individual nanofibers within a tissue scaffold. Statistically significant data for multiple properties can be collected by repetitively fine-scanning an identical region of interest. The results of a scanning probe recognition microscopy investigation of the surface roughness and elasticity of a series of tissue scaffolds are presented. Deconvolution and statistical methods were developed and used for data accuracy along curved nanofiber surfaces. Nanofiber features were also independently analyzed using transmission electron microscopy, with results that supported the scanning probe recognition microscopy-based analysis. PMID:18203431
Scanning probe recognition microscopy investigation of tissue scaffold properties.
Fan, Yuan; Chen, Qian; Ayres, Virginia M; Baczewski, Andrew D; Udpa, Lalita; Kumar, Shiva
2007-01-01
Scanning probe recognition microscopy is a new scanning probe microscopy technique which enables selective scanning along individual nanofibers within a tissue scaffold. Statistically significant data for multiple properties can be collected by repetitively fine-scanning an identical region of interest. The results of a scanning probe recognition microscopy investigation of the surface roughness and elasticity of a series of tissue scaffolds are presented. Deconvolution and statistical methods were developed and used for data accuracy along curved nanofiber surfaces. Nanofiber features were also independently analyzed using transmission electron microscopy, with results that supported the scanning probe recognition microscopy-based analysis.
NASA Astrophysics Data System (ADS)
Avakyan, L. A.; Heinz, M.; Skidanenko, A. V.; Yablunovski, K. A.; Ihlemann, J.; Meinertz, J.; Patzig, C.; Dubiel, M.; Bugaev, L. A.
2018-01-01
The formation of a localized surface plasmon resonance (SPR) spectrum of randomly distributed gold nanoparticles in the surface layer of silicate float glass, generated and implanted by UV ArF-excimer laser irradiation of a thin gold layer sputter-coated on the glass surface, was studied by the T-matrix method, which enables particle agglomeration to be taken into account. The experimental technique used is promising for the production of submicron patterns of plasmonic nanoparticles (given by laser masks or gratings) without damage to the glass surface. Analysis of the applicability of the multi-spheres T-matrix (MSTM) method to the studied material was performed through calculations of SPR characteristics for differently arranged and structured gold nanoparticles (gold nanoparticles in solution, particles pairs, and core-shell silver-gold nanoparticles) for which either experimental data or results of the modeling by other methods are available. For the studied gold nanoparticles in glass, it was revealed that the theoretical description of their SPR spectrum requires consideration of the plasmon coupling between particles, which can be done effectively by MSTM calculations. The obtained statistical distributions over particle sizes and over interparticle distances demonstrated the saturation behavior with respect to the number of particles under consideration, which enabled us to determine the effective aggregate of particles, sufficient to form the SPR spectrum. The suggested technique for the fitting of an experimental SPR spectrum of gold nanoparticles in glass by varying the geometrical parameters of the particles aggregate in the recurring calculations of spectrum by MSTM method enabled us to determine statistical characteristics of the aggregate: the average distance between particles, average size, and size distribution of the particles. The fitting strategy of the SPR spectrum presented here can be applied to nanoparticles of any nature and in various substances, and, in principle, can be extended for particles with non-spherical shapes, like ellipsoids, rod-like and other T-matrix-solvable shapes.
ConvAn: a convergence analyzing tool for optimization of biochemical networks.
Kostromins, Andrejs; Mozga, Ivars; Stalidzans, Egils
2012-01-01
Dynamic models of biochemical networks usually are described as a system of nonlinear differential equations. In case of optimization of models for purpose of parameter estimation or design of new properties mainly numerical methods are used. That causes problems of optimization predictability as most of numerical optimization methods have stochastic properties and the convergence of the objective function to the global optimum is hardly predictable. Determination of suitable optimization method and necessary duration of optimization becomes critical in case of evaluation of high number of combinations of adjustable parameters or in case of large dynamic models. This task is complex due to variety of optimization methods, software tools and nonlinearity features of models in different parameter spaces. A software tool ConvAn is developed to analyze statistical properties of convergence dynamics for optimization runs with particular optimization method, model, software tool, set of optimization method parameters and number of adjustable parameters of the model. The convergence curves can be normalized automatically to enable comparison of different methods and models in the same scale. By the help of the biochemistry adapted graphical user interface of ConvAn it is possible to compare different optimization methods in terms of ability to find the global optima or values close to that as well as the necessary computational time to reach them. It is possible to estimate the optimization performance for different number of adjustable parameters. The functionality of ConvAn enables statistical assessment of necessary optimization time depending on the necessary optimization accuracy. Optimization methods, which are not suitable for a particular optimization task, can be rejected if they have poor repeatability or convergence properties. The software ConvAn is freely available on www.biosystems.lv/convan. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Precipitate statistics in an Al-Mg-Si-Cu alloy from scanning precession electron diffraction data
NASA Astrophysics Data System (ADS)
Sunde, J. K.; Paulsen, Ø.; Wenner, S.; Holmestad, R.
2017-09-01
The key microstructural feature providing strength to age-hardenable Al alloys is nanoscale precipitates. Alloy development requires a reliable statistical assessment of these precipitates, in order to link the microstructure with material properties. Here, it is demonstrated that scanning precession electron diffraction combined with computational analysis enable the semi-automated extraction of precipitate statistics in an Al-Mg-Si-Cu alloy. Among the main findings is the precipitate number density, which agrees well with a conventional method based on manual counting and measurements. By virtue of its data analysis objectivity, our methodology is therefore seen as an advantageous alternative to existing routines, offering reproducibility and efficiency in alloy statistics. Additional results include improved qualitative information on phase distributions. The developed procedure is generic and applicable to any material containing nanoscale precipitates.
Understanding spatial organizations of chromosomes via statistical analysis of Hi-C data
Hu, Ming; Deng, Ke; Qin, Zhaohui; Liu, Jun S.
2015-01-01
Understanding how chromosomes fold provides insights into the transcription regulation, hence, the functional state of the cell. Using the next generation sequencing technology, the recently developed Hi-C approach enables a global view of spatial chromatin organization in the nucleus, which substantially expands our knowledge about genome organization and function. However, due to multiple layers of biases, noises and uncertainties buried in the protocol of Hi-C experiments, analyzing and interpreting Hi-C data poses great challenges, and requires novel statistical methods to be developed. This article provides an overview of recent Hi-C studies and their impacts on biomedical research, describes major challenges in statistical analysis of Hi-C data, and discusses some perspectives for future research. PMID:26124977
Statistical Characterization and Classification of Edge-Localized Plasma Instabilities
NASA Astrophysics Data System (ADS)
Webster, A. J.; Dendy, R. O.
2013-04-01
The statistics of edge-localized plasma instabilities (ELMs) in toroidal magnetically confined fusion plasmas are considered. From first principles, standard experimentally motivated assumptions are shown to determine a specific probability distribution for the waiting times between ELMs: the Weibull distribution. This is confirmed empirically by a statistically rigorous comparison with a large data set from the Joint European Torus. The successful characterization of ELM waiting times enables future work to progress in various ways. Here we present a quantitative classification of ELM types, complementary to phenomenological approaches. It also informs us about the nature of ELM processes, such as whether they are random or deterministic. The methods are extremely general and can be applied to numerous other quasiperiodic intermittent phenomena.
Bootstrapping under constraint for the assessment of group behavior in human contact networks
NASA Astrophysics Data System (ADS)
Tremblay, Nicolas; Barrat, Alain; Forest, Cary; Nornberg, Mark; Pinton, Jean-François; Borgnat, Pierre
2013-11-01
The increasing availability of time- and space-resolved data describing human activities and interactions gives insights into both static and dynamic properties of human behavior. In practice, nevertheless, real-world data sets can often be considered as only one realization of a particular event. This highlights a key issue in social network analysis: the statistical significance of estimated properties. In this context, we focus here on the assessment of quantitative features of specific subset of nodes in empirical networks. We present a method of statistical resampling based on bootstrapping groups of nodes under constraints within the empirical network. The method enables us to define acceptance intervals for various null hypotheses concerning relevant properties of the subset of nodes under consideration in order to characterize by a statistical test its behavior as “normal” or not. We apply this method to a high-resolution data set describing the face-to-face proximity of individuals during two colocated scientific conferences. As a case study, we show how to probe whether colocating the two conferences succeeded in bringing together the two corresponding groups of scientists.
Ladd, David E.; Law, George S.
2007-01-01
The U.S. Geological Survey (USGS) provides streamflow and other stream-related information needed to protect people and property from floods, to plan and manage water resources, and to protect water quality in the streams. Streamflow statistics provided by the USGS, such as the 100-year flood and the 7-day 10-year low flow, frequently are used by engineers, land managers, biologists, and many others to help guide decisions in their everyday work. In addition to streamflow statistics, resource managers often need to know the physical and climatic characteristics (basin characteristics) of the drainage basins for locations of interest to help them understand the mechanisms that control water availability and water quality at these locations. StreamStats is a Web-enabled geographic information system (GIS) application that makes it easy for users to obtain streamflow statistics, basin characteristics, and other information for USGS data-collection stations and for ungaged sites of interest. If a user selects the location of a data-collection station, StreamStats will provide previously published information for the station from a database. If a user selects a location where no data are available (an ungaged site), StreamStats will run a GIS program to delineate a drainage basin boundary, measure basin characteristics, and estimate streamflow statistics based on USGS streamflow prediction methods. A user can download a GIS feature class of the drainage basin boundary with attributes including the measured basin characteristics and streamflow estimates.
Mansourian, Robert; Mutch, David M; Antille, Nicolas; Aubert, Jerome; Fogel, Paul; Le Goff, Jean-Marc; Moulin, Julie; Petrov, Anton; Rytz, Andreas; Voegel, Johannes J; Roberts, Matthew-Alan
2004-11-01
Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-gamma treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription-polymerase chain reaction (RT-PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. The GEA code for R software is freely available upon request to authors.
Automatic stage identification of Drosophila egg chamber based on DAPI images
Jia, Dongyu; Xu, Qiuping; Xie, Qian; Mio, Washington; Deng, Wu-Min
2016-01-01
The Drosophila egg chamber, whose development is divided into 14 stages, is a well-established model for developmental biology. However, visual stage determination can be a tedious, subjective and time-consuming task prone to errors. Our study presents an objective, reliable and repeatable automated method for quantifying cell features and classifying egg chamber stages based on DAPI images. The proposed approach is composed of two steps: 1) a feature extraction step and 2) a statistical modeling step. The egg chamber features used are egg chamber size, oocyte size, egg chamber ratio and distribution of follicle cells. Methods for determining the on-site of the polytene stage and centripetal migration are also discussed. The statistical model uses linear and ordinal regression to explore the stage-feature relationships and classify egg chamber stages. Combined with machine learning, our method has great potential to enable discovery of hidden developmental mechanisms. PMID:26732176
Multi-fidelity machine learning models for accurate bandgap predictions of solids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab
Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less
Multi-fidelity machine learning models for accurate bandgap predictions of solids
Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab
2016-12-28
Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less
Ye, Feng; Liu, Yaohua; Whitfield, Ross; Osborn, Ray; Rosenkranz, Stephan
2018-04-01
The CORELLI instrument at Oak Ridge National Laboratory is a statistical chopper spectrometer designed and optimized to probe complex disorder in crystalline materials through diffuse scattering experiments. On CORELLI, the high efficiency of white-beam Laue diffraction combined with elastic discrimination have enabled an unprecedented data collection rate to obtain both the total and the elastic-only scattering over a large volume of reciprocal space from a single measurement. To achieve this, CORELLI is equipped with a statistical chopper to modulate the incoming neutron beam quasi-randomly, and then the cross-correlation method is applied to reconstruct the elastic component from the scattering data. Details of the implementation of the cross-correlation method on CORELLI are given and its performance is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Feng; Liu, Yaohua; Whitfield, Ross
The CORELLI instrument at Oak Ridge National Laboratory is a statistical chopper spectrometer designed and optimized to probe complex disorder in crystalline materials through diffuse scattering experiments. On CORELLI, the high efficiency of white-beam Laue diffraction combined with elastic discrimination have enabled an unprecedented data collection rate to obtain both the total and the elastic-only scattering over a large volume of reciprocal space from a single measurement. To achieve this, CORELLI is equipped with a statistical chopper to modulate the incoming neutron beam quasi-randomly, and then the cross-correlation method is applied to reconstruct the elastic component from the scattering data.more » Lastly, details of the implementation of the cross-correlation method on CORELLI are given and its performance is discussed.« less
Ye, Feng; Liu, Yaohua; Whitfield, Ross; ...
2018-03-26
The CORELLI instrument at Oak Ridge National Laboratory is a statistical chopper spectrometer designed and optimized to probe complex disorder in crystalline materials through diffuse scattering experiments. On CORELLI, the high efficiency of white-beam Laue diffraction combined with elastic discrimination have enabled an unprecedented data collection rate to obtain both the total and the elastic-only scattering over a large volume of reciprocal space from a single measurement. To achieve this, CORELLI is equipped with a statistical chopper to modulate the incoming neutron beam quasi-randomly, and then the cross-correlation method is applied to reconstruct the elastic component from the scattering data.more » Lastly, details of the implementation of the cross-correlation method on CORELLI are given and its performance is discussed.« less
De Spiegelaere, Ward; Malatinkova, Eva; Lynch, Lindsay; Van Nieuwerburgh, Filip; Messiaen, Peter; O'Doherty, Una; Vandekerckhove, Linos
2014-06-01
Quantification of integrated proviral HIV DNA by repetitive-sampling Alu-HIV PCR is a candidate virological tool to monitor the HIV reservoir in patients. However, the experimental procedures and data analysis of the assay are complex and hinder its widespread use. Here, we provide an improved and simplified data analysis method by adopting binomial and Poisson statistics. A modified analysis method on the basis of Poisson statistics was used to analyze the binomial data of positive and negative reactions from a 42-replicate Alu-HIV PCR by use of dilutions of an integration standard and on samples of 57 HIV-infected patients. Results were compared with the quantitative output of the previously described Alu-HIV PCR method. Poisson-based quantification of the Alu-HIV PCR was linearly correlated with the standard dilution series, indicating that absolute quantification with the Poisson method is a valid alternative for data analysis of repetitive-sampling Alu-HIV PCR data. Quantitative outputs of patient samples assessed by the Poisson method correlated with the previously described Alu-HIV PCR analysis, indicating that this method is a valid alternative for quantifying integrated HIV DNA. Poisson-based analysis of the Alu-HIV PCR data enables absolute quantification without the need of a standard dilution curve. Implementation of the CI estimation permits improved qualitative analysis of the data and provides a statistical basis for the required minimal number of technical replicates. © 2014 The American Association for Clinical Chemistry.
Revising the lower statistical limit of x-ray grating-based phase-contrast computed tomography.
Marschner, Mathias; Birnbacher, Lorenz; Willner, Marian; Chabior, Michael; Herzen, Julia; Noël, Peter B; Pfeiffer, Franz
2017-01-01
Phase-contrast x-ray computed tomography (PCCT) is currently investigated as an interesting extension of conventional CT, providing high soft-tissue contrast even if examining weakly absorbing specimen. Until now, the potential for dose reduction was thought to be limited compared to attenuation CT, since meaningful phase retrieval fails for scans with very low photon counts when using the conventional phase retrieval method via phase stepping. In this work, we examine the statistical behaviour of the reverse projection method, an alternative phase retrieval approach and compare the results to the conventional phase retrieval technique. We investigate the noise levels in the projections as well as the image quality and quantitative accuracy of the reconstructed tomographic volumes. The results of our study show that this method performs better in a low-dose scenario than the conventional phase retrieval approach, resulting in lower noise levels, enhanced image quality and more accurate quantitative values. Overall, we demonstrate that the lower statistical limit of the phase stepping procedure as proposed by recent literature does not apply to this alternative phase retrieval technique. However, further development is necessary to overcome experimental challenges posed by this method which would enable mainstream or even clinical application of PCCT.
Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.
Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J
2016-10-03
Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.
Statistical Deconvolution for Superresolution Fluorescence Microscopy
Mukamel, Eran A.; Babcock, Hazen; Zhuang, Xiaowei
2012-01-01
Superresolution microscopy techniques based on the sequential activation of fluorophores can achieve image resolution of ∼10 nm but require a sparse distribution of simultaneously activated fluorophores in the field of view. Image analysis procedures for this approach typically discard data from crowded molecules with overlapping images, wasting valuable image information that is only partly degraded by overlap. A data analysis method that exploits all available fluorescence data, regardless of overlap, could increase the number of molecules processed per frame and thereby accelerate superresolution imaging speed, enabling the study of fast, dynamic biological processes. Here, we present a computational method, referred to as deconvolution-STORM (deconSTORM), which uses iterative image deconvolution in place of single- or multiemitter localization to estimate the sample. DeconSTORM approximates the maximum likelihood sample estimate under a realistic statistical model of fluorescence microscopy movies comprising numerous frames. The model incorporates Poisson-distributed photon-detection noise, the sparse spatial distribution of activated fluorophores, and temporal correlations between consecutive movie frames arising from intermittent fluorophore activation. We first quantitatively validated this approach with simulated fluorescence data and showed that deconSTORM accurately estimates superresolution images even at high densities of activated fluorophores where analysis by single- or multiemitter localization methods fails. We then applied the method to experimental data of cellular structures and demonstrated that deconSTORM enables an approximately fivefold or greater increase in imaging speed by allowing a higher density of activated fluorophores/frame. PMID:22677393
Army Logistician. Volume 39, Issue 1, January-February 2007
2007-02-01
of electronic systems using statistical methods. P& C , however, requires advanced prognostic capabilities not only to detect the early onset of...patterns. Entities operating in a P& C -enabled environment will sense and understand contextual meaning , communicate their state and mission, and act to...accessing of historical and simulation patterns; on- board prognostics capabilities; physics of failure analyses; and predictive modeling. P& C also
Developing Confidence Limits For Reliability Of Software
NASA Technical Reports Server (NTRS)
Hayhurst, Kelly J.
1991-01-01
Technique developed for estimating reliability of software by use of Moranda geometric de-eutrophication model. Pivotal method enables straightforward construction of exact bounds with associated degree of statistical confidence about reliability of software. Confidence limits thus derived provide precise means of assessing quality of software. Limits take into account number of bugs found while testing and effects of sampling variation associated with random order of discovering bugs.
Quantification of heterogeneity observed in medical images.
Brooks, Frank J; Grigsby, Perry W
2013-03-02
There has been much recent interest in the quantification of visually evident heterogeneity within functional grayscale medical images, such as those obtained via magnetic resonance or positron emission tomography. In the case of images of cancerous tumors, variations in grayscale intensity imply variations in crucial tumor biology. Despite these considerable clinical implications, there is as yet no standardized method for measuring the heterogeneity observed via these imaging modalities. In this work, we motivate and derive a statistical measure of image heterogeneity. This statistic measures the distance-dependent average deviation from the smoothest intensity gradation feasible. We show how this statistic may be used to automatically rank images of in vivo human tumors in order of increasing heterogeneity. We test this method against the current practice of ranking images via expert visual inspection. We find that this statistic provides a means of heterogeneity quantification beyond that given by other statistics traditionally used for the same purpose. We demonstrate the effect of tumor shape upon our ranking method and find the method applicable to a wide variety of clinically relevant tumor images. We find that the automated heterogeneity rankings agree very closely with those performed visually by experts. These results indicate that our automated method may be used reliably to rank, in order of increasing heterogeneity, tumor images whether or not object shape is considered to contribute to that heterogeneity. Automated heterogeneity ranking yields objective results which are more consistent than visual rankings. Reducing variability in image interpretation will enable more researchers to better study potential clinical implications of observed tumor heterogeneity.
A flexibly shaped space-time scan statistic for disease outbreak detection and monitoring.
Takahashi, Kunihiko; Kulldorff, Martin; Tango, Toshiro; Yih, Katherine
2008-04-11
Early detection of disease outbreaks enables public health officials to implement disease control and prevention measures at the earliest possible time. A time periodic geographical disease surveillance system based on a cylindrical space-time scan statistic has been used extensively for disease surveillance along with the SaTScan software. In the purely spatial setting, many different methods have been proposed to detect spatial disease clusters. In particular, some spatial scan statistics are aimed at detecting irregularly shaped clusters which may not be detected by the circular spatial scan statistic. Based on the flexible purely spatial scan statistic, we propose a flexibly shaped space-time scan statistic for early detection of disease outbreaks. The performance of the proposed space-time scan statistic is compared with that of the cylindrical scan statistic using benchmark data. In order to compare their performances, we have developed a space-time power distribution by extending the purely spatial bivariate power distribution. Daily syndromic surveillance data in Massachusetts, USA, are used to illustrate the proposed test statistic. The flexible space-time scan statistic is well suited for detecting and monitoring disease outbreaks in irregularly shaped areas.
Kernel-based whole-genome prediction of complex traits: a review.
Morota, Gota; Gianola, Daniel
2014-01-01
Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.
General Aviation Avionics Statistics : 1975
DOT National Transportation Integrated Search
1978-06-01
This report presents avionics statistics for the 1975 general aviation (GA) aircraft fleet and updates a previous publication, General Aviation Avionics Statistics: 1974. The statistics are presented in a capability group framework which enables one ...
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Frollo, Ivan
2017-12-01
The paper focuses on two methods of evaluation of successfulness of speech signal enhancement recorded in the open-air magnetic resonance imager during phonation for the 3D human vocal tract modeling. The first approach enables to obtain a comparison based on statistical analysis by ANOVA and hypothesis tests. The second method is based on classification by Gaussian mixture models (GMM). The performed experiments have confirmed that the proposed ANOVA and GMM classifiers for automatic evaluation of the speech quality are functional and produce fully comparable results with the standard evaluation based on the listening test method.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
NASA Astrophysics Data System (ADS)
Denis, Vincent
2008-09-01
This paper presents a statistical method for determining the dimensions, tolerance and specifications of components for the Laser MegaJoule (LMJ). Numerous constraints inherent to a large facility require specific tolerances: the huge number of optical components; the interdependence of these components between the beams of same bundle; angular multiplexing for the amplifier section; distinct operating modes between the alignment and firing phases; the definition and use of alignment software in the place of classic optimization. This method provides greater flexibility to determine the positioning and manufacturing specifications of the optical components. Given the enormous power of the Laser MegaJoule (over 18 kJ in the infrared and 9 kJ in the ultraviolet), one of the major risks is damage the optical mounts and pollution of the installation by mechanical ablation. This method enables estimation of the beam occultation probabilities and quantification of the risks for the facility. All the simulations were run using the ZEMAX-EE optical design software.
NASA Astrophysics Data System (ADS)
Barreiro, Andrea K.; Ly, Cheng
2017-08-01
Rapid experimental advances now enable simultaneous electrophysiological recording of neural activity at single-cell resolution across large regions of the nervous system. Models of this neural network activity will necessarily increase in size and complexity, thus increasing the computational cost of simulating them and the challenge of analyzing them. Here we present a method to approximate the activity and firing statistics of a general firing rate network model (of the Wilson-Cowan type) subject to noisy correlated background inputs. The method requires solving a system of transcendental equations and is fast compared to Monte Carlo simulations of coupled stochastic differential equations. We implement the method with several examples of coupled neural networks and show that the results are quantitatively accurate even with moderate coupling strengths and an appreciable amount of heterogeneity in many parameters. This work should be useful for investigating how various neural attributes qualitatively affect the spiking statistics of coupled neural networks.
Variability aware compact model characterization for statistical circuit design optimization
NASA Astrophysics Data System (ADS)
Qiao, Ying; Qian, Kun; Spanos, Costas J.
2012-03-01
Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose an efficient variabilityaware compact model characterization methodology based on the linear propagation of variance. Hierarchical spatial variability patterns of selected compact model parameters are directly calculated from transistor array test structures. This methodology has been implemented and tested using transistor I-V measurements and the EKV-EPFL compact model. Calculation results compare well to full-wafer direct model parameter extractions. Further studies are done on the proper selection of both compact model parameters and electrical measurement metrics used in the method.
Mathematics in modern immunology
Castro, Mario; Lythe, Grant; Molina-París, Carmen; ...
2016-02-19
Mathematical and statistical methods enable multidisciplinary approaches that catalyse discovery. Together with experimental methods, they identify key hypotheses, define measurable observables and reconcile disparate results. Here, we collect a representative sample of studies in T-cell biology that illustrate the benefits of modelling–experimental collaborations and that have proven valuable or even groundbreaking. Furthermore, we conclude that it is possible to find excellent examples of synergy between mathematical modelling and experiment in immunology, which have brought significant insight that would not be available without these collaborations, but that much remains to be discovered.
Mathematics in modern immunology
Castro, Mario; Lythe, Grant; Molina-París, Carmen; Ribeiro, Ruy M.
2016-01-01
Mathematical and statistical methods enable multidisciplinary approaches that catalyse discovery. Together with experimental methods, they identify key hypotheses, define measurable observables and reconcile disparate results. We collect a representative sample of studies in T-cell biology that illustrate the benefits of modelling–experimental collaborations and that have proven valuable or even groundbreaking. We conclude that it is possible to find excellent examples of synergy between mathematical modelling and experiment in immunology, which have brought significant insight that would not be available without these collaborations, but that much remains to be discovered. PMID:27051512
Saad, Ahmed S; Abo-Talib, Nisreen F; El-Ghobashy, Mohamed R
2016-01-05
Different methods have been introduced to enhance selectivity of UV-spectrophotometry thus enabling accurate determination of co-formulated components, however mixtures whose components exhibit wide variation in absorptivities has been an obstacle against application of UV-spectrophotometry. The developed ratio difference at coabsorptive point method (RDC) represents a simple effective solution for the mentioned problem, where the additive property of light absorbance enabled the consideration of the two components as multiples of the lower absorptivity component at certain wavelength (coabsorptive point), at which their total concentration multiples could be determined, whereas the other component was selectively determined by applying the ratio difference method in a single step. Mixture of perindopril arginine (PA) and amlodipine besylate (AM) figures that problem, where the low absorptivity of PA relative to AM hinders selective spectrophotometric determination of PA. The developed method successfully determined both components in the overlapped region of their spectra with accuracy 99.39±1.60 and 100.51±1.21, for PA and AM, respectively. The method was validated as per the USP guidelines and showed no significant difference upon statistical comparison with reported chromatographic method. Copyright © 2015 Elsevier B.V. All rights reserved.
Software Analytical Instrument for Assessment of the Process of Casting Slabs
NASA Astrophysics Data System (ADS)
Franěk, Zdeněk; Kavička, František; Štětina, Josef; Masarik, Miloš
2010-06-01
The paper describes the original proposal of ways of solution and function of the program equipment for assessment of the process of casting slabs. The program system LITIOS was developed and implemented in EVRAZ Vitkovice Steel Ostrava on the equipment of continuous casting of steel (further only ECC). This program system works on the data warehouse of technological parameters of casting and quality parameters of slabs. It enables an ECC technologist to analyze the course of casting melt and with using statistics methods to set the influence of single technological parameters on the duality of final slabs. The system also enables long term monitoring and optimization of the production.
Modelling Complexity: Making Sense of Leadership Issues in 14-19 Education
ERIC Educational Resources Information Center
Briggs, Ann R. J.
2008-01-01
Modelling of statistical data is a well established analytical strategy. Statistical data can be modelled to represent, and thereby predict, the forces acting upon a structure or system. For the rapidly changing systems in the world of education, modelling enables the researcher to understand, to predict and to enable decisions to be based upon…
General Aviation Avionics Statistics : 1976
DOT National Transportation Integrated Search
1979-11-01
This report presents avionics statistics for the 1976 general aviation (GA) aircraft fleet and is the third in a series titled "General Aviation Avionics Statistics." The statistics are presented in a capability group framework which enables one to r...
General Aviation Avionics Statistics : 1978 Data
DOT National Transportation Integrated Search
1980-12-01
The report presents avionics statistics for the 1978 general aviation (GA) aircraft fleet and is the fifth in a series titled "General Aviation Statistics." The statistics are presented in a capability group framework which enables one to relate airb...
General Aviation Avionics Statistics : 1979 Data
DOT National Transportation Integrated Search
1981-04-01
This report presents avionics statistics for the 1979 general aviation (GA) aircraft fleet and is the sixth in a series titled General Aviation Avionics Statistics. The statistics preseneted in a capability group framework which enables one to relate...
High-Throughput, Data-Rich Cellular RNA Device Engineering
Townshend, Brent; Kennedy, Andrew B.; Xiang, Joy S.; Smolke, Christina D.
2015-01-01
Methods for rapidly assessing sequence-structure-function landscapes and developing conditional gene-regulatory devices are critical to our ability to manipulate and interface with biology. We describe a framework for engineering RNA devices from preexisting aptamers that exhibit ligand-responsive ribozyme tertiary interactions. Our methodology utilizes cell sorting, high-throughput sequencing, and statistical data analyses to enable parallel measurements of the activities of hundreds of thousands of sequences from RNA device libraries in the absence and presence of ligands. Our tertiary interaction RNA devices exhibit improved performance in terms of gene silencing, activation ratio, and ligand sensitivity as compared to optimized RNA devices that rely on secondary structure changes. We apply our method to building biosensors for diverse ligands and determine consensus sequences that enable ligand-responsive tertiary interactions. These methods advance our ability to develop broadly applicable genetic tools and to elucidate understanding of the underlying sequence-structure-function relationships that empower rational design of complex biomolecules. PMID:26258292
A Geometrical-Statistical Approach to Outlier Removal for TDOA Measurements
NASA Astrophysics Data System (ADS)
Compagnoni, Marco; Pini, Alessia; Canclini, Antonio; Bestagini, Paolo; Antonacci, Fabio; Tubaro, Stefano; Sarti, Augusto
2017-08-01
The curse of outlier measurements in estimation problems is a well known issue in a variety of fields. Therefore, outlier removal procedures, which enables the identification of spurious measurements within a set, have been developed for many different scenarios and applications. In this paper, we propose a statistically motivated outlier removal algorithm for time differences of arrival (TDOAs), or equivalently range differences (RD), acquired at sensor arrays. The method exploits the TDOA-space formalism and works by only knowing relative sensor positions. As the proposed method is completely independent from the application for which measurements are used, it can be reliably used to identify outliers within a set of TDOA/RD measurements in different fields (e.g. acoustic source localization, sensor synchronization, radar, remote sensing, etc.). The proposed outlier removal algorithm is validated by means of synthetic simulations and real experiments.
Statistical Reference Datasets
National Institute of Standards and Technology Data Gateway
Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
Multiple-Point statistics for stochastic modeling of aquifers, where do we stand?
NASA Astrophysics Data System (ADS)
Renard, P.; Julien, S.
2017-12-01
In the last 20 years, multiple-point statistics have been a focus of much research, successes and disappointments. The aim of this geostatistical approach was to integrate geological information into stochastic models of aquifer heterogeneity to better represent the connectivity of high or low permeability structures in the underground. Many different algorithms (ENESIM, SNESIM, SIMPAT, CCSIM, QUILTING, IMPALA, DEESSE, FILTERSIM, HYPPS, etc.) have been and are still proposed. They are all based on the concept of a training data set from which spatial statistics are derived and used in a further step to generate conditional realizations. Some of these algorithms evaluate the statistics of the spatial patterns for every pixel, other techniques consider the statistics at the scale of a patch or a tile. While the method clearly succeeded in enabling modelers to generate realistic models, several issues are still the topic of debate both from a practical and theoretical point of view, and some issues such as training data set availability are often hindering the application of the method in practical situations. In this talk, the aim is to present a review of the status of these approaches both from a theoretical and practical point of view using several examples at different scales (from pore network to regional aquifer).
Statistical power analysis of cardiovascular safety pharmacology studies in conscious rats.
Bhatt, Siddhartha; Li, Dingzhou; Flynn, Declan; Wisialowski, Todd; Hemkens, Michelle; Steidl-Nichols, Jill
2016-01-01
Cardiovascular (CV) toxicity and related attrition are a major challenge for novel therapeutic entities and identifying CV liability early is critical for effective derisking. CV safety pharmacology studies in rats are a valuable tool for early investigation of CV risk. Thorough understanding of data analysis techniques and statistical power of these studies is currently lacking and is imperative for enabling sound decision-making. Data from 24 crossover and 12 parallel design CV telemetry rat studies were used for statistical power calculations. Average values of telemetry parameters (heart rate, blood pressure, body temperature, and activity) were logged every 60s (from 1h predose to 24h post-dose) and reduced to 15min mean values. These data were subsequently binned into super intervals for statistical analysis. A repeated measure analysis of variance was used for statistical analysis of crossover studies and a repeated measure analysis of covariance was used for parallel studies. Statistical power analysis was performed to generate power curves and establish relationships between detectable CV (blood pressure and heart rate) changes and statistical power. Additionally, data from a crossover CV study with phentolamine at 4, 20 and 100mg/kg are reported as a representative example of data analysis methods. Phentolamine produced a CV profile characteristic of alpha adrenergic receptor antagonism, evidenced by a dose-dependent decrease in blood pressure and reflex tachycardia. Detectable blood pressure changes at 80% statistical power for crossover studies (n=8) were 4-5mmHg. For parallel studies (n=8), detectable changes at 80% power were 6-7mmHg. Detectable heart rate changes for both study designs were 20-22bpm. Based on our results, the conscious rat CV model is a sensitive tool to detect and mitigate CV risk in early safety studies. Furthermore, these results will enable informed selection of appropriate models and study design for early stage CV studies. Copyright © 2016 Elsevier Inc. All rights reserved.
The ADE scorecards: a tool for adverse drug event detection in electronic health records.
Chazard, Emmanuel; Băceanu, Adrian; Ferret, Laurie; Ficheur, Grégoire
2011-01-01
Although several methods exist for Adverse Drug events (ADE) detection due to past hospitalizations, a tool that could display those ADEs to the physicians does not exist yet. This article presents the ADE Scorecards, a Web tool that enables to screen past hospitalizations extracted from Electronic Health Records (EHR), using a set of ADE detection rules, presently rules discovered by data mining. The tool enables the physicians to (1) get contextualized statistics about the ADEs that happen in their medical department, (2) see the rules that are useful in their department, i.e. the rules that could have enabled to prevent those ADEs and (3) review in detail the ADE cases, through a comprehensive interface displaying the diagnoses, procedures, lab results, administered drugs and anonymized records. The article shows a demonstration of the tool through a use case.
VALUE - A Framework to Validate Downscaling Approaches for Climate Change Studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilke, Renate A. I.
2015-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. Here, we present the key ingredients of this framework. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
VALUE: A framework to validate downscaling approaches for climate change studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilcke, Renate A. I.
2015-01-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. In this paper, we present the key ingredients of this framework. VALUE's main approach to validation is user- focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
DETECTING UNSPECIFIED STRUCTURE IN LOW-COUNT IMAGES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stein, Nathan M.; Dyk, David A. van; Kashyap, Vinay L.
Unexpected structure in images of astronomical sources often presents itself upon visual inspection of the image, but such apparent structure may either correspond to true features in the source or be due to noise in the data. This paper presents a method for testing whether inferred structure in an image with Poisson noise represents a significant departure from a baseline (null) model of the image. To infer image structure, we conduct a Bayesian analysis of a full model that uses a multiscale component to allow flexible departures from the posited null model. As a test statistic, we use a tailmore » probability of the posterior distribution under the full model. This choice of test statistic allows us to estimate a computationally efficient upper bound on a p-value that enables us to draw strong conclusions even when there are limited computational resources that can be devoted to simulations under the null model. We demonstrate the statistical performance of our method on simulated images. Applying our method to an X-ray image of the quasar 0730+257, we find significant evidence against the null model of a single point source and uniform background, lending support to the claim of an X-ray jet.« less
Antoszewska-Smith, Joanna; Sarul, Michał; Łyczek, Jan; Konopka, Tomasz; Kawala, Beata
2017-03-01
The aim of this systematic review was to compare the effectiveness of orthodontic miniscrew implants-temporary intraoral skeletal anchorage devices (TISADs)-in anchorage reinforcement during en-masse retraction in relation to conventional methods of anchorage. A search of PubMed, Embase, Cochrane Central Register of Controlled Trials, and Web of Science was performed. The keywords were orthodontic, mini-implants, miniscrews, miniplates, and temporary anchorage device. Relevant articles were assessed for quality according to Cochrane guidelines and the data extracted for statistical analysis. A meta-analysis of raw mean differences concerning anchorage loss, tipping of molars, retraction of incisors, tipping of incisors, and treatment duration was carried out. Initially, we retrieved 10,038 articles. The selection process finally resulted in 14 articles including 616 patients (451 female, 165 male) for detailed analysis. Quality of the included studies was assessed as moderate. Meta-analysis showed that use of TISADs facilitates better anchorage reinforcement compared with conventional methods. On average, TISADs enabled 1.86 mm more anchorage preservation than did conventional methods (P <0.001). The results of the meta-analysis showed that TISADs are more effective than conventional methods of anchorage reinforcement. The average difference of 2 mm seems not only statistically but also clinically significant. However, the results should be interpreted with caution because of the moderate quality of the included studies. More high-quality studies on this issue are necessary to enable drawing more reliable conclusions. Copyright © 2016 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Statistical framework and noise sensitivity of the amplitude radial correlation contrast method.
Kipervaser, Zeev Gideon; Pelled, Galit; Goelman, Gadi
2007-09-01
A statistical framework for the amplitude radial correlation contrast (RCC) method, which integrates a conventional pixel threshold approach with cluster-size statistics, is presented. The RCC method uses functional MRI (fMRI) data to group neighboring voxels in terms of their degree of temporal cross correlation and compares coherences in different brain states (e.g., stimulation OFF vs. ON). By defining the RCC correlation map as the difference between two RCC images, the map distribution of two OFF states is shown to be normal, enabling the definition of the pixel cutoff. The empirical cluster-size null distribution obtained after the application of the pixel cutoff is used to define a cluster-size cutoff that allows 5% false positives. Assuming that the fMRI signal equals the task-induced response plus noise, an analytical expression of amplitude-RCC dependency on noise is obtained and used to define the pixel threshold. In vivo and ex vivo data obtained during rat forepaw electric stimulation are used to fine-tune this threshold. Calculating the spatial coherences within in vivo and ex vivo images shows enhanced coherence in the in vivo data, but no dependency on the anesthesia method, magnetic field strength, or depth of anesthesia, strengthening the generality of the proposed cutoffs. Copyright (c) 2007 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Adams, T.; Batra, P.; Bugel, L.; Camilleri, L.; Conrad, J. M.; de Gouvêa, A.; Fisher, P. H.; Formaggio, J. A.; Jenkins, J.; Karagiorgi, G.; Kobilarcik, T. R.; Kopp, S.; Kyle, G.; Loinaz, W. A.; Mason, D. A.; Milner, R.; Moore, R.; Morfín, J. G.; Nakamura, M.; Naples, D.; Nienaber, P.; Olness, F. I.; Owens, J. F.; Pate, S. F.; Pronin, A.; Seligman, W. G.; Shaevitz, M. H.; Schellman, H.; Schienbein, I.; Syphers, M. J.; Tait, T. M. P.; Takeuchi, T.; Tan, C. Y.; van de Water, R. G.; Yamamoto, R. K.; Yu, J. Y.
We extend the physics case for a new high-energy, ultra-high statistics neutrino scattering experiment, NuSOnG (Neutrino Scattering On Glass) to address a variety of issues including precision QCD measurements, extraction of structure functions, and the derived Parton Distribution Functions (PDF's). This experiment uses a Tevatron-based neutrino beam to obtain a sample of Deep Inelastic Scattering (DIS) events which is over two orders of magnitude larger than past samples. We outline an innovative method for fitting the structure functions using a parametrized energy shift which yields reduced systematic uncertainties. High statistics measurements, in combination with improved systematics, will enable NuSOnG to perform discerning tests of fundamental Standard Model parameters as we search for deviations which may hint of "Beyond the Standard Model" physics.
Radiomic analysis in prediction of Human Papilloma Virus status.
Yu, Kaixian; Zhang, Youyi; Yu, Yang; Huang, Chao; Liu, Rongjie; Li, Tengfei; Yang, Liuqing; Morris, Jeffrey S; Baladandayuthapani, Veerabhadran; Zhu, Hongtu
2017-12-01
Human Papilloma Virus (HPV) has been associated with oropharyngeal cancer prognosis. Traditionally the HPV status is tested through invasive lab test. Recently, the rapid development of statistical image analysis techniques has enabled precise quantitative analysis of medical images. The quantitative analysis of Computed Tomography (CT) provides a non-invasive way to assess HPV status for oropharynx cancer patients. We designed a statistical radiomics approach analyzing CT images to predict HPV status. Various radiomics features were extracted from CT scans, and analyzed using statistical feature selection and prediction methods. Our approach ranked the highest in the 2016 Medical Image Computing and Computer Assisted Intervention (MICCAI) grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction. Further analysis on the most relevant radiomic features distinguishing HPV positive and negative subjects suggested that HPV positive patients usually have smaller and simpler tumors.
Deterministic annealing for density estimation by multivariate normal mixtures
NASA Astrophysics Data System (ADS)
Kloppenburg, Martin; Tavan, Paul
1997-03-01
An approach to maximum-likelihood density estimation by mixtures of multivariate normal distributions for large high-dimensional data sets is presented. Conventionally that problem is tackled by notoriously unstable expectation-maximization (EM) algorithms. We remove these instabilities by the introduction of soft constraints, enabling deterministic annealing. Our developments are motivated by the proof that algorithmically stable fuzzy clustering methods that are derived from statistical physics analogs are special cases of EM procedures.
Power-law statistics of neurophysiological processes analyzed using short signals
NASA Astrophysics Data System (ADS)
Pavlova, Olga N.; Runnova, Anastasiya E.; Pavlov, Alexey N.
2018-04-01
We discuss the problem of quantifying power-law statistics of complex processes from short signals. Based on the analysis of electroencephalograms (EEG) we compare three interrelated approaches which enable characterization of the power spectral density (PSD) and show that an application of the detrended fluctuation analysis (DFA) or the wavelet-transform modulus maxima (WTMM) method represents a useful way of indirect characterization of the PSD features from short data sets. We conclude that despite DFA- and WTMM-based measures can be obtained from the estimated PSD, these tools outperform the standard spectral analysis when characterization of the analyzed regime should be provided based on a very limited amount of data.
Big Data Analytics for Scanning Transmission Electron Microscopy Ptychography
NASA Astrophysics Data System (ADS)
Jesse, S.; Chi, M.; Belianinov, A.; Beekman, C.; Kalinin, S. V.; Borisevich, A. Y.; Lupini, A. R.
2016-05-01
Electron microscopy is undergoing a transition; from the model of producing only a few micrographs, through the current state where many images and spectra can be digitally recorded, to a new mode where very large volumes of data (movies, ptychographic and multi-dimensional series) can be rapidly obtained. Here, we discuss the application of so-called “big-data” methods to high dimensional microscopy data, using unsupervised multivariate statistical techniques, in order to explore salient image features in a specific example of BiFeO3 domains. Remarkably, k-means clustering reveals domain differentiation despite the fact that the algorithm is purely statistical in nature and does not require any prior information regarding the material, any coexisting phases, or any differentiating structures. While this is a somewhat trivial case, this example signifies the extraction of useful physical and structural information without any prior bias regarding the sample or the instrumental modality. Further interpretation of these types of results may still require human intervention. However, the open nature of this algorithm and its wide availability, enable broad collaborations and exploratory work necessary to enable efficient data analysis in electron microscopy.
Brain vascular image segmentation based on fuzzy local information C-means clustering
NASA Astrophysics Data System (ADS)
Hu, Chaoen; Liu, Xia; Liang, Xiao; Hui, Hui; Yang, Xin; Tian, Jie
2017-02-01
Light sheet fluorescence microscopy (LSFM) is a powerful optical resolution fluorescence microscopy technique which enables to observe the mouse brain vascular network in cellular resolution. However, micro-vessel structures are intensity inhomogeneity in LSFM images, which make an inconvenience for extracting line structures. In this work, we developed a vascular image segmentation method by enhancing vessel details which should be useful for estimating statistics like micro-vessel density. Since the eigenvalues of hessian matrix and its sign describes different geometric structure in images, which enable to construct vascular similarity function and enhance line signals, the main idea of our method is to cluster the pixel values of the enhanced image. Our method contained three steps: 1) calculate the multiscale gradients and the differences between eigenvalues of Hessian matrix. 2) In order to generate the enhanced microvessels structures, a feed forward neural network was trained by 2.26 million pixels for dealing with the correlations between multi-scale gradients and the differences between eigenvalues. 3) The fuzzy local information c-means clustering (FLICM) was used to cluster the pixel values in enhance line signals. To verify the feasibility and effectiveness of this method, mouse brain vascular images have been acquired by a commercial light-sheet microscope in our lab. The experiment of the segmentation method showed that dice similarity coefficient can reach up to 85%. The results illustrated that our approach extracting line structures of blood vessels dramatically improves the vascular image and enable to accurately extract blood vessels in LSFM images.
Unicomb, Rachael; Colyvas, Kim; Harrison, Elisabeth; Hewat, Sally
2015-06-01
Case-study methodology studying change is often used in the field of speech-language pathology, but it can be criticized for not being statistically robust. Yet with the heterogeneous nature of many communication disorders, case studies allow clinicians and researchers to closely observe and report on change. Such information is valuable and can further inform large-scale experimental designs. In this research note, a statistical analysis for case-study data is outlined that employs a modification to the Reliable Change Index (Jacobson & Truax, 1991). The relationship between reliable change and clinical significance is discussed. Example data are used to guide the reader through the use and application of this analysis. A method of analysis is detailed that is suitable for assessing change in measures with binary categorical outcomes. The analysis is illustrated using data from one individual, measured before and after treatment for stuttering. The application of this approach to assess change in categorical, binary data has potential application in speech-language pathology. It enables clinicians and researchers to analyze results from case studies for their statistical and clinical significance. This new method addresses a gap in the research design literature, that is, the lack of analysis methods for noncontinuous data (such as counts, rates, proportions of events) that may be used in case-study designs.
SAFE: SPARQL Federation over RDF Data Cubes with Access Control.
Khan, Yasar; Saleem, Muhammad; Mehdi, Muntazir; Hogan, Aidan; Mehmood, Qaiser; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh
2017-02-01
Several query federation engines have been proposed for accessing public Linked Open Data sources. However, in many domains, resources are sensitive and access to these resources is tightly controlled by stakeholders; consequently, privacy is a major concern when federating queries over such datasets. In the Healthcare and Life Sciences (HCLS) domain real-world datasets contain sensitive statistical information: strict ownership is granted to individuals working in hospitals, research labs, clinical trial organisers, etc. Therefore, the legal and ethical concerns on (i) preserving the anonymity of patients (or clinical subjects); and (ii) respecting data ownership through access control; are key challenges faced by the data analytics community working within the HCLS domain. Likewise statistical data play a key role in the domain, where the RDF Data Cube Vocabulary has been proposed as a standard format to enable the exchange of such data. However, to the best of our knowledge, no existing approach has looked to optimise federated queries over such statistical data. We present SAFE: a query federation engine that enables policy-aware access to sensitive statistical datasets represented as RDF data cubes. SAFE is designed specifically to query statistical RDF data cubes in a distributed setting, where access control is coupled with source selection, user profiles and their access rights. SAFE proposes a join-aware source selection method that avoids wasteful requests to irrelevant and unauthorised data sources. In order to preserve anonymity and enforce stricter access control, SAFE's indexing system does not hold any data instances-it stores only predicates and endpoints. The resulting data summary has a significantly lower index generation time and size compared to existing engines, which allows for faster updates when sources change. We validate the performance of the system with experiments over real-world datasets provided by three clinical organisations as well as legacy linked datasets. We show that SAFE enables granular graph-level access control over distributed clinical RDF data cubes and efficiently reduces the source selection and overall query execution time when compared with general-purpose SPARQL query federation engines in the targeted setting.
Sobel, E.; Lange, K.
1996-01-01
The introduction of stochastic methods in pedigree analysis has enabled geneticists to tackle computations intractable by standard deterministic methods. Until now these stochastic techniques have worked by running a Markov chain on the set of genetic descent states of a pedigree. Each descent state specifies the paths of gene flow in the pedigree and the founder alleles dropped down each path. The current paper follows up on a suggestion by Elizabeth Thompson that genetic descent graphs offer a more appropriate space for executing a Markov chain. A descent graph specifies the paths of gene flow but not the particular founder alleles traveling down the paths. This paper explores algorithms for implementing Thompson's suggestion for codominant markers in the context of automatic haplotyping, estimating location scores, and computing gene-clustering statistics for robust linkage analysis. Realistic numerical examples demonstrate the feasibility of the algorithms. PMID:8651310
Composite Bloom Filters for Secure Record Linkage.
Durham, Elizabeth Ashley; Kantarcioglu, Murat; Xue, Yuan; Toth, Csaba; Kuzu, Mehmet; Malin, Bradley
2014-12-01
The process of record linkage seeks to integrate instances that correspond to the same entity. Record linkage has traditionally been performed through the comparison of identifying field values ( e.g., Surname ), however, when databases are maintained by disparate organizations, the disclosure of such information can breach the privacy of the corresponding individuals. Various private record linkage (PRL) methods have been developed to obscure such identifiers, but they vary widely in their ability to balance competing goals of accuracy, efficiency and security. The tokenization and hashing of field values into Bloom filters (BF) enables greater linkage accuracy and efficiency than other PRL methods, but the encodings may be compromised through frequency-based cryptanalysis. Our objective is to adapt a BF encoding technique to mitigate such attacks with minimal sacrifices in accuracy and efficiency. To accomplish these goals, we introduce a statistically-informed method to generate BF encodings that integrate bits from multiple fields, the frequencies of which are provably associated with a minimum number of fields. Our method enables a user-specified tradeoff between security and accuracy. We compare our encoding method with other techniques using a public dataset of voter registration records and demonstrate that the increases in security come with only minor losses to accuracy.
Composite Bloom Filters for Secure Record Linkage
Durham, Elizabeth Ashley; Kantarcioglu, Murat; Xue, Yuan; Toth, Csaba; Kuzu, Mehmet; Malin, Bradley
2014-01-01
The process of record linkage seeks to integrate instances that correspond to the same entity. Record linkage has traditionally been performed through the comparison of identifying field values (e.g., Surname), however, when databases are maintained by disparate organizations, the disclosure of such information can breach the privacy of the corresponding individuals. Various private record linkage (PRL) methods have been developed to obscure such identifiers, but they vary widely in their ability to balance competing goals of accuracy, efficiency and security. The tokenization and hashing of field values into Bloom filters (BF) enables greater linkage accuracy and efficiency than other PRL methods, but the encodings may be compromised through frequency-based cryptanalysis. Our objective is to adapt a BF encoding technique to mitigate such attacks with minimal sacrifices in accuracy and efficiency. To accomplish these goals, we introduce a statistically-informed method to generate BF encodings that integrate bits from multiple fields, the frequencies of which are provably associated with a minimum number of fields. Our method enables a user-specified tradeoff between security and accuracy. We compare our encoding method with other techniques using a public dataset of voter registration records and demonstrate that the increases in security come with only minor losses to accuracy. PMID:25530689
A Linguistic Truth-Valued Temporal Reasoning Formalism and Its Implementation
NASA Astrophysics Data System (ADS)
Lu, Zhirui; Liu, Jun; Augusto, Juan C.; Wang, Hui
Temporality and uncertainty are important features of many real world systems. Solving problems in such systems requires the use of formal mechanism such as logic systems, statistical methods or other reasoning and decision-making methods. In this paper, we propose a linguistic truth-valued temporal reasoning formalism to enable the management of both features concurrently using a linguistic truth valued logic and a temporal logic. We also provide a backward reasoning algorithm which allows the answering of user queries. A simple but realistic scenario in a smart home application is used to illustrate our work.
Driven-dissipative quantum Monte Carlo method for open quantum systems
NASA Astrophysics Data System (ADS)
Nagy, Alexandra; Savona, Vincenzo
2018-05-01
We develop a real-time full configuration-interaction quantum Monte Carlo approach to model driven-dissipative open quantum systems with Markovian system-bath coupling. The method enables stochastic sampling of the Liouville-von Neumann time evolution of the density matrix thanks to a massively parallel algorithm, thus providing estimates of observables on the nonequilibrium steady state. We present the underlying theory and introduce an initiator technique and importance sampling to reduce the statistical error. Finally, we demonstrate the efficiency of our approach by applying it to the driven-dissipative two-dimensional X Y Z spin-1/2 model on a lattice.
General aviation avionics statistics : 1977.
DOT National Transportation Integrated Search
1980-06-01
This report presents avionics statistics for the 1977 general aviation (GA) aircraft fleet and is the fourth in a series. The statistics are presented in a capability group framework which enables one to relate airborne avionics equipment to the capa...
Force and Conductance Spectroscopy of Single Molecule Junctions
NASA Astrophysics Data System (ADS)
Frei, Michael
Investigation of mechanical properties of single molecule junctions is crucial to develop an understanding and enable control of single molecular junctions. This work presents an experimental and analytical approach that enables the statistical evaluation of force and simultaneous conductance data of metallic atomic point contacts and molecular junctions. A conductive atomic force microscope based break junction technique is developed to form single molecular junctions and collect conductance and force data simultaneously. Improvements of the optical components have been achieved through the use of a super-luminescent diode, enabling tremendous increases in force resolution. An experimental procedure to collect data for various molecular junctions has been developed and includes deposition, calibration, and analysis methods. For the statistical analysis of force, novel approaches based on two dimensional histograms and a direct force identification method are presented. The two dimensional method allows for an unbiased evaluation of force events that are identified using corresponding conductance signatures. This is not always possible however, and in these situations, the force based identification of junction rearrangement events is an attractive alternative method. This combined experimental and analytical approach is then applied to three studies: First, the impact of molecular backbones to the mechanical behavior of single molecule junctions is investigated and it is found that junctions formed with identical linkers but different backbone structure result in junctions with varying breaking forces. All molecules used show a clear molecular signature and force data can be evaluated using the 2D method. Second, the effects of the linker group used to attach molecules to gold electrodes are investigated. A study of four alkane molecules with different linkers finds a drastic difference in the evolution of donor-acceptor and covalently bonded molecules respectively. In fact, the covalent bond is found to significantly distort the metal electrode rearrangement such that junction rearrangement events can no longer be identified with a clean and well defined conductance signature. For this case, the force based identification process is used. Third, results for break junction measurements with different metals are presented. It is found that silver and palladium junctions rupture with forces different from those of gold contacts. In the case of silver experiments in ambient conditions, we can also identify oxygen impurities in the silver contact formation process, leading to force and conductance measurements of silver-oxygen structures. For the future, this work provides an experimental and analytical foundation that will enable insights into single molecule systems not previously accessible.
Identifying when weather influences life-history traits of grazing herbivores.
Sims, Michelle; Elston, David A; Larkham, Ann; Nussey, Daniel H; Albon, Steve D
2007-07-01
1. There is increasing evidence that density-independent weather effects influence life-history traits and hence the dynamics of populations of animals. Here, we present a novel statistical approach to estimate when such influences are strongest. The method is demonstrated by analyses investigating the timing of the influence of weather on the birth weight of sheep and deer. 2. The statistical technique allowed for the pattern of temporal correlation in the weather data enabling the effects of weather in many fine-scale time intervals to be investigated simultaneously. Thus, while previous studies have typically considered weather averaged across a single broad time interval during pregnancy, our approach enabled examination simultaneously of the relationships with weekly and fortnightly averages throughout the whole of pregnancy. 3. We detected a positive effect of temperature on the birth weight of deer, which is strongest in late pregnancy (mid-March to mid-April), and a negative effect of rainfall on the birthweight of sheep, which is strongest during mid-pregnancy (late January to early February). The possible mechanisms underlying these weather-birth weight relationships are discussed. 4. This study enhances our insight into the pattern of the timing of influence of weather on early development. The method is of much more general application and could provide valuable insights in other areas of ecology in which sequences of intercorrelated explanatory variables have been collected in space or in time.
Teaching Statistics to Social Science Students: Making It Valuable
ERIC Educational Resources Information Center
North, D.; Zewotir, T.
2006-01-01
In this age of rapid information expansion and technology, statistics is playing an ever increasing role in education, particularly also in the training of social scientists. Statistics enables the social scientist to obtain a quantitative awareness of socio-economic phenomena hence is essential in their training. Statistics, however, is becoming…
NASA Astrophysics Data System (ADS)
Lievens, Klaus; Van Nimmen, Katrien; Lombaert, Geert; De Roeck, Guido; Van den Broeck, Peter
2016-09-01
In civil engineering and architecture, the availability of high strength materials and advanced calculation techniques enables the construction of slender footbridges, generally highly sensitive to human-induced excitation. Due to the inherent random character of the human-induced walking load, variability on the pedestrian characteristics must be considered in the response simulation. To assess the vibration serviceability of the footbridge, the statistics of the stochastic dynamic response are evaluated by considering the instantaneous peak responses in a time range. Therefore, a large number of time windows are needed to calculate the mean value and standard deviation of the instantaneous peak values. An alternative method to evaluate the statistics is based on the standard deviation of the response and a characteristic frequency as proposed in wind engineering applications. In this paper, the accuracy of this method is evaluated for human-induced vibrations. The methods are first compared for a group of pedestrians crossing a lightly damped footbridge. Small differences of the instantaneous peak value were found by the method using second order statistics. Afterwards, a TMD tuned to reduce the peak acceleration to a comfort value, was added to the structure. The comparison between both methods in made and the accuracy is verified. It is found that the TMD parameters are tuned sufficiently and good agreements between the two methods are found for the estimation of the instantaneous peak response for a strongly damped structure.
Catto, James W F; Linkens, Derek A; Abbod, Maysam F; Chen, Minyou; Burton, Julian L; Feeley, Kenneth M; Hamdy, Freddie C
2003-09-15
New techniques for the prediction of tumor behavior are needed, because statistical analysis has a poor accuracy and is not applicable to the individual. Artificial intelligence (AI) may provide these suitable methods. Whereas artificial neural networks (ANN), the best-studied form of AI, have been used successfully, its hidden networks remain an obstacle to its acceptance. Neuro-fuzzy modeling (NFM), another AI method, has a transparent functional layer and is without many of the drawbacks of ANN. We have compared the predictive accuracies of NFM, ANN, and traditional statistical methods, for the behavior of bladder cancer. Experimental molecular biomarkers, including p53 and the mismatch repair proteins, and conventional clinicopathological data were studied in a cohort of 109 patients with bladder cancer. For all three of the methods, models were produced to predict the presence and timing of a tumor relapse. Both methods of AI predicted relapse with an accuracy ranging from 88% to 95%. This was superior to statistical methods (71-77%; P < 0.0006). NFM appeared better than ANN at predicting the timing of relapse (P = 0.073). The use of AI can accurately predict cancer behavior. NFM has a similar or superior predictive accuracy to ANN. However, unlike the impenetrable "black-box" of a neural network, the rules of NFM are transparent, enabling validation from clinical knowledge and the manipulation of input variables to allow exploratory predictions. This technique could be used widely in a variety of areas of medicine.
Evaluating the decision accuracy and speed of clinical data visualizations.
Pieczkiewicz, David S; Finkelstein, Stanley M
2010-01-01
Clinicians face an increasing volume of biomedical data. Assessing the efficacy of systems that enable accurate and timely clinical decision making merits corresponding attention. This paper discusses the multiple-reader multiple-case (MRMC) experimental design and linear mixed models as means of assessing and comparing decision accuracy and latency (time) for decision tasks in which clinician readers must interpret visual displays of data. These tools can assess and compare decision accuracy and latency (time). These experimental and statistical techniques, used extensively in radiology imaging studies, offer a number of practical and analytic advantages over more traditional quantitative methods such as percent-correct measurements and ANOVAs, and are recommended for their statistical efficiency and generalizability. An example analysis using readily available, free, and commercial statistical software is provided as an appendix. While these techniques are not appropriate for all evaluation questions, they can provide a valuable addition to the evaluative toolkit of medical informatics research.
Sempa, Joseph B; Ujeneza, Eva L; Nieuwoudt, Martin
2017-01-01
In Sub-Saharan African (SSA) resource limited settings, Cluster of Differentiation 4 (CD4) counts continue to be used for clinical decision making in antiretroviral therapy (ART). Here, HIV-infected people often remain with CD4 counts <350 cells/μL even after 5 years of viral load suppression. Ongoing immunological monitoring is necessary. Due to varying statistical modeling methods comparing immune response to ART across different cohorts is difficult. We systematically review such models and detail the similarities, differences and problems. 'Preferred Reporting Items for Systematic Review and Meta-Analyses' guidelines were used. Only studies of immune-response after ART initiation from SSA in adults were included. Data was extracted from each study and tabulated. Outcomes were categorized into 3 groups: 'slope', 'survival', and 'asymptote' models. Wordclouds were drawn wherein the frequency of variables occurring in the reviewed models is indicated by their size and color. 69 covariates were identified in the final models of 35 studies. Effect sizes of covariates were not directly quantitatively comparable in view of the combination of differing variables and scale transformation methods across models. Wordclouds enabled the identification of qualitative and semi-quantitative covariate sets for each outcome category. Comparison across categories identified sex, baseline age, baseline log viral load, baseline CD4, ART initiation regimen and ART duration as a minimal consensus set. Most models were different with respect to covariates included, variable transformations and scales, model assumptions, modelling strategies and reporting methods, even for the same outcomes. To enable comparison across cohorts, statistical models would benefit from the application of more uniform modelling techniques. Historic efforts have produced results that are anecdotal to individual cohorts only. This study was able to define 'prior' knowledge in the Bayesian sense. Such information has value for prospective modelling efforts.
Salgado-Petinal, Carmen; Lamas, J Pablo; Garcia-Jares, Carmen; Llompart, Maria; Cela, Rafael
2005-07-01
In this paper a solid-phase microextraction-gas chromatography-mass spectrometry (SPME-GC-MS) method is proposed for a rapid analysis of some frequently prescribed selective serotonin re-uptake inhibitors (SSRI)-venlafaxine, fluvoxamine, mirtazapine, fluoxetine, citalopram, and sertraline-in urine samples. The SPME-based method enables simultaneous determination of the target SSRI after simple in-situ derivatization of some of the target compounds. Calibration curves in water and in urine were validated and statistically compared. This revealed the absence of matrix effect and, in consequence, the possibility of quantifying SSRI in urine samples by external water calibration. Intra-day and inter-day precision was satisfactory for all the target compounds (relative standard deviation, RSD, <14%) and the detection limits achieved were <0.4 ng mL(-1) urine. The time required for the SPME step and for GC analysis (30 min each) enables high throughput. The method was applied to real urine samples from different patients being treated with some of these pharmaceuticals. Some SSRI metabolites were also detected and tentatively identified.
New gap-filling and partitioning technique for H2O eddy fluxes measured over forests
NASA Astrophysics Data System (ADS)
Kang, Minseok; Kim, Joon; Malla Thakuri, Bindu; Chun, Junghwa; Cho, Chunho
2018-01-01
The continuous measurement of H2O fluxes using the eddy covariance (EC) technique is still challenging for forests because of large amounts of wet canopy evaporation (EWC), which occur during and following rain events when the EC systems rarely work correctly. We propose a new gap-filling and partitioning technique for the H2O fluxes: a model-statistics hybrid (MSH) method. It enables the recovery of the missing EWC in the traditional gap-filling method and the partitioning of the evapotranspiration (ET) into transpiration and (wet canopy) evaporation. We tested and validated the new method using the data sets from two flux towers, which are located at forests in hilly and complex terrains. The MSH reasonably recovered the missing EWC of 16-41 mm yr-1 and separated it from the ET (14-23 % of the annual ET). Additionally, we illustrated certain advantages of the proposed technique which enable us to understand better how ET responds to environmental changes and how the water cycle is connected to the carbon cycle in a forest ecosystem.
Emerging technologies for pediatric and adult trauma care.
Moulton, Steven L; Haley-Andrews, Stephanie; Mulligan, Jane
2010-06-01
Current Emergency Medical Service protocols rely on provider-directed care for evaluation, management and triage of injured patients from the field to a trauma center. New methods to quickly diagnose, support and coordinate the movement of trauma patients from the field to the most appropriate trauma center are in development. These methods will enhance trauma care and promote trauma system development. Recent advances in machine learning, statistical methods, device integration and wireless communication are giving rise to new methods for vital sign data analysis and a new generation of transport monitors. These monitors will collect and synchronize exponentially growing amounts of vital sign data with electronic patient care information. The application of advanced statistical methods to these complex clinical data sets has the potential to reveal many important physiological relationships and treatment effects. Several emerging technologies are converging to yield a new generation of smart sensors and tightly integrated transport monitors. These technologies will assist prehospital providers in quickly identifying and triaging the most severely injured children and adults to the most appropriate trauma centers. They will enable the development of real-time clinical support systems of increasing complexity, able to provide timelier, more cost-effective, autonomous care.
Hefron, Ryan; Borghetti, Brett; Schubert Kabban, Christine; Christensen, James; Estepp, Justin
2018-04-26
Applying deep learning methods to electroencephalograph (EEG) data for cognitive state assessment has yielded improvements over previous modeling methods. However, research focused on cross-participant cognitive workload modeling using these techniques is underrepresented. We study the problem of cross-participant state estimation in a non-stimulus-locked task environment, where a trained model is used to make workload estimates on a new participant who is not represented in the training set. Using experimental data from the Multi-Attribute Task Battery (MATB) environment, a variety of deep neural network models are evaluated in the trade-space of computational efficiency, model accuracy, variance and temporal specificity yielding three important contributions: (1) The performance of ensembles of individually-trained models is statistically indistinguishable from group-trained methods at most sequence lengths. These ensembles can be trained for a fraction of the computational cost compared to group-trained methods and enable simpler model updates. (2) While increasing temporal sequence length improves mean accuracy, it is not sufficient to overcome distributional dissimilarities between individuals’ EEG data, as it results in statistically significant increases in cross-participant variance. (3) Compared to all other networks evaluated, a novel convolutional-recurrent model using multi-path subnetworks and bi-directional, residual recurrent layers resulted in statistically significant increases in predictive accuracy and decreases in cross-participant variance.
Hefron, Ryan; Borghetti, Brett; Schubert Kabban, Christine; Christensen, James; Estepp, Justin
2018-01-01
Applying deep learning methods to electroencephalograph (EEG) data for cognitive state assessment has yielded improvements over previous modeling methods. However, research focused on cross-participant cognitive workload modeling using these techniques is underrepresented. We study the problem of cross-participant state estimation in a non-stimulus-locked task environment, where a trained model is used to make workload estimates on a new participant who is not represented in the training set. Using experimental data from the Multi-Attribute Task Battery (MATB) environment, a variety of deep neural network models are evaluated in the trade-space of computational efficiency, model accuracy, variance and temporal specificity yielding three important contributions: (1) The performance of ensembles of individually-trained models is statistically indistinguishable from group-trained methods at most sequence lengths. These ensembles can be trained for a fraction of the computational cost compared to group-trained methods and enable simpler model updates. (2) While increasing temporal sequence length improves mean accuracy, it is not sufficient to overcome distributional dissimilarities between individuals’ EEG data, as it results in statistically significant increases in cross-participant variance. (3) Compared to all other networks evaluated, a novel convolutional-recurrent model using multi-path subnetworks and bi-directional, residual recurrent layers resulted in statistically significant increases in predictive accuracy and decreases in cross-participant variance. PMID:29701668
Laksmana, F L; Van Vliet, L J; Hartman Kok, P J A; Vromans, H; Frijlink, H W; Van der Voort Maarschalk, K
2009-04-01
This study aims to develop a characterization method for coating structure based on image analysis, which is particularly promising for the rational design of coated particles in the pharmaceutical industry. The method applies the MATLAB image processing toolbox to images of coated particles taken with Confocal Laser Scanning Microscopy (CSLM). The coating thicknesses have been determined along the particle perimeter, from which a statistical analysis could be performed to obtain relevant thickness properties, e.g. the minimum coating thickness and the span of the thickness distribution. The characterization of the pore structure involved a proper segmentation of pores from the coating and a granulometry operation. The presented method facilitates the quantification of porosity, thickness and pore size distribution of a coating. These parameters are considered the important coating properties, which are critical to coating functionality. Additionally, the effect of the coating process variations on coating quality can straight-forwardly be assessed. Enabling a good characterization of the coating qualities, the presented method can be used as a fast and effective tool to predict coating functionality. This approach also enables the influence of different process conditions on coating properties to be effectively monitored, which latterly leads to process tailoring.
NASA Astrophysics Data System (ADS)
Mengis, Nadine; Keller, David P.; Oschlies, Andreas
2018-01-01
This study introduces the Systematic Correlation Matrix Evaluation (SCoMaE) method, a bottom-up approach which combines expert judgment and statistical information to systematically select transparent, nonredundant indicators for a comprehensive assessment of the state of the Earth system. The methods consists of two basic steps: (1) the calculation of a correlation matrix among variables relevant for a given research question and (2) the systematic evaluation of the matrix, to identify clusters of variables with similar behavior and respective mutually independent indicators. Optional further analysis steps include (3) the interpretation of the identified clusters, enabling a learning effect from the selection of indicators, (4) testing the robustness of identified clusters with respect to changes in forcing or boundary conditions, (5) enabling a comparative assessment of varying scenarios by constructing and evaluating a common correlation matrix, and (6) the inclusion of expert judgment, for example, to prescribe indicators, to allow for considerations other than statistical consistency. The example application of the SCoMaE method to Earth system model output forced by different CO2 emission scenarios reveals the necessity of reevaluating indicators identified in a historical scenario simulation for an accurate assessment of an intermediate-high, as well as a business-as-usual, climate change scenario simulation. This necessity arises from changes in prevailing correlations in the Earth system under varying climate forcing. For a comparative assessment of the three climate change scenarios, we construct and evaluate a common correlation matrix, in which we identify robust correlations between variables across the three considered scenarios.
Empirical intrinsic geometry for nonlinear modeling and time series filtering.
Talmon, Ronen; Coifman, Ronald R
2013-07-30
In this paper, we present a method for time series analysis based on empirical intrinsic geometry (EIG). EIG enables one to reveal the low-dimensional parametric manifold as well as to infer the underlying dynamics of high-dimensional time series. By incorporating concepts of information geometry, this method extends existing geometric analysis tools to support stochastic settings and parametrizes the geometry of empirical distributions. However, the statistical models are not required as priors; hence, EIG may be applied to a wide range of real signals without existing definitive models. We show that the inferred model is noise-resilient and invariant under different observation and instrumental modalities. In addition, we show that it can be extended efficiently to newly acquired measurements in a sequential manner. These two advantages enable us to revisit the Bayesian approach and incorporate empirical dynamics and intrinsic geometry into a nonlinear filtering framework. We show applications to nonlinear and non-Gaussian tracking problems as well as to acoustic signal localization.
Biosynthesis and genetic encoding of phosphothreonine through parallel selection and deep sequencing
Huguenin-Dezot, Nicolas; Liang, Alexandria D.; Schmied, Wolfgang H.; Rogerson, Daniel T.; Chin, Jason W.
2017-01-01
The phosphorylation of threonine residues in proteins regulates diverse processes in eukaryotic cells, and thousands of threonine phosphorylations have been identified. An understanding of how threonine phosphorylation regulates biological function will be accelerated by general methods to bio-synthesize defined phospho-proteins. Here we address limitations in current methods for discovering aminoacyl-tRNA synthetase/tRNA pairs for incorporating non-natural amino acids into proteins, by combining parallel positive selections with deep sequencing and statistical analysis, to create a rapid approach for directly discovering aminoacyl-tRNA synthetase/tRNA pairs that selectively incorporate non-natural substrates. Our approach is scalable and enables the direct discovery of aminoacyl-tRNA synthetase/tRNA pairs with mutually orthogonal substrate specificity. We biosynthesize phosphothreonine in cells, and use our new selection approach to discover a phosphothreonyl-tRNA synthetase/tRNACUA pair. By combining these advances we create an entirely biosynthetic route to incorporating phosphothreonine in proteins and biosynthesize several phosphoproteins; enabling phosphoprotein structure determination and synthetic protein kinase activation. PMID:28553966
Infrared thermography for wood density estimation
NASA Astrophysics Data System (ADS)
López, Gamaliel; Basterra, Luis-Alfonso; Acuña, Luis
2018-03-01
Infrared thermography (IRT) is becoming a commonly used technique to non-destructively inspect and evaluate wood structures. Based on the radiation emitted by all objects, this technique enables the remote visualization of the surface temperature without making contact using a thermographic device. The process of transforming radiant energy into temperature depends on many parameters, and interpreting the results is usually complicated. However, some works have analyzed the operation of IRT and expanded its applications, as found in the latest literature. This work analyzes the effect of density on the thermodynamic behavior of timber to be determined by IRT. The cooling of various wood samples has been registered, and a statistical procedure that enables one to quantitatively estimate the density of timber has been designed. This procedure represents a new method to physically characterize this material.
NASA Astrophysics Data System (ADS)
Abid, Najmul; Mirkhalaf, Mohammad; Barthelat, Francois
2018-03-01
Natural materials such as nacre, collagen, and spider silk are composed of staggered stiff and strong inclusions in a softer matrix. This type of hybrid microstructure results in remarkable combinations of stiffness, strength, and toughness and it now inspires novel classes of high-performance composites. However, the analytical and numerical approaches used to predict and optimize the mechanics of staggered composites often neglect statistical variations and inhomogeneities, which may have significant impacts on modulus, strength, and toughness. Here we present an analysis of localization using small representative volume elements (RVEs) and large scale statistical volume elements (SVEs) based on the discrete element method (DEM). DEM is an efficient numerical method which enabled the evaluation of more than 10,000 microstructures in this study, each including about 5,000 inclusions. The models explore the combined effects of statistics, inclusion arrangement, and interface properties. We find that statistical variations have a negative effect on all properties, in particular on the ductility and energy absorption because randomness precipitates the localization of deformations. However, the results also show that the negative effects of random microstructures can be offset by interfaces with large strain at failure accompanied by strain hardening. More specifically, this quantitative study reveals an optimal range of interface properties where the interfaces are the most effective at delaying localization. These findings show how carefully designed interfaces in bioinspired staggered composites can offset the negative effects of microstructural randomness, which is inherent to most current fabrication methods.
NASA Astrophysics Data System (ADS)
Kaleva Oikarinen, Juho; Järvelä, Sanna; Kaasila, Raimo
2014-04-01
This design-based research project focuses on documenting statistical learning among 16-17-year-old Finnish upper secondary school students (N = 78) in a computer-supported collaborative learning (CSCL) environment. One novel value of this study is in reporting the shift from teacher-led mathematical teaching to autonomous small-group learning in statistics. The main aim of this study is to examine how student collaboration occurs in learning statistics in a CSCL environment. The data include material from videotaped classroom observations and the researcher's notes. In this paper, the inter-subjective phenomena of students' interactions in a CSCL environment are analysed by using a contact summary sheet (CSS). The development of the multi-dimensional coding procedure of the CSS instrument is presented. Aptly selected video episodes were transcribed and coded in terms of conversational acts, which were divided into non-task-related and task-related categories to depict students' levels of collaboration. The results show that collaborative learning (CL) can facilitate cohesion and responsibility and reduce students' feelings of detachment in our classless, periodic school system. The interactive .pdf material and collaboration in small groups enable statistical learning. It is concluded that CSCL is one possible method of promoting statistical teaching. CL using interactive materials seems to foster and facilitate statistical learning processes.
Titrimetric and photometric methods for determination of hypochlorite in commercial bleaches.
Jonnalagadda, Sreekanth B; Gengan, Prabhashini
2010-01-01
Two methods, simple titration and photometric methods for determination of hypochlorite are developed, based its reaction with hydrogen peroxide and titration of the residual peroxide by acidic permanganate. In the titration method, the residual hydrogen peroxide is estimated by titration with standard permanganate solution to estimate the hypochlorite concentration. The photometric method is devised to measure the concentration of remaining permanganate, after the reaction with residual hydrogen peroxide. It employs 4 ranges of calibration curves to enable the determination of hypochlorite accurately. The new photometric method measures hypochlorite in the range 1.90 x 10(-3) to 1.90 x 10(-2) M, with high accuracy and with low variance. The concentrations of hypochlorite in diverse commercial bleach samples and in seawater which is enriched with hypochlorite were estimated using the proposed method and compared with the arsenite method. The statistical analysis validates the superiority of the proposed method.
Analyzing longitudinal data with the linear mixed models procedure in SPSS.
West, Brady T
2009-09-01
Many applied researchers analyzing longitudinal data share a common misconception: that specialized statistical software is necessary to fit hierarchical linear models (also known as linear mixed models [LMMs], or multilevel models) to longitudinal data sets. Although several specialized statistical software programs of high quality are available that allow researchers to fit these models to longitudinal data sets (e.g., HLM), rapid advances in general purpose statistical software packages have recently enabled analysts to fit these same models when using preferred packages that also enable other more common analyses. One of these general purpose statistical packages is SPSS, which includes a very flexible and powerful procedure for fitting LMMs to longitudinal data sets with continuous outcomes. This article aims to present readers with a practical discussion of how to analyze longitudinal data using the LMMs procedure in the SPSS statistical software package.
Comparative study of signalling methods for high-speed backplane transceiver
NASA Astrophysics Data System (ADS)
Wu, Kejun
2017-11-01
A combined analysis of transient simulation and statistical method is proposed for comparative study of signalling methods applied to high-speed backplane transceivers. This method enables fast and accurate signal-to-noise ratio and symbol error rate estimation of a serial link based on a four-dimension design space, including channel characteristics, noise scenarios, equalisation schemes, and signalling methods. The proposed combined analysis method chooses an efficient sampling size for performance evaluation. A comparative study of non-return-to-zero (NRZ), PAM-4, and four-phase shifted sinusoid symbol (PSS-4) using parameterised behaviour-level simulation shows PAM-4 and PSS-4 has substantial advantages over conventional NRZ in most of the cases. A comparison between PAM-4 and PSS-4 shows PAM-4 gets significant bit error rate degradation when noise level is enhanced.
NASA Technical Reports Server (NTRS)
Chilingaryan, A. A.; Galfayan, S. K.; Zazyan, M. Z.; Dunaevsky, A. M.
1985-01-01
Nonparametric statistical methods are used to carry out the quantitative comparison of the model and the experimental data. The same methods enable one to select the events initiated by the heavy nuclei and to calculate the portion of the corresponding events. For this purpose it is necessary to have the data on artificial events describing the experiment sufficiently well established. At present, the model with the small scaling violation in the fragmentation region is the closest to the experiments. Therefore, the treatment of gamma families obtained in the Pamir' experiment is being carried out at present with the application of these models.
On estimation of secret message length in LSB steganography in spatial domain
NASA Astrophysics Data System (ADS)
Fridrich, Jessica; Goljan, Miroslav
2004-06-01
In this paper, we present a new method for estimating the secret message length of bit-streams embedded using the Least Significant Bit embedding (LSB) at random pixel positions. We introduce the concept of a weighted stego image and then formulate the problem of determining the unknown message length as a simple optimization problem. The methodology is further refined to obtain more stable and accurate results for a wide spectrum of natural images. One of the advantages of the new method is its modular structure and a clean mathematical derivation that enables elegant estimator accuracy analysis using statistical image models.
Dynamic whole body PET parametric imaging: II. Task-oriented statistical estimation
Karakatsanis, Nicolas A.; Lodge, Martin A.; Zhou, Y.; Wahl, Richard L.; Rahmim, Arman
2013-01-01
In the context of oncology, dynamic PET imaging coupled with standard graphical linear analysis has been previously employed to enable quantitative estimation of tracer kinetic parameters of physiological interest at the voxel level, thus, enabling quantitative PET parametric imaging. However, dynamic PET acquisition protocols have been confined to the limited axial field-of-view (~15–20cm) of a single bed position and have not been translated to the whole-body clinical imaging domain. On the contrary, standardized uptake value (SUV) PET imaging, considered as the routine approach in clinical oncology, commonly involves multi-bed acquisitions, but is performed statically, thus not allowing for dynamic tracking of the tracer distribution. Here, we pursue a transition to dynamic whole body PET parametric imaging, by presenting, within a unified framework, clinically feasible multi-bed dynamic PET acquisition protocols and parametric imaging methods. In a companion study, we presented a novel clinically feasible dynamic (4D) multi-bed PET acquisition protocol as well as the concept of whole body PET parametric imaging employing Patlak ordinary least squares (OLS) regression to estimate the quantitative parameters of tracer uptake rate Ki and total blood distribution volume V. In the present study, we propose an advanced hybrid linear regression framework, driven by Patlak kinetic voxel correlations, to achieve superior trade-off between contrast-to-noise ratio (CNR) and mean squared error (MSE) than provided by OLS for the final Ki parametric images, enabling task-based performance optimization. Overall, whether the observer's task is to detect a tumor or quantitatively assess treatment response, the proposed statistical estimation framework can be adapted to satisfy the specific task performance criteria, by adjusting the Patlak correlation-coefficient (WR) reference value. The multi-bed dynamic acquisition protocol, as optimized in the preceding companion study, was employed along with extensive Monte Carlo simulations and an initial clinical FDG patient dataset to validate and demonstrate the potential of the proposed statistical estimation methods. Both simulated and clinical results suggest that hybrid regression in the context of whole-body Patlak Ki imaging considerably reduces MSE without compromising high CNR. Alternatively, for a given CNR, hybrid regression enables larger reductions than OLS in the number of dynamic frames per bed, allowing for even shorter acquisitions of ~30min, thus further contributing to the clinical adoption of the proposed framework. Compared to the SUV approach, whole body parametric imaging can provide better tumor quantification, and can act as a complement to SUV, for the task of tumor detection. PMID:24080994
Dynamic whole-body PET parametric imaging: II. Task-oriented statistical estimation.
Karakatsanis, Nicolas A; Lodge, Martin A; Zhou, Y; Wahl, Richard L; Rahmim, Arman
2013-10-21
In the context of oncology, dynamic PET imaging coupled with standard graphical linear analysis has been previously employed to enable quantitative estimation of tracer kinetic parameters of physiological interest at the voxel level, thus, enabling quantitative PET parametric imaging. However, dynamic PET acquisition protocols have been confined to the limited axial field-of-view (~15-20 cm) of a single-bed position and have not been translated to the whole-body clinical imaging domain. On the contrary, standardized uptake value (SUV) PET imaging, considered as the routine approach in clinical oncology, commonly involves multi-bed acquisitions, but is performed statically, thus not allowing for dynamic tracking of the tracer distribution. Here, we pursue a transition to dynamic whole-body PET parametric imaging, by presenting, within a unified framework, clinically feasible multi-bed dynamic PET acquisition protocols and parametric imaging methods. In a companion study, we presented a novel clinically feasible dynamic (4D) multi-bed PET acquisition protocol as well as the concept of whole-body PET parametric imaging employing Patlak ordinary least squares (OLS) regression to estimate the quantitative parameters of tracer uptake rate Ki and total blood distribution volume V. In the present study, we propose an advanced hybrid linear regression framework, driven by Patlak kinetic voxel correlations, to achieve superior trade-off between contrast-to-noise ratio (CNR) and mean squared error (MSE) than provided by OLS for the final Ki parametric images, enabling task-based performance optimization. Overall, whether the observer's task is to detect a tumor or quantitatively assess treatment response, the proposed statistical estimation framework can be adapted to satisfy the specific task performance criteria, by adjusting the Patlak correlation-coefficient (WR) reference value. The multi-bed dynamic acquisition protocol, as optimized in the preceding companion study, was employed along with extensive Monte Carlo simulations and an initial clinical (18)F-deoxyglucose patient dataset to validate and demonstrate the potential of the proposed statistical estimation methods. Both simulated and clinical results suggest that hybrid regression in the context of whole-body Patlak Ki imaging considerably reduces MSE without compromising high CNR. Alternatively, for a given CNR, hybrid regression enables larger reductions than OLS in the number of dynamic frames per bed, allowing for even shorter acquisitions of ~30 min, thus further contributing to the clinical adoption of the proposed framework. Compared to the SUV approach, whole-body parametric imaging can provide better tumor quantification, and can act as a complement to SUV, for the task of tumor detection.
Mathematical and Statistical Software Index. Final Report.
ERIC Educational Resources Information Center
Black, Doris E., Comp.
Brief descriptions are provided of general-purpose mathematical and statistical software, including 27 "stand-alone" programs, three subroutine systems, and two nationally recognized statistical packages, which are available in the Air Force Human Resources Laboratory (AFHRL) software library. This index was created to enable researchers…
Fostering Students' Statistical Literacy through Significant Learning Experience
ERIC Educational Resources Information Center
Krishnan, Saras
2015-01-01
A major objective of statistics education is to develop students' statistical literacy that enables them to be educated users of data in context. Teaching statistics in today's educational settings is not an easy feat because teachers have a huge task in keeping up with the demands of the new generation of learners. The present day students have…
Wardrop, N. A.; Jochem, W. C.; Bird, T. J.; Chamberlain, H. R.; Clarke, D.; Kerr, D.; Bengtsson, L.; Juran, S.; Seaman, V.; Tatem, A. J.
2018-01-01
Population numbers at local levels are fundamental data for many applications, including the delivery and planning of services, election preparation, and response to disasters. In resource-poor settings, recent and reliable demographic data at subnational scales can often be lacking. National population and housing census data can be outdated, inaccurate, or missing key groups or areas, while registry data are generally lacking or incomplete. Moreover, at local scales accurate boundary data are often limited, and high rates of migration and urban growth make existing data quickly outdated. Here we review past and ongoing work aimed at producing spatially disaggregated local-scale population estimates, and discuss how new technologies are now enabling robust and cost-effective solutions. Recent advances in the availability of detailed satellite imagery, geopositioning tools for field surveys, statistical methods, and computational power are enabling the development and application of approaches that can estimate population distributions at fine spatial scales across entire countries in the absence of census data. We outline the potential of such approaches as well as their limitations, emphasizing the political and operational hurdles for acceptance and sustainable implementation of new approaches, and the continued importance of traditional sources of national statistical data. PMID:29555739
Big Data Analytics for Scanning Transmission Electron Microscopy Ptychography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jesse, S.; Chi, M.; Belianinov, A.
Electron microscopy is undergoing a transition; from the model of producing only a few micrographs, through the current state where many images and spectra can be digitally recorded, to a new mode where very large volumes of data (movies, ptychographic and multi-dimensional series) can be rapidly obtained. In this paper, we discuss the application of so-called “big-data” methods to high dimensional microscopy data, using unsupervised multivariate statistical techniques, in order to explore salient image features in a specific example of BiFeO 3 domains. Remarkably, k-means clustering reveals domain differentiation despite the fact that the algorithm is purely statistical in naturemore » and does not require any prior information regarding the material, any coexisting phases, or any differentiating structures. While this is a somewhat trivial case, this example signifies the extraction of useful physical and structural information without any prior bias regarding the sample or the instrumental modality. Further interpretation of these types of results may still require human intervention. Finally, however, the open nature of this algorithm and its wide availability, enable broad collaborations and exploratory work necessary to enable efficient data analysis in electron microscopy.« less
Big Data Analytics for Scanning Transmission Electron Microscopy Ptychography
Jesse, S.; Chi, M.; Belianinov, A.; ...
2016-05-23
Electron microscopy is undergoing a transition; from the model of producing only a few micrographs, through the current state where many images and spectra can be digitally recorded, to a new mode where very large volumes of data (movies, ptychographic and multi-dimensional series) can be rapidly obtained. In this paper, we discuss the application of so-called “big-data” methods to high dimensional microscopy data, using unsupervised multivariate statistical techniques, in order to explore salient image features in a specific example of BiFeO 3 domains. Remarkably, k-means clustering reveals domain differentiation despite the fact that the algorithm is purely statistical in naturemore » and does not require any prior information regarding the material, any coexisting phases, or any differentiating structures. While this is a somewhat trivial case, this example signifies the extraction of useful physical and structural information without any prior bias regarding the sample or the instrumental modality. Further interpretation of these types of results may still require human intervention. Finally, however, the open nature of this algorithm and its wide availability, enable broad collaborations and exploratory work necessary to enable efficient data analysis in electron microscopy.« less
A Sensor Driven Probabilistic Method for Enabling Hyper Resolution Flood Simulations
NASA Astrophysics Data System (ADS)
Fries, K. J.; Salas, F.; Kerkez, B.
2016-12-01
A reduction in the cost of sensors and wireless communications is now enabling researchers and local governments to make flow, stage and rain measurements at locations that are not covered by existing USGS or state networks. We ask the question: how should these new sources of densified, street-level sensor measurements be used to make improved forecasts using the National Water Model (NWM)? Assimilating these data "into" the NWM can be challenging due to computational complexity, as well as heterogeneity of sensor and other input data. Instead, we introduce a machine learning and statistical framework that layers these data "on top" of the NWM outputs to improve high-resolution hydrologic and hydraulic forecasting. By generalizing our approach into a post-processing framework, a rapidly repeatable blueprint is generated for for decision makers who want to improve local forecasts by coupling sensor data with the NWM. We present preliminary results based on case studies in highly instrumented watersheds in the US. Through the use of statistical learning tools and hydrologic routing schemes, we demonstrate the ability of our approach to improve forecasts while simultaneously characterizing bias and uncertainty in the NWM.
Big Data Analytics for Scanning Transmission Electron Microscopy Ptychography
Jesse, S.; Chi, M.; Belianinov, A.; Beekman, C.; Kalinin, S. V.; Borisevich, A. Y.; Lupini, A. R.
2016-01-01
Electron microscopy is undergoing a transition; from the model of producing only a few micrographs, through the current state where many images and spectra can be digitally recorded, to a new mode where very large volumes of data (movies, ptychographic and multi-dimensional series) can be rapidly obtained. Here, we discuss the application of so-called “big-data” methods to high dimensional microscopy data, using unsupervised multivariate statistical techniques, in order to explore salient image features in a specific example of BiFeO3 domains. Remarkably, k-means clustering reveals domain differentiation despite the fact that the algorithm is purely statistical in nature and does not require any prior information regarding the material, any coexisting phases, or any differentiating structures. While this is a somewhat trivial case, this example signifies the extraction of useful physical and structural information without any prior bias regarding the sample or the instrumental modality. Further interpretation of these types of results may still require human intervention. However, the open nature of this algorithm and its wide availability, enable broad collaborations and exploratory work necessary to enable efficient data analysis in electron microscopy. PMID:27211523
Dutheil, Julien; Gaillard, Sylvain; Bazin, Eric; Glémin, Sylvain; Ranwez, Vincent; Galtier, Nicolas; Belkhir, Khalid
2006-04-04
A large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/output methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications. We present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets), various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc.), phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization), population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses) and various algorithms for numerical calculus. Implementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khromov, K. Yu.; Vaks, V. G., E-mail: vaks@mbslab.kiae.ru; Zhuravlev, I. A.
2013-02-15
The previously developed ab initio model and the kinetic Monte Carlo method (KMCM) are used to simulate precipitation in a number of iron-copper alloys with different copper concentrations x and temperatures T. The same simulations are also made using an improved version of the previously suggested stochastic statistical method (SSM). The results obtained enable us to make a number of general conclusions about the dependences of the decomposition kinetics in Fe-Cu alloys on x and T. We also show that the SSM usually describes the precipitation kinetics in good agreement with the KMCM, and using the SSM in conjunction withmore » the KMCM allows extending the KMC simulations to the longer evolution times. The results of simulations seem to agree with available experimental data for Fe-Cu alloys within statistical errors of simulations and the scatter of experimental results. Comparison of simulation results with experiments for some multicomponent Fe-Cu-based alloys allows making certain conclusions about the influence of alloying elements in these alloys on the precipitation kinetics at different stages of evolution.« less
Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods
NASA Technical Reports Server (NTRS)
Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan
2016-01-01
The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.
Compositional data analysis for physical activity, sedentary time and sleep research.
Dumuid, Dorothea; Stanford, Tyman E; Martin-Fernández, Josep-Antoni; Pedišić, Željko; Maher, Carol A; Lewis, Lucy K; Hron, Karel; Katzmarzyk, Peter T; Chaput, Jean-Philippe; Fogelholm, Mikael; Hu, Gang; Lambert, Estelle V; Maia, José; Sarmiento, Olga L; Standage, Martyn; Barreira, Tiago V; Broyles, Stephanie T; Tudor-Locke, Catrine; Tremblay, Mark S; Olds, Timothy
2017-01-01
The health effects of daily activity behaviours (physical activity, sedentary time and sleep) are widely studied. While previous research has largely examined activity behaviours in isolation, recent studies have adjusted for multiple behaviours. However, the inclusion of all activity behaviours in traditional multivariate analyses has not been possible due to the perfect multicollinearity of 24-h time budget data. The ensuing lack of adjustment for known effects on the outcome undermines the validity of study findings. We describe a statistical approach that enables the inclusion of all daily activity behaviours, based on the principles of compositional data analysis. Using data from the International Study of Childhood Obesity, Lifestyle and the Environment, we demonstrate the application of compositional multiple linear regression to estimate adiposity from children's daily activity behaviours expressed as isometric log-ratio coordinates. We present a novel method for predicting change in a continuous outcome based on relative changes within a composition, and for calculating associated confidence intervals to allow for statistical inference. The compositional data analysis presented overcomes the lack of adjustment that has plagued traditional statistical methods in the field, and provides robust and reliable insights into the health effects of daily activity behaviours.
Baseline Assessment and Prioritization Framework for IVHM Integrity Assurance Enabling Capabilities
NASA Technical Reports Server (NTRS)
Cooper, Eric G.; DiVito, Benedetto L.; Jacklin, Stephen A.; Miner, Paul S.
2009-01-01
Fundamental to vehicle health management is the deployment of systems incorporating advanced technologies for predicting and detecting anomalous conditions in highly complex and integrated environments. Integrated structural integrity health monitoring, statistical algorithms for detection, estimation, prediction, and fusion, and diagnosis supporting adaptive control are examples of advanced technologies that present considerable verification and validation challenges. These systems necessitate interactions between physical and software-based systems that are highly networked with sensing and actuation subsystems, and incorporate technologies that are, in many respects, different from those employed in civil aviation today. A formidable barrier to deploying these advanced technologies in civil aviation is the lack of enabling verification and validation tools, methods, and technologies. The development of new verification and validation capabilities will not only enable the fielding of advanced vehicle health management systems, but will also provide new assurance capabilities for verification and validation of current generation aviation software which has been implicated in anomalous in-flight behavior. This paper describes the research focused on enabling capabilities for verification and validation underway within NASA s Integrated Vehicle Health Management project, discusses the state of the art of these capabilities, and includes a framework for prioritizing activities.
SECIMTools: a suite of metabolomics data analysis tools.
Kirpich, Alexander S; Ibarra, Miguel; Moskalenko, Oleksandr; Fear, Justin M; Gerken, Joseph; Mi, Xinlei; Ashrafi, Ali; Morse, Alison M; McIntyre, Lauren M
2018-04-20
Metabolomics has the promise to transform the area of personalized medicine with the rapid development of high throughput technology for untargeted analysis of metabolites. Open access, easy to use, analytic tools that are broadly accessible to the biological community need to be developed. While technology used in metabolomics varies, most metabolomics studies have a set of features identified. Galaxy is an open access platform that enables scientists at all levels to interact with big data. Galaxy promotes reproducibility by saving histories and enabling the sharing workflows among scientists. SECIMTools (SouthEast Center for Integrated Metabolomics) is a set of Python applications that are available both as standalone tools and wrapped for use in Galaxy. The suite includes a comprehensive set of quality control metrics (retention time window evaluation and various peak evaluation tools), visualization techniques (hierarchical cluster heatmap, principal component analysis, modular modularity clustering), basic statistical analysis methods (partial least squares - discriminant analysis, analysis of variance, t-test, Kruskal-Wallis non-parametric test), advanced classification methods (random forest, support vector machines), and advanced variable selection tools (least absolute shrinkage and selection operator LASSO and Elastic Net). SECIMTools leverages the Galaxy platform and enables integrated workflows for metabolomics data analysis made from building blocks designed for easy use and interpretability. Standard data formats and a set of utilities allow arbitrary linkages between tools to encourage novel workflow designs. The Galaxy framework enables future data integration for metabolomics studies with other omics data.
Source-Device-Independent Ultrafast Quantum Random Number Generation.
Marangon, Davide G; Vallone, Giuseppe; Villoresi, Paolo
2017-02-10
Secure random numbers are a fundamental element of many applications in science, statistics, cryptography and more in general in security protocols. We present a method that enables the generation of high-speed unpredictable random numbers from the quadratures of an electromagnetic field without any assumption on the input state. The method allows us to eliminate the numbers that can be predicted due to the presence of classical and quantum side information. In particular, we introduce a procedure to estimate a bound on the conditional min-entropy based on the entropic uncertainty principle for position and momentum observables of infinite dimensional quantum systems. By the above method, we experimentally demonstrated the generation of secure true random bits at a rate greater than 1.7 Gbit/s.
Sergé, Arnauld; Bernard, Anne-Marie; Phélipot, Marie-Claire; Bertaux, Nicolas; Fallet, Mathieu; Grenot, Pierre; Marguet, Didier; He, Hai-Tao; Hamon, Yannick
2013-01-01
We introduce a series of experimental procedures enabling sensitive calcium monitoring in T cell populations by confocal video-microscopy. Tracking and post-acquisition analysis was performed using Methods for Automated and Accurate Analysis of Cell Signals (MAAACS), a fully customized program that associates a high throughput tracking algorithm, an intuitive reconnection routine and a statistical platform to provide, at a glance, the calcium barcode of a population of individual T-cells. Combined with a sensitive calcium probe, this method allowed us to unravel the heterogeneity in shape and intensity of the calcium response in T cell populations and especially in naive T cells, which display intracellular calcium oscillations upon stimulation by antigen presenting cells. PMID:24086124
Custovic, Adnan; Ainsworth, John; Arshad, Hasan; Bishop, Christopher; Buchan, Iain; Cullinan, Paul; Devereux, Graham; Henderson, John; Holloway, John; Roberts, Graham; Turner, Steve; Woodcock, Ashley; Simpson, Angela
2015-01-01
We created Asthma e-Lab, a secure web-based research environment to support consistent recording, description and sharing of data, computational/statistical methods and emerging findings across the five UK birth cohorts. The e-Lab serves as a data repository for our unified dataset and provides the computational resources and a scientific social network to support collaborative research. All activities are transparent, and emerging findings are shared via the e-Lab, linked to explanations of analytical methods, thus enabling knowledge transfer. eLab facilitates the iterative interdisciplinary dialogue between clinicians, statisticians, computer scientists, mathematicians, geneticists and basic scientists, capturing collective thought behind the interpretations of findings. PMID:25805205
A comparison of linear interpolation models for iterative CT reconstruction.
Hahn, Katharina; Schöndube, Harald; Stierstorfer, Karl; Hornegger, Joachim; Noo, Frédéric
2016-12-01
Recent reports indicate that model-based iterative reconstruction methods may improve image quality in computed tomography (CT). One difficulty with these methods is the number of options available to implement them, including the selection of the forward projection model and the penalty term. Currently, the literature is fairly scarce in terms of guidance regarding this selection step, whereas these options impact image quality. Here, the authors investigate the merits of three forward projection models that rely on linear interpolation: the distance-driven method, Joseph's method, and the bilinear method. The authors' selection is motivated by three factors: (1) in CT, linear interpolation is often seen as a suitable trade-off between discretization errors and computational cost, (2) the first two methods are popular with manufacturers, and (3) the third method enables assessing the importance of a key assumption in the other methods. One approach to evaluate forward projection models is to inspect their effect on discretized images, as well as the effect of their transpose on data sets, but significance of such studies is unclear since the matrix and its transpose are always jointly used in iterative reconstruction. Another approach is to investigate the models in the context they are used, i.e., together with statistical weights and a penalty term. Unfortunately, this approach requires the selection of a preferred objective function and does not provide clear information on features that are intrinsic to the model. The authors adopted the following two-stage methodology. First, the authors analyze images that progressively include components of the singular value decomposition of the model in a reconstructed image without statistical weights and penalty term. Next, the authors examine the impact of weights and penalty on observed differences. Image quality metrics were investigated for 16 different fan-beam imaging scenarios that enabled probing various aspects of all models. The metrics include a surrogate for computational cost, as well as bias, noise, and an estimation task, all at matched resolution. The analysis revealed fundamental differences in terms of both bias and noise. Task-based assessment appears to be required to appreciate the differences in noise; the estimation task the authors selected showed that these differences balance out to yield similar performance. Some scenarios highlighted merits for the distance-driven method in terms of bias but with an increase in computational cost. Three combinations of statistical weights and penalty term showed that the observed differences remain the same, but strong edge-preserving penalty can dramatically reduce the magnitude of these differences. In many scenarios, Joseph's method seems to offer an interesting compromise between cost and computational effort. The distance-driven method offers the possibility to reduce bias but with an increase in computational cost. The bilinear method indicated that a key assumption in the other two methods is highly robust. Last, strong edge-preserving penalty can act as a compensator for insufficiencies in the forward projection model, bringing all models to similar levels in the most challenging imaging scenarios. Also, the authors find that their evaluation methodology helps appreciating how model, statistical weights, and penalty term interplay together.
Bessey, Cindy; Vanderklift, Mathew A
2014-02-15
Stable isotope analysis (SIA) is a powerful tool in many fields of research that enables quantitative comparisons among studies, if similar methods have been used. The goal of this study was to determine if three different drying methods commonly used to prepare samples for SIA yielded different δ(15)N and δ(13)C values. Muscle subsamples from 10 individuals each of three teleost species were dried using three methods: (i) oven, (ii) food dehydrator, and (iii) freeze-dryer. All subsamples were analysed for δ(15)N and δ(13)C values, and nitrogen and carbon content, using a continuous flow system consisting of a Delta V Plus mass spectrometer and a Flush 1112 elemental analyser via a Conflo IV universal interface. The δ(13)C values were normalized to constant lipid content using the equations proposed by McConnaughey and McRoy. Although statistically significant, the differences in δ(15)N values between the drying methods were small (mean differences ≤0.21‰). The differences in δ(13)C values between the drying methods were not statistically significant, and normalising the δ(13)C values to constant lipid content reduced the mean differences for all treatments to ≤0.65‰. A statistically significant difference of ~2% in C content existed between tissues dried in a food dehydrator and those dried in a freeze-dryer for two fish species. There was no significant effect of fish size on the differences between methods. No substantial effect of drying method was found on the δ(15)N or δ(13)C values of teleost muscle tissue. Copyright © 2013 John Wiley & Sons, Ltd.
Tang, Jie; Nett, Brian E; Chen, Guang-Hong
2009-10-07
Of all available reconstruction methods, statistical iterative reconstruction algorithms appear particularly promising since they enable accurate physical noise modeling. The newly developed compressive sampling/compressed sensing (CS) algorithm has shown the potential to accurately reconstruct images from highly undersampled data. The CS algorithm can be implemented in the statistical reconstruction framework as well. In this study, we compared the performance of two standard statistical reconstruction algorithms (penalized weighted least squares and q-GGMRF) to the CS algorithm. In assessing the image quality using these iterative reconstructions, it is critical to utilize realistic background anatomy as the reconstruction results are object dependent. A cadaver head was scanned on a Varian Trilogy system at different dose levels. Several figures of merit including the relative root mean square error and a quality factor which accounts for the noise performance and the spatial resolution were introduced to objectively evaluate reconstruction performance. A comparison is presented between the three algorithms for a constant undersampling factor comparing different algorithms at several dose levels. To facilitate this comparison, the original CS method was formulated in the framework of the statistical image reconstruction algorithms. Important conclusions of the measurements from our studies are that (1) for realistic neuro-anatomy, over 100 projections are required to avoid streak artifacts in the reconstructed images even with CS reconstruction, (2) regardless of the algorithm employed, it is beneficial to distribute the total dose to more views as long as each view remains quantum noise limited and (3) the total variation-based CS method is not appropriate for very low dose levels because while it can mitigate streaking artifacts, the images exhibit patchy behavior, which is potentially harmful for medical diagnosis.
First principles statistical mechanics of alloys and magnetism
NASA Astrophysics Data System (ADS)
Eisenbach, Markus; Khan, Suffian N.; Li, Ying Wai
Modern high performance computing resources are enabling the exploration of the statistical physics of phase spaces with increasing size and higher fidelity of the Hamiltonian of the systems. For selected systems, this now allows the combination of Density Functional based first principles calculations with classical Monte Carlo methods for parameter free, predictive thermodynamics of materials. We combine our locally selfconsistent real space multiple scattering method for solving the Kohn-Sham equation with Wang-Landau Monte-Carlo calculations (WL-LSMS). In the past we have applied this method to the calculation of Curie temperatures in magnetic materials. Here we will present direct calculations of the chemical order - disorder transitions in alloys. We present our calculated transition temperature for the chemical ordering in CuZn and the temperature dependence of the short-range order parameter and specific heat. Finally we will present the extension of the WL-LSMS method to magnetic alloys, thus allowing the investigation of the interplay of magnetism, structure and chemical order in ferrous alloys. This research was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Science and Engineering Division and it used Oak Ridge Leadership Computing Facility resources at Oak Ridge National Laboratory.
A statistical approach to combining multisource information in one-class classifiers
Simonson, Katherine M.; Derek West, R.; Hansen, Ross L.; ...
2017-06-08
A new method is introduced in this paper for combining information from multiple sources to support one-class classification. The contributing sources may represent measurements taken by different sensors of the same physical entity, repeated measurements by a single sensor, or numerous features computed from a single measured image or signal. The approach utilizes the theory of statistical hypothesis testing, and applies Fisher's technique for combining p-values, modified to handle nonindependent sources. Classifier outputs take the form of fused p-values, which may be used to gauge the consistency of unknown entities with one or more class hypotheses. The approach enables rigorousmore » assessment of classification uncertainties, and allows for traceability of classifier decisions back to the constituent sources, both of which are important for high-consequence decision support. Application of the technique is illustrated in two challenge problems, one for skin segmentation and the other for terrain labeling. Finally, the method is seen to be particularly effective for relatively small training samples.« less
A statistical approach to combining multisource information in one-class classifiers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simonson, Katherine M.; Derek West, R.; Hansen, Ross L.
A new method is introduced in this paper for combining information from multiple sources to support one-class classification. The contributing sources may represent measurements taken by different sensors of the same physical entity, repeated measurements by a single sensor, or numerous features computed from a single measured image or signal. The approach utilizes the theory of statistical hypothesis testing, and applies Fisher's technique for combining p-values, modified to handle nonindependent sources. Classifier outputs take the form of fused p-values, which may be used to gauge the consistency of unknown entities with one or more class hypotheses. The approach enables rigorousmore » assessment of classification uncertainties, and allows for traceability of classifier decisions back to the constituent sources, both of which are important for high-consequence decision support. Application of the technique is illustrated in two challenge problems, one for skin segmentation and the other for terrain labeling. Finally, the method is seen to be particularly effective for relatively small training samples.« less
Advanced building energy management system demonstration for Department of Defense buildings.
O'Neill, Zheng; Bailey, Trevor; Dong, Bing; Shashanka, Madhusudana; Luo, Dong
2013-08-01
This paper presents an advanced building energy management system (aBEMS) that employs advanced methods of whole-building performance monitoring combined with statistical methods of learning and data analysis to enable identification of both gradual and discrete performance erosion and faults. This system assimilated data collected from multiple sources, including blueprints, reduced-order models (ROM) and measurements, and employed advanced statistical learning algorithms to identify patterns of anomalies. The results were presented graphically in a manner understandable to facilities managers. A demonstration of aBEMS was conducted in buildings at Naval Station Great Lakes. The facility building management systems were extended to incorporate the energy diagnostics and analysis algorithms, producing systematic identification of more efficient operation strategies. At Naval Station Great Lakes, greater than 20% savings were demonstrated for building energy consumption by improving facility manager decision support to diagnose energy faults and prioritize alternative, energy-efficient operation strategies. The paper concludes with recommendations for widespread aBEMS success. © 2013 New York Academy of Sciences.
Synaptic dynamics contribute to long-term single neuron response fluctuations.
Reinartz, Sebastian; Biro, Istvan; Gal, Asaf; Giugliano, Michele; Marom, Shimon
2014-01-01
Firing rate variability at the single neuron level is characterized by long-memory processes and complex statistics over a wide range of time scales (from milliseconds up to several hours). Here, we focus on the contribution of non-stationary efficacy of the ensemble of synapses-activated in response to a given stimulus-on single neuron response variability. We present and validate a method tailored for controlled and specific long-term activation of a single cortical neuron in vitro via synaptic or antidromic stimulation, enabling a clear separation between two determinants of neuronal response variability: membrane excitability dynamics vs. synaptic dynamics. Applying this method we show that, within the range of physiological activation frequencies, the synaptic ensemble of a given neuron is a key contributor to the neuronal response variability, long-memory processes and complex statistics observed over extended time scales. Synaptic transmission dynamics impact on response variability in stimulation rates that are substantially lower compared to stimulation rates that drive excitability resources to fluctuate. Implications to network embedded neurons are discussed.
Partitioning heritability by functional annotation using genome-wide association summary statistics.
Finucane, Hilary K; Bulik-Sullivan, Brendan; Gusev, Alexander; Trynka, Gosia; Reshef, Yakir; Loh, Po-Ru; Anttila, Verneri; Xu, Han; Zang, Chongzhi; Farh, Kyle; Ripke, Stephan; Day, Felix R; Purcell, Shaun; Stahl, Eli; Lindstrom, Sara; Perry, John R B; Okada, Yukinori; Raychaudhuri, Soumya; Daly, Mark J; Patterson, Nick; Neale, Benjamin M; Price, Alkes L
2015-11-01
Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.
Chi-Square Statistics, Tests of Hypothesis and Technology.
ERIC Educational Resources Information Center
Rochowicz, John A.
The use of technology such as computers and programmable calculators enables students to find p-values and conduct tests of hypotheses in many different ways. Comprehension and interpretation of a research problem become the focus for statistical analysis. This paper describes how to calculate chisquare statistics and p-values for statistical…
Olokundun, Maxwell; Iyiola, Oluwole; Ibidunni, Stephen; Ogbari, Mercy; Falola, Hezekiah; Salau, Odunayo; Peter, Fred; Borishade, Taiye
2018-06-01
The article presented data on the effectiveness of entrepreneurship curriculum contents on university students' entrepreneurial interest and knowledge. The study focused on the perceptions of Nigerian university students. Emphasis was laid on the first four universities in Nigeria to offer a degree programme in entrepreneurship. The study adopted quantitative approach with a descriptive research design to establish trends related to the objective of the study. Survey was be used as quantitative research method. The population of this study included all students in the selected universities. Data was analyzed with the use of Statistical Package for Social Sciences (SPSS). Mean score was used as statistical tool of analysis. The field data set is made widely accessible to enable critical or a more comprehensive investigation.
ZERODUR: deterministic approach for strength design
NASA Astrophysics Data System (ADS)
Hartmann, Peter
2012-12-01
There is an increasing request for zero expansion glass ceramic ZERODUR substrates being capable of enduring higher operational static loads or accelerations. The integrity of structures such as optical or mechanical elements for satellites surviving rocket launches, filigree lightweight mirrors, wobbling mirrors, and reticle and wafer stages in microlithography must be guaranteed with low failure probability. Their design requires statistically relevant strength data. The traditional approach using the statistical two-parameter Weibull distribution suffered from two problems. The data sets were too small to obtain distribution parameters with sufficient accuracy and also too small to decide on the validity of the model. This holds especially for the low failure probability levels that are required for reliable applications. Extrapolation to 0.1% failure probability and below led to design strengths so low that higher load applications seemed to be not feasible. New data have been collected with numbers per set large enough to enable tests on the applicability of the three-parameter Weibull distribution. This distribution revealed to provide much better fitting of the data. Moreover it delivers a lower threshold value, which means a minimum value for breakage stress, allowing of removing statistical uncertainty by introducing a deterministic method to calculate design strength. Considerations taken from the theory of fracture mechanics as have been proven to be reliable with proof test qualifications of delicate structures made from brittle materials enable including fatigue due to stress corrosion in a straight forward way. With the formulae derived, either lifetime can be calculated from given stress or allowable stress from minimum required lifetime. The data, distributions, and design strength calculations for several practically relevant surface conditions of ZERODUR are given. The values obtained are significantly higher than those resulting from the two-parameter Weibull distribution approach and no longer subject to statistical uncertainty.
Interlaced X-ray diffraction computed tomography
Vamvakeros, Antonios; Jacques, Simon D. M.; Di Michiel, Marco; Senecal, Pierre; Middelkoop, Vesna; Cernik, Robert J.; Beale, Andrew M.
2016-01-01
An X-ray diffraction computed tomography data-collection strategy that allows, post experiment, a choice between temporal and spatial resolution is reported. This strategy enables time-resolved studies on comparatively short timescales, or alternatively allows for improved spatial resolution if the system under study, or components within it, appear to be unchanging. The application of the method for studying an Mn–Na–W/SiO2 fixed-bed reactor in situ is demonstrated. Additionally, the opportunities to improve the data-collection strategy further, enabling post-collection tuning between statistical, temporal and spatial resolutions, are discussed. In principle, the interlaced scanning approach can also be applied to other pencil-beam tomographic techniques, like X-ray fluorescence computed tomography, X-ray absorption fine structure computed tomography, pair distribution function computed tomography and tomographic scanning transmission X-ray microscopy. PMID:27047305
Quantitative single-molecule imaging by confocal laser scanning microscopy.
Vukojevic, Vladana; Heidkamp, Marcus; Ming, Yu; Johansson, Björn; Terenius, Lars; Rigler, Rudolf
2008-11-25
A new approach to quantitative single-molecule imaging by confocal laser scanning microscopy (CLSM) is presented. It relies on fluorescence intensity distribution to analyze the molecular occurrence statistics captured by digital imaging and enables direct determination of the number of fluorescent molecules and their diffusion rates without resorting to temporal or spatial autocorrelation analyses. Digital images of fluorescent molecules were recorded by using fast scanning and avalanche photodiode detectors. In this way the signal-to-background ratio was significantly improved, enabling direct quantitative imaging by CLSM. The potential of the proposed approach is demonstrated by using standard solutions of fluorescent dyes, fluorescently labeled DNA molecules, quantum dots, and the Enhanced Green Fluorescent Protein in solution and in live cells. The method was verified by using fluorescence correlation spectroscopy. The relevance for biological applications, in particular, for live cell imaging, is discussed.
Use of the World Wide Web for multisite data collection.
Subramanian, A K; McAfee, A T; Getzinger, J P
1997-08-01
As access to the Internet becomes increasingly available, research applications in medicine will increase. This paper describes the use of the Internet, and, more specifically, the World Wide Web (WWW), as a channel of communication between EDs throughout the world and investigators who are interested in facilitating the collection of data from multiple sites. Data entered into user-friendly electronic surveys can be transmitted over the Internet to a database located at the site of the study, rendering geographic separation less of a barrier to the conduction of multisite studies. The electronic format of the data can enable real-time statistical processing while data are stored using existing database technologies. In theory, automated processing of variables within such a database enables early identification of data trends. Methods of ensuring validity, security, and compliance are discussed.
A concept for holistic whole body MRI data analysis, Imiomics
Malmberg, Filip; Johansson, Lars; Lind, Lars; Sundbom, Magnus; Ahlström, Håkan; Kullberg, Joel
2017-01-01
Purpose To present and evaluate a whole-body image analysis concept, Imiomics (imaging–omics) and an image registration method that enables Imiomics analyses by deforming all image data to a common coordinate system, so that the information in each voxel can be compared between persons or within a person over time and integrated with non-imaging data. Methods The presented image registration method utilizes relative elasticity constraints of different tissue obtained from whole-body water-fat MRI. The registration method is evaluated by inverse consistency and Dice coefficients and the Imiomics concept is evaluated by example analyses of importance for metabolic research using non-imaging parameters where we know what to expect. The example analyses include whole body imaging atlas creation, anomaly detection, and cross-sectional and longitudinal analysis. Results The image registration method evaluation on 128 subjects shows low inverse consistency errors and high Dice coefficients. Also, the statistical atlas with fat content intensity values shows low standard deviation values, indicating successful deformations to the common coordinate system. The example analyses show expected associations and correlations which agree with explicit measurements, and thereby illustrate the usefulness of the proposed Imiomics concept. Conclusions The registration method is well-suited for Imiomics analyses, which enable analyses of relationships to non-imaging data, e.g. clinical data, in new types of holistic targeted and untargeted big-data analysis. PMID:28241015
Möltgen, C-V; Herdling, T; Reich, G
2013-11-01
This study demonstrates an approach, using science-based calibration (SBC), for direct coating thickness determination on heart-shaped tablets in real-time. Near-Infrared (NIR) spectra were collected during four full industrial pan coating operations. The tablets were coated with a thin hydroxypropyl methylcellulose (HPMC) film up to a film thickness of 28 μm. The application of SBC permits the calibration of the NIR spectral data without using costly determined reference values. This is due to the fact that SBC combines classical methods to estimate the coating signal and statistical methods for the noise estimation. The approach enabled the use of NIR for the measurement of the film thickness increase from around 8 to 28 μm of four independent batches in real-time. The developed model provided a spectroscopic limit of detection for the coating thickness of 0.64 ± 0.03 μm root-mean square (RMS). In the commonly used statistical methods for calibration, such as Partial Least Squares (PLS), sufficiently varying reference values are needed for calibration. For thin non-functional coatings this is a challenge because the quality of the model depends on the accuracy of the selected calibration standards. The obvious and simple approach of SBC eliminates many of the problems associated with the conventional statistical methods and offers an alternative for multivariate calibration. Copyright © 2013 Elsevier B.V. All rights reserved.
Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.
Xu, Lijing; Furlotte, Nicholas; Lin, Yunyue; Heinrich, Kevin; Berry, Michael W; George, Ebenezer O; Homayouni, Ramin
2011-04-14
High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature. GCAT is freely available at http://binf1.memphis.edu/gcat.
Quan, Phenix-Lan; Sauzade, Martin
2018-01-01
Digital Polymerase Chain Reaction (dPCR) is a novel method for the absolute quantification of target nucleic acids. Quantification by dPCR hinges on the fact that the random distribution of molecules in many partitions follows a Poisson distribution. Each partition acts as an individual PCR microreactor and partitions containing amplified target sequences are detected by fluorescence. The proportion of PCR-positive partitions suffices to determine the concentration of the target sequence without a need for calibration. Advances in microfluidics enabled the current revolution of digital quantification by providing efficient partitioning methods. In this review, we compare the fundamental concepts behind the quantification of nucleic acids by dPCR and quantitative real-time PCR (qPCR). We detail the underlying statistics of dPCR and explain how it defines its precision and performance metrics. We review the different microfluidic digital PCR formats, present their underlying physical principles, and analyze the technological evolution of dPCR platforms. We present the novel multiplexing strategies enabled by dPCR and examine how isothermal amplification could be an alternative to PCR in digital assays. Finally, we determine whether the theoretical advantages of dPCR over qPCR hold true by perusing studies that directly compare assays implemented with both methods. PMID:29677144
NASA Astrophysics Data System (ADS)
Inclan, Eric; Lassester, Jack; Geohegan, David; Yoon, Mina
Optimization algorithms (OA) coupled with numerical methods enable researchers to identify and study (meta) stable nanoclusters without the control restrictions of empirical methods. An algorithm's performance is governed by two factors: (1) its compatibility with an objective function, (2) the dimension of a design space, which increases with cluster size. Although researchers often tune an algorithm's user-defined parameters (UDP), tuning is not guaranteed to improve performance. In this research, Particle Swarm (PSO) and Differential Evolution (DE), are compared by tuning their UDP in a multi-objective optimization environment (MOE). Combined with a Kolmogorov Smirnov test for statistical significance, the MOE enables the study of the Pareto Front (PF), made of the UDP settings that trade-off between best performance in energy minimization (``effectiveness'') based on force-field potential energy, and best convergence rate (``efficiency''). By studying the PF, this research finds that UDP values frequently suggested in the literature do not provide best effectiveness for these methods. Additionally, monotonic convergence is found to significantly improve efficiency without sacrificing effectiveness for very small systems, suggesting better compatibility. Work is supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division.
Uncertainty of quantitative microbiological methods of pharmaceutical analysis.
Gunar, O V; Sakhno, N G
2015-12-30
The total uncertainty of quantitative microbiological methods, used in pharmaceutical analysis, consists of several components. The analysis of the most important sources of the quantitative microbiological methods variability demonstrated no effect of culture media and plate-count techniques in the estimation of microbial count while the highly significant effect of other factors (type of microorganism, pharmaceutical product and individual reading and interpreting errors) was established. The most appropriate method of statistical analysis of such data was ANOVA which enabled not only the effect of individual factors to be estimated but also their interactions. Considering all the elements of uncertainty and combining them mathematically the combined relative uncertainty of the test results was estimated both for method of quantitative examination of non-sterile pharmaceuticals and microbial count technique without any product. These data did not exceed 35%, appropriated for a traditional plate count methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Paavolainen, Lassi; Acar, Erman; Tuna, Uygar; Peltonen, Sari; Moriya, Toshio; Soonsawad, Pan; Marjomäki, Varpu; Cheng, R Holland; Ruotsalainen, Ulla
2014-01-01
Electron tomography (ET) of biological samples is used to study the organization and the structure of the whole cell and subcellular complexes in great detail. However, projections cannot be acquired over full tilt angle range with biological samples in electron microscopy. ET image reconstruction can be considered an ill-posed problem because of this missing information. This results in artifacts, seen as the loss of three-dimensional (3D) resolution in the reconstructed images. The goal of this study was to achieve isotropic resolution with a statistical reconstruction method, sequential maximum a posteriori expectation maximization (sMAP-EM), using no prior morphological knowledge about the specimen. The missing wedge effects on sMAP-EM were examined with a synthetic cell phantom to assess the effects of noise. An experimental dataset of a multivesicular body was evaluated with a number of gold particles. An ellipsoid fitting based method was developed to realize the quantitative measures elongation and contrast in an automated, objective, and reliable way. The method statistically evaluates the sub-volumes containing gold particles randomly located in various parts of the whole volume, thus giving information about the robustness of the volume reconstruction. The quantitative results were also compared with reconstructions made with widely-used weighted backprojection and simultaneous iterative reconstruction technique methods. The results showed that the proposed sMAP-EM method significantly suppresses the effects of the missing information producing isotropic resolution. Furthermore, this method improves the contrast ratio, enhancing the applicability of further automatic and semi-automatic analysis. These improvements in ET reconstruction by sMAP-EM enable analysis of subcellular structures with higher three-dimensional resolution and contrast than conventional methods.
Alexanderian, Alen; Zhu, Liang; Salloum, Maher; Ma, Ronghui; Yu, Meilin
2017-09-01
In this study, statistical models are developed for modeling uncertain heterogeneous permeability and porosity in tumors, and the resulting uncertainties in pressure and velocity fields during an intratumoral injection are quantified using a nonintrusive spectral uncertainty quantification (UQ) method. Specifically, the uncertain permeability is modeled as a log-Gaussian random field, represented using a truncated Karhunen-Lòeve (KL) expansion, and the uncertain porosity is modeled as a log-normal random variable. The efficacy of the developed statistical models is validated by simulating the concentration fields with permeability and porosity of different uncertainty levels. The irregularity in the concentration field bears reasonable visual agreement with that in MicroCT images from experiments. The pressure and velocity fields are represented using polynomial chaos (PC) expansions to enable efficient computation of their statistical properties. The coefficients in the PC expansion are computed using a nonintrusive spectral projection method with the Smolyak sparse quadrature. The developed UQ approach is then used to quantify the uncertainties in the random pressure and velocity fields. A global sensitivity analysis is also performed to assess the contribution of individual KL modes of the log-permeability field to the total variance of the pressure field. It is demonstrated that the developed UQ approach can effectively quantify the flow uncertainties induced by uncertain material properties of the tumor.
The discounting model selector: Statistical software for delay discounting applications.
Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A
2017-05-01
Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.
Separation and confirmation of showers
NASA Astrophysics Data System (ADS)
Neslušan, L.; Hajduková, M.
2017-02-01
Aims: Using IAU MDC photographic, IAU MDC CAMS video, SonotaCo video, and EDMOND video databases, we aim to separate all provable annual meteor showers from each of these databases. We intend to reveal the problems inherent in this procedure and answer the question whether the databases are complete and the methods of separation used are reliable. We aim to evaluate the statistical significance of each separated shower. In this respect, we intend to give a list of reliably separated showers rather than a list of the maximum possible number of showers. Methods: To separate the showers, we simultaneously used two methods. The use of two methods enables us to compare their results, and this can indicate the reliability of the methods. To evaluate the statistical significance, we suggest a new method based on the ideas of the break-point method. Results: We give a compilation of the showers from all four databases using both methods. Using the first (second) method, we separated 107 (133) showers, which are in at least one of the databases used. These relatively low numbers are a consequence of discarding any candidate shower with a poor statistical significance. Most of the separated showers were identified as meteor showers from the IAU MDC list of all showers. Many of them were identified as several of the showers in the list. This proves that many showers have been named multiple times with different names. Conclusions: At present, a prevailing share of existing annual showers can be found in the data and confirmed when we use a combination of results from large databases. However, to gain a complete list of showers, we need more-complete meteor databases than the most extensive databases currently are. We also still need a more sophisticated method to separate showers and evaluate their statistical significance. Tables A.1 and A.2 are also available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/598/A40
DEIVA: a web application for interactive visual analysis of differential gene expression profiles.
Harshbarger, Jayson; Kratz, Anton; Carninci, Piero
2017-01-07
Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.
Stochastic Analysis and Design of Heterogeneous Microstructural Materials System
NASA Astrophysics Data System (ADS)
Xu, Hongyi
Advanced materials system refers to new materials that are comprised of multiple traditional constituents but complex microstructure morphologies, which lead to superior properties over the conventional materials. To accelerate the development of new advanced materials system, the objective of this dissertation is to develop a computational design framework and the associated techniques for design automation of microstructure materials systems, with an emphasis on addressing the uncertainties associated with the heterogeneity of microstructural materials. Five key research tasks are identified: design representation, design evaluation, design synthesis, material informatics and uncertainty quantification. Design representation of microstructure includes statistical characterization and stochastic reconstruction. This dissertation develops a new descriptor-based methodology, which characterizes 2D microstructures using descriptors of composition, dispersion and geometry. Statistics of 3D descriptors are predicted based on 2D information to enable 2D-to-3D reconstruction. An efficient sequential reconstruction algorithm is developed to reconstruct statistically equivalent random 3D digital microstructures. In design evaluation, a stochastic decomposition and reassembly strategy is developed to deal with the high computational costs and uncertainties induced by material heterogeneity. The properties of Representative Volume Elements (RVE) are predicted by stochastically reassembling SVE elements with stochastic properties into a coarse representation of the RVE. In design synthesis, a new descriptor-based design framework is developed, which integrates computational methods of microstructure characterization and reconstruction, sensitivity analysis, Design of Experiments (DOE), metamodeling and optimization the enable parametric optimization of the microstructure for achieving the desired material properties. Material informatics is studied to efficiently reduce the dimension of microstructure design space. This dissertation develops a machine learning-based methodology to identify the key microstructure descriptors that highly impact properties of interest. In uncertainty quantification, a comparative study on data-driven random process models is conducted to provide guidance for choosing the most accurate model in statistical uncertainty quantification. Two new goodness-of-fit metrics are developed to provide quantitative measurements of random process models' accuracy. The benefits of the proposed methods are demonstrated by the example of designing the microstructure of polymer nanocomposites. This dissertation provides material-generic, intelligent modeling/design methodologies and techniques to accelerate the process of analyzing and designing new microstructural materials system.
Data Analysis for the Behavioral Sciences Using SPSS
NASA Astrophysics Data System (ADS)
Lawner Weinberg, Sharon; Knapp Abramowitz, Sarah
2002-04-01
This book is written from the perspective that statistics is an integrated set of tools used together to uncover the story contained in numerical data. Accordingly, the book comes with a disk containing a series of real data sets to motivate discussions of appropriate methods of analysis. The presentation is based on a conceptual approach supported by an understanding of underlying mathematical foundations. Students learn that more than one method of analysis is typically needed and that an ample characterization of results is a critical component of any data analytic plan. The use of real data and SPSS to perform computations and create graphical summaries enables a greater emphasis on conceptual understanding and interpretation.
Satellite disintegration dynamics
NASA Technical Reports Server (NTRS)
Dasenbrock, R. R.; Kaufman, B.; Heard, W. B.
1975-01-01
The subject of satellite disintegration is examined in detail. Elements of the orbits of individual fragments, determined by DOD space surveillance systems, are used to accurately predict the time and place of fragmentation. Dual time independent and time dependent analyses are performed for simulated and real breakups. Methods of statistical mechanics are used to study the evolution of the fragment clouds. The fragments are treated as an ensemble of non-interacting particles. A solution of Liouville's equation is obtained which enables the spatial density to be calculated as a function of position, time and initial velocity distribution.
Schloss, Patrick D; Handelsman, Jo
2006-10-01
The recent advent of tools enabling statistical inferences to be drawn from comparisons of microbial communities has enabled the focus of microbial ecology to move from characterizing biodiversity to describing the distribution of that biodiversity. Although statistical tools have been developed to compare community structures across a phylogenetic tree, we lack tools to compare the memberships and structures of two communities at a particular operational taxonomic unit (OTU) definition. Furthermore, current tests of community structure do not indicate the similarity of the communities but only report the probability of a statistical hypothesis. Here we present a computer program, SONS, which implements nonparametric estimators for the fraction and richness of OTUs shared between two communities.
EHME: a new word database for research in Basque language.
Acha, Joana; Laka, Itziar; Landa, Josu; Salaburu, Pello
2014-11-14
This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria 1 , as well as to obtain statistical characteristics from a list of words
P-MartCancer-Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets.
Webb-Robertson, Bobbie-Jo M; Bramer, Lisa M; Jensen, Jeffrey L; Kobold, Markus A; Stratton, Kelly G; White, Amanda M; Rodland, Karin D
2017-11-01
P-MartCancer is an interactive web-based software environment that enables statistical analyses of peptide or protein data, quantitated from mass spectrometry-based global proteomics experiments, without requiring in-depth knowledge of statistical programming. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification, and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access and the capability to analyze multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium at the peptide, gene, and protein levels. P-MartCancer is deployed as a web service (https://pmart.labworks.org/cptac.html), alternatively available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/). Cancer Res; 77(21); e47-50. ©2017 AACR . ©2017 American Association for Cancer Research.
Evaluation of variability in high-resolution protein structures by global distance scoring.
Anzai, Risa; Asami, Yoshiki; Inoue, Waka; Ueno, Hina; Yamada, Koya; Okada, Tetsuji
2018-01-01
Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.
RAId_aPS: MS/MS Analysis with Multiple Scoring Functions and Spectrum-Specific Statistics
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2010-01-01
Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an -value calibration protocol, we demonstrated earlier the feasibility of translating either the score or heuristic -value reported by any method into the textbook-defined -value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific -values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign -values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign -values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page. PMID:21103371
78 FR 63458 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-24
..., access to conduct research involving DoDEA students, staff, parents or data. Additionally will establish researcher accountability, enable future contact with researchers, and support preparation of statistical and... students, staff, parents or data. To establish researcher accountability, enable future contact with...
Transmission overhaul estimates for partial and full replacement at repair
NASA Technical Reports Server (NTRS)
Savage, M.; Lewicki, D. G.
1991-01-01
Timely transmission overhauls increase in-flight service reliability greater than the calculated design reliabilities of the individual aircraft transmission components. Although necessary for aircraft safety, transmission overhauls contribute significantly to aircraft expense. Predictions of a transmission's maintenance needs at the design stage should enable the development of more cost effective and reliable transmissions in the future. The frequency is estimated of overhaul along with the number of transmissions or components needed to support the overhaul schedule. Two methods based on the two parameter Weibull statistical distribution for component life are used to estimate the time between transmission overhauls. These methods predict transmission lives for maintenance schedules which repair the transmission with a complete system replacement or repair only failed components of the transmission. An example illustrates the methods.
Zheng, Jie; Rodriguez, Santiago; Laurin, Charles; Baird, Denis; Trela-Larsen, Lea; Erzurumluoglu, Mesut A; Zheng, Yi; White, Jon; Giambartolomei, Claudia; Zabaneh, Delilah; Morris, Richard; Kumari, Meena; Casas, Juan P; Hingorani, Aroon D; Evans, David M; Gaunt, Tom R; Day, Ian N M
2017-01-01
Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : jie.zheng@bristol.ac.uk or tom.gaunt@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Vertical integration of basic science in final year of medical education
Rajan, Sudha Jasmine; Jacob, Tripti Meriel; Sathyendra, Sowmya
2016-01-01
Background: Development of health professionals with ability to integrate, synthesize, and apply knowledge gained through medical college is greatly hampered by the system of delivery that is compartmentalized and piecemeal. There is a need to integrate basic sciences with clinical teaching to enable application in clinical care. Aim: To study the benefit and acceptance of vertical integration of basic science in final year MBBS undergraduate curriculum. Materials and Methods: After Institutional Ethics Clearance, neuroanatomy refresher classes with clinical application to neurological diseases were held as part of the final year posting in two medical units. Feedback was collected. Pre- and post-tests which tested application and synthesis were conducted. Summative assessment was compared with the control group of students who had standard teaching in other two medical units. In-depth interview was conducted on 2 willing participants and 2 teachers who did neurology bedside teaching. Results: Majority (>80%) found the classes useful and interesting. There was statistically significant improvement in the post-test scores. There was a statistically significant difference between the intervention and control groups' scores during summative assessment (76.2 vs. 61.8 P < 0.01). Students felt that it reinforced, motivated self-directed learning, enabled correlations, improved understanding, put things in perspective, gave confidence, aided application, and enabled them to follow discussions during clinical teaching. Conclusion: Vertical integration of basic science in final year was beneficial and resulted in knowledge gain and improved summative scores. The classes were found to be useful, interesting and thought to help in clinical care and application by majority of students. PMID:27563584
Beyond δ: Tailoring marked statistics to reveal modified gravity
NASA Astrophysics Data System (ADS)
Valogiannis, Georgios; Bean, Rachel
2018-01-01
Models which attempt to explain the accelerated expansion of the universe through large-scale modifications to General Relativity (GR), must satisfy the stringent experimental constraints of GR in the solar system. Viable candidates invoke a “screening” mechanism, that dynamically suppresses deviations in high density environments, making their overall detection challenging even for ambitious future large-scale structure surveys. We present methods to efficiently simulate the non-linear properties of such theories, and consider how a series of statistics that reweight the density field to accentuate deviations from GR can be applied to enhance the overall signal-to-noise ratio in differentiating the models from GR. Our results demonstrate that the cosmic density field can yield additional, invaluable cosmological information, beyond the simple density power spectrum, that will enable surveys to more confidently discriminate between modified gravity models and ΛCDM.
Privacy-preserving Kruskal-Wallis test.
Guo, Suxin; Zhong, Sheng; Zhang, Aidong
2013-10-01
Statistical tests are powerful tools for data analysis. Kruskal-Wallis test is a non-parametric statistical test that evaluates whether two or more samples are drawn from the same distribution. It is commonly used in various areas. But sometimes, the use of the method is impeded by privacy issues raised in fields such as biomedical research and clinical data analysis because of the confidential information contained in the data. In this work, we give a privacy-preserving solution for the Kruskal-Wallis test which enables two or more parties to coordinately perform the test on the union of their data without compromising their data privacy. To the best of our knowledge, this is the first work that solves the privacy issues in the use of the Kruskal-Wallis test on distributed data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Statistical Modeling of Single Target Cell Encapsulation
Moon, SangJun; Ceyhan, Elvan; Gurkan, Umut Atakan; Demirci, Utkan
2011-01-01
High throughput drop-on-demand systems for separation and encapsulation of individual target cells from heterogeneous mixtures of multiple cell types is an emerging method in biotechnology that has broad applications in tissue engineering and regenerative medicine, genomics, and cryobiology. However, cell encapsulation in droplets is a random process that is hard to control. Statistical models can provide an understanding of the underlying processes and estimation of the relevant parameters, and enable reliable and repeatable control over the encapsulation of cells in droplets during the isolation process with high confidence level. We have modeled and experimentally verified a microdroplet-based cell encapsulation process for various combinations of cell loading and target cell concentrations. Here, we explain theoretically and validate experimentally a model to isolate and pattern single target cells from heterogeneous mixtures without using complex peripheral systems. PMID:21814548
Robust synthesis and continuous manufacturing of carbon nanotube forests and graphene films
NASA Astrophysics Data System (ADS)
Polsen, Erik S.
Successful translation of the outstanding properties of carbon nanotubes (CNTs) and graphene to commercial applications requires highly consistent methods of synthesis, using scalable and cost-effective machines. This thesis presents robust process conditions and a series of process operations that will enable integrated roll-to-roll (R2R) CNT and graphene growth on flexible substrates. First, a comprehensive study was undertaken to establish the sources of variation in laboratory CVD growth of CNT forests. Statistical analysis identified factors that contribute to variation in forest height and density including ambient humidity, sample position in the reactor, and barometric pressure. Implementation of system modifications and user procedures reduced the variation in height and density by 50% and 54% respectively. With improved growth, two new methods for continuous deposition and patterning of catalyst nanoparticles for CNT forest growth were developed, enabling the diameter, density and pattern geometry to be tailored through the control of process parameters. Convective assembly of catalyst nanoparticles in solution enables growth of CNT forests with density 3-fold higher than using sputtered catalyst films with the same growth parameters. Additionally, laser printing of magnetic ink character recognition toner provides a large scale patterning method, with digital control of the pattern density and tunable CNT density via laser intensity. A concentric tube CVD reactor was conceptualized, designed and built for R2R growth of CNT forests and graphene on flexible substrates helically fed through the annular gap. The design enables downstream injection of the hydrocarbon source, and gas consumption is reduced 90% compared to a standard tube furnace. Multi-wall CNT forests are grown continuously on metallic and ceramic fiber substrates at 33 mm/min. High quality, uniform bi- and multi-layer graphene is grown on Cu and Ni foils at 25 - 495 mm/min. A second machine for continuous forest growth and delamination was developed; and forest-substrate adhesion strength was controlled through CVD parameters. Taken together, these methods enable uniform R2R processing of CNT forests and graphene with engineered properties. Last, it is projected that foreseeable improvements in CNT forest quality and density using these methods will result in electrical and thermal properties that exceed state-of-the-art bulk materials.
Modality-Driven Classification and Visualization of Ensemble Variance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bensema, Kevin; Gosink, Luke; Obermaier, Harald
Advances in computational power now enable domain scientists to address conceptual and parametric uncertainty by running simulations multiple times in order to sufficiently sample the uncertain input space. While this approach helps address conceptual and parametric uncertainties, the ensemble datasets produced by this technique present a special challenge to visualization researchers as the ensemble dataset records a distribution of possible values for each location in the domain. Contemporary visualization approaches that rely solely on summary statistics (e.g., mean and variance) cannot convey the detailed information encoded in ensemble distributions that are paramount to ensemble analysis; summary statistics provide no informationmore » about modality classification and modality persistence. To address this problem, we propose a novel technique that classifies high-variance locations based on the modality of the distribution of ensemble predictions. Additionally, we develop a set of confidence metrics to inform the end-user of the quality of fit between the distribution at a given location and its assigned class. We apply a similar method to time-varying ensembles to illustrate the relationship between peak variance and bimodal or multimodal behavior. These classification schemes enable a deeper understanding of the behavior of the ensemble members by distinguishing between distributions that can be described by a single tendency and distributions which reflect divergent trends in the ensemble.« less
Wardrop, N A; Jochem, W C; Bird, T J; Chamberlain, H R; Clarke, D; Kerr, D; Bengtsson, L; Juran, S; Seaman, V; Tatem, A J
2018-04-03
Population numbers at local levels are fundamental data for many applications, including the delivery and planning of services, election preparation, and response to disasters. In resource-poor settings, recent and reliable demographic data at subnational scales can often be lacking. National population and housing census data can be outdated, inaccurate, or missing key groups or areas, while registry data are generally lacking or incomplete. Moreover, at local scales accurate boundary data are often limited, and high rates of migration and urban growth make existing data quickly outdated. Here we review past and ongoing work aimed at producing spatially disaggregated local-scale population estimates, and discuss how new technologies are now enabling robust and cost-effective solutions. Recent advances in the availability of detailed satellite imagery, geopositioning tools for field surveys, statistical methods, and computational power are enabling the development and application of approaches that can estimate population distributions at fine spatial scales across entire countries in the absence of census data. We outline the potential of such approaches as well as their limitations, emphasizing the political and operational hurdles for acceptance and sustainable implementation of new approaches, and the continued importance of traditional sources of national statistical data. Copyright © 2018 the Author(s). Published by PNAS.
SBCDDB: Sleeping Beauty Cancer Driver Database for gene discovery in mouse models of human cancers
Mann, Michael B
2018-01-01
Abstract Large-scale oncogenomic studies have identified few frequently mutated cancer drivers and hundreds of infrequently mutated drivers. Defining the biological context for rare driving events is fundamentally important to increasing our understanding of the druggable pathways in cancer. Sleeping Beauty (SB) insertional mutagenesis is a powerful gene discovery tool used to model human cancers in mice. Our lab and others have published a number of studies that identify cancer drivers from these models using various statistical and computational approaches. Here, we have integrated SB data from primary tumor models into an analysis and reporting framework, the Sleeping Beauty Cancer Driver DataBase (SBCDDB, http://sbcddb.moffitt.org), which identifies drivers in individual tumors or tumor populations. Unique to this effort, the SBCDDB utilizes a single, scalable, statistical analysis method that enables data to be grouped by different biological properties. This allows for SB drivers to be evaluated (and re-evaluated) under different contexts. The SBCDDB provides visual representations highlighting the spatial attributes of transposon mutagenesis and couples this functionality with analysis of gene sets, enabling users to interrogate relationships between drivers. The SBCDDB is a powerful resource for comparative oncogenomic analyses with human cancer genomics datasets for driver prioritization. PMID:29059366
Statistical technique for analysing functional connectivity of multiple spike trains.
Masud, Mohammad Shahed; Borisyuk, Roman
2011-03-15
A new statistical technique, the Cox method, used for analysing functional connectivity of simultaneously recorded multiple spike trains is presented. This method is based on the theory of modulated renewal processes and it estimates a vector of influence strengths from multiple spike trains (called reference trains) to the selected (target) spike train. Selecting another target spike train and repeating the calculation of the influence strengths from the reference spike trains enables researchers to find all functional connections among multiple spike trains. In order to study functional connectivity an "influence function" is identified. This function recognises the specificity of neuronal interactions and reflects the dynamics of postsynaptic potential. In comparison to existing techniques, the Cox method has the following advantages: it does not use bins (binless method); it is applicable to cases where the sample size is small; it is sufficiently sensitive such that it estimates weak influences; it supports the simultaneous analysis of multiple influences; it is able to identify a correct connectivity scheme in difficult cases of "common source" or "indirect" connectivity. The Cox method has been thoroughly tested using multiple sets of data generated by the neural network model of the leaky integrate and fire neurons with a prescribed architecture of connections. The results suggest that this method is highly successful for analysing functional connectivity of simultaneously recorded multiple spike trains. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Spampinato, A.; Axinte, D. A.
2017-12-01
The mechanisms of interaction between bodies with statistically arranged features present characteristics common to different abrasive processes, such as dressing of abrasive tools. In contrast with the current empirical approach used to estimate the results of operations based on attritive interactions, the method we present in this paper allows us to predict the output forces and the topography of a simulated grinding wheel for a set of specific operational parameters (speed ratio and radial feed-rate), providing a thorough understanding of the complex mechanisms regulating these processes. In modelling the dressing mechanisms, the abrasive characteristics of both bodies (grain size, geometry, inter-space and protrusion) are first simulated; thus, their interaction is simulated in terms of grain collisions. Exploiting a specifically designed contact/impact evaluation algorithm, the model simulates the collisional effects of the dresser abrasives on the grinding wheel topography (grain fracture/break-out). The method has been tested for the case of a diamond rotary dresser, predicting output forces within less than 10% error and obtaining experimentally validated grinding wheel topographies. The study provides a fundamental understanding of the dressing operation, enabling the improvement of its performance in an industrial scenario, while being of general interest in modelling collision-based processes involving statistically distributed elements.
Bayesian Orbit Computation Tools for Objects on Geocentric Orbits
NASA Astrophysics Data System (ADS)
Virtanen, J.; Granvik, M.; Muinonen, K.; Oszkiewicz, D.
2013-08-01
We consider the space-debris orbital inversion problem via the concept of Bayesian inference. The methodology has been put forward for the orbital analysis of solar system small bodies in early 1990's [7] and results in a full solution of the statistical inverse problem given in terms of a posteriori probability density function (PDF) for the orbital parameters. We demonstrate the applicability of our statistical orbital analysis software to Earth orbiting objects, both using well-established Monte Carlo (MC) techniques (for a review, see e.g. [13] as well as recently developed Markov-chain MC (MCMC) techniques (e.g., [9]). In particular, we exploit the novel virtual observation MCMC method [8], which is based on the characterization of the phase-space volume of orbital solutions before the actual MCMC sampling. Our statistical methods and the resulting PDFs immediately enable probabilistic impact predictions to be carried out. Furthermore, this can be readily done also for very sparse data sets and data sets of poor quality - providing that some a priori information on the observational uncertainty is available. For asteroids, impact probabilities with the Earth from the discovery night onwards have been provided, e.g., by [11] and [10], the latter study includes the sampling of the observational-error standard deviation as a random variable.
Bingi, Jayachandra; Murukeshan, Vadakke Matham
2015-12-18
Laser speckle pattern is a granular structure formed due to random coherent wavelet interference and generally considered as noise in optical systems including photolithography. Contrary to this, in this paper, we use the speckle pattern to generate predictable and controlled Gaussian random structures and quasi-random structures photo-lithographically. The random structures made using this proposed speckle lithography technique are quantified based on speckle statistics, radial distribution function (RDF) and fast Fourier transform (FFT). The control over the speckle size, density and speckle clustering facilitates the successful fabrication of black silicon with different surface structures. The controllability and tunability of randomness makes this technique a robust method for fabricating predictable 2D Gaussian random structures and black silicon structures. These structures can enhance the light trapping significantly in solar cells and hence enable improved energy harvesting. Further, this technique can enable efficient fabrication of disordered photonic structures and random media based devices.
Autonomy and Housing Accessibility Among Powered Mobility Device Users
Brandt, Åse; Lexell, Eva Månsson; Iwarsson, Susanne
2015-01-01
OBJECTIVE. To describe environmental barriers, accessibility problems, and powered mobility device (PMD) users’ autonomy indoors and outdoors; to determine the home environmental barriers that generated the most housing accessibility problems indoors, at entrances, and in the close exterior surroundings; and to examine personal factors and environmental components and their association with indoor and outdoor autonomy. METHOD. This cross-sectional study was based on data collected from a sample of 48 PMD users with a spinal cord injury (SCI) using the Impact of Participation and Autonomy and the Housing Enabler instruments. Descriptive statistics and logistic regression were used. RESULTS. More years living with SCI predicted less restriction in autonomy indoors, whereas more functional limitations and accessibility problems related to entrance doors predicted more restriction in autonomy outdoors. CONCLUSION. To enable optimized PMD use, practitioners must pay attention to the relationship between client autonomy and housing accessibility problems. PMID:26356666
Predicting hospital visits from geo-tagged Internet search logs.
Agarwal, Vibhu; Han, Lichy; Madan, Isaac; Saluja, Shaurya; Shidham, Aaditya; Shah, Nigam H
2016-01-01
The steady rise in healthcare costs has deprived over 45 million Americans of healthcare services (1, 2) and has encouraged healthcare providers to look for opportunities to improve their operational efficiency. Prior studies have shown that evidence of healthcare seeking intent in Internet searches correlates well with healthcare resource utilization. Given the ubiquitous nature of mobile Internet search, we hypothesized that analyzing geo-tagged mobile search logs could enable us to machine-learn predictors of future patient visits. Using a de-identified dataset of geo-tagged mobile Internet search logs, we mined text and location patterns that are predictors of healthcare resource utilization and built statistical models that predict the probability of a user's future visit to a medical facility. Our efforts will enable the development of innovative methods for modeling and optimizing the use of healthcare resources-a crucial prerequisite for securing healthcare access for everyone in the days to come.
Ryu, Jihye; Torres, Elizabeth B.
2018-01-01
The field of enacted/embodied cognition has emerged as a contemporary attempt to connect the mind and body in the study of cognition. However, there has been a paucity of methods that enable a multi-layered approach tapping into different levels of functionality within the nervous systems (e.g., continuously capturing in tandem multi-modal biophysical signals in naturalistic settings). The present study introduces a new theoretical and statistical framework to characterize the influences of cognitive demands on biophysical rhythmic signals harnessed from deliberate, spontaneous and autonomic activities. In this study, nine participants performed a basic pointing task to communicate a decision while they were exposed to different levels of cognitive load. Within these decision-making contexts, we examined the moment-by-moment fluctuations in the peak amplitude and timing of the biophysical time series data (e.g., continuous waveforms extracted from hand kinematics and heart signals). These spike-trains data offered high statistical power for personalized empirical statistical estimation and were well-characterized by a Gamma process. Our approach enabled the identification of different empirically estimated families of probability distributions to facilitate inference regarding the continuous physiological phenomena underlying cognitively driven decision-making. We found that the same pointing task revealed shifts in the probability distribution functions (PDFs) of the hand kinematic signals under study and were accompanied by shifts in the signatures of the heart inter-beat-interval timings. Within the time scale of an experimental session, marked changes in skewness and dispersion of the distributions were tracked on the Gamma parameter plane with 95% confidence. The results suggest that traditional theoretical assumptions of stationarity and normality in biophysical data from the nervous systems are incongruent with the true statistical nature of empirical data. This work offers a unifying platform for personalized statistical inference that goes far beyond those used in conventional studies, often assuming a “one size fits all model” on data drawn from discrete events such as mouse clicks, and observations that leave out continuously co-occurring spontaneous activity taking place largely beneath awareness. PMID:29681805
PCTO-SIM: Multiple-point geostatistical modeling using parallel conditional texture optimization
NASA Astrophysics Data System (ADS)
Pourfard, Mohammadreza; Abdollahifard, Mohammad J.; Faez, Karim; Motamedi, Sayed Ahmad; Hosseinian, Tahmineh
2017-05-01
Multiple-point Geostatistics is a well-known general statistical framework by which complex geological phenomena have been modeled efficiently. Pixel-based and patch-based are two major categories of these methods. In this paper, the optimization-based category is used which has a dual concept in texture synthesis as texture optimization. Our extended version of texture optimization uses the energy concept to model geological phenomena. While honoring the hard point, the minimization of our proposed cost function forces simulation grid pixels to be as similar as possible to training images. Our algorithm has a self-enrichment capability and creates a richer training database from a sparser one through mixing the information of all surrounding patches of the simulation nodes. Therefore, it preserves pattern continuity in both continuous and categorical variables very well. It also shows a fuzzy result in its every realization similar to the expected result of multi realizations of other statistical models. While the main core of most previous Multiple-point Geostatistics methods is sequential, the parallel main core of our algorithm enabled it to use GPU efficiently to reduce the CPU time. One new validation method for MPS has also been proposed in this paper.
Assessing the fit of site-occupancy models
MacKenzie, D.I.; Bailey, L.L.
2004-01-01
Few species are likely to be so evident that they will always be detected at a site when present. Recently a model has been developed that enables estimation of the proportion of area occupied, when the target species is not detected with certainty. Here we apply this modeling approach to data collected on terrestrial salamanders in the Plethodon glutinosus complex in the Great Smoky Mountains National Park, USA, and wish to address the question 'how accurately does the fitted model represent the data?' The goodness-of-fit of the model needs to be assessed in order to make accurate inferences. This article presents a method where a simple Pearson chi-square statistic is calculated and a parametric bootstrap procedure is used to determine whether the observed statistic is unusually large. We found evidence that the most global model considered provides a poor fit to the data, hence estimated an overdispersion factor to adjust model selection procedures and inflate standard errors. Two hypothetical datasets with known assumption violations are also analyzed, illustrating that the method may be used to guide researchers to making appropriate inferences. The results of a simulation study are presented to provide a broader view of the methods properties.
Some Dimensions of Data Quality in Statistical Systems
DOT National Transportation Integrated Search
1997-07-01
An important objective of a statistical data system is to enable users of the data to recommend (an organizations to take) rational action for solving problems or for improving quality of service of manufactured product. With this view in mind, this ...
NASA Astrophysics Data System (ADS)
Borycki, Dawid; Kholiqov, Oybek; Zhou, Wenjun; Srinivasan, Vivek J.
2017-03-01
Sensing and imaging methods based on the dynamic scattering of coherent light, including laser speckle, laser Doppler, and diffuse correlation spectroscopy quantify scatterer motion using light intensity (speckle) fluctuations. The underlying optical field autocorrelation (OFA), rather than being measured directly, is typically inferred from the intensity autocorrelation (IA) through the Siegert relationship, by assuming that the scattered field obeys Gaussian statistics. In this work, we demonstrate interferometric near-infrared spectroscopy (iNIRS) for measurement of time-of-flight (TOF) resolved field and intensity autocorrelations in fluid tissue phantoms and in vivo. In phantoms, we find a breakdown of the Siegert relationship for short times-of-flight due to a contribution from static paths whose optical field does not decorrelate over experimental time scales, and demonstrate that eliminating such paths by polarization gating restores the validity of the Siegert relationship. Inspired by these results, we developed a method, called correlation gating, for separating the OFA into static and dynamic components. Correlation gating enables more precise quantification of tissue dynamics. To prove this, we show that iNIRS and correlation gating can be applied to measure cerebral hemodynamics of the nude mouse in vivo using dynamically scattered (ergodic) paths and not static (non-ergodic) paths, which may not be impacted by blood. More generally, correlation gating, in conjunction with TOF resolution, enables more precise separation of diffuse and non-diffusive contributions to OFA than is possible with TOF resolution alone. Finally, we show that direct measurements of OFA are statistically more efficient than indirect measurements based on IA.
Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu
2017-12-07
Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Bakal, Tomas; Janata, Jiri; Sabova, Lenka; Grabic, Roman; Zlabek, Vladimir; Najmanova, Lucie
2018-06-16
A robust and widely applicable method for sampling of aquatic microbial biofilm and further sample processing is presented. The method is based on next-generation sequencing of V4-V5 variable regions of 16S rRNA gene and further statistical analysis of sequencing data, which could be useful not only to investigate taxonomic composition of biofilm bacterial consortia but also to assess aquatic ecosystem health. Five artificial materials commonly used for biofilm growth (glass, stainless steel, aluminum, polypropylene, polyethylene) were tested to determine the one giving most robust and reproducible results. The effect of used sampler material on total microbial composition was not statistically significant; however, the non-plastic materials (glass, metal) gave more stable outputs without irregularities among sample parallels. The bias of the method is assessed with respect to the employment of a non-quantitative step (PCR amplification) to obtain quantitative results (relative abundance of identified taxa). This aspect is often overlooked in ecological and medical studies. We document that sequencing of a mixture of three merged primary PCR reactions for each sample and further evaluation of median values from three technical replicates for each sample enables to overcome this bias and gives robust and repeatable results well distinguishing among sampling localities and seasons.
Advances in Time Estimation Methods for Molecular Data.
Kumar, Sudhir; Hedges, S Blair
2016-04-01
Molecular dating has become central to placing a temporal dimension on the tree of life. Methods for estimating divergence times have been developed for over 50 years, beginning with the proposal of molecular clock in 1962. We categorize the chronological development of these methods into four generations based on the timing of their origin. In the first generation approaches (1960s-1980s), a strict molecular clock was assumed to date divergences. In the second generation approaches (1990s), the equality of evolutionary rates between species was first tested and then a strict molecular clock applied to estimate divergence times. The third generation approaches (since ∼2000) account for differences in evolutionary rates across the tree by using a statistical model, obviating the need to assume a clock or to test the equality of evolutionary rates among species. Bayesian methods in the third generation require a specific or uniform prior on the speciation-process and enable the inclusion of uncertainty in clock calibrations. The fourth generation approaches (since 2012) allow rates to vary from branch to branch, but do not need prior selection of a statistical model to describe the rate variation or the specification of speciation model. With high accuracy, comparable to Bayesian approaches, and speeds that are orders of magnitude faster, fourth generation methods are able to produce reliable timetrees of thousands of species using genome scale data. We found that early time estimates from second generation studies are similar to those of third and fourth generation studies, indicating that methodological advances have not fundamentally altered the timetree of life, but rather have facilitated time estimation by enabling the inclusion of more species. Nonetheless, we feel an urgent need for testing the accuracy and precision of third and fourth generation methods, including their robustness to misspecification of priors in the analysis of large phylogenies and data sets. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Smith, W Brad; Cuenca Lara, Rubí Angélica; Delgado Caballero, Carina Edith; Godínez Valdivia, Carlos Isaías; Kapron, Joseph S; Leyva Reyes, Juan Carlos; Meneses Tovar, Carmen Lourdes; Miles, Patrick D; Oswalt, Sonja N; Ramírez Salgado, Mayra; Song, Xilong Alex; Stinson, Graham; Villela Gaytán, Sergio Armando
2018-05-21
Forests cannot be managed sustainably without reliable data to inform decisions. National Forest Inventories (NFI) tend to report national statistics, with sub-national stratification based on domestic ecological classification systems. It is becoming increasingly important to be able to report statistics on ecosystems that span international borders, as global change and globalization expand stakeholders' spheres of concern. The state of a transnational ecosystem can only be properly assessed by examining the entire ecosystem. In global forest resource assessments, it may be useful to break national statistics down by ecosystem, especially for large countries. The Inventory and Monitoring Working Group (IMWG) of the North American Forest Commission (NAFC) has begun developing a harmonized North American Forest Database (NAFD) for managing forest inventory data, enabling consistent, continental-scale forest assessment supporting ecosystem-level reporting and relational queries. The first iteration of the database contains data describing 1.9 billion ha, including 677.5 million ha of forest. Data harmonization is made challenging by the existence of definitions and methodologies tailored to suit national circumstances, emerging from each country's professional forestry development. This paper reports the methods used to synchronize three national forest inventories, starting with a small suite of variables and attributes.
Enhancing pediatric clinical trial feasibility through the use of Bayesian statistics.
Huff, Robin A; Maca, Jeff D; Puri, Mala; Seltzer, Earl W
2017-11-01
BackgroundPediatric clinical trials commonly experience recruitment challenges including limited number of patients and investigators, inclusion/exclusion criteria that further reduce the patient pool, and a competitive research landscape created by pediatric regulatory commitments. To overcome these challenges, innovative approaches are needed.MethodsThis article explores the use of Bayesian statistics to improve pediatric trial feasibility, using pediatric Type-2 diabetes as an example. Data for six therapies approved for adults were used to perform simulations to determine the impact on pediatric trial size.ResultsWhen the number of adult patients contributing to the simulation was assumed to be the same as the number of patients to be enrolled in the pediatric trial, the pediatric trial size was reduced by 75-78% when compared with a frequentist statistical approach, but was associated with a 34-45% false-positive rate. In subsequent simulations, greater control was exerted over the false-positive rate by decreasing the contribution of the adult data. A 30-33% reduction in trial size was achieved when false-positives were held to less than 10%.ConclusionReducing the trial size through the use of Bayesian statistics would facilitate completion of pediatric trials, enabling drugs to be labeled appropriately for children.
Enabling a Comprehensive Teaching Strategy: Video Lectures
ERIC Educational Resources Information Center
Brecht, H. David; Ogilby, Suzanne M.
2008-01-01
This study empirically tests the feasibility and effectiveness of video lectures as a form of video instruction that enables a comprehensive teaching strategy used throughout a traditional classroom course. It examines student use patterns and the videos' effects on student learning, using qualitative and nonparametric statistical analyses of…
ERIC Educational Resources Information Center
Lovett, Jennifer N.; Lee, Hollylynne S.
2017-01-01
The purpose of this paper is to present a multitechnology-enabled lesson used with secondary preservice mathematics teachers to develop their technological pedagogical statistical knowledge. This lesson engages preservice teachers in a statistics lesson aimed at developing their reasoning about the measurement units of data using TinkerPlots and…
ERIC Educational Resources Information Center
Riskowski, Jody L.; Olbricht, Gayla; Wilson, Jennifer
2010-01-01
Statistics is the art and science of gathering, analyzing, and making conclusions from data. However, many people do not fully understand how to interpret statistical results and conclusions. Placing students in a collaborative environment involving project-based learning may enable them to overcome misconceptions of probability and enhance the…
Heterogeneous variances in multi-environment yield trials for corn hybrids
USDA-ARS?s Scientific Manuscript database
Recent developments in statistics and computing have enabled much greater levels of complexity in statistical models of multi-environment yield trial data. One particular feature of interest to breeders is simultaneously modeling heterogeneity of variances among environments and cultivars. Our obj...
2017-01-01
Statistical approaches to emergent knowledge have tended to focus on the process by which experience of individual episodes accumulates into generalizable experience across episodes. However, there is a seemingly opposite, but equally critical, process that such experience affords: the process by which, from a space of types (e.g. onions—a semantic class that develops through exposure to individual episodes involving individual onions), we can perceive or create, on-the-fly, a specific token (a specific onion, perhaps one that is chopped) in the absence of any prior perceptual experience with that specific token. This article reviews a selection of statistical learning studies that lead to the speculation that this process—the generation, on the basis of semantic memory, of a novel episodic representation—is itself an instance of a statistical, in fact associative, process. The article concludes that the same processes that enable statistical abstraction across individual episodes to form semantic memories also enable the generation, from those semantic memories, of representations that correspond to individual tokens, and of novel episodic facts about those tokens. Statistical learning is a window onto these deeper processes that underpin cognition. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’. PMID:27872378
Altmann, Gerry T M
2017-01-05
Statistical approaches to emergent knowledge have tended to focus on the process by which experience of individual episodes accumulates into generalizable experience across episodes. However, there is a seemingly opposite, but equally critical, process that such experience affords: the process by which, from a space of types (e.g. onions-a semantic class that develops through exposure to individual episodes involving individual onions), we can perceive or create, on-the-fly, a specific token (a specific onion, perhaps one that is chopped) in the absence of any prior perceptual experience with that specific token. This article reviews a selection of statistical learning studies that lead to the speculation that this process-the generation, on the basis of semantic memory, of a novel episodic representation-is itself an instance of a statistical, in fact associative, process. The article concludes that the same processes that enable statistical abstraction across individual episodes to form semantic memories also enable the generation, from those semantic memories, of representations that correspond to individual tokens, and of novel episodic facts about those tokens. Statistical learning is a window onto these deeper processes that underpin cognition.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).
[Sem: a suitable statistical software adaptated for research in oncology].
Kwiatkowski, F; Girard, M; Hacene, K; Berlie, J
2000-10-01
Many softwares have been adapted for medical use; they rarely enable conveniently both data management and statistics. A recent cooperative work ended up in a new software, Sem (Statistics Epidemiology Medicine), which allows data management of trials and, as well, statistical treatments on them. Very convenient, it can be used by non professional in statistics (biologists, doctors, researchers, data managers), since usually (excepted with multivariate models), the software performs by itself the most adequate test, after what complementary tests can be requested if needed. Sem data base manager (DBM) is not compatible with usual DBM: this constitutes a first protection against loss of privacy. Other shields (passwords, cryptage...) strengthen data security, all the more necessary today since Sem can be run on computers nets. Data organization enables multiplicity: forms can be duplicated by patient. Dates are treated in a special but transparent manner (sorting, date and delay calculations...). Sem communicates with common desktop softwares, often with a simple copy/paste. So, statistics can be easily performed on data stored in external calculation sheets, and slides by pasting graphs with a single mouse click (survival curves...). Already used over fifty places in different hospitals for daily work, this product, combining data management and statistics, appears to be a convenient and innovative solution.
Bayesian evaluation of effect size after replicating an original study
van Aert, Robbie C. M.; van Assen, Marcel A. L. M.
2017-01-01
The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies. However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant. We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study’s significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application (https://rvanaert.shinyapps.io/snapshot/) and R code for applying the method. PMID:28388646
The Earthquake‐Source Inversion Validation (SIV) Project
Mai, P. Martin; Schorlemmer, Danijel; Page, Morgan T.; Ampuero, Jean-Paul; Asano, Kimiyuki; Causse, Mathieu; Custodio, Susana; Fan, Wenyuan; Festa, Gaetano; Galis, Martin; Gallovic, Frantisek; Imperatori, Walter; Käser, Martin; Malytskyy, Dmytro; Okuwaki, Ryo; Pollitz, Fred; Passone, Luca; Razafindrakoto, Hoby N. T.; Sekiguchi, Haruko; Song, Seok Goo; Somala, Surendra N.; Thingbaijam, Kiran K. S.; Twardzik, Cedric; van Driel, Martin; Vyas, Jagdish C.; Wang, Rongjiang; Yagi, Yuji; Zielke, Olaf
2016-01-01
Finite‐fault earthquake source inversions infer the (time‐dependent) displacement on the rupture surface from geophysical data. The resulting earthquake source models document the complexity of the rupture process. However, multiple source models for the same earthquake, obtained by different research teams, often exhibit remarkable dissimilarities. To address the uncertainties in earthquake‐source inversion methods and to understand strengths and weaknesses of the various approaches used, the Source Inversion Validation (SIV) project conducts a set of forward‐modeling exercises and inversion benchmarks. In this article, we describe the SIV strategy, the initial benchmarks, and current SIV results. Furthermore, we apply statistical tools for quantitative waveform comparison and for investigating source‐model (dis)similarities that enable us to rank the solutions, and to identify particularly promising source inversion approaches. All SIV exercises (with related data and descriptions) and statistical comparison tools are available via an online collaboration platform, and we encourage source modelers to use the SIV benchmarks for developing and testing new methods. We envision that the SIV efforts will lead to new developments for tackling the earthquake‐source imaging problem.
NASA Astrophysics Data System (ADS)
Aldrin, John C.; Lindgren, Eric A.
2018-04-01
This paper expands on the objective and motivation for NDE-based characterization and includes a discussion of the current approach using model-assisted inversion being pursued within the Air Force Research Laboratory (AFRL). This includes a discussion of the multiple model-based methods that can be used, including physics-based models, deep machine learning, and heuristic approaches. The benefits and drawbacks of each method is reviewed and the potential to integrate multiple methods is discussed. Initial successes are included to highlight the ability to obtain quantitative values of damage. Additional steps remaining to realize this capability with statistical metrics of accuracy are discussed, and how these results can be used to enable probabilistic life management are addressed. The outcome of this initiative will realize the long-term desired capability of NDE methods to provide quantitative characterization to accelerate certification of new materials and enhance life management of engineered systems.
Reconstructing metastatic seeding patterns of human cancers
Reiter, Johannes G.; Makohon-Moore, Alvin P.; Gerold, Jeffrey M.; Bozic, Ivana; Chatterjee, Krishnendu; Iacobuzio-Donahue, Christine A.; Vogelstein, Bert; Nowak, Martin A.
2017-01-01
Reconstructing the evolutionary history of metastases is critical for understanding their basic biological principles and has profound clinical implications. Genome-wide sequencing data has enabled modern phylogenomic methods to accurately dissect subclones and their phylogenies from noisy and impure bulk tumour samples at unprecedented depth. However, existing methods are not designed to infer metastatic seeding patterns. Here we develop a tool, called Treeomics, to reconstruct the phylogeny of metastases and map subclones to their anatomic locations. Treeomics infers comprehensive seeding patterns for pancreatic, ovarian, and prostate cancers. Moreover, Treeomics correctly disambiguates true seeding patterns from sequencing artifacts; 7% of variants were misclassified by conventional statistical methods. These artifacts can skew phylogenies by creating illusory tumour heterogeneity among distinct samples. In silico benchmarking on simulated tumour phylogenies across a wide range of sample purities (15–95%) and sequencing depths (25-800 × ) demonstrates the accuracy of Treeomics compared with existing methods. PMID:28139641
Metaplot: A Novel Stata Graph for Assessing Heterogeneity at a Glance
Poorolajal, J; Mahmoodi, M; Majdzadeh, R; Fotouhi, A
2010-01-01
Background: Heterogeneity is usually a major concern in meta-analysis. Although there are some statistical approaches for assessing variability across studies, here we present a new approach to heterogeneity using “MetaPlot” that investigate the influence of a single study on the overall heterogeneity. Methods: MetaPlot is a two-way (x, y) graph, which can be considered as a complementary graphical approach for testing heterogeneity. This method shows graphically as well as numerically the results of an influence analysis, in which Higgins’ I2 statistic with 95% (Confidence interval) CI are computed omitting one study in each turn and then are plotted against reciprocal of standard error (1/SE) or “precision”. In this graph, “1/SE” lies on x axis and “I2 results” lies on y axe. Results: Having a first glance at MetaPlot, one can predict to what extent omission of a single study may influence the overall heterogeneity. The precision on x-axis enables us to distinguish the size of each trial. The graph describes I2 statistic with 95% CI graphically as well as numerically in one view for prompt comparison. It is possible to implement MetaPlot for meta-analysis of different types of outcome data and summary measures. Conclusion: This method presents a simple graphical approach to identify an outlier and its effect on overall heterogeneity at a glance. We wish to suggest MetaPlot to Stata experts to prepare its module for the software. PMID:23113013
Deployment of paired pushnets from jet-propelled kayaks to sample ichthyoplankton
Acre, Matthew R.; Grabowski, Timothy B.
2015-01-01
Accessing and effectively sampling the off-channel habitats that are considered crucial for early life stages of freshwater fishes constitute a difficult challenge when common ichthyoplankton survey methods, such as push nets, are used. We describe a new method of deploying push nets from jet-propelled kayaks to enable the sampling of previously inaccessible off-channel habitats. The described rig is also functional in more open and accessible habitats, such as the main channel of rivers or reservoirs. Although further evaluation is necessary to ensure that results are comparable across studies, the described push-net system offers a statistically rigorous methodology that generates replicate samples from a wide range of freshwater habitats that were previously inaccessible to this gear type.
Displaying R spatial statistics on Google dynamic maps with web applications created by Rwui
2012-01-01
Background The R project includes a large variety of packages designed for spatial statistics. Google dynamic maps provide web based access to global maps and satellite imagery. We describe a method for displaying directly the spatial output from an R script on to a Google dynamic map. Methods This is achieved by creating a Java based web application which runs the R script and then displays the results on the dynamic map. In order to make this method easy to implement by those unfamiliar with programming Java based web applications, we have added the method to the options available in the R Web User Interface (Rwui) application. Rwui is an established web application for creating web applications for running R scripts. A feature of Rwui is that all the code for the web application being created is generated automatically so that someone with no knowledge of web programming can make a fully functional web application for running an R script in a matter of minutes. Results Rwui can now be used to create web applications that will display the results from an R script on a Google dynamic map. Results may be displayed as discrete markers and/or as continuous overlays. In addition, users of the web application may select regions of interest on the dynamic map with mouse clicks and the coordinates of the region of interest will automatically be made available for use by the R script. Conclusions This method of displaying R output on dynamic maps is designed to be of use in a number of areas. Firstly it allows statisticians, working in R and developing methods in spatial statistics, to easily visualise the results of applying their methods to real world data. Secondly, it allows researchers who are using R to study health geographics data, to display their results directly onto dynamic maps. Thirdly, by creating a web application for running an R script, a statistician can enable users entirely unfamiliar with R to run R coded statistical analyses of health geographics data. Fourthly, we envisage an educational role for such applications. PMID:22998945
Management Ratios 1. For Colleges & Universities.
ERIC Educational Resources Information Center
Minter, John, Ed.
Ratios that enable colleges and universities to select other institutions for comparison are presented. The ratios and underlying data also enable colleges to rank order institutions and to calculate means, quartiles, and ranges for these groups. The data are based on FY 1983 U.S. Department of Education Statistics. The ratios summarize the…
A new quantitative approach to measure perceived work-related stress in Italian employees.
Cevenini, Gabriele; Fratini, Ilaria; Gambassi, Roberto
2012-09-01
We propose a method for a reliable quantitative measure of subjectively perceived occupational stress applicable in any company to enhance occupational safety and psychosocial health, to enable precise prevention policies and intervention and to improve work quality and efficiency. A suitable questionnaire was telephonically administered to a stratified sample of the whole Italian population of employees. Combined multivariate statistical methods, including principal component, cluster and discriminant analyses, were used to identify risk factors and to design a causal model for understanding work-related stress. The model explained the causal links of stress through employee perception of imbalance between job demands and resources for responding appropriately, by supplying a reliable U-shaped nonlinear stress index, expressed in terms of values of human systolic arterial pressure. Low, intermediate and high values indicated demotivation (or inefficiency), well-being and distress, respectively. Costs for stress-dependent productivity shortcomings were estimated to about 3.7% of national income from employment. The method identified useful structured information able to supply a simple and precise interpretation of employees' well-being and stress risk. Results could be compared with estimated national benchmarks to enable targeted intervention strategies to protect the health and safety of workers, and to reduce unproductive costs for firms.
A Semi-Automatic Method for Image Analysis of Edge Dynamics in Living Cells
Huang, Lawrence; Helmke, Brian P.
2011-01-01
Spatial asymmetry of actin edge ruffling contributes to the process of cell polarization and directional migration, but mechanisms by which external cues control actin polymerization near cell edges remain unclear. We designed a quantitative image analysis strategy to measure the spatiotemporal distribution of actin edge ruffling. Time-lapse images of endothelial cells (ECs) expressing mRFP-actin were segmented using an active contour method. In intensity line profiles oriented normal to the cell edge, peak detection identified the angular distribution of polymerized actin within 1 µm of the cell edge, which was localized to lamellipodia and edge ruffles. Edge features associated with filopodia and peripheral stress fibers were removed. Circular statistical analysis enabled detection of cell polarity, indicated by a unimodal distribution of edge ruffles. To demonstrate the approach, we detected a rapid, nondirectional increase in edge ruffling in serum-stimulated ECs and a change in constitutive ruffling orientation in quiescent, nonpolarized ECs. Error analysis using simulated test images demonstrate robustness of the method to variations in image noise levels, edge ruffle arc length, and edge intensity gradient. These quantitative measurements of edge ruffling dynamics enable investigation at the cellular length scale of the underlying molecular mechanisms regulating actin assembly and cell polarization. PMID:21643526
P-MartCancer–Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.; Bramer, Lisa M.; Jensen, Jeffrey L.
P-MartCancer is a new interactive web-based software environment that enables biomedical and biological scientists to perform in-depth analyses of global proteomics data without requiring direct interaction with the data or with statistical software. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access to multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium (CPTAC) at the peptide, gene and protein levels. P-MartCancer is deployed using Azure technologies (http://pmart.labworks.org/cptac.html), the web-service is alternativelymore » available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/) and many statistical functions can be utilized directly from an R package available on GitHub (https://github.com/pmartR).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilson, Andrew; Haass, Michael; Rintoul, Mark Daniel
GazeAppraise advances the state of the art of gaze pattern analysis using methods that simultaneously analyze spatial and temporal characteristics of gaze patterns. GazeAppraise enables novel research in visual perception and cognition; for example, using shape features as distinguishing elements to assess individual differences in visual search strategy. Given a set of point-to-point gaze sequences, hereafter referred to as scanpaths, the method constructs multiple descriptive features for each scanpath. Once the scanpath features have been calculated, they are used to form a multidimensional vector representing each scanpath and cluster analysis is performed on the set of vectors from all scanpaths.more » An additional benefit of this method is the identification of causal or correlated characteristics of the stimuli, subjects, and visual task through statistical analysis of descriptive metadata distributions within and across clusters.« less
NASA Astrophysics Data System (ADS)
Wong, Kin-Yiu; Gao, Jiali
2007-12-01
Based on Kleinert's variational perturbation (KP) theory [Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 3rd ed. (World Scientific, Singapore, 2004)], we present an analytic path-integral approach for computing the effective centroid potential. The approach enables the KP theory to be applied to any realistic systems beyond the first-order perturbation (i.e., the original Feynman-Kleinert [Phys. Rev. A 34, 5080 (1986)] variational method). Accurate values are obtained for several systems in which exact quantum results are known. Furthermore, the computed kinetic isotope effects for a series of proton transfer reactions, in which the potential energy surfaces are evaluated by density-functional theory, are in good accordance with experiments. We hope that our method could be used by non-path-integral experts or experimentalists as a "black box" for any given system.
A basket two-part model to analyze medical expenditure on interdependent multiple sectors.
Sugawara, Shinya; Wu, Tianyi; Yamanishi, Kenji
2018-05-01
This study proposes a novel statistical methodology to analyze expenditure on multiple medical sectors using consumer data. Conventionally, medical expenditure has been analyzed by two-part models, which separately consider purchase decision and amount of expenditure. We extend the traditional two-part models by adding the step of basket analysis for dimension reduction. This new step enables us to analyze complicated interdependence between multiple sectors without an identification problem. As an empirical application for the proposed method, we analyze data of 13 medical sectors from the Medical Expenditure Panel Survey. In comparison with the results of previous studies that analyzed the multiple sector independently, our method provides more detailed implications of the impacts of individual socioeconomic status on the composition of joint purchases from multiple medical sectors; our method has a better prediction performance.
Multifractal analysis of mobile social networks
NASA Astrophysics Data System (ADS)
Zheng, Wei; Zhang, Zifeng; Deng, Yufan
2017-09-01
As Wireless Fidelity (Wi-Fi)-enabled handheld devices have been widely used, the mobile social networks (MSNs) has been attracting extensive attention. Fractal approaches have also been widely applied to characterierize natural networks as useful tools to depict their spatial distribution and scaling properties. Moreover, when the complexity of the spatial distribution of MSNs cannot be properly charaterized by single fractal dimension, multifractal analysis is required. For further research, we introduced a multifractal analysis method based on box-covering algorithm to describe the structure of MSNs. Using this method, we find that the networks are multifractal at different time interval. The simulation results demonstrate that the proposed method is efficient for analyzing the multifractal characteristic of MSNs, which provides a distribution of singularities adequately describing both the heterogeneity of fractal patterns and the statistics of measurements across spatial scales in MSNs.
Translation of shuttle operations simulation from GPSS 2 to GPSS 1100
NASA Technical Reports Server (NTRS)
Marshall, A. J.
1972-01-01
A method has been developed which enables a programmer to convert the General Purpose Systems Simulator (GPSS) 2 simulation language into the GPSS 1100 language. To accomplish the conversion, a translator deck is used in addition to hand changes made by the analyst after translation. The conversion of a particular GPSS 2 program used at the Marshall Space Flight Center (MSFC) is reported and major changes required for compatibility of the two languages are summerized. Validation of the GPSS 1100 model was completed by comparing the results of the GPSS 2 statistics to the converted 1100 model.
NASA Astrophysics Data System (ADS)
Azila Che Musa, Nor; Mahmud, Zamalia; Baharun, Norhayati
2017-09-01
One of the important skills that is required from any student who are learning statistics is knowing how to solve statistical problems correctly using appropriate statistical methods. This will enable them to arrive at a conclusion and make a significant contribution and decision for the society. In this study, a group of 22 students majoring in statistics at UiTM Shah Alam were given problems relating to topics on testing of hypothesis which require them to solve the problems using confidence interval, traditional and p-value approach. Hypothesis testing is one of the techniques used in solving real problems and it is listed as one of the difficult concepts for students to grasp. The objectives of this study is to explore students’ perceived and actual ability in solving statistical problems and to determine which item in statistical problem solving that students find difficult to grasp. Students’ perceived and actual ability were measured based on the instruments developed from the respective topics. Rasch measurement tools such as Wright map and item measures for fit statistics were used to accomplish the objectives. Data were collected and analysed using Winsteps 3.90 software which is developed based on the Rasch measurement model. The results showed that students’ perceived themselves as moderately competent in solving the statistical problems using confidence interval and p-value approach even though their actual performance showed otherwise. Item measures for fit statistics also showed that the maximum estimated measures were found on two problems. These measures indicate that none of the students have attempted these problems correctly due to reasons which include their lack of understanding in confidence interval and probability values.
Jancey, Jonine; Howat, Peter; Ledger, Melissa; Lee, Andy H.
2013-01-01
Introduction Workplace health promotion programs to prevent overweight and obesity in office-based employees should be evidence-based and comprehensive and should consider behavioral, social, organizational, and environmental factors. The objective of this study was to identify barriers to and enablers of physical activity and nutrition as well as intervention strategies for health promotion in office-based workplaces in the Perth, Western Australia, metropolitan area in 2012. Methods We conducted an online survey of 111 employees from 55 organizations. The online survey investigated demographics, individual and workplace characteristics, barriers and enablers, intervention-strategy preferences, and physical activity and nutrition behaviors. We used χ2 and Mann–Whitney U statistics to test for differences between age and sex groups for barriers and enablers, intervention-strategy preferences, and physical activity and nutrition behaviors. Stepwise multiple regression analysis determined factors that affect physical activity and nutrition behaviors. Results We identified several factors that affected physical activity and nutrition behaviors, including the most common barriers (“too tired” and “access to unhealthy food”) and enablers (“enjoy physical activity” and “nutrition knowledge”). Intervention-strategy preferences demonstrated employee support for health promotion in the workplace. Conclusion The findings provide useful insights into employees’ preferences for interventions; they can be used to develop comprehensive programs for evidence-based workplace health promotion that consider environmental and policy influences as well as the individual. PMID:24028834
FabricS: A user-friendly, complete and robust software for particle shape-fabric analysis
NASA Astrophysics Data System (ADS)
Moreno Chávez, G.; Castillo Rivera, F.; Sarocchi, D.; Borselli, L.; Rodríguez-Sedano, L. A.
2018-06-01
Shape-fabric is a textural parameter related to the spatial arrangement of elongated particles in geological samples. Its usefulness spans a range from sedimentary petrology to igneous and metamorphic petrology. Independently of the process being studied, when a material flows, the elongated particles are oriented with the major axis in the direction of flow. In sedimentary petrology this information has been used for studies of paleo-flow direction of turbidites, the origin of quartz sediments, and locating ignimbrite vents, among others. In addition to flow direction and its polarity, the method enables flow rheology to be inferred. The use of shape-fabric has been limited due to the difficulties of automatically measuring particles and analyzing them with reliable circular statistics programs. This has dampened interest in the method for a long time. Shape-fabric measurement has increased in popularity since the 1980s thanks to the development of new image analysis techniques and circular statistics software. However, the programs currently available are unreliable, old and are incompatible with newer operating systems, or require programming skills. The goal of our work is to develop a user-friendly program, in the MATLAB environment, with a graphical user interface, that can process images and includes editing functions, and thresholds (elongation and size) for selecting a particle population and analyzing it with reliable circular statistics algorithms. Moreover, the method also has to produce rose diagrams, orientation vectors, and a complete series of statistical parameters. All these requirements are met by our new software. In this paper, we briefly explain the methodology from collection of oriented samples in the field to the minimum number of particles needed to obtain reliable fabric data. We obtained the data using specific statistical tests and taking into account the degree of iso-orientation of the samples and the required degree of reliability. The program has been verified by means of several simulations performed using appropriately designed features and by analyzing real samples.
NASA Astrophysics Data System (ADS)
Bouhaj, M.; von Estorff, O.; Peiffer, A.
2017-09-01
In the application of Statistical Energy Analysis "SEA" to complex assembled structures, a purely predictive model often exhibits errors. These errors are mainly due to a lack of accurate modelling of the power transmission mechanism described through the Coupling Loss Factors (CLF). Experimental SEA (ESEA) is practically used by the automotive and aerospace industry to verify and update the model or to derive the CLFs for use in an SEA predictive model when analytical estimates cannot be made. This work is particularly motivated by the lack of procedures that allow an estimate to be made of the variance and confidence intervals of the statistical quantities when using the ESEA technique. The aim of this paper is to introduce procedures enabling a statistical description of measured power input, vibration energies and the derived SEA parameters. Particular emphasis is placed on the identification of structural CLFs of complex built-up structures comparing different methods. By adopting a Stochastic Energy Model (SEM), the ensemble average in ESEA is also addressed. For this purpose, expressions are obtained to randomly perturb the energy matrix elements and generate individual samples for the Monte Carlo (MC) technique applied to derive the ensemble averaged CLF. From results of ESEA tests conducted on an aircraft fuselage section, the SEM approach provides a better performance of estimated CLFs compared to classical matrix inversion methods. The expected range of CLF values and the synthesized energy are used as quality criteria of the matrix inversion, allowing to assess critical SEA subsystems, which might require a more refined statistical description of the excitation and the response fields. Moreover, the impact of the variance of the normalized vibration energy on uncertainty of the derived CLFs is outlined.
Statistical significance of combinatorial regulations
Terada, Aika; Okada-Hatakeyama, Mariko; Tsuda, Koji; Sese, Jun
2013-01-01
More than three transcription factors often work together to enable cells to respond to various signals. The detection of combinatorial regulation by multiple transcription factors, however, is not only computationally nontrivial but also extremely unlikely because of multiple testing correction. The exponential growth in the number of tests forces us to set a strict limit on the maximum arity. Here, we propose an efficient branch-and-bound algorithm called the “limitless arity multiple-testing procedure” (LAMP) to count the exact number of testable combinations and calibrate the Bonferroni factor to the smallest possible value. LAMP lists significant combinations without any limit, whereas the family-wise error rate is rigorously controlled under the threshold. In the human breast cancer transcriptome, LAMP discovered statistically significant combinations of as many as eight binding motifs. This method may contribute to uncover pathways regulated in a coordinated fashion and find hidden associations in heterogeneous data. PMID:23882073
Massive parallelization of serial inference algorithms for a complex generalized linear model
Suchard, Marc A.; Simpson, Shawn E.; Zorych, Ivan; Ryan, Patrick; Madigan, David
2014-01-01
Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety. PMID:25328363
Practicable group testing method to evaluate weight/weight GMO content in maize grains.
Mano, Junichi; Yanaka, Yuka; Ikezu, Yoko; Onishi, Mari; Futo, Satoshi; Minegishi, Yasutaka; Ninomiya, Kenji; Yotsuyanagi, Yuichi; Spiegelhalter, Frank; Akiyama, Hiroshi; Teshima, Reiko; Hino, Akihiro; Naito, Shigehiro; Koiwa, Tomohiro; Takabatake, Reona; Furui, Satoshi; Kitta, Kazumi
2011-07-13
Because of the increasing use of maize hybrids with genetically modified (GM) stacked events, the established and commonly used bulk sample methods for PCR quantification of GM maize in non-GM maize are prone to overestimate the GM organism (GMO) content, compared to the actual weight/weight percentage of GM maize in the grain sample. As an alternative method, we designed and assessed a group testing strategy in which the GMO content is statistically evaluated based on qualitative analyses of multiple small pools, consisting of 20 maize kernels each. This approach enables the GMO content evaluation on a weight/weight basis, irrespective of the presence of stacked-event kernels. To enhance the method's user-friendliness in routine application, we devised an easy-to-use PCR-based qualitative analytical method comprising a sample preparation step in which 20 maize kernels are ground in a lysis buffer and a subsequent PCR assay in which the lysate is directly used as a DNA template. This method was validated in a multilaboratory collaborative trial.
Time, frequency, and time-varying Granger-causality measures in neuroscience.
Cekic, Sezen; Grandjean, Didier; Renaud, Olivier
2018-05-20
This article proposes a systematic methodological review and an objective criticism of existing methods enabling the derivation of time, frequency, and time-varying Granger-causality statistics in neuroscience. The capacity to describe the causal links between signals recorded at different brain locations during a neuroscience experiment is indeed of primary interest for neuroscientists, who often have very precise prior hypotheses about the relationships between recorded brain signals. The increasing interest and the huge number of publications related to this topic calls for this systematic review, which describes the very complex methodological aspects underlying the derivation of these statistics. In this article, we first present a general framework that allows us to review and compare Granger-causality statistics in the time domain, and the link with transfer entropy. Then, the spectral and the time-varying extensions are exposed and discussed together with their estimation and distributional properties. Although not the focus of this article, partial and conditional Granger causality, dynamical causal modelling, directed transfer function, directed coherence, partial directed coherence, and their variant are also mentioned. Copyright © 2018 John Wiley & Sons, Ltd.
MacLean, Adam L; Harrington, Heather A; Stumpf, Michael P H; Byrne, Helen M
2016-01-01
The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non-exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine.
NASA Astrophysics Data System (ADS)
Friedel, M. J.; Daughney, C.
2016-12-01
The development of a successful surface-groundwater management strategy depends on the quality of data provided for analysis. This study evaluates the statistical robustness when using a modified self-organizing map (MSOM) technique to estimate missing values for three hypersurface models: synoptic groundwater-surface water hydrochemistry, time-series of groundwater-surface water hydrochemistry, and mixed-survey (combination of groundwater-surface water hydrochemistry and lithologies) hydrostratigraphic unit data. These models of increasing complexity are developed and validated based on observations from the Southland region of New Zealand. In each case, the estimation method is sufficiently robust to cope with groundwater-surface water hydrochemistry vagaries due to sample size and extreme data insufficiency, even when >80% of the data are missing. The estimation of surface water hydrochemistry time series values enabled the evaluation of seasonal variation, and the imputation of lithologies facilitated the evaluation of hydrostratigraphic controls on groundwater-surface water interaction. The robust statistical results for groundwater-surface water models of increasing data complexity provide justification to apply the MSOM technique in other regions of New Zealand and abroad.
Eutrophication risk assessment in coastal embayments using simple statistical models.
Arhonditsis, G; Eleftheriadou, M; Karydis, M; Tsirtsis, G
2003-09-01
A statistical methodology is proposed for assessing the risk of eutrophication in marine coastal embayments. The procedure followed was the development of regression models relating the levels of chlorophyll a (Chl) with the concentration of the limiting nutrient--usually nitrogen--and the renewal rate of the systems. The method was applied in the Gulf of Gera, Island of Lesvos, Aegean Sea and a surrogate for renewal rate was created using the Canberra metric as a measure of the resemblance between the Gulf and the oligotrophic waters of the open sea in terms of their physical, chemical and biological properties. The Chl-total dissolved nitrogen-renewal rate regression model was the most significant, accounting for 60% of the variation observed in Chl. Predicted distributions of Chl for various combinations of the independent variables, based on Bayesian analysis of the models, enabled comparison of the outcomes of specific scenarios of interest as well as further analysis of the system dynamics. The present statistical approach can be used as a methodological tool for testing the resilience of coastal ecosystems under alternative managerial schemes and levels of exogenous nutrient loading.
Reaction Event Counting Statistics of Biopolymer Reaction Systems with Dynamic Heterogeneity.
Lim, Yu Rim; Park, Seong Jun; Park, Bo Jung; Cao, Jianshu; Silbey, Robert J; Sung, Jaeyoung
2012-04-10
We investigate the reaction event counting statistics (RECS) of an elementary biopolymer reaction in which the rate coefficient is dependent on states of the biopolymer and the surrounding environment and discover a universal kinetic phase transition in the RECS of the reaction system with dynamic heterogeneity. From an exact analysis for a general model of elementary biopolymer reactions, we find that the variance in the number of reaction events is dependent on the square of the mean number of the reaction events when the size of measurement time is small on the relaxation time scale of rate coefficient fluctuations, which does not conform to renewal statistics. On the other hand, when the size of the measurement time interval is much greater than the relaxation time of rate coefficient fluctuations, the variance becomes linearly proportional to the mean reaction number in accordance with renewal statistics. Gillespie's stochastic simulation method is generalized for the reaction system with a rate coefficient fluctuation. The simulation results confirm the correctness of the analytic results for the time dependent mean and variance of the reaction event number distribution. On the basis of the obtained results, we propose a method of quantitative analysis for the reaction event counting statistics of reaction systems with rate coefficient fluctuations, which enables one to extract information about the magnitude and the relaxation times of the fluctuating reaction rate coefficient, without a bias that can be introduced by assuming a particular kinetic model of conformational dynamics and the conformation dependent reactivity. An exact relationship is established between a higher moment of the reaction event number distribution and the multitime correlation of the reaction rate for the reaction system with a nonequilibrium initial state distribution as well as for the system with the equilibrium initial state distribution.
Cardiac arrest risk standardization using administrative data compared to registry data
Gaieski, David F.; Donnino, Michael W.; Nelson, Joshua I. M.; Mutter, Eric L.; Carr, Brendan G.; Abella, Benjamin S.; Wiebe, Douglas J.
2017-01-01
Background Methods for comparing hospitals regarding cardiac arrest (CA) outcomes, vital for improving resuscitation performance, rely on data collected by cardiac arrest registries. However, most CA patients are treated at hospitals that do not participate in such registries. This study aimed to determine whether CA risk standardization modeling based on administrative data could perform as well as that based on registry data. Methods and results Two risk standardization logistic regression models were developed using 2453 patients treated from 2000–2015 at three hospitals in an academic health system. Registry and administrative data were accessed for all patients. The outcome was death at hospital discharge. The registry model was considered the “gold standard” with which to compare the administrative model, using metrics including comparing areas under the curve, calibration curves, and Bland-Altman plots. The administrative risk standardization model had a c-statistic of 0.891 (95% CI: 0.876–0.905) compared to a registry c-statistic of 0.907 (95% CI: 0.895–0.919). When limited to only non-modifiable factors, the administrative model had a c-statistic of 0.818 (95% CI: 0.799–0.838) compared to a registry c-statistic of 0.810 (95% CI: 0.788–0.831). All models were well-calibrated. There was no significant difference between c-statistics of the models, providing evidence that valid risk standardization can be performed using administrative data. Conclusions Risk standardization using administrative data performs comparably to standardization using registry data. This methodology represents a new tool that can enable opportunities to compare hospital performance in specific hospital systems or across the entire US in terms of survival after CA. PMID:28783754
Nakanishi, Rine; Sankaran, Sethuraman; Grady, Leo; Malpeso, Jenifer; Yousfi, Razik; Osawa, Kazuhiro; Ceponiene, Indre; Nazarat, Negin; Rahmani, Sina; Kissel, Kendall; Jayawardena, Eranthi; Dailing, Christopher; Zarins, Christopher; Koo, Bon-Kwon; Min, James K; Taylor, Charles A; Budoff, Matthew J
2018-03-23
Our goal was to evaluate the efficacy of a fully automated method for assessing the image quality (IQ) of coronary computed tomography angiography (CCTA). The machine learning method was trained using 75 CCTA studies by mapping features (noise, contrast, misregistration scores, and un-interpretability index) to an IQ score based on manual ground truth data. The automated method was validated on a set of 50 CCTA studies and subsequently tested on a new set of 172 CCTA studies against visual IQ scores on a 5-point Likert scale. The area under the curve in the validation set was 0.96. In the 172 CCTA studies, our method yielded a Cohen's kappa statistic for the agreement between automated and visual IQ assessment of 0.67 (p < 0.01). In the group where good to excellent (n = 163), fair (n = 6), and poor visual IQ scores (n = 3) were graded, 155, 5, and 2 of the patients received an automated IQ score > 50 %, respectively. Fully automated assessment of the IQ of CCTA data sets by machine learning was reproducible and provided similar results compared with visual analysis within the limits of inter-operator variability. • The proposed method enables automated and reproducible image quality assessment. • Machine learning and visual assessments yielded comparable estimates of image quality. • Automated assessment potentially allows for more standardised image quality. • Image quality assessment enables standardization of clinical trial results across different datasets.
Design, Development and Testing of Web Services for Multi-Sensor Snow Cover Mapping
NASA Astrophysics Data System (ADS)
Kadlec, Jiri
This dissertation presents the design, development and validation of new data integration methods for mapping the extent of snow cover based on open access ground station measurements, remote sensing images, volunteer observer snow reports, and cross country ski track recordings from location-enabled mobile devices. The first step of the data integration procedure includes data discovery, data retrieval, and data quality control of snow observations at ground stations. The WaterML R package developed in this work enables hydrologists to retrieve and analyze data from multiple organizations that are listed in the Consortium of Universities for the Advancement of Hydrologic Sciences Inc (CUAHSI) Water Data Center catalog directly within the R statistical software environment. Using the WaterML R package is demonstrated by running an energy balance snowpack model in R with data inputs from CUAHSI, and by automating uploads of real time sensor observations to CUAHSI HydroServer. The second step of the procedure requires efficient access to multi-temporal remote sensing snow images. The Snow Inspector web application developed in this research enables the users to retrieve a time series of fractional snow cover from the Moderate Resolution Imaging Spectroradiometer (MODIS) for any point on Earth. The time series retrieval method is based on automated data extraction from tile images provided by a Web Map Tile Service (WMTS). The average required time for retrieving 100 days of data using this technique is 5.4 seconds, which is significantly faster than other methods that require the download of large satellite image files. The presented data extraction technique and space-time visualization user interface can be used as a model for working with other multi-temporal hydrologic or climate data WMTS services. The third, final step of the data integration procedure is generating continuous daily snow cover maps. A custom inverse distance weighting method has been developed to combine volunteer snow reports, cross-country ski track reports and station measurements to fill cloud gaps in the MODIS snow cover product. The method is demonstrated by producing a continuous daily time step snow presence probability map dataset for the Czech Republic region. The ability of the presented methodology to reconstruct MODIS snow cover under cloud is validated by simulating cloud cover datasets and comparing estimated snow cover to actual MODIS snow cover. The percent correctly classified indicator showed accuracy between 80 and 90% using this method. Using crowdsourcing data (volunteer snow reports and ski tracks) improves the map accuracy by 0.7--1.2%. The output snow probability map data sets are published online using web applications and web services. Keywords: crowdsourcing, image analysis, interpolation, MODIS, R statistical software, snow cover, snowpack probability, Tethys platform, time series, WaterML, web services, winter sports.
Ghaffar, Abdul; Pongpanich, Sathirakorn; Ghaffar, Najma; Chapman, Robert Sedgwick; Mureed, Sheh
2015-01-01
Objectives: To identify, and compare relative importance of, factors associated with antenatal care (ANC) utilization in rural Balochistan, toward framing a policy to increase such utilization. Methods: This cross sectional study was conducted among 513 pregnant women in Jhal Magsi District, Balochistan, in 2011. A standardized interviewer-administered questionnaire was used. Predisposing, enabling, and reinforcing factors were evaluated with generalized linear models (Poisson distribution and log link). Results: Prevalence of any ANC was only 14.4%. Predisposing, enabling, and reinforcing factors were all important determinants of ANC utilization. Reinforcing factors were clearly most important, husband’s support for ANC was more important than support from other community members. Among predisposing factors, higher income, education, occupation, and better knowledge regarding benefits of ANC were positively and statistically significantly associated with ANC However increased number of children showed negative association. Complications free pregnancy showed positive significant association with ANC at public health facility among enabling factors. Conclusion: It is very important to increase antenatal care utilization in the study area and similar areas. Policy to achieve this should focus on enhancing support from the husband. PMID:26150867
Displaying R spatial statistics on Google dynamic maps with web applications created by Rwui.
Newton, Richard; Deonarine, Andrew; Wernisch, Lorenz
2012-09-24
The R project includes a large variety of packages designed for spatial statistics. Google dynamic maps provide web based access to global maps and satellite imagery. We describe a method for displaying directly the spatial output from an R script on to a Google dynamic map. This is achieved by creating a Java based web application which runs the R script and then displays the results on the dynamic map. In order to make this method easy to implement by those unfamiliar with programming Java based web applications, we have added the method to the options available in the R Web User Interface (Rwui) application. Rwui is an established web application for creating web applications for running R scripts. A feature of Rwui is that all the code for the web application being created is generated automatically so that someone with no knowledge of web programming can make a fully functional web application for running an R script in a matter of minutes. Rwui can now be used to create web applications that will display the results from an R script on a Google dynamic map. Results may be displayed as discrete markers and/or as continuous overlays. In addition, users of the web application may select regions of interest on the dynamic map with mouse clicks and the coordinates of the region of interest will automatically be made available for use by the R script. This method of displaying R output on dynamic maps is designed to be of use in a number of areas. Firstly it allows statisticians, working in R and developing methods in spatial statistics, to easily visualise the results of applying their methods to real world data. Secondly, it allows researchers who are using R to study health geographics data, to display their results directly onto dynamic maps. Thirdly, by creating a web application for running an R script, a statistician can enable users entirely unfamiliar with R to run R coded statistical analyses of health geographics data. Fourthly, we envisage an educational role for such applications.
Inlet Flow Control and Prediction Technologies for Embedded Propulsion Systems
NASA Technical Reports Server (NTRS)
McMillan, Michelle L.; Mackie, Scott A.; Gissen, Abe; Vukasinovic, Bojan; Lakebrink, Matthew T.; Glezer, Ari; Mani, Mori; Mace, James L.
2011-01-01
Fail-safe, hybrid, flow control (HFC) is a promising technology for meeting high-speed cruise efficiency, low-noise signature, and reduced fuel-burn goals for future, Hybrid-Wing-Body (HWB) aircraft with embedded engines. This report details the development of HFC technology that enables improved inlet performance in HWB vehicles with highly integrated inlets and embedded engines without adversely affecting vehicle performance. In addition, new test techniques for evaluating Boundary-Layer-Ingesting (BLI)-inlet flow-control technologies developed and demonstrated through this program are documented, including the ability to generate a BLI-like inlet-entrance flow in a direct-connect, wind-tunnel facility, as well as, the use of D-optimal, statistically designed experiments to optimize test efficiency and enable interpretation of results. Validated improvements in numerical analysis tools and methods accomplished through this program are also documented, including Reynolds-Averaged Navier-Stokes CFD simulations of steady-state flow physics for baseline, BLI-inlet diffuser flow, as well as, that created by flow-control devices. Finally, numerical methods were employed in a ground-breaking attempt to directly simulate dynamic distortion. The advances in inlet technologies and prediction tools will help to meet and exceed "N+2" project goals for future HWB aircraft.
Suemitsu, Yoshikazu; Nara, Shigetoshi
2004-09-01
Chaotic dynamics introduced into a neural network model is applied to solving two-dimensional mazes, which are ill-posed problems. A moving object moves from the position at t to t + 1 by simply defined motion function calculated from firing patterns of the neural network model at each time step t. We have embedded several prototype attractors that correspond to the simple motion of the object orienting toward several directions in two-dimensional space in our neural network model. Introducing chaotic dynamics into the network gives outputs sampled from intermediate state points between embedded attractors in a state space, and these dynamics enable the object to move in various directions. System parameter switching between a chaotic and an attractor regime in the state space of the neural network enables the object to move to a set target in a two-dimensional maze. Results of computer simulations show that the success rate for this method over 300 trials is higher than that of random walk. To investigate why the proposed method gives better performance, we calculate and discuss statistical data with respect to dynamical structure.
ChIPWig: a random access-enabling lossless and lossy compression method for ChIP-seq data.
Ravanmehr, Vida; Kim, Minji; Wang, Zhiying; Milenkovic, Olgica
2018-03-15
Chromatin immunoprecipitation sequencing (ChIP-seq) experiments are inexpensive and time-efficient, and result in massive datasets that introduce significant storage and maintenance challenges. To address the resulting Big Data problems, we propose a lossless and lossy compression framework specifically designed for ChIP-seq Wig data, termed ChIPWig. ChIPWig enables random access, summary statistics lookups and it is based on the asymptotic theory of optimal point density design for nonuniform quantizers. We tested the ChIPWig compressor on 10 ChIP-seq datasets generated by the ENCODE consortium. On average, lossless ChIPWig reduced the file sizes to merely 6% of the original, and offered 6-fold compression rate improvement compared to bigWig. The lossy feature further reduced file sizes 2-fold compared to the lossless mode, with little or no effects on peak calling and motif discovery using specialized NarrowPeaks methods. The compression and decompression speed rates are of the order of 0.2 sec/MB using general purpose computers. The source code and binaries are freely available for download at https://github.com/vidarmehr/ChIPWig-v2, implemented in C ++. milenkov@illinois.edu. Supplementary data are available at Bioinformatics online.
Monitoring tigers with confidence.
Linkie, Matthew; Guillera-Arroita, Gurutzeta; Smith, Joseph; Rayan, D Mark
2010-12-01
With only 5% of the world's wild tigers (Panthera tigris Linnaeus, 1758) remaining since the last century, conservationists urgently need to know whether or not the management strategies currently being employed are effectively protecting these tigers. This knowledge is contingent on the ability to reliably monitor tiger populations, or subsets, over space and time. In the this paper, we focus on the 2 seminal methodologies (camera trap and occupancy surveys) that have enabled the monitoring of tiger populations with greater confidence. Specifically, we: (i) describe their statistical theory and application in the field; (ii) discuss issues associated with their survey designs and state variable modeling; and, (iii) discuss their future directions. These methods have had an unprecedented influence on increasing statistical rigor within tiger surveys and, also, surveys of other carnivore species. Nevertheless, only 2 published camera trap studies have gone beyond single baseline assessments and actually monitored population trends. For low density tiger populations (e.g. <1 adult tiger/100 km(2)) obtaining sufficient precision for state variable estimates from camera trapping remains a challenge because of insufficient detection probabilities and/or sample sizes. Occupancy surveys have overcome this problem by redefining the sampling unit (e.g. grid cells and not individual tigers). Current research is focusing on developing spatially explicit capture-mark-recapture models and estimating abundance indices from landscape-scale occupancy surveys, as well as the use of genetic information for identifying and monitoring tigers. The widespread application of these monitoring methods in the field now enables complementary studies on the impact of the different threats to tiger populations and their response to varying management intervention. © 2010 ISZS, Blackwell Publishing and IOZ/CAS.
Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal
2012-04-01
Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.
Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal
2012-01-01
Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli. PMID:22139924
Green, Esther; Yuen, Dora; Chasen, Martin; Amernic, Heidi; Shabestari, Omid; Brundage, Michael; Krzyzanowska, Monika K; Klinger, Christopher; Ismail, Zahra; Pereira, José
2017-01-01
To examine oncology nurses' attitudes toward and reported use of the Edmonton Symptom Assessment System (ESAS) and to determine whether the length of work experience and presence of oncology certification are associated with their attitudes and reported usage. . Exploratory, mixed-methods study employing a questionnaire approach. . 14 regional cancer centers (RCCs) in Ontario, Canada. . Oncology nurses who took part in a larger province-wide study that surveyed 960 interdisciplinary providers in oncology care settings at all of Ontario's 14 RCCs. . Oncology nurses' attitudes and use of ESAS were measured using a 21-item investigator-developed questionnaire. Descriptive statistics and Kendall's tau-b or tau-c test were used for data analyses. Qualitative responses were analyzed using content analysis. . Attitudes toward and self-reported use of standardized symptom screening and ESAS. . More than half of the participants agreed that ESAS improves symptom screening, most said they would encourage their patients to complete ESAS, and most felt that managing symptoms is within their scope of practice and clinical responsibilities. Qualitative comments provided additional information elucidating the quantitative responses. Statistical analyses revealed that oncology nurses who have 10 years or less of work experience were more likely to agree that the use of standardized, valid instruments to screen for and assess symptoms should be considered best practice, ESAS improves symptom screening, and ESAS enables them to better manage patients' symptoms. No statistically significant difference was found between oncology-certified RNs and noncertified RNs on attitudes or reported use of ESAS. . Implementing a population-based symptom screening approach is a major undertaking. The current study found that oncology nurses recognize the value of standardized screening, as demonstrated by their attitudes toward ESAS. . Oncology nurses are integral to providing high-quality person-centered care. Using standardized approaches that enable patients to self-report symptoms and understanding barriers and enablers to optimal use of patient-reported outcome tools can improve the quality of patient care.
Rogue waves in terms of multi-point statistics and nonequilibrium thermodynamics
NASA Astrophysics Data System (ADS)
Hadjihosseini, Ali; Lind, Pedro; Mori, Nobuhito; Hoffmann, Norbert P.; Peinke, Joachim
2017-04-01
Ocean waves, which lead to rogue waves, are investigated on the background of complex systems. In contrast to deterministic approaches based on the nonlinear Schroedinger equation or focusing effects, we analyze this system in terms of a noisy stochastic system. In particular we present a statistical method that maps the complexity of multi-point data into the statistics of hierarchically ordered height increments for different time scales. We show that the stochastic cascade process with Markov properties is governed by a Fokker-Planck equation. Conditional probabilities as well as the Fokker-Planck equation itself can be estimated directly from the available observational data. This stochastic description enables us to show several new aspects of wave states. Surrogate data sets can in turn be generated allowing to work out different statistical features of the complex sea state in general and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics. As a new outlook the ocean wave states will be considered in terms of nonequilibrium thermodynamics, for which the entropy production of different wave heights will be considered. We show evidence that rogue waves are characterized by negative entropy production. The statistics of the entropy production can be used to distinguish different wave states.
Systematic and fully automated identification of protein sequence patterns.
Hart, R K; Royyuru, A K; Stolovitzky, G; Califano, A
2000-01-01
We present an efficient algorithm to systematically and automatically identify patterns in protein sequence families. The procedure is based on the Splash deterministic pattern discovery algorithm and on a framework to assess the statistical significance of patterns. We demonstrate its application to the fully automated discovery of patterns in 974 PROSITE families (the complete subset of PROSITE families which are defined by patterns and contain DR records). Splash generates patterns with better specificity and undiminished sensitivity, or vice versa, in 28% of the families; identical statistics were obtained in 48% of the families, worse statistics in 15%, and mixed behavior in the remaining 9%. In about 75% of the cases, Splash patterns identify sequence sites that overlap more than 50% with the corresponding PROSITE pattern. The procedure is sufficiently rapid to enable its use for daily curation of existing motif and profile databases. Third, our results show that the statistical significance of discovered patterns correlates well with their biological significance. The trypsin subfamily of serine proteases is used to illustrate this method's ability to exhaustively discover all motifs in a family that are statistically and biologically significant. Finally, we discuss applications of sequence patterns to multiple sequence alignment and the training of more sensitive score-based motif models, akin to the procedure used by PSI-BLAST. All results are available at httpl//www.research.ibm.com/spat/.
Mapping irrigated lands at 250-m scale by merging MODIS data and National Agricultural Statistics
Pervez, Md Shahriar; Brown, Jesslyn F.
2010-01-01
Accurate geospatial information on the extent of irrigated land improves our understanding of agricultural water use, local land surface processes, conservation or depletion of water resources, and components of the hydrologic budget. We have developed a method in a geospatial modeling framework that assimilates irrigation statistics with remotely sensed parameters describing vegetation growth conditions in areas with agricultural land cover to spatially identify irrigated lands at 250-m cell size across the conterminous United States for 2002. The geospatial model result, known as the Moderate Resolution Imaging Spectroradiometer (MODIS) Irrigated Agriculture Dataset (MIrAD-US), identified irrigated lands with reasonable accuracy in California and semiarid Great Plains states with overall accuracies of 92% and 75% and kappa statistics of 0.75 and 0.51, respectively. A quantitative accuracy assessment of MIrAD-US for the eastern region has not yet been conducted, and qualitative assessment shows that model improvements are needed for the humid eastern regions where the distinction in annual peak NDVI between irrigated and non-irrigated crops is minimal and county sizes are relatively small. This modeling approach enables consistent mapping of irrigated lands based upon USDA irrigation statistics and should lead to better understanding of spatial trends in irrigated lands across the conterminous United States. An improved version of the model with revised datasets is planned and will employ 2007 USDA irrigation statistics.
Treatment of Outliers via Interpolation Method with Neural Network Forecast Performances
NASA Astrophysics Data System (ADS)
Wahir, N. A.; Nor, M. E.; Rusiman, M. S.; Gopal, K.
2018-04-01
Outliers often lurk in many datasets, especially in real data. Such anomalous data can negatively affect statistical analyses, primarily normality, variance, and estimation aspects. Hence, handling the occurrences of outliers require special attention. Therefore, it is important to determine the suitable ways in treating outliers so as to ensure that the quality of the analyzed data is indeed high. As such, this paper discusses an alternative method to treat outliers via linear interpolation method. In fact, assuming outlier as a missing value in the dataset allows the application of the interpolation method to interpolate the outliers thus, enabling the comparison of data series using forecast accuracy before and after outlier treatment. With that, the monthly time series of Malaysian tourist arrivals from January 1998 until December 2015 had been used to interpolate the new series. The results indicated that the linear interpolation method, which was comprised of improved time series data, displayed better results, when compared to the original time series data in forecasting from both Box-Jenkins and neural network approaches.
Final Report: Quantification of Uncertainty in Extreme Scale Computations (QUEST)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marzouk, Youssef; Conrad, Patrick; Bigoni, Daniele
QUEST (\\url{www.quest-scidac.org}) is a SciDAC Institute that is focused on uncertainty quantification (UQ) in large-scale scientific computations. Our goals are to (1) advance the state of the art in UQ mathematics, algorithms, and software; and (2) provide modeling, algorithmic, and general UQ expertise, together with software tools, to other SciDAC projects, thereby enabling and guiding a broad range of UQ activities in their respective contexts. QUEST is a collaboration among six institutions (Sandia National Laboratories, Los Alamos National Laboratory, the University of Southern California, Massachusetts Institute of Technology, the University of Texas at Austin, and Duke University) with a historymore » of joint UQ research. Our vision encompasses all aspects of UQ in leadership-class computing. This includes the well-founded setup of UQ problems; characterization of the input space given available data/information; local and global sensitivity analysis; adaptive dimensionality and order reduction; forward and inverse propagation of uncertainty; handling of application code failures, missing data, and hardware/software fault tolerance; and model inadequacy, comparison, validation, selection, and averaging. The nature of the UQ problem requires the seamless combination of data, models, and information across this landscape in a manner that provides a self-consistent quantification of requisite uncertainties in predictions from computational models. Accordingly, our UQ methods and tools span an interdisciplinary space across applied math, information theory, and statistics. The MIT QUEST effort centers on statistical inference and methods for surrogate or reduced-order modeling. MIT personnel have been responsible for the development of adaptive sampling methods, methods for approximating computationally intensive models, and software for both forward uncertainty propagation and statistical inverse problems. A key software product of the MIT QUEST effort is the MIT Uncertainty Quantification library, called MUQ (\\url{muq.mit.edu}).« less
Bouallègue, Fayçal Ben; Crouzet, Jean-François; Comtat, Claude; Fourcade, Marjolaine; Mohammadi, Bijan; Mariano-Goulart, Denis
2007-07-01
This paper presents an extended 3-D exact rebinning formula in the Fourier space that leads to an iterative reprojection algorithm (iterative FOREPROJ), which enables the estimation of unmeasured oblique projection data on the basis of the whole set of measured data. In first approximation, this analytical formula also leads to an extended Fourier rebinning equation that is the basis for an approximate reprojection algorithm (extended FORE). These algorithms were evaluated on numerically simulated 3-D positron emission tomography (PET) data for the solution of the truncation problem, i.e., the estimation of the missing portions in the oblique projection data, before the application of algorithms that require complete projection data such as some rebinning methods (FOREX) or 3-D reconstruction algorithms (3DRP or direct Fourier methods). By taking advantage of all the 3-D data statistics, the iterative FOREPROJ reprojection provides a reliable alternative to the classical FOREPROJ method, which only exploits the low-statistics nonoblique data. It significantly improves the quality of the external reconstructed slices without loss of spatial resolution. As for the approximate extended FORE algorithm, it clearly exhibits limitations due to axial interpolations, but will require clinical studies with more realistic measured data in order to decide on its pertinence.
Sulcal depth-based cortical shape analysis in normal healthy control and schizophrenia groups
NASA Astrophysics Data System (ADS)
Lyu, Ilwoo; Kang, Hakmook; Woodward, Neil D.; Landman, Bennett A.
2018-03-01
Sulcal depth is an important marker of brain anatomy in neuroscience/neurological function. Previously, sulcal depth has been explored at the region-of-interest (ROI) level to increase statistical sensitivity to group differences. In this paper, we present a fully automated method that enables inferences of ROI properties from a sulcal region- focused perspective consisting of two main components: 1) sulcal depth computation and 2) sulcal curve-based refined ROIs. In conventional statistical analysis, the average sulcal depth measurements are employed in several ROIs of the cortical surface. However, taking the average sulcal depth over the full ROI blurs overall sulcal depth measurements which may result in reduced sensitivity to detect sulcal depth changes in neurological and psychiatric disorders. To overcome such a blurring effect, we focus on sulcal fundic regions in each ROI by filtering out other gyral regions. Consequently, the proposed method results in more sensitive to group differences than a traditional ROI approach. In the experiment, we focused on a cortical morphological analysis to sulcal depth reduction in schizophrenia with a comparison to the normal healthy control group. We show that the proposed method is more sensitivity to abnormalities of sulcal depth in schizophrenia; sulcal depth is significantly smaller in most cortical lobes in schizophrenia compared to healthy controls (p < 0.05).
Interactive searching of facial image databases
NASA Astrophysics Data System (ADS)
Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean
1995-09-01
A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations
Zhu, Yicheng; Neeman, Teresa; Yap, Von Bing; Huttley, Gavin A.
2017-01-01
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A→G mutations. We show that major effects of neighbors on germline mutation lie within ±2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T→C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif. PMID:27974498
Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module
NASA Astrophysics Data System (ADS)
Martinez, Gregory D.; McKay, James; Farmer, Ben; Scott, Pat; Roebber, Elinore; Putze, Antje; Conrad, Jan
2017-11-01
We introduce ScannerBit, the statistics and sampling module of the public, open-source global fitting framework GAMBIT. ScannerBit provides a standardised interface to different sampling algorithms, enabling the use and comparison of multiple computational methods for inferring profile likelihoods, Bayesian posteriors, and other statistical quantities. The current version offers random, grid, raster, nested sampling, differential evolution, Markov Chain Monte Carlo (MCMC) and ensemble Monte Carlo samplers. We also announce the release of a new standalone differential evolution sampler, Diver, and describe its design, usage and interface to ScannerBit. We subject Diver and three other samplers (the nested sampler MultiNest, the MCMC GreAT, and the native ScannerBit implementation of the ensemble Monte Carlo algorithm T-Walk) to a battery of statistical tests. For this we use a realistic physical likelihood function, based on the scalar singlet model of dark matter. We examine the performance of each sampler as a function of its adjustable settings, and the dimensionality of the sampling problem. We evaluate performance on four metrics: optimality of the best fit found, completeness in exploring the best-fit region, number of likelihood evaluations, and total runtime. For Bayesian posterior estimation at high resolution, T-Walk provides the most accurate and timely mapping of the full parameter space. For profile likelihood analysis in less than about ten dimensions, we find that Diver and MultiNest score similarly in terms of best fit and speed, outperforming GreAT and T-Walk; in ten or more dimensions, Diver substantially outperforms the other three samplers on all metrics.
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
Benchmarking routine psychological services: a discussion of challenges and methods.
Delgadillo, Jaime; McMillan, Dean; Leach, Chris; Lucock, Mike; Gilbody, Simon; Wood, Nick
2014-01-01
Policy developments in recent years have led to important changes in the level of access to evidence-based psychological treatments. Several methods have been used to investigate the effectiveness of these treatments in routine care, with different approaches to outcome definition and data analysis. To present a review of challenges and methods for the evaluation of evidence-based treatments delivered in routine mental healthcare. This is followed by a case example of a benchmarking method applied in primary care. High, average and poor performance benchmarks were calculated through a meta-analysis of published data from services working under the Improving Access to Psychological Therapies (IAPT) Programme in England. Pre-post treatment effect sizes (ES) and confidence intervals were estimated to illustrate a benchmarking method enabling services to evaluate routine clinical outcomes. High, average and poor performance ES for routine IAPT services were estimated to be 0.91, 0.73 and 0.46 for depression (using PHQ-9) and 1.02, 0.78 and 0.52 for anxiety (using GAD-7). Data from one specific IAPT service exemplify how to evaluate and contextualize routine clinical performance against these benchmarks. The main contribution of this report is to summarize key recommendations for the selection of an adequate set of psychometric measures, the operational definition of outcomes, and the statistical evaluation of clinical performance. A benchmarking method is also presented, which may enable a robust evaluation of clinical performance against national benchmarks. Some limitations concerned significant heterogeneity among data sources, and wide variations in ES and data completeness.
Masud, Mohammad Shahed; Borisyuk, Roman; Stuart, Liz
2017-07-15
This study analyses multiple spike trains (MST) data, defines its functional connectivity and subsequently visualises an accurate diagram of connections. This is a challenging problem. For example, it is difficult to distinguish the common input and the direct functional connection of two spike trains. The new method presented in this paper is based on the traditional pairwise cross-correlation function (CCF) and a new combination of statistical techniques. First, the CCF is used to create the Advanced Correlation Grid (ACG) correlation where both the significant peak of the CCF and the corresponding time delay are used for detailed analysis of connectivity. Second, these two features of functional connectivity are used to classify connections. Finally, the visualization technique is used to represent the topology of functional connections. Examples are presented in the paper to demonstrate the new Advanced Correlation Grid method and to show how it enables discrimination between (i) influence from one spike train to another through an intermediate spike train and (ii) influence from one common spike train to another pair of analysed spike trains. The ACG method enables scientists to automatically distinguish between direct connections from spurious connections such as common source connection and indirect connection whereas existing methods require in-depth analysis to identify such connections. The ACG is a new and effective method for studying functional connectivity of multiple spike trains. This method can identify accurately all the direct connections and can distinguish common source and indirect connections automatically. Copyright © 2017 Elsevier B.V. All rights reserved.
Study for online range monitoring with the interaction vertex imaging method.
Finck, Ch; Karakaya, Y; Reithinger, V; Rescigno, R; Baudot, J; Constanzo, J; Juliani, D; Krimmer, J; Rinaldi, I; Rousseau, M; Testa, E; Vanstalle, M; Ray, C
2017-11-21
Ion beam therapy enables a highly accurate dose conformation delivery to the tumor due to the finite range of charged ions in matter (i.e. Bragg peak (BP)). Consequently, the dose profile is very sensitive to patients anatomical changes as well as minor mispositioning, and so it requires improved dose control techniques. Proton interaction vertex imaging (IVI) could offer an online range control in carbon ion therapy. In this paper, a statistical method was used to study the sensitivity of the IVI technique on experimental data obtained from the Heidelberg Ion-Beam Therapy Center. The vertices of secondary protons were reconstructed with pixelized silicon detectors. The statistical study used the [Formula: see text] test of the reconstructed vertex distributions for a given displacement of the BP position as a function of the impinging carbon ions. Different phantom configurations were used with or without bone equivalent tissue and air inserts. The inflection points in the fall-off region of the longitudinal vertex distribution were computed using different methods, while the relation with the BP position was established. In the present setup, the resolution of the BP position was about 4-5 mm in the homogeneous phantom under clinical conditions (10 6 incident carbon ions). Our results show that the IVI method could therefore monitor the BP position with a promising resolution in clinical conditions.
Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José C; Mota-Sanchez, David; Estrada-González, Fermín; Gillberg, Jussi; Singh, Ravi; Mondal, Suchismita; Juliana, Philomin
2018-01-04
In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment-trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets. Copyright © 2018 Montesinos-Lopez et al.
Study for online range monitoring with the interaction vertex imaging method
NASA Astrophysics Data System (ADS)
Finck, Ch; Karakaya, Y.; Reithinger, V.; Rescigno, R.; Baudot, J.; Constanzo, J.; Juliani, D.; Krimmer, J.; Rinaldi, I.; Rousseau, M.; Testa, E.; Vanstalle, M.; Ray, C.
2017-12-01
Ion beam therapy enables a highly accurate dose conformation delivery to the tumor due to the finite range of charged ions in matter (i.e. Bragg peak (BP)). Consequently, the dose profile is very sensitive to patients anatomical changes as well as minor mispositioning, and so it requires improved dose control techniques. Proton interaction vertex imaging (IVI) could offer an online range control in carbon ion therapy. In this paper, a statistical method was used to study the sensitivity of the IVI technique on experimental data obtained from the Heidelberg Ion-Beam Therapy Center. The vertices of secondary protons were reconstructed with pixelized silicon detectors. The statistical study used the χ2 test of the reconstructed vertex distributions for a given displacement of the BP position as a function of the impinging carbon ions. Different phantom configurations were used with or without bone equivalent tissue and air inserts. The inflection points in the fall-off region of the longitudinal vertex distribution were computed using different methods, while the relation with the BP position was established. In the present setup, the resolution of the BP position was about 4-5 mm in the homogeneous phantom under clinical conditions (106 incident carbon ions). Our results show that the IVI method could therefore monitor the BP position with a promising resolution in clinical conditions.
Cross-domain question classification in community question answering via kernel mapping
NASA Astrophysics Data System (ADS)
Su, Lei; Hu, Zuoliang; Yang, Bin; Li, Yiyang; Chen, Jun
2015-10-01
An increasingly popular method for retrieving information is via the community question answering (CQA) systems such as Yahoo! Answers and Baidu Knows. In CQA, question classification plays an important role to find the answers. However, the labeled training examples for statistical question classifier are fairly expensive to obtain, as they require the experienced human efforts. Meanwhile, unlabeled data are readily available. This paper employs the method of domain adaptation via kernel mapping to solve this problem. In detail, the kernel approach is utilized to map the target-domain data and the source-domain data into a common space, where the question classifiers are trained under the closer conditional probabilities. The kernel mapping function is constructed by domain knowledge. Therefore, domain knowledge could be transferred from the labeled examples in the source domain to the unlabeled ones in the targeted domain. The statistical training model can be improved by using a large number of unlabeled data. Meanwhile, the Hadoop Platform is used to construct the mapping mechanism to reduce the time complexity. Map/Reduce enable kernel mapping for domain adaptation in parallel in the Hadoop Platform. Experimental results show that the accuracy of question classification could be improved by the method of kernel mapping. Furthermore, the parallel method in the Hadoop Platform could effective schedule the computing resources to reduce the running time.
Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems
Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José C.; Mota-Sanchez, David; Estrada-González, Fermín; Gillberg, Jussi; Singh, Ravi; Mondal, Suchismita; Juliana, Philomin
2018-01-01
In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment–trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets. PMID:29097376
Meta-analysis using Dirichlet process.
Muthukumarana, Saman; Tiwari, Ram C
2016-02-01
This article develops a Bayesian approach for meta-analysis using the Dirichlet process. The key aspect of the Dirichlet process in meta-analysis is the ability to assess evidence of statistical heterogeneity or variation in the underlying effects across study while relaxing the distributional assumptions. We assume that the study effects are generated from a Dirichlet process. Under a Dirichlet process model, the study effects parameters have support on a discrete space and enable borrowing of information across studies while facilitating clustering among studies. We illustrate the proposed method by applying it to a dataset on the Program for International Student Assessment on 30 countries. Results from the data analysis, simulation studies, and the log pseudo-marginal likelihood model selection procedure indicate that the Dirichlet process model performs better than conventional alternative methods. © The Author(s) 2012.
Barratt, Dean C; Chan, Carolyn S K; Edwards, Philip J; Penney, Graeme P; Slomczykowski, Mike; Carter, Timothy J; Hawkes, David J
2008-06-01
Statistical shape modelling potentially provides a powerful tool for generating patient-specific, 3D representations of bony anatomy for computer-aided orthopaedic surgery (CAOS) without the need for a preoperative CT scan. Furthermore, freehand 3D ultrasound (US) provides a non-invasive method for digitising bone surfaces in the operating theatre that enables a much greater region to be sampled compared with conventional direct-contact (i.e., pointer-based) digitisation techniques. In this paper, we describe how these approaches can be combined to simultaneously generate and register a patient-specific model of the femur and pelvis to the patient during surgery. In our implementation, a statistical deformation model (SDM) was constructed for the femur and pelvis by performing a principal component analysis on the B-spline control points that parameterise the freeform deformations required to non-rigidly register a training set of CT scans to a carefully segmented template CT scan. The segmented template bone surface, represented by a triangulated surface mesh, is instantiated and registered to a cloud of US-derived surface points using an iterative scheme in which the weights corresponding to the first five principal modes of variation of the SDM are optimised in addition to the rigid-body parameters. The accuracy of the method was evaluated using clinically realistic data obtained on three intact human cadavers (three whole pelves and six femurs). For each bone, a high-resolution CT scan and rigid-body registration transformation, calculated using bone-implanted fiducial markers, served as the gold standard bone geometry and registration transformation, respectively. After aligning the final instantiated model and CT-derived surfaces using the iterative closest point (ICP) algorithm, the average root-mean-square distance between the surfaces was 3.5mm over the whole bone and 3.7mm in the region of surgical interest. The corresponding distances after aligning the surfaces using the marker-based registration transformation were 4.6 and 4.5mm, respectively. We conclude that despite limitations on the regions of bone accessible using US imaging, this technique has potential as a cost-effective and non-invasive method to enable surgical navigation during CAOS procedures, without the additional radiation dose associated with performing a preoperative CT scan or intraoperative fluoroscopic imaging. However, further development is required to investigate errors using error measures relevant to specific surgical procedures.
NASA Astrophysics Data System (ADS)
Victor, Rodolfo A.; Prodanović, Maša.; Torres-Verdín, Carlos
2017-12-01
We develop a new Monte Carlo-based inversion method for estimating electron density and effective atomic number from 3-D dual-energy computed tomography (CT) core scans. The method accounts for uncertainties in X-ray attenuation coefficients resulting from the polychromatic nature of X-ray beam sources of medical and industrial scanners, in addition to delivering uncertainty estimates of inversion products. Estimation of electron density and effective atomic number from CT core scans enables direct deterministic or statistical correlations with salient rock properties for improved petrophysical evaluation; this condition is specifically important in media such as vuggy carbonates where CT resolution better captures core heterogeneity that dominates fluid flow properties. Verification tests of the inversion method performed on a set of highly heterogeneous carbonate cores yield very good agreement with in situ borehole measurements of density and photoelectric factor.
PDF modeling of near-wall turbulent flows
NASA Astrophysics Data System (ADS)
Dreeben, Thomas David
1997-06-01
Pdf methods are extended to include modeling of wall- bounded turbulent flows. For flows in which resolution of the viscous sublayer is desired, a Pdf near-wall model is developed in which the Generalized Langevin model is combined with an exact model for viscous transport. Durbin's method of elliptic relaxation is used to incorporate the wall effects into the governing equations without the use of wall functions or damping functions. Close to the wall, the Generalized Langevin model provides an analogy to the effect of the fluctuating continuity equation. This enables accurate modeling of the near-wall turbulent statistics. Demonstrated accuracy for fully-developed channel flow is achieved with a Pdf/Monte Carlo simulation, and with its related Reynolds-stress closure. For flows in which the details of the viscous sublayer are not important, a Pdf wall- function method is developed with the Simplified Langevin model.
Geometry and Dynamics for Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Barp, Alessandro; Briol, François-Xavier; Kennedy, Anthony D.; Girolami, Mark
2018-03-01
Markov Chain Monte Carlo methods have revolutionised mathematical computation and enabled statistical inference within many previously intractable models. In this context, Hamiltonian dynamics have been proposed as an efficient way of building chains which can explore probability densities efficiently. The method emerges from physics and geometry and these links have been extensively studied by a series of authors through the last thirty years. However, there is currently a gap between the intuitions and knowledge of users of the methodology and our deep understanding of these theoretical foundations. The aim of this review is to provide a comprehensive introduction to the geometric tools used in Hamiltonian Monte Carlo at a level accessible to statisticians, machine learners and other users of the methodology with only a basic understanding of Monte Carlo methods. This will be complemented with some discussion of the most recent advances in the field which we believe will become increasingly relevant to applied scientists.
NASA Astrophysics Data System (ADS)
Gagatsos, Christos N.; Karanikas, Alexandros I.; Kordas, Georgios; Cerf, Nicolas J.
2016-02-01
In spite of their simple description in terms of rotations or symplectic transformations in phase space, quadratic Hamiltonians such as those modelling the most common Gaussian operations on bosonic modes remain poorly understood in terms of entropy production. For instance, determining the quantum entropy generated by a Bogoliubov transformation is notably a hard problem, with generally no known analytical solution, while it is vital to the characterisation of quantum communication via bosonic channels. Here we overcome this difficulty by adapting the replica method, a tool borrowed from statistical physics and quantum field theory. We exhibit a first application of this method to continuous-variable quantum information theory, where it enables accessing entropies in an optical parametric amplifier. As an illustration, we determine the entropy generated by amplifying a binary superposition of the vacuum and a Fock state, which yields a surprisingly simple, yet unknown analytical expression.
Smith, Gillian E; Elliot, Alex J; Ibbotson, Sue; Morbey, Roger; Edeghere, Obaghe; Hawker, Jeremy; Catchpole, Mike; Endericks, Tina; Fisher, Paul; McCloskey, Brian
2017-09-01
Syndromic surveillance aims to provide early warning and real time estimates of the extent of incidents; and reassurance about lack of impact of mass gatherings. We describe a novel public health risk assessment process to ensure those leading the response to the 2012 Olympic Games were alerted to unusual activity that was of potential public health importance, and not inundated with multiple statistical 'alarms'. Statistical alarms were assessed to identify those which needed to result in 'alerts' as reliably as possible. There was no previously developed method for this. We identified factors that increased our concern about an alarm suggesting that an 'alert' should be made. Between 2 July and 12 September 2012, 350 674 signals were analysed resulting in 4118 statistical alarms. Using the risk assessment process, 122 'alerts' were communicated to Olympic incident directors. Use of a novel risk assessment process enabled the interpretation of large number of statistical alarms in a manageable way for the period of a sustained mass gathering. This risk assessment process guided the prioritization and could be readily adapted to other surveillance systems. The process, which is novel to our knowledge, continues as a legacy of the Games. © Crown copyright 2016.
Anna, Bluszcz
Nowadays methods of measurement and assessment of the level of sustained development at the international, national and regional level are a current research problem, which requires multi-dimensional analysis. The relative assessment of the sustainability level of the European Union member states and the comparative analysis of the position of Poland relative to other countries was the aim of the conducted studies in the article. EU member states were treated as objects in the multi-dimensional space. Dimensions of space were specified by ten diagnostic variables describing the sustainability level of UE countries in three dimensions, i.e., social, economic and environmental. Because the compiled statistical data were expressed in different units of measure, taxonomic methods were used for building an aggregated measure to assess the level of sustainable development of EU member states, which through normalisation of variables enabled the comparative analysis between countries. Methodology of studies consisted of eight stages, which included, among others: defining data matrices, calculating the variability coefficient for all variables, which variability coefficient was under 10 %, division of variables into stimulants and destimulants, selection of the method of variable normalisation, developing matrices of normalised data, selection of the formula and calculating the aggregated indicator of the relative level of sustainable development of the EU countries, calculating partial development indicators for three studies dimensions: social, economic and environmental and the classification of the EU countries according to the relative level of sustainable development. Statistical date were collected based on the Polish Central Statistical Office publication.
Statistical Learning as a Basis for Social Understanding in Children
ERIC Educational Resources Information Center
Ruffman, Ted; Taumoepeau, Mele; Perkins, Chris
2012-01-01
Many authors have argued that infants understand goals, intentions, and beliefs. We posit that infants' success on such tasks might instead reveal an understanding of behaviour, that infants' proficient statistical learning abilities might enable such insights, and that maternal talk scaffolds children's learning about the social world as well. We…
Some Experience with Interactive Computing in Teaching Introductory Statistics.
ERIC Educational Resources Information Center
Diegert, Carl
Students in two biostatistics courses at the Cornell Medical College and in a course in applications of computer science given in Cornell's School of Industrial Engineering were given access to an interactive package of computer programs enabling them to perform statistical analysis without the burden of hand computation. After a general…
GPS: Geometry, Probability, and Statistics
ERIC Educational Resources Information Center
Field, Mike
2012-01-01
It might be said that for most occupations there is now less of a need for mathematics than there was say fifty years ago. But, the author argues, geometry, probability, and statistics constitute essential knowledge for everyone. Maybe not the geometry of Euclid, but certainly geometrical ways of thinking that might enable us to describe the world…
Bingi, Jayachandra; Murukeshan, Vadakke Matham
2015-01-01
Laser speckle pattern is a granular structure formed due to random coherent wavelet interference and generally considered as noise in optical systems including photolithography. Contrary to this, in this paper, we use the speckle pattern to generate predictable and controlled Gaussian random structures and quasi-random structures photo-lithographically. The random structures made using this proposed speckle lithography technique are quantified based on speckle statistics, radial distribution function (RDF) and fast Fourier transform (FFT). The control over the speckle size, density and speckle clustering facilitates the successful fabrication of black silicon with different surface structures. The controllability and tunability of randomness makes this technique a robust method for fabricating predictable 2D Gaussian random structures and black silicon structures. These structures can enhance the light trapping significantly in solar cells and hence enable improved energy harvesting. Further, this technique can enable efficient fabrication of disordered photonic structures and random media based devices. PMID:26679513
NASA Technical Reports Server (NTRS)
Bednarcyk, Brett A.; Arnold, Steven M.
2006-01-01
A framework is presented that enables coupled multiscale analysis of composite structures. The recently developed, free, Finite Element Analysis - Micromechanics Analysis Code (FEAMAC) software couples the Micromechanics Analysis Code with Generalized Method of Cells (MAC/GMC) with ABAQUS to perform micromechanics based FEA such that the nonlinear composite material response at each integration point is modeled at each increment by MAC/GMC. As a result, the stochastic nature of fiber breakage in composites can be simulated through incorporation of an appropriate damage and failure model that operates within MAC/GMC on the level of the fiber. Results are presented for the progressive failure analysis of a titanium matrix composite tensile specimen that illustrate the power and utility of the framework and address the techniques needed to model the statistical nature of the problem properly. In particular, it is shown that incorporating fiber strength randomness on multiple scales improves the quality of the simulation by enabling failure at locations other than those associated with structural level stress risers.
Attendance at NHS mandatory training sessions.
Brand, Darren
2015-02-17
To identify factors that affect NHS healthcare professionals' attendance at mandatory training sessions. A quantitative approach was used, with a questionnaire sent to 400 randomly selected participants. A total of 122 responses were received, providing a mix of qualitative and quantitative data. Quantitative data were analysed using statistical methods. Open-ended responses were reviewed using thematic analysis. Clinical staff value mandatory training sessions highly. They are aware of the requirement to keep practice up-to-date and ensure patient safety remains a priority. However, changes to the delivery format of mandatory training sessions are required to enable staff to participate more easily, as staff are often unable to attend. The delivery of mandatory training should move from classroom-based sessions into the clinical area to maximise participation. Delivery should be assisted by local 'experts' who are able to customise course content to meet local requirements and the requirements of different staff groups. Improved arrangements to provide staff cover, for those attending training, would enable more staff to attend training sessions.
NASA Technical Reports Server (NTRS)
Bednarcyk, Brett A.; Arnold, Steven M.
2007-01-01
A framework is presented that enables coupled multiscale analysis of composite structures. The recently developed, free, Finite Element Analysis-Micromechanics Analysis Code (FEAMAC) software couples the Micromechanics Analysis Code with Generalized Method of Cells (MAC/GMC) with ABAQUS to perform micromechanics based FEA such that the nonlinear composite material response at each integration point is modeled at each increment by MAC/GMC. As a result, the stochastic nature of fiber breakage in composites can be simulated through incorporation of an appropriate damage and failure model that operates within MAC/GMC on the level of the fiber. Results are presented for the progressive failure analysis of a titanium matrix composite tensile specimen that illustrate the power and utility of the framework and address the techniques needed to model the statistical nature of the problem properly. In particular, it is shown that incorporating fiber strength randomness on multiple scales improves the quality of the simulation by enabling failure at locations other than those associated with structural level stress risers.
Network analysis for the visualization and analysis of qualitative data.
Pokorny, Jennifer J; Norman, Alex; Zanesco, Anthony P; Bauer-Wu, Susan; Sahdra, Baljinder K; Saron, Clifford D
2018-03-01
We present a novel manner in which to visualize the coding of qualitative data that enables representation and analysis of connections between codes using graph theory and network analysis. Network graphs are created from codes applied to a transcript or audio file using the code names and their chronological location. The resulting network is a representation of the coding data that characterizes the interrelations of codes. This approach enables quantification of qualitative codes using network analysis and facilitates examination of associations of network indices with other quantitative variables using common statistical procedures. Here, as a proof of concept, we applied this method to a set of interview transcripts that had been coded in 2 different ways and the resultant network graphs were examined. The creation of network graphs allows researchers an opportunity to view and share their qualitative data in an innovative way that may provide new insights and enhance transparency of the analytical process by which they reach their conclusions. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Warton, David I; Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.
Bae, Jong-Myon
2016-01-01
A common method for conducting a quantitative systematic review (QSR) for observational studies related to nutritional epidemiology is the "highest versus lowest intake" method (HLM), in which only the information concerning the effect size (ES) of the highest category of a food item is collected on the basis of its lowest category. However, in the interval collapsing method (ICM), a method suggested to enable a maximum utilization of all available information, the ES information is collected by collapsing all categories into a single category. This study aimed to compare the ES and summary effect size (SES) between the HLM and ICM. A QSR for evaluating the citrus fruit intake and risk of pancreatic cancer and calculating the SES by using the HLM was selected. The ES and SES were estimated by performing a meta-analysis using the fixed-effect model. The directionality and statistical significance of the ES and SES were used as criteria for determining the concordance between the HLM and ICM outcomes. No significant differences were observed in the directionality of SES extracted by using the HLM or ICM. The application of the ICM, which uses a broader information base, yielded more-consistent ES and SES, and narrower confidence intervals than the HLM. The ICM is advantageous over the HLM owing to its higher statistical accuracy in extracting information for QSR on nutritional epidemiology. The application of the ICM should hence be recommended for future studies.
Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071
Approximate Single-Diode Photovoltaic Model for Efficient I-V Characteristics Estimation
Ting, T. O.; Zhang, Nan; Guan, Sheng-Uei; Wong, Prudence W. H.
2013-01-01
Precise photovoltaic (PV) behavior models are normally described by nonlinear analytical equations. To solve such equations, it is necessary to use iterative procedures. Aiming to make the computation easier, this paper proposes an approximate single-diode PV model that enables high-speed predictions for the electrical characteristics of commercial PV modules. Based on the experimental data, statistical analysis is conducted to validate the approximate model. Simulation results show that the calculated current-voltage (I-V) characteristics fit the measured data with high accuracy. Furthermore, compared with the existing modeling methods, the proposed model reduces the simulation time by approximately 30% in this work. PMID:24298205
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
An architecture for a brain-image database
NASA Technical Reports Server (NTRS)
Herskovits, E. H.
2000-01-01
The widespread availability of methods for noninvasive assessment of brain structure has enabled researchers to investigate neuroimaging correlates of normal aging, cerebrovascular disease, and other processes; we designate such studies as image-based clinical trials (IBCTs). We propose an architecture for a brain-image database, which integrates image processing and statistical operators, and thus supports the implementation and analysis of IBCTs. The implementation of this architecture is described and results from the analysis of image and clinical data from two IBCTs are presented. We expect that systems such as this will play a central role in the management and analysis of complex research data sets.
NASA Astrophysics Data System (ADS)
Klein, Kristopher; Kasper, Justin; Korreck, Kelly; Alterman, Benjamin
2017-04-01
The role of free-energy driven instabilities in governing heating and acceleration processes in the heliosphere has been studied for over half a century, with significant recent advancements enabled by the statistical analysis of decades worth of observations from missions such as WIND. Typical studies focus on marginal stability boundaries in a reduced parameter space, such as the canonical plasma beta versus temperature anisotropy plane, due to a single source of free energy. We present a more general method of determining stability, accounting for all possible sources of free energy in the constituent plasma velocity distributions. Through this novel implementation, we can efficiently determine if the plasma is linearly unstable, and if so, how many normal modes are growing. Such identification will enabling us to better pinpoint the dominant heating or acceleration processes in solar wind plasma. The theory behind this approach is reviewed, followed by a discussion of our methods for a robust numerical implementation, and an initial application to portions of the WIND data set. Further application of this method to velocity distribution measurements from current missions, including WIND, upcoming missions, including Solar Probe Plus and Solar Orbiter, and missions currently in preliminary phases, such as ESA's THOR and NASA's IMAP, will help elucidate how instabilities shape the evolution of the heliosphere.
Heart rate sensitive optical coherence angiography
NASA Astrophysics Data System (ADS)
Alvarez, Karl; Lopez-Tremoleda, Jordi; Donnan, Rob; Michael-Titus, Adina T.; Tomlins, Peter H.
2018-02-01
Optical coherence angiography (OCA) enables visualisation of three-dimensional micro-vasculature from optical coherence tomography data volumes. Typically, various statistical methods are used to discriminate static tissue from blood flow within vessels. In this paper, we introduce a new method that relies upon the beating heart frequency to isolate blood vessels from the surrounding tissue. Vascular blood flow is assumed to be more strongly modulated by the heart-beat compared to surrounding tissue and therefore short-time Fourier transform of sequential measurements can discriminate the two. Furthermore, it is demonstrated that adjacent B-Scans within an OCT data volume can provide the required sampling frequency. As such, the technique can be considered to be a spatially mapped variation of photoplethysmography (PPG), whereby each image voxel operates as a PPG detector. This principle is demonstrated using both a model system and in vivo for monitoring the vascular changes effected by traumatic brain injury in mice. In vivo measurements were acquired at an A-Scan rate of 10kHz to form a 500x500x512 (lateral x lateral x axial) pixel volume, enabling sequential sampling of the mouse heart rate in an expected range of 300-600 bpm. One of the advantages of this new OCA processing method is that it can be used in conjunction with existing algorithms as an additional filter for signal to noise enhancement.
Gao, Bin; Li, Xiaoqing; Woo, Wai Lok; Tian, Gui Yun
2018-05-01
Thermographic inspection has been widely applied to non-destructive testing and evaluation with the capabilities of rapid, contactless, and large surface area detection. Image segmentation is considered essential for identifying and sizing defects. To attain a high-level performance, specific physics-based models that describe defects generation and enable the precise extraction of target region are of crucial importance. In this paper, an effective genetic first-order statistical image segmentation algorithm is proposed for quantitative crack detection. The proposed method automatically extracts valuable spatial-temporal patterns from unsupervised feature extraction algorithm and avoids a range of issues associated with human intervention in laborious manual selection of specific thermal video frames for processing. An internal genetic functionality is built into the proposed algorithm to automatically control the segmentation threshold to render enhanced accuracy in sizing the cracks. Eddy current pulsed thermography will be implemented as a platform to demonstrate surface crack detection. Experimental tests and comparisons have been conducted to verify the efficacy of the proposed method. In addition, a global quantitative assessment index F-score has been adopted to objectively evaluate the performance of different segmentation algorithms.
A Physics-Inspired Mechanistic Model of Migratory Movement Patterns in Birds.
Revell, Christopher; Somveille, Marius
2017-08-29
In this paper, we introduce a mechanistic model of migratory movement patterns in birds, inspired by ideas and methods from physics. Previous studies have shed light on the factors influencing bird migration but have mainly relied on statistical correlative analysis of tracking data. Our novel method offers a bottom up explanation of population-level migratory movement patterns. It differs from previous mechanistic models of animal migration and enables predictions of pathways and destinations from a given starting location. We define an environmental potential landscape from environmental data and simulate bird movement within this landscape based on simple decision rules drawn from statistical mechanics. We explore the capacity of the model by qualitatively comparing simulation results to the non-breeding migration patterns of a seabird species, the Black-browed Albatross (Thalassarche melanophris). This minimal, two-parameter model was able to capture remarkably well the previously documented migration patterns of the Black-browed Albatross, with the best combination of parameter values conserved across multiple geographically separate populations. Our physics-inspired mechanistic model could be applied to other bird and highly-mobile species, improving our understanding of the relative importance of various factors driving migration and making predictions that could be useful for conservation.
Framework for making better predictions by directly estimating variables' predictivity.
Lo, Adeline; Chernoff, Herman; Zheng, Tian; Lo, Shaw-Hwa
2016-12-13
We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to noisy useless variables. We demonstrate that the [Formula: see text]-score of the PR method of VS yields a relatively unbiased estimate of a parameter that is not sensitive to noisy variables and is a lower bound to the parameter of interest. Thus, the PR method using the [Formula: see text]-score provides an effective approach to selecting highly predictive variables. We offer simulations and an application of the [Formula: see text]-score on real data to demonstrate the statistic's predictive performance on sample data. We conjecture that using the partition retention and [Formula: see text]-score can aid in finding variable sets with promising prediction rates; however, further research in the avenue of sample-based measures of predictivity is much desired.
Non-ad-hoc decision rule for the Dempster-Shafer method of evidential reasoning
NASA Astrophysics Data System (ADS)
Cheaito, Ali; Lecours, Michael; Bosse, Eloi
1998-03-01
This paper is concerned with the fusion of identity information through the use of statistical analysis rooted in Dempster-Shafer theory of evidence to provide automatic identification aboard a platform. An identity information process for a baseline Multi-Source Data Fusion (MSDF) system is defined. The MSDF system is applied to information sources which include a number of radars, IFF systems, an ESM system, and a remote track source. We use a comprehensive Platform Data Base (PDB) containing all the possible identity values that the potential target may take, and we use the fuzzy logic strategies which enable the fusion of subjective attribute information from sensor and the PDB to make the derivation of target identity more quickly, more precisely, and with statistically quantifiable measures of confidence. The conventional Dempster-Shafer lacks a formal basis upon which decision can be made in the face of ambiguity. We define a non-ad hoc decision rule based on the expected utility interval for pruning the `unessential' propositions which would otherwise overload the real-time data fusion systems. An example has been selected to demonstrate the implementation of our modified Dempster-Shafer method of evidential reasoning.
Leeseberg Stamler, L; Cole, M M; Patrick, L J
2001-08-01
Strategies to delay or prevent complications from diabetes include diabetes patient education. Diabetes educators seek to provide education that meets the needs of clients and influences positive health outcomes. (1) To expand prior research exploring an enablement framework for patient education by examining perceptions of patient education by persons with diabetes and (2) to test the mastery of stress instrument (MSI) as a potential evaluative instrument for patient education. Triangulated data collection with a convenience sample of adults taking diabetes education classes. Half the sample completed audio-taped semi-structured interviews pre, during and posteducation and all completed the MSI posteducation. Qualitative data were analysed using latent content analysis, descriptive statistics were completed. Qualitative analysis revealed content categories similar to previous work with prenatal participants, supporting the enablement framework. Statistical analyses noted congruence with psychometric findings from development of MSI; secondary qualitative analyses revealed congruency between MSI scores and patient perceptions. Mastery is an outcome congruent with the enablement framework for patient education across content areas. Mastery of stress instrument may be a instrument for identification of patients who are coping well with diabetes self-management, as well as those who are not and who require further nursing interventions.
NASA Astrophysics Data System (ADS)
Fan, Daidu; Tu, Junbiao; Cai, Guofu; Shang, Shuai
2015-06-01
Grain-size analysis is a basic routine in sedimentology and related fields, but diverse methods of sample collection, processing and statistical analysis often make direct comparisons and interpretations difficult or even impossible. In this paper, 586 published grain-size datasets from the Qiantang Estuary (East China Sea) sampled and analyzed by the same procedures were merged and their textural parameters calculated by a percentile and two moment methods. The aim was to explore which of the statistical procedures performed best in the discrimination of three distinct sedimentary units on the tidal flats of the middle Qiantang Estuary. A Gaussian curve-fitting method served to simulate mixtures of two normal populations having different modal sizes, sorting values and size distributions, enabling a better understanding of the impact of finer tail components on textural parameters, as well as the proposal of a unifying descriptive nomenclature. The results show that percentile and moment procedures yield almost identical results for mean grain size, and that sorting values are also highly correlated. However, more complex relationships exist between percentile and moment skewness (kurtosis), changing from positive to negative correlations when the proportions of the finer populations decrease below 35% (10%). This change results from the overweighting of tail components in moment statistics, which stands in sharp contrast to the underweighting or complete amputation of small tail components by the percentile procedure. Intercomparisons of bivariate plots suggest an advantage of the Friedman & Johnson moment procedure over the McManus moment method in terms of the description of grain-size distributions, and over the percentile method by virtue of a greater sensitivity to small variations in tail components. The textural parameter scalings of Folk & Ward were translated into their Friedman & Johnson moment counterparts by application of mathematical functions derived by regression analysis of measured and modeled grain-size data, or by determining the abscissa values of intersections between auxiliary lines running parallel to the x-axis and vertical lines corresponding to the descriptive percentile limits along the ordinate of representative bivariate plots. Twofold limits were extrapolated for the moment statistics in relation to single descriptive terms in the cases of skewness and kurtosis by considering both positive and negative correlations between percentile and moment statistics. The extrapolated descriptive scalings were further validated by examining entire size-frequency distributions simulated by mixing two normal populations of designated modal size and sorting values, but varying in mixing ratios. These were found to match well in most of the proposed scalings, although platykurtic and very platykurtic categories were questionable when the proportion of the finer population was below 5%. Irrespective of the statistical procedure, descriptive nomenclatures should therefore be cautiously used when tail components contribute less than 5% to grain-size distributions.
Atlas-based liver segmentation and hepatic fat-fraction assessment for clinical trials.
Yan, Zhennan; Zhang, Shaoting; Tan, Chaowei; Qin, Hongxing; Belaroussi, Boubakeur; Yu, Hui Jing; Miller, Colin; Metaxas, Dimitris N
2015-04-01
Automated assessment of hepatic fat-fraction is clinically important. A robust and precise segmentation would enable accurate, objective and consistent measurement of hepatic fat-fraction for disease quantification, therapy monitoring and drug development. However, segmenting the liver in clinical trials is a challenging task due to the variability of liver anatomy as well as the diverse sources the images were acquired from. In this paper, we propose an automated and robust framework for liver segmentation and assessment. It uses single statistical atlas registration to initialize a robust deformable model to obtain fine segmentation. Fat-fraction map is computed by using chemical shift based method in the delineated region of liver. This proposed method is validated on 14 abdominal magnetic resonance (MR) volumetric scans. The qualitative and quantitative comparisons show that our proposed method can achieve better segmentation accuracy with less variance comparing with two other atlas-based methods. Experimental results demonstrate the promises of our assessment framework. Copyright © 2014 Elsevier Ltd. All rights reserved.
Structured Matrix Completion with Applications to Genomic Data Integration.
Cai, Tianxi; Cai, T Tony; Zhang, Anru
2016-01-01
Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.
Apparent Yield Strength of Hot-Pressed SiCs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daloz, William L; Wereszczak, Andrew A; Jadaan, Osama M.
2008-01-01
Apparent yield strengths (YApp) of four hot-pressed silicon carbides (SiC-B, SiC-N,SiC-HPN, and SiC-SC-1RN) were estimated using diamond spherical or Hertzian indentation. The von Mises and Tresca criteria were considered. The developed test method was robust, simple and quick to execute, and thusly enabled the acquisition of confident sampling statistics. The choice of indenter size, test method, and method of analysis are described. The compressive force necessary to initiate apparent yielding was identified postmortem using differential interference contrast (or Nomarski) imaging with an optical microscope. It was found that the YApp of SiC-HPN (14.0 GPa) was approximately 10% higher than themore » equivalently valued YApp of SiC-B, SiC-N, and SiC-SC-1RN. This discrimination in YApp shows that the use of this test method could be insightful because there were no differences among the average Knoop hardnesses of the four SiC grades.« less
Quantitative characterization of genetic parts and circuits for plant synthetic biology.
Schaumberg, Katherine A; Antunes, Mauricio S; Kassaw, Tessema K; Xu, Wenlong; Zalewski, Christopher S; Medford, June I; Prasad, Ashok
2016-01-01
Plant synthetic biology promises immense technological benefits, including the potential development of a sustainable bio-based economy through the predictive design of synthetic gene circuits. Such circuits are built from quantitatively characterized genetic parts; however, this characterization is a significant obstacle in work with plants because of the time required for stable transformation. We describe a method for rapid quantitative characterization of genetic plant parts using transient expression in protoplasts and dual luciferase outputs. We observed experimental variability in transient-expression assays and developed a mathematical model to describe, as well as statistical normalization methods to account for, this variability, which allowed us to extract quantitative parameters. We characterized >120 synthetic parts in Arabidopsis and validated our method by comparing transient expression with expression in stably transformed plants. We also tested >100 synthetic parts in sorghum (Sorghum bicolor) protoplasts, and the results showed that our method works in diverse plant groups. Our approach enables the construction of tunable gene circuits in complex eukaryotic organisms.
A sensitive continuum analysis method for gamma ray spectra
NASA Technical Reports Server (NTRS)
Thakur, Alakh N.; Arnold, James R.
1993-01-01
In this work we examine ways to improve the sensitivity of the analysis procedure for gamma ray spectra with respect to small differences in the continuum (Compton) spectra. The method developed is applied to analyze gamma ray spectra obtained from planetary mapping by the Mars Observer spacecraft launched in September 1992. Calculated Mars simulation spectra and actual thick target bombardment spectra have been taken as test cases. The principle of the method rests on the extraction of continuum information from Fourier transforms of the spectra. We study how a better estimate of the spectrum from larger regions of the Mars surface will improve the analysis for smaller regions with poorer statistics. Estimation of signal within the continuum is done in the frequency domain which enables efficient and sensitive discrimination of subtle differences between two spectra. The process is compared to other methods for the extraction of information from the continuum. Finally we explore briefly the possible uses of this technique in other applications of continuum spectra.
Impact of feature saliency on visual category learning.
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the 'essence' of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies.
Impact of feature saliency on visual category learning
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the ‘essence’ of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies. PMID:25954220
Asking Sensitive Questions: A Statistical Power Analysis of Randomized Response Models
ERIC Educational Resources Information Center
Ulrich, Rolf; Schroter, Hannes; Striegel, Heiko; Simon, Perikles
2012-01-01
This article derives the power curves for a Wald test that can be applied to randomized response models when small prevalence rates must be assessed (e.g., detecting doping behavior among elite athletes). These curves enable the assessment of the statistical power that is associated with each model (e.g., Warner's model, crosswise model, unrelated…
Python package for model STructure ANalysis (pySTAN)
NASA Astrophysics Data System (ADS)
Van Hoey, Stijn; van der Kwast, Johannes; Nopens, Ingmar; Seuntjens, Piet
2013-04-01
The selection and identification of a suitable hydrological model structure is more than fitting parameters of a model structure to reproduce a measured hydrograph. The procedure is highly dependent on various criteria, i.e. the modelling objective, the characteristics and the scale of the system under investigation as well as the available data. Rigorous analysis of the candidate model structures is needed to support and objectify the selection of the most appropriate structure for a specific case (or eventually justify the use of a proposed ensemble of structures). This holds both in the situation of choosing between a limited set of different structures as well as in the framework of flexible model structures with interchangeable components. Many different methods to evaluate and analyse model structures exist. This leads to a sprawl of available methods, all characterized by different assumptions, changing conditions of application and various code implementations. Methods typically focus on optimization, sensitivity analysis or uncertainty analysis, with backgrounds from optimization, machine-learning or statistics amongst others. These methods also need an evaluation metric (objective function) to compare the model outcome with some observed data. However, for current methods described in literature, implementations are not always transparent and reproducible (if available at all). No standard procedures exist to share code and the popularity (and amount of applications) of the methods is sometimes more dependent on the availability than the merits of the method. Moreover, new implementations of existing methods are difficult to verify and the different theoretical backgrounds make it difficult for environmental scientists to decide about the usefulness of a specific method. A common and open framework with a large set of methods can support users in deciding about the most appropriate method. Hence, it enables to simultaneously apply and compare different methods on a fair basis. We developed and present pySTAN (python framework for STructure Analysis), a python package containing a set of functions for model structure evaluation to provide the analysis of (hydrological) model structures. A selected set of algorithms for optimization, uncertainty and sensitivity analysis is currently available, together with a set of evaluation (objective) functions and input distributions to sample from. The methods are implemented model-independent and the python language provides the wrapper functions to apply administer external model codes. Different objective functions can be considered simultaneously with both statistical metrics and more hydrology specific metrics. By using so-called reStructuredText (sphinx documentation generator) and Python documentation strings (docstrings), the generation of manual pages is semi-automated and a specific environment is available to enhance both the readability and transparency of the code. It thereby enables a larger group of users to apply and compare these methods and to extend the functionalities.
Factors that enable and hinder the implementation of projects in the alcohol and other drug field.
MacLean, Sarah; Berends, Lynda; Hunter, Barbara; Roberts, Bridget; Mugavin, Janette
2012-02-01
Few studies systematically explore elements of successful project implementation across a range of alcohol and other drug (AOD) activities. This paper provides an evidence base to inform project implementation in the AOD field. We accessed records for 127 completed projects funded by the Alcohol, Education and Rehabilitation Foundation from 2002 to 2008. An adapted realist synthesis methodology enabled us to develop categories of enablers and barriers to successful project implementation, and to identify factors statistically associated with successful project implementation, defined as meeting all funding objectives. Thematic analysis of eight case study projects allowed detailed exploration of findings. Nine enabler and 10 barrier categories were identified. Those most frequently reported as both barriers and enablers concerned partnerships with external agencies and communities, staffing and project design. Achieving supportive relationships with partner agencies and communities, employing skilled staff and implementing consumer or participant input mechanisms were statistically associated with successful project implementation. The framework described here will support development of evidence-based project funding guidelines and project performance indicators. The study provides evidence that investing project hours and resources to develop robust relationships with project partners and communities, implementing mechanisms for consumer or participant input and attracting skilled staff are legitimate and important activities, not just in themselves but because they potentially influence achievement of project funding objectives. © 2012 The Authors. ANZJPH © 2012 Public Health Association of Australia.
León, Larry F; Cai, Tianxi
2012-04-01
In this paper we develop model checking techniques for assessing functional form specifications of covariates in censored linear regression models. These procedures are based on a censored data analog to taking cumulative sums of "robust" residuals over the space of the covariate under investigation. These cumulative sums are formed by integrating certain Kaplan-Meier estimators and may be viewed as "robust" censored data analogs to the processes considered by Lin, Wei & Ying (2002). The null distributions of these stochastic processes can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be generated by computer simulation. Each observed process can then be graphically compared with a few realizations from the Gaussian process. We also develop formal test statistics for numerical comparison. Such comparisons enable one to assess objectively whether an apparent trend seen in a residual plot reects model misspecification or natural variation. We illustrate the methods with a well known dataset. In addition, we examine the finite sample performance of the proposed test statistics in simulation experiments. In our simulation experiments, the proposed test statistics have good power of detecting misspecification while at the same time controlling the size of the test.
Connectopic mapping with resting-state fMRI.
Haak, Koen V; Marquand, Andre F; Beckmann, Christian F
2018-04-15
Brain regions are often topographically connected: nearby locations within one brain area connect with nearby locations in another area. Mapping these connection topographies, or 'connectopies' in short, is crucial for understanding how information is processed in the brain. Here, we propose principled, fully data-driven methods for mapping connectopies using functional magnetic resonance imaging (fMRI) data acquired at rest by combining spectral embedding of voxel-wise connectivity 'fingerprints' with a novel approach to spatial statistical inference. We apply the approach in human primary motor and visual cortex, and show that it can trace biologically plausible, overlapping connectopies in individual subjects that follow these regions' somatotopic and retinotopic maps. As a generic mechanism to perform inference over connectopies, the new spatial statistics approach enables rigorous statistical testing of hypotheses regarding the fine-grained spatial profile of functional connectivity and whether that profile is different between subjects or between experimental conditions. The combined framework offers a fundamental alternative to existing approaches to investigating functional connectivity in the brain, from voxel- or seed-pair wise characterizations of functional association, towards a full, multivariate characterization of spatial topography. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Estimation of gene induction enables a relevance-based ranking of gene sets.
Bartholomé, Kilian; Kreutz, Clemens; Timmer, Jens
2009-07-01
In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either require threshhold values to be chosen for the analysis, or they need some reference set for the determination of significance. We present a method that estimates the number of differentially expressed genes in a gene set without requiring a threshold value for significance of genes. The method is self-contained (i.e., it does not require a reference set for comparison). In contrast to other methods which are focused on significance, our approach emphasizes the relevance of the regulation of gene sets. The presented method measures the degree of regulation of a gene set and is a useful tool to compare the induction of different gene sets and place the results of microarray experiments into the biological context. An R-package is available.
NASA Astrophysics Data System (ADS)
Field, S. N.; Glassom, D.; Bythell, J.
2007-06-01
The choice of substrata and the methods of deployment in analyses of settlement in benthic communities are often driven by the cost of materials and their local availability, and comparisons are often made between studies using different methodologies. The effects of varying artificial substratum, size of replicates and method of deployment were determined on a shallow reef in Eilat, Israel, while the effect of size of replicates was also investigated on a shallow reef in Sharm El Sheikh, Egypt. When statistical power was high enough, that is, when sufficient numbers of settlers were recorded, significant differences were found between materials used, tile size and methods of deployment. Significant differences were detected in total coral settlement rates and for the two dominant taxonomic groups, acroporids and pocilloporids. Standardisation of tile materials, dimensions, and method of deployment is needed for comparison between coral and other epibiont settlement studies. However, a greater understanding of the effects of these experimental variables on settlement processes may enable retrospective comparisons between studies utilising a range of materials and methods.
NASA Astrophysics Data System (ADS)
Kataoka, Norio; Kasama, Kiyonobu; Zen, Kouki; Chen, Guangqi
This paper presents a probabilistic method for assessi ng the liquefaction risk of cement-treated ground, which is an anti-liquefaction ground improved by cemen t-mixing. In this study, the liquefaction potential of cement-treated ground is analyzed statistically using Monte Carlo Simulation based on the nonlinear earthquake response analysis consid ering the spatial variability of so il properties. The seismic bearing capacity of partially liquefied ground is analyzed in order to estimat e damage costs induced by partial liquefaction. Finally, the annual li quefaction risk is calcu lated by multiplying the liquefaction potential with the damage costs. The results indicated that the proposed new method enables to evaluate the probability of liquefaction, to estimate the damage costs using the hazard curv e, fragility curve induced by liquefaction, and liq uefaction risk curve.
Uncertainty-enabled design of electromagnetic reflectors with integrated shape control
NASA Astrophysics Data System (ADS)
Haque, Samiul; Kindrat, Laszlo P.; Zhang, Li; Mikheev, Vikenty; Kim, Daewa; Liu, Sijing; Chung, Jooyeon; Kuian, Mykhailo; Massad, Jordan E.; Smith, Ralph C.
2018-03-01
We implemented a computationally efficient model for a corner-supported, thin, rectangular, orthotropic polyvinylidene fluoride (PVDF) laminate membrane, actuated by a two-dimensional array of segmented electrodes. The laminate can be used as shape-controlled electromagnetic reflector and the model estimates the reflector's shape given an array of control voltages. In this paper, we describe a model to determine the shape of the laminate for a given distribution of control voltages. Then, we investigate the surface shape error and its sensitivity to the model parameters. Subsequently, we analyze the simulated deflection of the actuated bimorph using a Zernike polynomial decomposition. Finally, we provide a probabilistic description of reflector performance using statistical methods to quantify uncertainty. We make design recommendations for nominal parameter values and their tolerances based on optimization under uncertainty using multiple methods.
Genomic Selection in Plant Breeding: Methods, Models, and Perspectives.
Crossa, José; Pérez-Rodríguez, Paulino; Cuevas, Jaime; Montesinos-López, Osval; Jarquín, Diego; de Los Campos, Gustavo; Burgueño, Juan; González-Camacho, Juan M; Pérez-Elizalde, Sergio; Beyene, Yoseph; Dreisigacker, Susanne; Singh, Ravi; Zhang, Xuecai; Gowda, Manje; Roorkiwal, Manish; Rutkoski, Jessica; Varshney, Rajeev K
2017-11-01
Genomic selection (GS) facilitates the rapid selection of superior genotypes and accelerates the breeding cycle. In this review, we discuss the history, principles, and basis of GS and genomic-enabled prediction (GP) as well as the genetics and statistical complexities of GP models, including genomic genotype×environment (G×E) interactions. We also examine the accuracy of GP models and methods for two cereal crops and two legume crops based on random cross-validation. GS applied to maize breeding has shown tangible genetic gains. Based on GP results, we speculate how GS in germplasm enhancement (i.e., prebreeding) programs could accelerate the flow of genes from gene bank accessions to elite lines. Recent advances in hyperspectral image technology could be combined with GS and pedigree-assisted breeding. Copyright © 2017 Elsevier Ltd. All rights reserved.
Assessment of HIV testing among young methamphetamine users in Muse, Northern Shan State, Myanmar
2014-01-01
Background Methamphetamine (MA) use has a strong correlation with risky sexual behaviors, and thus may be triggering the growing HIV epidemic in Myanmar. Although methamphetamine use is a serious public health concern, only a few studies have examined HIV testing among young drug users. This study aimed to examine how predisposing, enabling and need factors affect HIV testing among young MA users. Methods A cross-sectional study was conducted from January to March 2013 in Muse city in the Northern Shan State of Myanmar. Using a respondent-driven sampling method, 776 MA users aged 18-24 years were recruited. The main outcome of interest was whether participants had ever been tested for HIV. Descriptive statistics and multivariate logistic regression were applied in this study. Results Approximately 14.7% of young MA users had ever been tested for HIV. Significant positive predictors of HIV testing included predisposing factors such as being a female MA user, having had higher education, and currently living with one’s spouse/sexual partner. Significant enabling factors included being employed and having ever visited NGO clinics or met NGO workers. Significant need factors were having ever been diagnosed with an STI and having ever wanted to receive help to stop drug use. Conclusions Predisposing, enabling and need factors were significant contributors affecting uptake of HIV testing among young MA users. Integrating HIV testing into STI treatment programs, alongside general expansion of HIV testing services may be effective in increasing HIV testing uptake among young MA users. PMID:25042697
Giordano, Bruno L.; Kayser, Christoph; Rousselet, Guillaume A.; Gross, Joachim; Schyns, Philippe G.
2016-01-01
Abstract We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open‐source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541–1573, 2017. © 2016 Wiley Periodicals, Inc. PMID:27860095
Pauli structures arising from confined particles interacting via a statistical potential
NASA Astrophysics Data System (ADS)
Batle, Josep; Ciftja, Orion; Farouk, Ahmed; Alkhambashi, Majid; Abdalla, Soliman
2017-09-01
There have been suggestions that the Pauli exclusion principle alone can lead a non-interacting (free) system of identical fermions to form crystalline structures dubbed Pauli crystals. Single-shot imaging experiments for the case of ultra-cold systems of free spin-polarized fermionic atoms in a two-dimensional harmonic trap appear to show geometric arrangements that cannot be characterized as Wigner crystals. This work explores this idea and considers a well-known approach that enables one to treat a quantum system of free fermions as a system of classical particles interacting with a statistical interaction potential. The model under consideration, though classical in nature, incorporates the quantum statistics by endowing the classical particles with an effective interaction potential. The reasonable expectation is that possible Pauli crystal features seen in experiments may manifest in this model that captures the correct quantum statistics as a first order correction. We use the Monte Carlo simulated annealing method to obtain the most stable configurations of finite two-dimensional systems of confined particles that interact with an appropriate statistical repulsion potential. We consider both an isotropic harmonic and a hard-wall confinement potential. Despite minor differences, the most stable configurations observed in our model correspond to the reported Pauli crystals in single-shot imaging experiments of free spin-polarized fermions in a harmonic trap. The crystalline configurations observed appear to be different from the expected classical Wigner crystal structures that would emerge should the confined classical particles had interacted with a pair-wise Coulomb repulsion.
Desensitized Optimal Filtering and Sensor Fusion Toolkit
NASA Technical Reports Server (NTRS)
Karlgaard, Christopher D.
2015-01-01
Analytical Mechanics Associates, Inc., has developed a software toolkit that filters and processes navigational data from multiple sensor sources. A key component of the toolkit is a trajectory optimization technique that reduces the sensitivity of Kalman filters with respect to model parameter uncertainties. The sensor fusion toolkit also integrates recent advances in adaptive Kalman and sigma-point filters for non-Gaussian problems with error statistics. This Phase II effort provides new filtering and sensor fusion techniques in a convenient package that can be used as a stand-alone application for ground support and/or onboard use. Its modular architecture enables ready integration with existing tools. A suite of sensor models and noise distribution as well as Monte Carlo analysis capability are included to enable statistical performance evaluations.
Interactive (statistical) visualisation and exploration of a billion objects with vaex
NASA Astrophysics Data System (ADS)
Breddels, M. A.
2017-06-01
With new catalogues arriving such as the Gaia DR1, containing more than a billion objects, new methods of handling and visualizing these data volumes are needed. We show that by calculating statistics on a regular (N-dimensional) grid, visualizations of a billion objects can be done within a second on a modern desktop computer. This is achieved using memory mapping of hdf5 files together with a simple binning algorithm, which are part of a Python library called vaex. This enables efficient exploration or large datasets interactively, making science exploration of large catalogues feasible. Vaex is a Python library and an application, which allows for interactive exploration and visualization. The motivation for developing vaex is the catalogue of the Gaia satellite, however, vaex can also be used on SPH or N-body simulations, any other (future) catalogues such as SDSS, Pan-STARRS, LSST, etc. or other tabular data. The homepage for vaex is http://vaex.astro.rug.nl.
Complex networks as a unified framework for descriptive analysis and predictive modeling in climate
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steinhaeuser, Karsten J K; Chawla, Nitesh; Ganguly, Auroop R
The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim towards characterizing observed phenomena as well as discovering new knowledge in the climate domain. Specifically, we posit that complex networks are well-suited for both descriptive analysis and predictive modeling tasks. We show that the structural properties of climate networks have useful interpretation within the domain. Further,more » we extract clusters from these networks and demonstrate their predictive power as climate indices. Our experimental results establish that the network clusters are statistically significantly better predictors than clusters derived using a more traditional clustering approach. Using complex networks as data representation thus enables the unique opportunity for descriptive and predictive modeling to inform each other.« less
Safe and effective nursing shift handover with NURSEPASS: An interrupted time series.
Smeulers, Marian; Dolman, Christine D; Atema, Danielle; van Dieren, Susan; Maaskant, Jolanda M; Vermeulen, Hester
2016-11-01
Implementation of a locally developed evidence based nursing shift handover blueprint with a bedside-safety-check to determine the effect size on quality of handover. A mixed methods design with: (1) an interrupted time series analysis to determine the effect on handover quality in six domains; (2) descriptive statistics to analyze the intercepted discrepancies by the bedside-safety-check; (3) evaluation sessions to gather experiences with the new handover process. We observed a continued trend of improvement in handover quality and a significant improvement in two domains of handover: organization/efficiency and contents. The bedside-safety-check successfully identified discrepancies on drains, intravenous medications, bandages or general condition and was highly appreciated. Use of the nursing shift handover blueprint showed promising results on effectiveness as well as on feasibility and acceptability. However, to enable long term measurement on effectiveness, evaluation with large scale interrupted times series or statistical process control is needed. Copyright © 2016 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blume-Kohout, Robin J; Scholten, Travis L.
Quantum state tomography on a d-dimensional system demands resources that grow rapidly with d. They may be reduced by using model selection to tailor the number of parameters in the model (i.e., the size of the density matrix). Most model selection methods typically rely on a test statistic and a null theory that describes its behavior when two models are equally good. Here, we consider the loglikelihood ratio. Because of the positivity constraint ρ ≥ 0, quantum state space does not generally satisfy local asymptotic normality (LAN), meaning the classical null theory for the loglikelihood ratio (the Wilks theorem) shouldmore » not be used. Thus, understanding and quantifying how positivity affects the null behavior of this test statistic is necessary for its use in model selection for state tomography. We define a new generalization of LAN, metric-projected LAN, show that quantum state space satisfies it, and derive a replacement for the Wilks theorem. In addition to enabling reliable model selection, our results shed more light on the qualitative effects of the positivity constraint on state tomography.« less
Mitigating Provider Uncertainty in Service Provision Contracts
NASA Astrophysics Data System (ADS)
Smith, Chris; van Moorsel, Aad
Uncertainty is an inherent property of open, distributed and multiparty systems. The viability of the mutually beneficial relationships which motivate these systems relies on rational decision-making by each constituent party under uncertainty. Service provision in distributed systems is one such relationship. Uncertainty is experienced by the service provider in his ability to deliver a service with selected quality level guarantees due to inherent non-determinism, such as load fluctuations and hardware failures. Statistical estimators utilized to model this non-determinism introduce additional uncertainty through sampling error. Inability of the provider to accurately model and analyze uncertainty in the quality level guarantees can result in the formation of sub-optimal service provision contracts. Emblematic consequences include loss of revenue, inefficient resource utilization and erosion of reputation and consumer trust. We propose a utility model for contract-based service provision to provide a systematic approach to optimal service provision contract formation under uncertainty. Performance prediction methods to enable the derivation of statistical estimators for quality level are introduced, with analysis of their resultant accuracy and cost.
Conesa, Claudia; García-Breijo, Eduardo; Loeff, Edwin; Seguí, Lucía; Fito, Pedro; Laguarda-Miró, Nicolás
2015-01-01
Electrochemical Impedance Spectroscopy (EIS) has been used to develop a methodology able to identify and quantify fermentable sugars present in the enzymatic hydrolysis phase of second-generation bioethanol production from pineapple waste. Thus, a low-cost non-destructive system consisting of a stainless double needle electrode associated to an electronic equipment that allows the implementation of EIS was developed. In order to validate the system, different concentrations of glucose, fructose and sucrose were added to the pineapple waste and analyzed both individually and in combination. Next, statistical data treatment enabled the design of specific Artificial Neural Networks-based mathematical models for each one of the studied sugars and their respective combinations. The obtained prediction models are robust and reliable and they are considered statistically valid (CCR% > 93.443%). These results allow us to introduce this EIS-based technique as an easy, fast, non-destructive, and in-situ alternative to the traditional laboratory methods for enzymatic hydrolysis monitoring. PMID:26378537
Optimized design and analysis of preclinical intervention studies in vivo
Laajala, Teemu D.; Jumppanen, Mikael; Huhtaniemi, Riikka; Fey, Vidal; Kaur, Amanpreet; Knuuttila, Matias; Aho, Eija; Oksala, Riikka; Westermarck, Jukka; Mäkelä, Sari; Poutanen, Matti; Aittokallio, Tero
2016-01-01
Recent reports have called into question the reproducibility, validity and translatability of the preclinical animal studies due to limitations in their experimental design and statistical analysis. To this end, we implemented a matching-based modelling approach for optimal intervention group allocation, randomization and power calculations, which takes full account of the complex animal characteristics at baseline prior to interventions. In prostate cancer xenograft studies, the method effectively normalized the confounding baseline variability, and resulted in animal allocations which were supported by RNA-seq profiling of the individual tumours. The matching information increased the statistical power to detect true treatment effects at smaller sample sizes in two castration-resistant prostate cancer models, thereby leading to saving of both animal lives and research costs. The novel modelling approach and its open-source and web-based software implementations enable the researchers to conduct adequately-powered and fully-blinded preclinical intervention studies, with the aim to accelerate the discovery of new therapeutic interventions. PMID:27480578
Optimized design and analysis of preclinical intervention studies in vivo.
Laajala, Teemu D; Jumppanen, Mikael; Huhtaniemi, Riikka; Fey, Vidal; Kaur, Amanpreet; Knuuttila, Matias; Aho, Eija; Oksala, Riikka; Westermarck, Jukka; Mäkelä, Sari; Poutanen, Matti; Aittokallio, Tero
2016-08-02
Recent reports have called into question the reproducibility, validity and translatability of the preclinical animal studies due to limitations in their experimental design and statistical analysis. To this end, we implemented a matching-based modelling approach for optimal intervention group allocation, randomization and power calculations, which takes full account of the complex animal characteristics at baseline prior to interventions. In prostate cancer xenograft studies, the method effectively normalized the confounding baseline variability, and resulted in animal allocations which were supported by RNA-seq profiling of the individual tumours. The matching information increased the statistical power to detect true treatment effects at smaller sample sizes in two castration-resistant prostate cancer models, thereby leading to saving of both animal lives and research costs. The novel modelling approach and its open-source and web-based software implementations enable the researchers to conduct adequately-powered and fully-blinded preclinical intervention studies, with the aim to accelerate the discovery of new therapeutic interventions.
Selmke, Markus; Khadka, Utsab; Bregulla, Andreas P; Cichos, Frank; Yang, Haw
2018-04-18
Photon nudging is a new experimental method which enables the force-free manipulation and localization of individual self-propelled artificial micro-swimmers in fluidic environments. It uses a weak laser to stochastically and adaptively turn on and off the swimmer's propulsion when the swimmer, through rotational diffusion, points towards or away from its target, respectively. This contribution presents a theoretical framework for the statistics of both 2D and 3D controls. The main results are: the on- and off-time distributions for the controlling laser, the arrival time statistics for the swimmer to reach a remote target, and how the experimentally accessible control parameters influence the control, e.g., the optimal acceptance angle for directed transport. The results are general in that they are independent of the propulsion or the actuation mechanisms. They provide a concrete physical picture for how a single artificial micro-swimmer could be navigated under thermal fluctuations-insights that could also be useful for understanding biological micro-swimmers.
Eddy, Sean R.
2008-01-01
Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments. PMID:18516236
Analysis of S-box in Image Encryption Using Root Mean Square Error Method
NASA Astrophysics Data System (ADS)
Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan
2012-07-01
The use of substitution boxes (S-boxes) in encryption applications has proven to be an effective nonlinear component in creating confusion and randomness. The S-box is evolving and many variants appear in literature, which include advanced encryption standard (AES) S-box, affine power affine (APA) S-box, Skipjack S-box, Gray S-box, Lui J S-box, residue prime number S-box, Xyi S-box, and S8 S-box. These S-boxes have algebraic and statistical properties which distinguish them from each other in terms of encryption strength. In some circumstances, the parameters from algebraic and statistical analysis yield results which do not provide clear evidence in distinguishing an S-box for an application to a particular set of data. In image encryption applications, the use of S-boxes needs special care because the visual analysis and perception of a viewer can sometimes identify artifacts embedded in the image. In addition to existing algebraic and statistical analysis already used for image encryption applications, we propose an application of root mean square error technique, which further elaborates the results and enables the analyst to vividly distinguish between the performances of various S-boxes. While the use of the root mean square error analysis in statistics has proven to be effective in determining the difference in original data and the processed data, its use in image encryption has shown promising results in estimating the strength of the encryption method. In this paper, we show the application of the root mean square error analysis to S-box image encryption. The parameters from this analysis are used in determining the strength of S-boxes
Testing for voter rigging in small polling stations
Jimenez, Raúl; Hidalgo, Manuel; Klimek, Peter
2017-01-01
Nowadays, a large number of countries combine formal democratic institutions with authoritarian practices. Although in these countries the ruling elites may receive considerable voter support, they often use several manipulation tools to control election outcomes. A common practice of these regimes is the coercion and mobilization of large numbers of voters. This electoral irregularity is known as voter rigging, distinguishing it from vote rigging, which involves ballot stuffing or stealing. We develop a statistical test to quantify the extent to which the results of a particular election display traces of voter rigging. Our key hypothesis is that small polling stations are more susceptible to voter rigging because it is easier to identify opposing individuals, there are fewer eyewitnesses, and interested parties might reasonably expect fewer visits from election observers. We devise a general statistical method for testing whether voting behavior in small polling stations is significantly different from the behavior in their neighbor stations in a way that is consistent with the widespread occurrence of voter rigging. On the basis of a comparative analysis, the method enables third parties to conclude that an explanation other than simple variability is needed to explain geographic heterogeneities in vote preferences. We analyze 21 elections in 10 countries and find significant statistical anomalies compatible with voter rigging in Russia from 2007 to 2011, in Venezuela from 2006 to 2013, and in Uganda in 2011. Particularly disturbing is the case of Venezuela, where the smallest polling stations were decisive to the outcome of the 2013 presidential elections. PMID:28695193
Testing for voter rigging in small polling stations.
Jimenez, Raúl; Hidalgo, Manuel; Klimek, Peter
2017-06-01
Nowadays, a large number of countries combine formal democratic institutions with authoritarian practices. Although in these countries the ruling elites may receive considerable voter support, they often use several manipulation tools to control election outcomes. A common practice of these regimes is the coercion and mobilization of large numbers of voters. This electoral irregularity is known as voter rigging, distinguishing it from vote rigging, which involves ballot stuffing or stealing. We develop a statistical test to quantify the extent to which the results of a particular election display traces of voter rigging. Our key hypothesis is that small polling stations are more susceptible to voter rigging because it is easier to identify opposing individuals, there are fewer eyewitnesses, and interested parties might reasonably expect fewer visits from election observers. We devise a general statistical method for testing whether voting behavior in small polling stations is significantly different from the behavior in their neighbor stations in a way that is consistent with the widespread occurrence of voter rigging. On the basis of a comparative analysis, the method enables third parties to conclude that an explanation other than simple variability is needed to explain geographic heterogeneities in vote preferences. We analyze 21 elections in 10 countries and find significant statistical anomalies compatible with voter rigging in Russia from 2007 to 2011, in Venezuela from 2006 to 2013, and in Uganda in 2011. Particularly disturbing is the case of Venezuela, where the smallest polling stations were decisive to the outcome of the 2013 presidential elections.
Metaplot: a novel stata graph for assessing heterogeneity at a glance.
Poorolajal, J; Mahmoodi, M; Majdzadeh, R; Fotouhi, A
2010-01-01
Heterogeneity is usually a major concern in meta-analysis. Although there are some statistical approaches for assessing variability across studies, here we present a new approach to heterogeneity using "MetaPlot" that investigate the influence of a single study on the overall heterogeneity. MetaPlot is a two-way (x, y) graph, which can be considered as a complementary graphical approach for testing heterogeneity. This method shows graphically as well as numerically the results of an influence analysis, in which Higgins' I(2) statistic with 95% (Confidence interval) CI are computed omitting one study in each turn and then are plotted against reciprocal of standard error (1/SE) or "precision". In this graph, "1/SE" lies on x axis and "I(2) results" lies on y axe. Having a first glance at MetaPlot, one can predict to what extent omission of a single study may influence the overall heterogeneity. The precision on x-axis enables us to distinguish the size of each trial. The graph describes I(2) statistic with 95% CI graphically as well as numerically in one view for prompt comparison. It is possible to implement MetaPlot for meta-analysis of different types of outcome data and summary measures. This method presents a simple graphical approach to identify an outlier and its effect on overall heterogeneity at a glance. We wish to suggest MetaPlot to Stata experts to prepare its module for the software.
Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan
2009-01-01
The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
Inverse tissue mechanics of cell monolayer expansion.
Kondo, Yohei; Aoki, Kazuhiro; Ishii, Shin
2018-03-01
Living tissues undergo deformation during morphogenesis. In this process, cells generate mechanical forces that drive the coordinated cell motion and shape changes. Recent advances in experimental and theoretical techniques have enabled in situ measurement of the mechanical forces, but the characterization of mechanical properties that determine how these forces quantitatively affect tissue deformation remains challenging, and this represents a major obstacle for the complete understanding of morphogenesis. Here, we proposed a non-invasive reverse-engineering approach for the estimation of the mechanical properties, by combining tissue mechanics modeling and statistical machine learning. Our strategy is to model the tissue as a continuum mechanical system and to use passive observations of spontaneous tissue deformation and force fields to statistically estimate the model parameters. This method was applied to the analysis of the collective migration of Madin-Darby canine kidney cells, and the tissue flow and force were simultaneously observed by the phase contrast imaging and traction force microscopy. We found that our monolayer elastic model, whose elastic moduli were reverse-engineered, enabled a long-term forecast of the traction force fields when given the tissue flow fields, indicating that the elasticity contributes to the evolution of the tissue stress. Furthermore, we investigated the tissues in which myosin was inhibited by blebbistatin treatment, and observed a several-fold reduction in the elastic moduli. The obtained results validate our framework, which paves the way to the estimation of mechanical properties of living tissues during morphogenesis.
NASA Astrophysics Data System (ADS)
Sauchyn, David; Ilich, Nesa
2017-11-01
We combined the methods and advantages of stochastic hydrology and paleohydrology to estimate 900 years of weekly flows for the North and South Saskatchewan Rivers at Edmonton and Medicine Hat, Alberta, respectively. Regression models of water-year streamflow were constructed using historical naturalized flow data and a pool of 196 tree-ring (earlywood, latewood, and annual) ring-width chronologies from 76 sites. The tree-ring models accounted for up to 80% of the interannual variability in historical naturalized flows. We developed a new algorithm for generating stochastic time series of weekly flows constrained by the statistical properties of both the historical record and proxy streamflow data, and by the necessary condition that weekly flows correlate between the end of a year and the start of the next. A second innovation, enabled by the density of our tree-ring network, is to derive the paleohydrology from an ensemble of 100 statistically significant reconstructions at each gauge. Using paleoclimatic data to generate long series of weekly flow estimates augments the short historical record with an expanded range of hydrologic variability, including sequences of wet and dry years of greater length and severity. This unique hydrometric time series will enable evaluation of the reliability of current water supply and management systems given the range of hydroclimatic variability and extremes contained in the stochastic paleohydrology. It also could inform evaluation of the uncertainty in climate model projections, given that internal hydroclimatic variability is the dominant source of uncertainty.
NASA Astrophysics Data System (ADS)
Rougier, Simon; Puissant, Anne; Stumpf, André; Lachiche, Nicolas
2016-09-01
Vegetation monitoring is becoming a major issue in the urban environment due to the services they procure and necessitates an accurate and up to date mapping. Very High Resolution satellite images enable a detailed mapping of the urban tree and herbaceous vegetation. Several supervised classifications with statistical learning techniques have provided good results for the detection of urban vegetation but necessitate a large amount of training data. In this context, this study proposes to investigate the performances of different sampling strategies in order to reduce the number of examples needed. Two windows based active learning algorithms from state-of-art are compared to a classical stratified random sampling and a third combining active learning and stratified strategies is proposed. The efficiency of these strategies is evaluated on two medium size French cities, Strasbourg and Rennes, associated to different datasets. Results demonstrate that classical stratified random sampling can in some cases be just as effective as active learning methods and that it should be used more frequently to evaluate new active learning methods. Moreover, the active learning strategies proposed in this work enables to reduce the computational runtime by selecting multiple windows at each iteration without increasing the number of windows needed.
Developing and validating a nutrition knowledge questionnaire: key methods and considerations.
Trakman, Gina Louise; Forsyth, Adrienne; Hoye, Russell; Belski, Regina
2017-10-01
To outline key statistical considerations and detailed methodologies for the development and evaluation of a valid and reliable nutrition knowledge questionnaire. Literature on questionnaire development in a range of fields was reviewed and a set of evidence-based guidelines specific to the creation of a nutrition knowledge questionnaire have been developed. The recommendations describe key qualitative methods and statistical considerations, and include relevant examples from previous papers and existing nutrition knowledge questionnaires. Where details have been omitted for the sake of brevity, the reader has been directed to suitable references. We recommend an eight-step methodology for nutrition knowledge questionnaire development as follows: (i) definition of the construct and development of a test plan; (ii) generation of the item pool; (iii) choice of the scoring system and response format; (iv) assessment of content validity; (v) assessment of face validity; (vi) purification of the scale using item analysis, including item characteristics, difficulty and discrimination; (vii) evaluation of the scale including its factor structure and internal reliability, or Rasch analysis, including assessment of dimensionality and internal reliability; and (viii) gathering of data to re-examine the questionnaire's properties, assess temporal stability and confirm construct validity. Several of these methods have previously been overlooked. The measurement of nutrition knowledge is an important consideration for individuals working in the nutrition field. Improved methods in the development of nutrition knowledge questionnaires, such as the use of factor analysis or Rasch analysis, will enable more confidence in reported measures of nutrition knowledge.
Mihic, Marko M; Todorovic, Marija Lj; Obradovic, Vladimir Lj; Mitrovic, Zorica M
2016-01-01
Background Social services aimed at the elderly are facing great challenges caused by progressive aging of the global population but also by the constant pressure to spend funds in a rational manner. Purpose This paper focuses on analyzing the investments into human resources aimed at enhancing home care for the elderly since many countries have recorded progress in the area over the past years. The goal of this paper is to stress the significance of performing an economic analysis of the investment. Methods This paper combines statistical analysis methods such as correlation and regression analysis, methods of economic analysis, and scenario method. Results The economic analysis of investing in human resources for home care service in Serbia showed that the both scenarios of investing in either additional home care hours or more beneficiaries are cost-efficient. However, the optimal solution with the positive (and the highest) value of economic net present value criterion is to invest in human resources to boost the number of home care hours from 6 to 8 hours per week and increase the number of the beneficiaries to 33%. Conclusion This paper shows how the statistical and economic analysis results can be used to evaluate different scenarios and enable quality decision-making based on exact data in order to improve health and quality of life of the elderly and spend funds in a rational manner. PMID:26869778
Deconstructing multivariate decoding for the study of brain function.
Hebart, Martin N; Baker, Chris I
2017-08-04
Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function. Copyright © 2017. Published by Elsevier Inc.
Temporal slow-growth formulation for direct numerical simulation of compressible wall-bounded flows
NASA Astrophysics Data System (ADS)
Topalian, Victor; Oliver, Todd A.; Ulerich, Rhys; Moser, Robert D.
2017-08-01
A slow-growth formulation for DNS of wall-bounded turbulent flow is developed and demonstrated to enable extension of slow-growth modeling concepts to wall-bounded flows with complex physics. As in previous slow-growth approaches, the formulation assumes scale separation between the fast scales of turbulence and the slow evolution of statistics such as the mean flow. This separation enables the development of approaches where the fast scales of turbulence are directly simulated while the forcing provided by the slow evolution is modeled. The resulting model admits periodic boundary conditions in the streamwise direction, which avoids the need for extremely long domains and complex inflow conditions that typically accompany spatially developing simulations. Further, it enables the use of efficient Fourier numerics. Unlike previous approaches [Guarini, Moser, Shariff, and Wray, J. Fluid Mech. 414, 1 (2000), 10.1017/S0022112000008466; Maeder, Adams, and Kleiser, J. Fluid Mech. 429, 187 (2001), 10.1017/S0022112000002718; Spalart, J. Fluid Mech. 187, 61 (1988), 10.1017/S0022112088000345], the present approach is based on a temporally evolving boundary layer and is specifically tailored to give results for calibration and validation of Reynolds-averaged Navier-Stokes (RANS) turbulence models. The use of a temporal homogenization simplifies the modeling, enabling straightforward extension to flows with complicating features, including cold and blowing walls. To generate data useful for calibration and validation of RANS models, special care is taken to ensure that the mean slow-growth forcing is closed in terms of the mean and other quantities that appear in standard RANS models, ensuring that there is no confounding between typical RANS closures and additional closures required for the slow-growth problem. The performance of the method is demonstrated on two problems: an essentially incompressible, zero-pressure-gradient boundary layer and a transonic boundary layer over a cooled, transpiring wall. The results show that the approach produces flows that are qualitatively similar to other slow-growth methods as well as spatially developing simulations and that the method can be a useful tool in investigating wall-bounded flows with complex physics.
Estimating short-run and long-run interaction mechanisms in interictal state.
Ozkaya, Ata; Korürek, Mehmet
2010-04-01
We address the issue of analyzing electroencephalogram (EEG) from seizure patients in order to test, model and determine the statistical properties that distinguish between EEG states (interictal, pre-ictal, ictal) by introducing a new class of time series analysis methods. In the present study: firstly, we employ statistical methods to determine the non-stationary behavior of focal interictal epileptiform series within very short time intervals; secondly, for such intervals that are deemed non-stationary we suggest the concept of Autoregressive Integrated Moving Average (ARIMA) process modelling, well known in time series analysis. We finally address the queries of causal relationships between epileptic states and between brain areas during epileptiform activity. We estimate the interaction between different EEG series (channels) in short time intervals by performing Granger-causality analysis and also estimate such interaction in long time intervals by employing Cointegration analysis, both analysis methods are well-known in econometrics. Here we find: first, that the causal relationship between neuronal assemblies can be identified according to the duration and the direction of their possible mutual influences; second, that although the estimated bidirectional causality in short time intervals yields that the neuronal ensembles positively affect each other, in long time intervals neither of them is affected (increasing amplitudes) from this relationship. Moreover, Cointegration analysis of the EEG series enables us to identify whether there is a causal link from the interictal state to ictal state.
Mihic, Marko M; Todorovic, Marija Lj; Obradovic, Vladimir Lj; Mitrovic, Zorica M
2016-01-01
Social services aimed at the elderly are facing great challenges caused by progressive aging of the global population but also by the constant pressure to spend funds in a rational manner. This paper focuses on analyzing the investments into human resources aimed at enhancing home care for the elderly since many countries have recorded progress in the area over the past years. The goal of this paper is to stress the significance of performing an economic analysis of the investment. This paper combines statistical analysis methods such as correlation and regression analysis, methods of economic analysis, and scenario method. The economic analysis of investing in human resources for home care service in Serbia showed that the both scenarios of investing in either additional home care hours or more beneficiaries are cost-efficient. However, the optimal solution with the positive (and the highest) value of economic net present value criterion is to invest in human resources to boost the number of home care hours from 6 to 8 hours per week and increase the number of the beneficiaries to 33%. This paper shows how the statistical and economic analysis results can be used to evaluate different scenarios and enable quality decision-making based on exact data in order to improve health and quality of life of the elderly and spend funds in a rational manner.
NASA Astrophysics Data System (ADS)
Bhakat, Soumendranath; Söderhjelm, Pär
2017-01-01
The funnel metadynamics method enables rigorous calculation of the potential of mean force along an arbitrary binding path and thereby evaluation of the absolute binding free energy. A problem of such physical paths is that the mechanism characterizing the binding process is not always obvious. In particular, it might involve reorganization of the solvent in the binding site, which is not easily captured with a few geometrically defined collective variables that can be used for biasing. In this paper, we propose and test a simple method to resolve this trapped-water problem by dividing the process into an artificial host-desolvation step and an actual binding step. We show that, under certain circumstances, the contribution from the desolvation step can be calculated without introducing further statistical errors. We apply the method to the problem of predicting host-guest binding free energies in the SAMPL5 blind challenge, using two octa-acid hosts and six guest molecules. For one of the hosts, well-converged results are obtained and the prediction of relative binding free energies is the best among all the SAMPL5 submissions. For the other host, which has a narrower binding pocket, the statistical uncertainties are slightly higher; longer simulations would therefore be needed to obtain conclusive results.
Salman, A; Shufan, E; Zeiri, L; Huleihel, M
2014-07-01
Herpes viruses are involved in a variety of human disorders. Herpes Simplex Virus type 1 (HSV-1) is the most common among the herpes viruses and is primarily involved in human cutaneous disorders. Although the symptoms of infection by this virus are usually minimal, in some cases HSV-1 might cause serious infections in the eyes and the brain leading to blindness and even death. A drug, acyclovir, is available to counter this virus. The drug is most effective when used during the early stages of the infection, which makes early detection and identification of these viral infections highly important for successful treatment. In the present study we evaluated the potential of Raman spectroscopy as a sensitive, rapid, and reliable method for the detection and identification of HSV-1 viral infections in cell cultures. Using Raman spectroscopy followed by advanced statistical methods enabled us, with sensitivity approaching 100%, to differentiate between a control group of Vero cells and another group of Vero cells that had been infected with HSV-1. Cell sites that were "rich in membrane" gave the best results in the differentiation between the two categories. The major changes were observed in the 1195-1726 cm(-1) range of the Raman spectrum. The features in this range are attributed mainly to proteins, lipids, and nucleic acids. Copyright © 2014. Published by Elsevier Inc.
Mechanistic analysis of challenge-response experiments.
Shotwell, M S; Drake, K J; Sidorov, V Y; Wikswo, J P
2013-09-01
We present an application of mechanistic modeling and nonlinear longitudinal regression in the context of biomedical response-to-challenge experiments, a field where these methods are underutilized. In this type of experiment, a system is studied by imposing an experimental challenge, and then observing its response. The combination of mechanistic modeling and nonlinear longitudinal regression has brought new insight, and revealed an unexpected opportunity for optimal design. Specifically, the mechanistic aspect of our approach enables the optimal design of experimental challenge characteristics (e.g., intensity, duration). This article lays some groundwork for this approach. We consider a series of experiments wherein an isolated rabbit heart is challenged with intermittent anoxia. The heart responds to the challenge onset, and recovers when the challenge ends. The mean response is modeled by a system of differential equations that describe a candidate mechanism for cardiac response to anoxia challenge. The cardiac system behaves more variably when challenged than when at rest. Hence, observations arising from this experiment exhibit complex heteroscedasticity and sharp changes in central tendency. We present evidence that an asymptotic statistical inference strategy may fail to adequately account for statistical uncertainty. Two alternative methods are critiqued qualitatively (i.e., for utility in the current context), and quantitatively using an innovative Monte-Carlo method. We conclude with a discussion of the exciting opportunities in optimal design of response-to-challenge experiments. © 2013, The International Biometric Society.
Comparison of four statistical and machine learning methods for crash severity prediction.
Iranitalab, Amirfarrokh; Khattak, Aemal
2017-11-01
Crash severity prediction models enable different agencies to predict the severity of a reported crash with unknown severity or the severity of crashes that may be expected to occur sometime in the future. This paper had three main objectives: comparison of the performance of four statistical and machine learning methods including Multinomial Logit (MNL), Nearest Neighbor Classification (NNC), Support Vector Machines (SVM) and Random Forests (RF), in predicting traffic crash severity; developing a crash costs-based approach for comparison of crash severity prediction methods; and investigating the effects of data clustering methods comprising K-means Clustering (KC) and Latent Class Clustering (LCC), on the performance of crash severity prediction models. The 2012-2015 reported crash data from Nebraska, United States was obtained and two-vehicle crashes were extracted as the analysis data. The dataset was split into training/estimation (2012-2014) and validation (2015) subsets. The four prediction methods were trained/estimated using the training/estimation dataset and the correct prediction rates for each crash severity level, overall correct prediction rate and a proposed crash costs-based accuracy measure were obtained for the validation dataset. The correct prediction rates and the proposed approach showed NNC had the best prediction performance in overall and in more severe crashes. RF and SVM had the next two sufficient performances and MNL was the weakest method. Data clustering did not affect the prediction results of SVM, but KC improved the prediction performance of MNL, NNC and RF, while LCC caused improvement in MNL and RF but weakened the performance of NNC. Overall correct prediction rate had almost the exact opposite results compared to the proposed approach, showing that neglecting the crash costs can lead to misjudgment in choosing the right prediction method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Crispin, Ndedda; Wamae, Annah; Ndirangu, Meshack; Wamalwa, David; Wangalwa, Gilbert; Watako, Patrick; Mbiti, Elijah
2012-01-01
Objective: Appropriate performance of home visits facilitates adoption of best practices at home and increased demand for facility based services. Methods: It was a cross-sectional study in which community health workers were observed conducting home visits during pregnancy. Data was collected using a structured questionnaire and the Consultant Quality Index (CQI-2 tool) on record keeping, use of job aids, counselling, client satisfaction and client enablement. Descriptive and inferential statistics were used. Relationships were determined using chi square and odds ratios. Results: The study showed significant relationships of age with good record keeping (p = 0.0001), appropriate use of job aids (p=0.0001), client satisfaction (p = 0.018) and client enablement (p = 0.001). Male CHWs were 1.6 times more likely to keep better records than females (OR 1.64 CI (1.02-2.63), while females were more likely to counsel and enable their clients OR 0.42 CI (0.25-0.71) and OR 0.29 CI (012-070) respectively when compared to men. Moreover, higher levels of education were associated with good record keeping OR 0.30 CI (0.19-0.49), p=0.0001; appropriate use of job aids OR 0.30 CI (0.15-0.61) and to appropriately counsel their clients OR 0.34 CI (0.20-0.58) than their lower literacy level counterparts. Experience of CHWs was associated with appropriate use of job aids (p = 0.049); client satisfaction (p = 0.0001) and client enablement (p = 0.032). Conclusions: Socio-demographic characteristics of community health workers affect the performance of home visits in various ways. The study also confirmed that CHWs with lower literacy levels satisfy and enable their clients effectively. PMID:22980380
Statistical image-domain multimaterial decomposition for dual-energy CT.
Xue, Yi; Ruan, Ruoshui; Hu, Xiuhua; Kuang, Yu; Wang, Jing; Long, Yong; Niu, Tianye
2017-03-01
Dual-energy CT (DECT) enhances tissue characterization because of its basis material decomposition capability. In addition to conventional two-material decomposition from DECT measurements, multimaterial decomposition (MMD) is required in many clinical applications. To solve the ill-posed problem of reconstructing multi-material images from dual-energy measurements, additional constraints are incorporated into the formulation, including volume and mass conservation and the assumptions that there are at most three materials in each pixel and various material types among pixels. The recently proposed flexible image-domain MMD method decomposes pixels sequentially into multiple basis materials using a direct inversion scheme which leads to magnified noise in the material images. In this paper, we propose a statistical image-domain MMD method for DECT to suppress the noise. The proposed method applies penalized weighted least-square (PWLS) reconstruction with a negative log-likelihood term and edge-preserving regularization for each material. The statistical weight is determined by a data-based method accounting for the noise variance of high- and low-energy CT images. We apply the optimization transfer principles to design a serial of pixel-wise separable quadratic surrogates (PWSQS) functions which monotonically decrease the cost function. The separability in each pixel enables the simultaneous update of all pixels. The proposed method is evaluated on a digital phantom, Catphan©600 phantom and three patients (pelvis, head, and thigh). We also implement the direct inversion and low-pass filtration methods for a comparison purpose. Compared with the direct inversion method, the proposed method reduces noise standard deviation (STD) in soft tissue by 95.35% in the digital phantom study, by 88.01% in the Catphan©600 phantom study, by 92.45% in the pelvis patient study, by 60.21% in the head patient study, and by 81.22% in the thigh patient study, respectively. The overall volume fraction accuracy is improved by around 6.85%. Compared with the low-pass filtration method, the root-mean-square percentage error (RMSE(%)) of electron densities in the Catphan©600 phantom is decreased by 20.89%. As modulation transfer function (MTF) magnitude decreased to 50%, the proposed method increases the spatial resolution by an overall factor of 1.64 on the digital phantom, and 2.16 on the Catphan©600 phantom. The overall volume fraction accuracy is increased by 6.15%. We proposed a statistical image-domain MMD method using DECT measurements. The method successfully suppresses the magnified noise while faithfully retaining the quantification accuracy and anatomical structure in the decomposed material images. The proposed method is practical and promising for advanced clinical applications using DECT imaging. © 2017 American Association of Physicists in Medicine.
Generalized Full-Information Item Bifactor Analysis
Cai, Li; Yang, Ji Seung; Hansen, Mark
2011-01-01
Full-information item bifactor analysis is an important statistical method in psychological and educational measurement. Current methods are limited to single group analysis and inflexible in the types of item response models supported. We propose a flexible multiple-group item bifactor analysis framework that supports a variety of multidimensional item response theory models for an arbitrary mixing of dichotomous, ordinal, and nominal items. The extended item bifactor model also enables the estimation of latent variable means and variances when data from more than one group are present. Generalized user-defined parameter restrictions are permitted within or across groups. We derive an efficient full-information maximum marginal likelihood estimator. Our estimation method achieves substantial computational savings by extending Gibbons and Hedeker’s (1992) bifactor dimension reduction method so that the optimization of the marginal log-likelihood only requires two-dimensional integration regardless of the dimensionality of the latent variables. We use simulation studies to demonstrate the flexibility and accuracy of the proposed methods. We apply the model to study cross-country differences, including differential item functioning, using data from a large international education survey on mathematics literacy. PMID:21534682
Chan, Robin F.; Shabalin, Andrey A.; Xie, Lin Y.; Adkins, Daniel E.; Zhao, Min; Turecki, Gustavo; Clark, Shaunna L.; Aberg, Karolina A.
2017-01-01
Abstract Methylome-wide association studies are typically performed using microarray technologies that only assay a very small fraction of the CG methylome and entirely miss two forms of methylation that are common in brain and likely of particular relevance for neuroscience and psychiatric disorders. The alternative is to use whole genome bisulfite (WGB) sequencing but this approach is not yet practically feasible with sample sizes required for adequate statistical power. We argue for revisiting methylation enrichment methods that, provided optimal protocols are used, enable comprehensive, adequately powered and cost-effective genome-wide investigations of the brain methylome. To support our claim we use data showing that enrichment methods approximate the sensitivity obtained with WGB methods and with slightly better specificity. However, this performance is achieved at <5% of the reagent costs. Furthermore, because many more samples can be sequenced simultaneously, projects can be completed about 15 times faster. Currently the only viable option available for comprehensive brain methylome studies, enrichment methods may be critical for moving the field forward. PMID:28334972
Probabilistic determination of probe locations from distance data
Xu, Xiao-Ping; Slaughter, Brian D.; Volkmann, Niels
2013-01-01
Distance constraints, in principle, can be employed to determine information about the location of probes within a three-dimensional volume. Traditional methods for locating probes from distance constraints involve optimization of scoring functions that measure how well the probe location fits the distance data, exploring only a small subset of the scoring function landscape in the process. These methods are not guaranteed to find the global optimum and provide no means to relate the identified optimum to all other optima in scoring space. Here, we introduce a method for the location of probes from distance information that is based on probability calculus. This method allows exploration of the entire scoring space by directly combining probability functions representing the distance data and information about attachment sites. The approach is guaranteed to identify the global optimum and enables the derivation of confidence intervals for the probe location as well as statistical quantification of ambiguities. We apply the method to determine the location of a fluorescence probe using distances derived by FRET and show that the resulting location matches that independently derived by electron microscopy. PMID:23770585
Stanisavljevic, Dejana; Trajkovic, Goran; Marinkovic, Jelena; Bukumiric, Zoran; Cirkovic, Andja; Milic, Natasa
2014-01-01
Background Medical statistics has become important and relevant for future doctors, enabling them to practice evidence based medicine. Recent studies report that students’ attitudes towards statistics play an important role in their statistics achievements. The aim of the study was to test the psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS) in order to acquire a valid instrument to measure attitudes inside the Serbian educational context. Methods The validation study was performed on a cohort of 417 medical students who were enrolled in an obligatory introductory statistics course. The SATS adaptation was based on an internationally accepted methodology for translation and cultural adaptation. Psychometric properties of the Serbian version of the SATS were analyzed through the examination of factorial structure and internal consistency. Results Most medical students held positive attitudes towards statistics. The average total SATS score was above neutral (4.3±0.8), and varied from 1.9 to 6.2. Confirmatory factor analysis validated the six-factor structure of the questionnaire (Affect, Cognitive Competence, Value, Difficulty, Interest and Effort). Values for fit indices TLI (0.940) and CFI (0.961) were above the cut-off of ≥0.90. The RMSEA value of 0.064 (0.051–0.078) was below the suggested value of ≤0.08. Cronbach’s alpha of the entire scale was 0.90, indicating scale reliability. In a multivariate regression model, self-rating of ability in mathematics and current grade point average were significantly associated with the total SATS score after adjusting for age and gender. Conclusion Present study provided the evidence for the appropriate metric properties of the Serbian version of SATS. Confirmatory factor analysis validated the six-factor structure of the scale. The SATS might be reliable and a valid instrument for identifying medical students’ attitudes towards statistics in the Serbian educational context. PMID:25405489
Performance Comparison of SDN Solutions for Switching Dedicated Long-Haul Connections
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rao, Nageswara S
2016-01-01
We consider scenarios with two sites connected over a dedicated, long-haul connection that must quickly fail-over in response to degradations in host-to-host application performance. We present two methods for path fail-over using OpenFlowenabled switches: (a) a light-weight method that utilizes host scripts to monitor the application performance and dpctl API for switching, and (b) a generic method that uses two OpenDaylight (ODL) controllers and REST interfaces. The restoration dynamics of the application contain significant statistical variations due to the controllers, north interfaces and switches; in addition, the variety of vendor implementations further complicates the choice between different solutions. We presentmore » the impulse-response method to estimate the regressions of performance parameters, which enables a rigorous and objective comparison of different solutions. We describe testing results of the two methods, using TCP throughput and connection rtt as main parameters, over a testbed consisting of HP and Cisco switches connected over longhaul connections emulated in hardware by ANUE devices. The combination of analytical and experimental results demonstrates that dpctl method responds seconds faster than ODL method on average, while both methods restore TCP throughput.« less
Healing X-ray scattering images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Jiliang; Lhermitte, Julien; Tian, Ye
X-ray scattering images contain numerous gaps and defects arising from detector limitations and experimental configuration. Here, we present a method to heal X-ray scattering images, filling gaps in the data and removing defects in a physically meaningful manner. Unlike generic inpainting methods, this method is closely tuned to the expected structure of reciprocal-space data. In particular, we exploit statistical tests and symmetry analysis to identify the structure of an image; we then copy, average and interpolate measured data into gaps in a way that respects the identified structure and symmetry. Importantly, the underlying analysis methods provide useful characterization of structuresmore » present in the image, including the identification of diffuseversussharp features, anisotropy and symmetry. The presented method leverages known characteristics of reciprocal space, enabling physically reasonable reconstruction even with large image gaps. The method will correspondingly fail for images that violate these underlying assumptions. The method assumes point symmetry and is thus applicable to small-angle X-ray scattering (SAXS) data, but only to a subset of wide-angle data. Our method succeeds in filling gaps and healing defects in experimental images, including extending data beyond the original detector borders.« less
Healing X-ray scattering images
Liu, Jiliang; Lhermitte, Julien; Tian, Ye; ...
2017-05-24
X-ray scattering images contain numerous gaps and defects arising from detector limitations and experimental configuration. Here, we present a method to heal X-ray scattering images, filling gaps in the data and removing defects in a physically meaningful manner. Unlike generic inpainting methods, this method is closely tuned to the expected structure of reciprocal-space data. In particular, we exploit statistical tests and symmetry analysis to identify the structure of an image; we then copy, average and interpolate measured data into gaps in a way that respects the identified structure and symmetry. Importantly, the underlying analysis methods provide useful characterization of structuresmore » present in the image, including the identification of diffuseversussharp features, anisotropy and symmetry. The presented method leverages known characteristics of reciprocal space, enabling physically reasonable reconstruction even with large image gaps. The method will correspondingly fail for images that violate these underlying assumptions. The method assumes point symmetry and is thus applicable to small-angle X-ray scattering (SAXS) data, but only to a subset of wide-angle data. Our method succeeds in filling gaps and healing defects in experimental images, including extending data beyond the original detector borders.« less
Real-Time Ultrasound Segmentation, Analysis and Visualisation of Deep Cervical Muscle Structure.
Cunningham, Ryan J; Harding, Peter J; Loram, Ian D
2017-02-01
Despite widespread availability of ultrasound and a need for personalised muscle diagnosis (neck/back pain-injury, work related disorder, myopathies, neuropathies), robust, online segmentation of muscles within complex groups remains unsolved by existing methods. For example, Cervical Dystonia (CD) is a prevalent neurological condition causing painful spasticity in one or multiple muscles in the cervical muscle system. Clinicians currently have no method for targeting/monitoring treatment of deep muscles. Automated methods of muscle segmentation would enable clinicians to study, target, and monitor the deep cervical muscles via ultrasound. We have developed a method for segmenting five bilateral cervical muscles and the spine via ultrasound alone, in real-time. Magnetic Resonance Imaging (MRI) and ultrasound data were collected from 22 participants (age: 29.0±6.6, male: 12). To acquire ultrasound muscle segment labels, a novel multimodal registration method was developed, involving MRI image annotation, and shape registration to MRI-matched ultrasound images, via approximation of the tissue deformation. We then applied polynomial regression to transform our annotations and textures into a mean space, before using shape statistics to generate a texture-to-shape dictionary. For segmentation, test images were compared to dictionary textures giving an initial segmentation, and then we used a customized Active Shape Model to refine the fit. Using ultrasound alone, on unseen participants, our technique currently segments a single image in [Formula: see text] to over 86% accuracy (Jaccard index). We propose this approach is applicable generally to segment, extrapolate and visualise deep muscle structure, and analyse statistical features online.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chihi, Hayet; Galli, Alain; Ravenne, Christian
2000-03-15
The object of this study is to build a three-dimensional (3D) geometric model of the stratigraphic units of the margin of the Rhone River on the basis of geophysical investigations by a network of seismic profiles at sea. The geometry of these units is described by depth charts of each surface identified by seismic profiling, which is done by geostatistics. The modeling starts by a statistical analysis by which we determine the parameters that enable us to calculate the variograms of the identified surfaces. After having determined the statistical parameters, we calculate the variograms of the variable Depth. By analyzingmore » the behavior of the variogram we then can deduce whether the situation is stationary and if the variable has an anisotropic behavior. We tried the following two nonstationary methods to obtain our estimates: (a) The method of universal kriging if the underlying variogram was directly accessible. (b) The method of increments if the underlying variogram was not directly accessible. After having modeled the variograms of the increments and of the variable itself, we calculated the surfaces by kriging the variable Depth on a small-mesh estimation grid. The two methods then are compared and their respective advantages and disadvantages are discussed, as well as their fields of application. These methods are capable of being used widely in earth sciences for automatic mapping of geometric surfaces or for variables such as a piezometric surface or a concentration, which are not 'stationary,' that is, essentially, possess a gradient or a tendency to develop systematically in space.« less
Propagation of terahertz pulses in random media.
Pearce, Jeremy; Jian, Zhongping; Mittleman, Daniel M
2004-02-15
We describe measurements of single-cycle terahertz pulse propagation in a random medium. The unique capabilities of terahertz time-domain spectroscopy permit the characterization of a multiply scattered field with unprecedented spatial and temporal resolution. With these results, we can develop a framework for understanding the statistics of broadband laser speckle. Also, the ability to extract information on the phase of the field opens up new possibilities for characterizing multiply scattered waves. We illustrate this with a simple example, which involves computing a time-windowed temporal correlation between fields measured at different spatial locations. This enables the identification of individual scattering events, and could lead to a new method for imaging in random media.
Gui, Jiang; Moore, Jason H.; Williams, Scott M.; Andrews, Peter; Hillege, Hans L.; van der Harst, Pim; Navis, Gerjan; Van Gilst, Wiek H.; Asselbergs, Folkert W.; Gilbert-Diamond, Diane
2013-01-01
We present an extension of the two-class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP-SNP interactions in the context of a quantitative trait. The proposed Quantitative MDR (QMDR) method handles continuous data by modifying MDR’s constructive induction algorithm to use a T-test. QMDR replaces the balanced accuracy metric with a T-test statistic as the score to determine the best interaction model. We used a simulation to identify the empirical distribution of QMDR’s testing score. We then applied QMDR to genetic data from the ongoing prospective Prevention of Renal and Vascular End-Stage Disease (PREVEND) study. PMID:23805232
Very large scale characterization of graphene mechanical devices using a colorimetry technique.
Cartamil-Bueno, Santiago Jose; Centeno, Alba; Zurutuza, Amaia; Steeneken, Peter Gerard; van der Zant, Herre Sjoerd Jan; Houri, Samer
2017-06-08
We use a scalable optical technique to characterize more than 21 000 circular nanomechanical devices made of suspended single- and double-layer graphene on cavities with different diameters (D) and depths (g). To maximize the contrast between suspended and broken membranes we used a model for selecting the optimal color filter. The method enables parallel and automatized image processing for yield statistics. We find the survival probability to be correlated with a structural mechanics scaling parameter given by D 4 /g 3 . Moreover, we extract a median adhesion energy of Γ = 0.9 J m -2 between the membrane and the native SiO 2 at the bottom of the cavities.
The basis function approach for modeling autocorrelation in ecological data
Hefley, Trevor J.; Broms, Kristin M.; Brost, Brian M.; Buderman, Frances E.; Kay, Shannon L.; Scharf, Henry; Tipton, John; Williams, Perry J.; Hooten, Mevin B.
2017-01-01
Analyzing ecological data often requires modeling the autocorrelation created by spatial and temporal processes. Many seemingly disparate statistical methods used to account for autocorrelation can be expressed as regression models that include basis functions. Basis functions also enable ecologists to modify a wide range of existing ecological models in order to account for autocorrelation, which can improve inference and predictive accuracy. Furthermore, understanding the properties of basis functions is essential for evaluating the fit of spatial or time-series models, detecting a hidden form of collinearity, and analyzing large data sets. We present important concepts and properties related to basis functions and illustrate several tools and techniques ecologists can use when modeling autocorrelation in ecological data.
Wide-Field Imaging of Single-Nanoparticle Extinction with Sub-nm2 Sensitivity
NASA Astrophysics Data System (ADS)
Payne, Lukas M.; Langbein, Wolfgang; Borri, Paola
2018-03-01
We report on a highly sensitive wide-field imaging technique for quantitative measurement of the optical extinction cross section σext of single nanoparticles. The technique is simple and high speed, and it enables the simultaneous acquisition of hundreds of nanoparticles for statistical analysis. Using rapid referencing, fast acquisition, and a deconvolution analysis, a shot-noise-limited sensitivity down to 0.4 nm2 is achieved. Measurements on a set of individual gold nanoparticles of 5 nm diameter using this method yield σext=(10.0 ±3.1 ) nm2, which is consistent with theoretical expectations and well above the background fluctuations of 0.9 nm2 .
Hung, Man Yui; Wright, David John; Blacklock, Jeanette; Needle, Richard John
2017-01-01
Introduction A high nurse-vacancy rate combined with high numbers of applications for junior pharmacist roles resulted in Colchester Hospital University National Health System Foundation Trust trial employing junior pharmacists into traditional nursing posts with the aim of integrating pharmacists into the ward team and enhancing local medicines optimization. The aim of the evaluation was to describe the implementation process and practice of the integrated care pharmacists (ICPs) in order to inform future innovations of a similar nature. Methods Four band 6 ward-based ICPs were employed on two wards funded within current ward staffing expenditure. With ethical committee approval, interviews were undertaken with the ICPs and focus groups with ward nurses, senior ward nurses and members of the medical team. Data were analyzed thematically to identify service benefits, barriers and enablers. Routine ward performance data were obtained from the two ICP wards and two wards selected as comparators. Appropriate statistical tests were performed to identify differences in performance. Results Four ICPs were interviewed, and focus groups were undertaken with three junior nurses, four senior nurses and three medical practitioners. Service enablers were continuous ward time, undertaking drug administration, positive feedback and use of effective communication methods. Barriers were planning, funding model, career development, and interprofessional working and social isolation. ICPs were believed to save nurse time and improve medicines safety. The proportion of patients receiving medicine reconciliation within 24 hours increased significantly in the ICP wards. All ICPs had resigned from their role within 12 months. Discussion It was believed that by locating pharmacists on the ward full time and allowing them to undertake medicines administration and medicines reconciliation, the nursing time would be saved and medicines safety improved. There was however significant learning to be derived from the implementation process, which may enable similar future models to be introduced more successfully. PMID:29354565
46 CFR 502.157 - Written evidence.
Code of Federal Regulations, 2010 CFR
2010-10-01
... objecting. Statistical exhibits shall contain a short commentary explaining the conclusions which the... such written rebuttal statements and exhibits in advance of the hearing to enable study by the parties...
Friedman, David B
2012-01-01
All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.
NASA Technical Reports Server (NTRS)
Safford, Robert R.; Jackson, Andrew E.; Swart, William W.; Barth, Timothy S.
1994-01-01
Successful ground processing at KSC requires that flight hardware and ground support equipment conform to specifications at tens of thousands of checkpoints. Knowledge of conformance is an essential requirement for launch. That knowledge of conformance at every requisite point does not, however, enable identification of past problems with equipment, or potential problem areas. This paper describes how the introduction of Statistical Process Control and Process Capability Analysis identification procedures into existing shuttle processing procedures can enable identification of potential problem areas and candidates for improvements to increase processing performance measures. Results of a case study describing application of the analysis procedures to Thermal Protection System processing are used to illustrate the benefits of the approaches described in the paper.
BREAST: a novel method to improve the diagnostic efficacy of mammography
NASA Astrophysics Data System (ADS)
Brennan, P. C.; Tapia, K.; Ryan, J.; Lee, W.
2013-03-01
High quality breast imaging and accurate image assessment are critical to the early diagnoses, treatment and management of women with breast cancer. Breast Screen Reader Assessment Strategy (BREAST) provides a platform, accessible by researchers and clinicians world-wide, which will contain image data bases, algorithms to assess reader performance and on-line systems for image evaluation. The platform will contribute to the diagnostic efficacy of breast imaging in Australia and beyond on two fronts: reducing errors in mammography, and transforming our assessment of novel technologies and techniques. Mammography is the primary diagnostic tool for detecting breast cancer with over 800,000 women X-rayed each year in Australia, however, it fails to detect 30% of breast cancers with a number of missed cancers being visible on the image [1-6]. BREAST will monitor the mistakes, identify reasons for mammographic errors, and facilitate innovative solutions to reduce error rates. The BREAST platform has the potential to enable expert assessment of breast imaging innovations, anywhere in the world where experts or innovations are located. Currently, innovations are often being assessed by limited numbers of individuals who happen to be geographically located close to the innovation, resulting in equivocal studies with low statistical power. BREAST will transform this current paradigm by enabling large numbers of experts to assess any new method or technology using our embedded evaluation methods. We are confident that this world-first system will play an important part in the future efficacy of breast imaging.
Remans, Tony; Keunen, Els; Bex, Geert Jan; Smeets, Karen; Vangronsveld, Jaco; Cuypers, Ann
2014-10-01
Reverse transcription-quantitative PCR (RT-qPCR) has been widely adopted to measure differences in mRNA levels; however, biological and technical variation strongly affects the accuracy of the reported differences. RT-qPCR specialists have warned that, unless researchers minimize this variability, they may report inaccurate differences and draw incorrect biological conclusions. The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines describe procedures for conducting and reporting RT-qPCR experiments. The MIQE guidelines enable others to judge the reliability of reported results; however, a recent literature survey found low adherence to these guidelines. Additionally, even experiments that use appropriate procedures remain subject to individual variation that statistical methods cannot correct. For example, since ideal reference genes do not exist, the widely used method of normalizing RT-qPCR data to reference genes generates background noise that affects the accuracy of measured changes in mRNA levels. However, current RT-qPCR data reporting styles ignore this source of variation. In this commentary, we direct researchers to appropriate procedures, outline a method to present the remaining uncertainty in data accuracy, and propose an intuitive way to select reference genes to minimize uncertainty. Reporting the uncertainty in data accuracy also serves for quality assessment, enabling researchers and peer reviewers to confidently evaluate the reliability of gene expression data. © 2014 American Society of Plant Biologists. All rights reserved.
NASA Technical Reports Server (NTRS)
Pokorny, P.; Janches, D.; Brown, P. G.; Hormaechea, J. L.
2017-01-01
Over a million individually measured meteoroid orbits were collected with the Southern Argentina Agile MEteor Radar (SAAMER) between 2012-2015. This provides a robust statistical database to perform an initial orbital survey of meteor showers in the Southern Hemisphere via the application of a 3D wavelet transform. The method results in a composite year from all 4 years of data, enabling us to obtain an undisturbed year of meteor activity with more than one thousand meteors per day. Our automated meteor shower search methodology identified 58 showers. Of these showers, 24 were associated with previously reported showers from the IAU catalogue while 34 showers are new and not listed in the catalogue. Our searching method combined with our large data sample provides unprecedented accuracy in measuring meteor shower activity and description of shower characteristics in the Southern Hemisphere. Using simple modeling and clustering methods we also propose potential parent bodies for the newly discovered showers.
NASA Astrophysics Data System (ADS)
Pokorný, P.; Janches, D.; Brown, P. G.; Hormaechea, J. L.
2017-07-01
Over a million individually measured meteoroid orbits were collected with the Southern Argentina Agile MEteor Radar (SAAMER) between 2012-2015. This provides a robust statistical database to perform an initial orbital survey of meteor showers in the Southern Hemisphere via the application of a 3D wavelet transform. The method results in a composite year from all 4 years of data, enabling us to obtain an undisturbed year of meteor activity with more than one thousand meteors per day. Our automated meteor shower search methodology identified 58 showers. Of these showers, 24 were associated with previously reported showers from the IAU catalogue while 34 showers are new and not listed in the catalogue. Our searching method combined with our large data sample provides unprecedented accuracy in measuring meteor shower activity and description of shower characteristics in the Southern Hemisphere. Using simple modeling and clustering methods we also propose potential parent bodies for the newly discovered showers.
Dumbryte, Irma; Linkeviciene, Laura; Linkevicius, Tomas; Malinauskas, Mangirdas
2017-07-26
The study aimed at introducing current available techniques for enamel microcracks (EMCs) detection, and presenting a method for direct quantitative analysis of an individual EMC. Measurements of the detailed EMCs characteristics (location, length, and width) were taken from the reconstructed images of the buccal tooth surface (teeth extracted from two age groups of patients) employing a scanning electron microscopy (SEM) and our derived formulas before and after ceramic brackets removal. Measured parameters of EMCs for younger age group were 2.41 µm (width), 3.68 mm (length) before and 2.73 µm, 3.90 mm after debonding; for older -4.03 µm, 4.35 mm before and 4.80 µm, 4.37 mm after brackets removal. Following debonding EMCs increased for both groups, however the changes in width and length were statistically insignificant. Regardless of the age group, proposed method enabled precise detection of the same EMC before and after debonding, and quantitative examination of its characteristics.
Examining Neuronal Connectivity and Its Role in Learning and Memory
NASA Astrophysics Data System (ADS)
Gala, Rohan
Learning and long-term memory formation are accompanied with changes in the patterns and weights of synaptic connections in the underlying neuronal network. However, the fundamental rules that drive connectivity changes, and the precise structure-function relationships within neuronal networks remain elusive. Technological improvements over the last few decades have enabled the observation of large but specific subsets of neurons and their connections in unprecedented detail. Devising robust and automated computational methods is critical to distill information from ever-increasing volumes of raw experimental data. Moreover, statistical models and theoretical frameworks are required to interpret the data and assemble evidence into understanding of brain function. In this thesis, I first describe computational methods to reconstruct connectivity based on light microscopy imaging experiments. Next, I use these methods to quantify structural changes in connectivity based on in vivo time-lapse imaging experiments. Finally, I present a theoretical model of associative learning that can explain many stereotypical features of experimentally observed connectivity.
Won, Sungho; Choi, Hosik; Park, Suyeon; Lee, Juyoung; Park, Changyi; Kwon, Sunghoon
2015-01-01
Owing to recent improvement of genotyping technology, large-scale genetic data can be utilized to identify disease susceptibility loci and this successful finding has substantially improved our understanding of complex diseases. However, in spite of these successes, most of the genetic effects for many complex diseases were found to be very small, which have been a big hurdle to build disease prediction model. Recently, many statistical methods based on penalized regressions have been proposed to tackle the so-called "large P and small N" problem. Penalized regressions including least absolute selection and shrinkage operator (LASSO) and ridge regression limit the space of parameters, and this constraint enables the estimation of effects for very large number of SNPs. Various extensions have been suggested, and, in this report, we compare their accuracy by applying them to several complex diseases. Our results show that penalized regressions are usually robust and provide better accuracy than the existing methods for at least diseases under consideration.
Fine-scale patterns of population stratification confound rare variant association tests.
O'Connor, Timothy D; Kiezun, Adam; Bamshad, Michael; Rich, Stephen S; Smith, Joshua D; Turner, Emily; Leal, Suzanne M; Akey, Joshua M
2013-01-01
Advances in next-generation sequencing technology have enabled systematic exploration of the contribution of rare variation to Mendelian and complex diseases. Although it is well known that population stratification can generate spurious associations with common alleles, its impact on rare variant association methods remains poorly understood. Here, we performed exhaustive coalescent simulations with demographic parameters calibrated from exome sequence data to evaluate the performance of nine rare variant association methods in the presence of fine-scale population structure. We find that all methods have an inflated spurious association rate for parameter values that are consistent with levels of differentiation typical of European populations. For example, at a nominal significance level of 5%, some test statistics have a spurious association rate as high as 40%. Finally, we empirically assess the impact of population stratification in a large data set of 4,298 European American exomes. Our results have important implications for the design, analysis, and interpretation of rare variant genome-wide association studies.
Bhavnani, Suresh K.; Chen, Tianlong; Ayyaswamy, Archana; Visweswaran, Shyam; Bellala, Gowtham; Rohit, Divekar; Kevin E., Bassler
2017-01-01
A primary goal of precision medicine is to identify patient subgroups based on their characteristics (e.g., comorbidities or genes) with the goal of designing more targeted interventions. While network visualization methods such as Fruchterman-Reingold have been used to successfully identify such patient subgroups in small to medium sized data sets, they often fail to reveal comprehensible visual patterns in large and dense networks despite having significant clustering. We therefore developed an algorithm called ExplodeLayout, which exploits the existence of significant clusters in bipartite networks to automatically “explode” a traditional network layout with the goal of separating overlapping clusters, while at the same time preserving key network topological properties that are critical for the comprehension of patient subgroups. We demonstrate the utility of ExplodeLayout by visualizing a large dataset extracted from Medicare consisting of readmitted hip-fracture patients and their comorbidities, demonstrate its statistically significant improvement over a traditional layout algorithm, and discuss how the resulting network visualization enabled clinicians to infer mechanisms precipitating hospital readmission in specific patient subgroups. PMID:28815099
Li, Yan; Andrade, Jorge
2017-01-01
A growing trend in the biomedical community is the use of Next Generation Sequencing (NGS) technologies in genomics research. The complexity of downstream differential expression (DE) analysis is however still challenging, as it requires sufficient computer programing and command-line knowledge. Furthermore, researchers often need to evaluate and visualize interactively the effect of using differential statistical and error models, assess the impact of selecting different parameters and cutoffs, and finally explore the overlapping consensus of cross-validated results obtained with different methods. This represents a bottleneck that slows down or impedes the adoption of NGS technologies in many labs. We developed DEApp, an interactive and dynamic web application for differential expression analysis of count based NGS data. This application enables models selection, parameter tuning, cross validation and visualization of results in a user-friendly interface. DEApp enables labs with no access to full time bioinformaticians to exploit the advantages of NGS applications in biomedical research. This application is freely available at https://yanli.shinyapps.io/DEAppand https://gallery.shinyapps.io/DEApp.
Regression without truth with Markov chain Monte-Carlo
NASA Astrophysics Data System (ADS)
Madan, Hennadii; Pernuš, Franjo; Likar, Boštjan; Å piclin, Žiga
2017-03-01
Regression without truth (RWT) is a statistical technique for estimating error model parameters of each method in a group of methods used for measurement of a certain quantity. A very attractive aspect of RWT is that it does not rely on a reference method or "gold standard" data, which is otherwise difficult RWT was used for a reference-free performance comparison of several methods for measuring left ventricular ejection fraction (EF), i.e. a percentage of blood leaving the ventricle each time the heart contracts, and has since been applied for various other quantitative imaging biomarkerss (QIBs). Herein, we show how Markov chain Monte-Carlo (MCMC), a computational technique for drawing samples from a statistical distribution with probability density function known only up to a normalizing coefficient, can be used to augment RWT to gain a number of important benefits compared to the original approach based on iterative optimization. For instance, the proposed MCMC-based RWT enables the estimation of joint posterior distribution of the parameters of the error model, straightforward quantification of uncertainty of the estimates, estimation of true value of the measurand and corresponding credible intervals (CIs), does not require a finite support for prior distribution of the measureand generally has a much improved robustness against convergence to non-global maxima. The proposed approach is validated using synthetic data that emulate the EF data for 45 patients measured with 8 different methods. The obtained results show that 90% CI of the corresponding parameter estimates contain the true values of all error model parameters and the measurand. A potential real-world application is to take measurements of a certain QIB several different methods and then use the proposed framework to compute the estimates of the true values and their uncertainty, a vital information for diagnosis based on QIB.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnett, Alex H.; Betcke, Timo; School of Mathematics, University of Manchester, Manchester, M13 9PL
2007-12-15
We report the first large-scale statistical study of very high-lying eigenmodes (quantum states) of the mushroom billiard proposed by L. A. Bunimovich [Chaos 11, 802 (2001)]. The phase space of this mixed system is unusual in that it has a single regular region and a single chaotic region, and no KAM hierarchy. We verify Percival's conjecture to high accuracy (1.7%). We propose a model for dynamical tunneling and show that it predicts well the chaotic components of predominantly regular modes. Our model explains our observed density of such superpositions dying as E{sup -1/3} (E is the eigenvalue). We compare eigenvaluemore » spacing distributions against Random Matrix Theory expectations, using 16 000 odd modes (an order of magnitude more than any existing study). We outline new variants of mesh-free boundary collocation methods which enable us to achieve high accuracy and high mode numbers ({approx}10{sup 5}) orders of magnitude faster than with competing methods.« less
The DEVELOP National Program's Strategy for Communicating Applied Science Outcomes
NASA Astrophysics Data System (ADS)
Childs-Gleason, L. M.; Ross, K. W.; Crepps, G.; Favors, J.; Kelley, C.; Miller, T. N.; Allsbrook, K. N.; Rogers, L.; Ruiz, M. L.
2016-12-01
NASA's DEVELOP National Program conducts rapid feasibility projects that enable the future workforce and current decision makers to collaborate and build capacity to use Earth science data to enhance environmental management and policy. The program communicates its results and applications to a broad spectrum of audiences through a variety of methods: "virtual poster sessions" that engage the general public through short project videos and interactive dialogue periods, a "Campus Ambassador Corps" that communicates about the program and its projects to academia, scientific and policy conference presentations, community engagement activities and end-of-project presentations, project "hand-offs" providing results and tools to project partners, traditional publications (both gray literature and peer-reviewed), an interactive website project gallery, targeted brochures, and through multiple social media venues and campaigns. This presentation will describe the various methods employed by DEVELOP to communicate the program's scientific outputs, target audiences, general statistics, community response and best practices.
All-atom calculation of protein free-energy profiles
NASA Astrophysics Data System (ADS)
Orioli, S.; Ianeselli, A.; Spagnolli, G.; Faccioli, P.
2017-10-01
The Bias Functional (BF) approach is a variational method which enables one to efficiently generate ensembles of reactive trajectories for complex biomolecular transitions, using ordinary computer clusters. For example, this scheme was applied to simulate in atomistic detail the folding of proteins consisting of several hundreds of amino acids and with experimental folding time of several minutes. A drawback of the BF approach is that it produces trajectories which do not satisfy microscopic reversibility. Consequently, this method cannot be used to directly compute equilibrium observables, such as free energy landscapes or equilibrium constants. In this work, we develop a statistical analysis which permits us to compute the potential of mean-force (PMF) along an arbitrary collective coordinate, by exploiting the information contained in the reactive trajectories calculated with the BF approach. We assess the accuracy and computational efficiency of this scheme by comparing its results with the PMF obtained for a small protein by means of plain molecular dynamics.
Perthold, Jan Walther; Oostenbrink, Chris
2018-05-17
Enveloping distribution sampling (EDS) is an efficient approach to calculate multiple free-energy differences from a single molecular dynamics (MD) simulation. However, the construction of an appropriate reference-state Hamiltonian that samples all states efficiently is not straightforward. We propose a novel approach for the construction of the EDS reference-state Hamiltonian, related to a previously described procedure to smoothen energy landscapes. In contrast to previously suggested EDS approaches, our reference-state Hamiltonian preserves local energy minima of the combined end-states. Moreover, we propose an intuitive, robust and efficient parameter optimization scheme to tune EDS Hamiltonian parameters. We demonstrate the proposed method with established and novel test systems and conclude that our approach allows for the automated calculation of multiple free-energy differences from a single simulation. Accelerated EDS promises to be a robust and user-friendly method to compute free-energy differences based on solid statistical mechanics.
Toward Personalized Control of Human Gut Bacterial Communities.
David, Lawrence A
2018-01-01
A key challenge in microbiology will be developing tools for manipulating human gut bacterial communities. Our ability to predict and control the dynamics of these communities is now in its infancy. To manage human gut microbiota, I am developing methods in three research domains. First, I am refining in vitro tools to experimentally study gut microbes at high throughput and in controlled settings. Second, I am adapting "big data" techniques to overcome statistical challenges confronting microbiota modeling. Third, I am testing study designs that can streamline human testing of microbiota manipulations. Assembling these methods creates new challenges, including training scientists who can work across disciplines such as engineering, ecology, and medicine. Nevertheless, I envision that overcoming these obstacles will enable my group to construct platforms that can personalize microbiota treatments, particularly ones based on diet. More broadly, I anticipate that such platforms will have applications across fields such as agriculture, biotechnology, and environmental management.
smallWig: parallel compression of RNA-seq WIG files.
Wang, Zhiying; Weissman, Tsachy; Milenkovic, Olgica
2016-01-15
We developed a new lossless compression method for WIG data, named smallWig, offering the best known compression rates for RNA-seq data and featuring random access functionalities that enable visualization, summary statistics analysis and fast queries from the compressed files. Our approach results in order of magnitude improvements compared with bigWig and ensures compression rates only a fraction of those produced by cWig. The key features of the smallWig algorithm are statistical data analysis and a combination of source coding methods that ensure high flexibility and make the algorithm suitable for different applications. Furthermore, for general-purpose file compression, the compression rate of smallWig approaches the empirical entropy of the tested WIG data. For compression with random query features, smallWig uses a simple block-based compression scheme that introduces only a minor overhead in the compression rate. For archival or storage space-sensitive applications, the method relies on context mixing techniques that lead to further improvements of the compression rate. Implementations of smallWig can be executed in parallel on different sets of chromosomes using multiple processors, thereby enabling desirable scaling for future transcriptome Big Data platforms. The development of next-generation sequencing technologies has led to a dramatic decrease in the cost of DNA/RNA sequencing and expression profiling. RNA-seq has emerged as an important and inexpensive technology that provides information about whole transcriptomes of various species and organisms, as well as different organs and cellular communities. The vast volume of data generated by RNA-seq experiments has significantly increased data storage costs and communication bandwidth requirements. Current compression tools for RNA-seq data such as bigWig and cWig either use general-purpose compressors (gzip) or suboptimal compression schemes that leave significant room for improvement. To substantiate this claim, we performed a statistical analysis of expression data in different transform domains and developed accompanying entropy coding methods that bridge the gap between theoretical and practical WIG file compression rates. We tested different variants of the smallWig compression algorithm on a number of integer-and real- (floating point) valued RNA-seq WIG files generated by the ENCODE project. The results reveal that, on average, smallWig offers 18-fold compression rate improvements, up to 2.5-fold compression time improvements, and 1.5-fold decompression time improvements when compared with bigWig. On the tested files, the memory usage of the algorithm never exceeded 90 KB. When more elaborate context mixing compressors were used within smallWig, the obtained compression rates were as much as 23 times better than those of bigWig. For smallWig used in the random query mode, which also supports retrieval of the summary statistics, an overhead in the compression rate of roughly 3-17% was introduced depending on the chosen system parameters. An increase in encoding and decoding time of 30% and 55% represents an additional performance loss caused by enabling random data access. We also implemented smallWig using multi-processor programming. This parallelization feature decreases the encoding delay 2-3.4 times compared with that of a single-processor implementation, with the number of processors used ranging from 2 to 8; in the same parameter regime, the decoding delay decreased 2-5.2 times. The smallWig software can be downloaded from: http://stanford.edu/~zhiyingw/smallWig/smallwig.html, http://publish.illinois.edu/milenkovic/, http://web.stanford.edu/~tsachy/. zhiyingw@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Zhang, X; Patel, L A; Beckwith, O; Schneider, R; Weeden, C J; Kindt, J T
2017-11-14
Micelle cluster distributions from molecular dynamics simulations of a solvent-free coarse-grained model of sodium octyl sulfate (SOS) were analyzed using an improved method to extract equilibrium association constants from small-system simulations containing one or two micelle clusters at equilibrium with free surfactants and counterions. The statistical-thermodynamic and mathematical foundations of this partition-enabled analysis of cluster histograms (PEACH) approach are presented. A dramatic reduction in computational time for analysis was achieved through a strategy similar to the selector variable method to circumvent the need for exhaustive enumeration of the possible partitions of surfactants and counterions into clusters. Using statistics from a set of small-system (up to 60 SOS molecules) simulations as input, equilibrium association constants for micelle clusters were obtained as a function of both number of surfactants and number of associated counterions through a global fitting procedure. The resulting free energies were able to accurately predict micelle size and charge distributions in a large (560 molecule) system. The evolution of micelle size and charge with SOS concentration as predicted by the PEACH-derived free energies and by a phenomenological four-parameter model fit, along with the sensitivity of these predictions to variations in cluster definitions, are analyzed and discussed.
Zhang, Guosheng; Huang, Kuan-Chieh; Xu, Zheng; Tzeng, Jung-Ying; Conneely, Karen N; Guan, Weihua; Kang, Jian; Li, Yun
2016-05-01
DNA methylation is a key epigenetic mark involved in both normal development and disease progression. Recent advances in high-throughput technologies have enabled genome-wide profiling of DNA methylation. However, DNA methylation profiling often employs different designs and platforms with varying resolution, which hinders joint analysis of methylation data from multiple platforms. In this study, we propose a penalized functional regression model to impute missing methylation data. By incorporating functional predictors, our model utilizes information from nonlocal probes to improve imputation quality. Here, we compared the performance of our functional model to linear regression and the best single probe surrogate in real data and via simulations. Specifically, we applied different imputation approaches to an acute myeloid leukemia dataset consisting of 194 samples and our method showed higher imputation accuracy, manifested, for example, by a 94% relative increase in information content and up to 86% more CpG sites passing post-imputation filtering. Our simulated association study further demonstrated that our method substantially improves the statistical power to identify trait-associated methylation loci. These findings indicate that the penalized functional regression model is a convenient and valuable imputation tool for methylation data, and it can boost statistical power in downstream epigenome-wide association study (EWAS). © 2016 WILEY PERIODICALS, INC.
NASA Astrophysics Data System (ADS)
Kissick, David J.; Muir, Ryan D.; Sullivan, Shane Z.; Oglesbee, Robert A.; Simpson, Garth J.
2013-02-01
Despite the ubiquitous use of multi-photon and confocal microscopy measurements in biology, the core techniques typically suffer from fundamental compromises between signal to noise (S/N) and linear dynamic range (LDR). In this study, direct synchronous digitization of voltage transients coupled with statistical analysis is shown to allow S/N approaching the theoretical maximum throughout an LDR spanning more than 8 decades, limited only by the dark counts of the detector on the low end and by the intrinsic nonlinearities of the photomultiplier tube (PMT) detector on the high end. Synchronous digitization of each voltage transient represents a fundamental departure from established methods in confocal/multi-photon imaging, which are currently based on either photon counting or signal averaging. High information-density data acquisition (up to 3.2 GB/s of raw data) enables the smooth transition between the two modalities on a pixel-by-pixel basis and the ultimate writing of much smaller files (few kB/s). Modeling of the PMT response allows extraction of key sensor parameters from the histogram of voltage peak-heights. Applications in second harmonic generation (SHG) microscopy are described demonstrating S/N approaching the shot-noise limit of the detector over large dynamic ranges.
Wavelet investigation of preferential concentration in particle-laden turbulence
NASA Astrophysics Data System (ADS)
Bassenne, Maxime; Urzay, Javier; Schneider, Kai; Moin, Parviz
2017-11-01
Direct numerical simulations of particle-laden homogeneous-isotropic turbulence are employed in conjunction with wavelet multi-resolution analyses to study preferential concentration in both physical and spectral spaces. Spatially-localized energy spectra for velocity, vorticity and particle-number density are computed, along with their spatial fluctuations that enable the quantification of scale-dependent probability density functions, intermittency and inter-phase conditional statistics. The main result is that particles are found in regions of lower turbulence spectral energy than the corresponding mean. This suggests that modeling the subgrid-scale turbulence intermittency is required for capturing the small-scale statistics of preferential concentration in large-eddy simulations. Additionally, a method is defined that decomposes a particle number-density field into the sum of a coherent and an incoherent components. The coherent component representing the clusters can be sparsely described by at most 1.6% of the total number of wavelet coefficients. An application of the method, motivated by radiative-heat-transfer simulations, is illustrated in the form of a grid-adaptation algorithm that results in non-uniform meshes refined around particle clusters. It leads to a reduction of the number of control volumes by one to two orders of magnitude. PSAAP-II Center at Stanford (Grant DE-NA0002373).
Acar, Nihat; Karakasli, Ahmet; Karaarslan, Ahmet; Mas, Nermin Ng; Hapa, Onur
2017-01-01
Volumetric measurements of benign tumors enable surgeons to trace volume changes during follow-up periods. For a volumetric measurement technique to be applicable, it should be easy, rapid, and inexpensive and should carry a high interobserver reliability. We aimed to assess the interobserver reliability of a volumetric measurement technique using the Cavalier's principle of stereological methods. The computerized tomography (CT) of 15 patients with a histopathologically confirmed diagnosis of enchondroma with variant tumor sizes and localizations was retrospectively reviewed for interobserver reliability evaluation of the volumetric stereological measurement with the Cavalier's principle, V = t × [((SU) × d) /SL]2 × Σ P. The volumes of the 15 tumors collected by the observers are demonstrated in Table 1. There was no statistical significance between the first and second observers ( p = 0.000 and intraclass correlation coefficient = 0.970) and between the first and third observers ( p = 0.000 and intraclass correlation coefficient = 0.981). No statistical significance was detected between the second and third observers ( p = 0.000 and intraclass correlation coefficient = 0.976). The Cavalier's principle with the stereological technique using the CT scans is an easy, rapid, and inexpensive technique in volumetric evaluation of enchondromas with a trustable interobserver reliability.
NASA Astrophysics Data System (ADS)
Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud
2012-01-01
The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.
Temperature and magnetic-field driven dynamics in artificial magnetic square ice
Drouhin, Henri-Jean; Wegrowe, Jean-Eric; Razeghi, Manijeh; ...
2015-09-08
Artificial spin ices are often spoken of as being realisations of some of the celebrated vertex models of statistical mechanics, where the exact microstate of the system can be imaged using advanced magnetic microscopy methods. The fact that a stable image can be formed means that the system is in fact athermal and not undergoing the usual finite-temperature fluctuations of a statistical mechanical system. In this paper we report on the preparation of artificial spin ices with islands that are thermally fluctuating due to their very small size. The relaxation rate of these islands was determined using variable frequency focusedmore » magneto-optic Kerr measurements. We performed magnetic imaging of artificial spin ice under varied temperature and magnetic field using X-ray transmission microscopy which uses X-ray magnetic circular dichroism to generate magnetic contrast. Furthermore, we have developed an on-membrane heater in order to apply temperatures in excess of 700 K and have shown increased dynamics due to higher temperature. Due to the ‘photon-in, photon-out' method employed here, it is the first report where it is possible to image the microstates of an ASI system under the simultaneous application of temperature and magnetic field, enabling the determination of relaxation rates, coercivties, and the analysis of vertex population during reversal.« less
kruX: matrix-based non-parametric eQTL discovery.
Qi, Jianlong; Asl, Hassan Foroughi; Björkegren, Johan; Michoel, Tom
2014-01-14
The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com.
Temperature and magnetic-field driven dynamics in artificial magnetic square ice
DOE Office of Scientific and Technical Information (OSTI.GOV)
Drouhin, Henri-Jean; Wegrowe, Jean-Eric; Razeghi, Manijeh
Artificial spin ices are often spoken of as being realisations of some of the celebrated vertex models of statistical mechanics, where the exact microstate of the system can be imaged using advanced magnetic microscopy methods. The fact that a stable image can be formed means that the system is in fact athermal and not undergoing the usual finite-temperature fluctuations of a statistical mechanical system. In this paper we report on the preparation of artificial spin ices with islands that are thermally fluctuating due to their very small size. The relaxation rate of these islands was determined using variable frequency focusedmore » magneto-optic Kerr measurements. We performed magnetic imaging of artificial spin ice under varied temperature and magnetic field using X-ray transmission microscopy which uses X-ray magnetic circular dichroism to generate magnetic contrast. Furthermore, we have developed an on-membrane heater in order to apply temperatures in excess of 700 K and have shown increased dynamics due to higher temperature. Due to the ‘photon-in, photon-out' method employed here, it is the first report where it is possible to image the microstates of an ASI system under the simultaneous application of temperature and magnetic field, enabling the determination of relaxation rates, coercivties, and the analysis of vertex population during reversal.« less
NASA Astrophysics Data System (ADS)
Baram, S.; Ronen, Z.; Kurtzman, D.; Peeters, A.; Dahan, O.
2013-12-01
Land cultivation and dairy waste lagoons are considered to be nonpoint and point sources of groundwater contamination by chloride (Cl-) and nitrate (NO3-). The objective of this work is to introduce a methodology to assess the past and future impacts of such agricultural activities on regional groundwater quality. The method is based on mass balances and on spatial statistical analysis of Cl- and NO3-concentration distributions in the saturated and unsaturated zones. The method enables quantitative analysis of the relation between the locations of pollution point sources and the spatial variability in Cl- and NO3- concentrations in groundwater. The method was applied to the Beer-Tuvia region, Israel, where intensive dairy farming along with land cultivation has been practiced for over 50 years above the local phreatic aquifer. Mass balance calculations accounted for the various groundwater recharge and abstraction sources and sinks in the entire region. The mass balances showed that leachates from lagoons and the cultivated land have contributed 6.0 and 89.4 % of the total mass of Cl- added to the aquifer and 12.6 and 77.4 % of the total mass of NO3-. The chemical composition of the aquifer and vadose zone water suggested that irrigated agricultural activity in the region is the main contributor of Cl- and NO3- to the groundwater. A low spatial correlation between the Cl- and NO3- concentrations in the groundwater and the on-land location of the dairy farms strengthened this assumption, despite the dairy waste lagoon being a point source for groundwater contamination by Cl- and NO3-. Results demonstrate that analyzing vadose zone and groundwater data by spatial statistical analysis methods can significantly contribute to the understanding of the relations between groundwater contaminating sources, and to assessing appropriate remediation steps.
Identification of Water Bodies in a Landsat 8 OLI Image Using a J48 Decision Tree.
Acharya, Tri Dev; Lee, Dong Ha; Yang, In Tae; Lee, Jae Kang
2016-07-12
Water bodies are essential to humans and other forms of life. Identification of water bodies can be useful in various ways, including estimation of water availability, demarcation of flooded regions, change detection, and so on. In past decades, Landsat satellite sensors have been used for land use classification and water body identification. Due to the introduction of a New Operational Land Imager (OLI) sensor on Landsat 8 with a high spectral resolution and improved signal-to-noise ratio, the quality of imagery sensed by Landsat 8 has improved, enabling better characterization of land cover and increased data size. Therefore, it is necessary to explore the most appropriate and practical water identification methods that take advantage of the improved image quality and use the fewest inputs based on the original OLI bands. The objective of the study is to explore the potential of a J48 decision tree (JDT) in identifying water bodies using reflectance bands from Landsat 8 OLI imagery. J48 is an open-source decision tree. The test site for the study is in the Northern Han River Basin, which is located in Gangwon province, Korea. Training data with individual bands were used to develop the JDT model and later applied to the whole study area. The performance of the model was statistically analysed using the kappa statistic and area under the curve (AUC). The results were compared with five other known water identification methods using a confusion matrix and related statistics. Almost all the methods showed high accuracy, and the JDT was successfully applied to the OLI image using only four bands, where the new additional deep blue band of OLI was found to have the third highest information gain. Thus, the JDT can be a good method for water body identification based on images with improved resolution and increased size.
Fantin, Valentina; Scalbi, Simona; Ottaviano, Giuseppe; Masoni, Paolo
2014-04-01
The purpose of this study is to propose a method for harmonising Life Cycle Assessment (LCA) literature studies on the same product or on different products fulfilling the same function for a reliable and meaningful comparison of their life-cycle environmental impacts. The method is divided in six main steps which aim to rationalize and quicken the efforts needed to carry out the comparison. The steps include: 1) a clear definition of the goal and scope of the review; 2) critical review of the references; 3) identification of significant parameters that have to be harmonised; 4) harmonisation of the parameters; 5) statistical analysis to support the comparison; 6) results and discussion. This approach was then applied to the comparative analysis of the published LCA studies on tap and bottled water production, focussing on Global Warming Potential (GWP) results, with the aim to identify the environmental preferable alternative. A statistical analysis with Wilcoxon's test confirmed that the difference between harmonised GWP values of tap and bottled water was significant. The results obtained from the comparison of the harmonised mean GWP results showed that tap water always has the best environmental performance, even in case of high energy-consuming technologies for drinking water treatments. The strength of the method is that it enables both performing a deep analysis of the LCA literature and obtaining more consistent comparisons across the published LCAs. For these reasons, it can be a valuable tool which provides useful information for both practitioners and decision makers. Finally, its application to the case study allowed both to supply a description of systems variability and to evaluate the importance of several key parameters for tap and bottled water production. The comparative review of LCA studies, with the inclusion of a statistical decision test, can validate and strengthen the final statements of the comparison. Copyright © 2014 Elsevier B.V. All rights reserved.
Coscollà, Clara; Navarro-Olivares, Santiago; Martí, Pedro; Yusà, Vicent
2014-02-01
When attempting to discover the important factors and then optimise a response by tuning these factors, experimental design (design of experiments, DoE) gives a powerful suite of statistical methodology. DoE identify significant factors and then optimise a response with respect to them in method development. In this work, a headspace-solid-phase micro-extraction (HS-SPME) combined with gas chromatography tandem mass spectrometry (GC-MS/MS) methodology for the simultaneous determination of six important organotin compounds namely monobutyltin (MBT), dibutyltin (DBT), tributyltin (TBT), monophenyltin (MPhT), diphenyltin (DPhT), triphenyltin (TPhT) has been optimized using a statistical design of experiments (DOE). The analytical method is based on the ethylation with NaBEt4 and simultaneous headspace-solid-phase micro-extraction of the derivative compounds followed by GC-MS/MS analysis. The main experimental parameters influencing the extraction efficiency selected for optimization were pre-incubation time, incubation temperature, agitator speed, extraction time, desorption temperature, buffer (pH, concentration and volume), headspace volume, sample salinity, preparation of standards, ultrasonic time and desorption time in the injector. The main factors (excitation voltage, excitation time, ion source temperature, isolation time and electron energy) affecting the GC-IT-MS/MS response were also optimized using the same statistical design of experiments. The proposed method presented good linearity (coefficient of determination R(2)>0.99) and repeatibilty (1-25%) for all the compounds under study. The accuracy of the method measured as the average percentage recovery of the compounds in spiked surface and marine waters was higher than 70% for all compounds studied. Finally, the optimized methodology was applied to real aqueous samples enabled the simultaneous determination of all compounds under study in surface and marine water samples obtained from Valencia region (Spain). © 2013 Elsevier B.V. All rights reserved.
Calculation of streamflow statistics for Ontario and the Great Lakes states
Piggott, Andrew R.; Neff, Brian P.
2005-01-01
Basic, flow-duration, and n-day frequency statistics were calculated for 779 current and historical streamflow gages in Ontario and 3,157 streamflow gages in the Great Lakes states with length-of-record daily mean streamflow data ending on December 31, 2000 and September 30, 2001, respectively. The statistics were determined using the U.S. Geological Survey’s SWSTAT and IOWDM, ANNIE, and LIBANNE software and Linux shell and PERL programming that enabled the mass processing of the data and calculation of the statistics. Verification exercises were performed to assess the accuracy of the processing and calculations. The statistics and descriptions, longitudes and latitudes, and drainage areas for each of the streamflow gages are summarized in ASCII text files and ESRI shapefiles.
Variability-aware compact modeling and statistical circuit validation on SRAM test array
NASA Astrophysics Data System (ADS)
Qiao, Ying; Spanos, Costas J.
2016-03-01
Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose a variability-aware compact model characterization methodology based on stepwise parameter selection. Transistor I-V measurements are obtained from bit transistor accessible SRAM test array fabricated using a collaborating foundry's 28nm FDSOI technology. Our in-house customized Monte Carlo simulation bench can incorporate these statistical compact models; and simulation results on SRAM writability performance are very close to measurements in distribution estimation. Our proposed statistical compact model parameter extraction methodology also has the potential of predicting non-Gaussian behavior in statistical circuit performances through mixtures of Gaussian distributions.
78 FR 29387 - Government-Owned Inventions, Available for Licensing
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-20
....: MSC-24919-1: Systems and Methods for RFID-Enables Information Collection; NASA Case No.: MSC-25632-1... Methods for RFID-Enabled Dispenser; NASA Case No.: MSC-25313-1: Hydrostatic Hyperbaric Apparatus and...; NASA Case No: MSC-25590-1: Systems and Methods for RFID-Enabled Pressure Sensing Apparatus; NASA Case...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trattner, Sigal; Cheng, Bin; Pieniazek, Radoslaw L.
2014-04-15
Purpose: Effective dose (ED) is a widely used metric for comparing ionizing radiation burden between different imaging modalities, scanners, and scan protocols. In computed tomography (CT), ED can be estimated by performing scans on an anthropomorphic phantom in which metal-oxide-semiconductor field-effect transistor (MOSFET) solid-state dosimeters have been placed to enable organ dose measurements. Here a statistical framework is established to determine the sample size (number of scans) needed for estimating ED to a desired precision and confidence, for a particular scanner and scan protocol, subject to practical limitations. Methods: The statistical scheme involves solving equations which minimize the sample sizemore » required for estimating ED to desired precision and confidence. It is subject to a constrained variation of the estimated ED and solved using the Lagrange multiplier method. The scheme incorporates measurement variation introduced both by MOSFET calibration, and by variation in MOSFET readings between repeated CT scans. Sample size requirements are illustrated on cardiac, chest, and abdomen–pelvis CT scans performed on a 320-row scanner and chest CT performed on a 16-row scanner. Results: Sample sizes for estimating ED vary considerably between scanners and protocols. Sample size increases as the required precision or confidence is higher and also as the anticipated ED is lower. For example, for a helical chest protocol, for 95% confidence and 5% precision for the ED, 30 measurements are required on the 320-row scanner and 11 on the 16-row scanner when the anticipated ED is 4 mSv; these sample sizes are 5 and 2, respectively, when the anticipated ED is 10 mSv. Conclusions: Applying the suggested scheme, it was found that even at modest sample sizes, it is feasible to estimate ED with high precision and a high degree of confidence. As CT technology develops enabling ED to be lowered, more MOSFET measurements are needed to estimate ED with the same precision and confidence.« less
Trattner, Sigal; Cheng, Bin; Pieniazek, Radoslaw L.; Hoffmann, Udo; Douglas, Pamela S.; Einstein, Andrew J.
2014-01-01
Purpose: Effective dose (ED) is a widely used metric for comparing ionizing radiation burden between different imaging modalities, scanners, and scan protocols. In computed tomography (CT), ED can be estimated by performing scans on an anthropomorphic phantom in which metal-oxide-semiconductor field-effect transistor (MOSFET) solid-state dosimeters have been placed to enable organ dose measurements. Here a statistical framework is established to determine the sample size (number of scans) needed for estimating ED to a desired precision and confidence, for a particular scanner and scan protocol, subject to practical limitations. Methods: The statistical scheme involves solving equations which minimize the sample size required for estimating ED to desired precision and confidence. It is subject to a constrained variation of the estimated ED and solved using the Lagrange multiplier method. The scheme incorporates measurement variation introduced both by MOSFET calibration, and by variation in MOSFET readings between repeated CT scans. Sample size requirements are illustrated on cardiac, chest, and abdomen–pelvis CT scans performed on a 320-row scanner and chest CT performed on a 16-row scanner. Results: Sample sizes for estimating ED vary considerably between scanners and protocols. Sample size increases as the required precision or confidence is higher and also as the anticipated ED is lower. For example, for a helical chest protocol, for 95% confidence and 5% precision for the ED, 30 measurements are required on the 320-row scanner and 11 on the 16-row scanner when the anticipated ED is 4 mSv; these sample sizes are 5 and 2, respectively, when the anticipated ED is 10 mSv. Conclusions: Applying the suggested scheme, it was found that even at modest sample sizes, it is feasible to estimate ED with high precision and a high degree of confidence. As CT technology develops enabling ED to be lowered, more MOSFET measurements are needed to estimate ED with the same precision and confidence. PMID:24694150
A comparison of ensemble post-processing approaches that preserve correlation structures
NASA Astrophysics Data System (ADS)
Schefzik, Roman; Van Schaeybroeck, Bert; Vannitsem, Stéphane
2016-04-01
Despite the fact that ensemble forecasts address the major sources of uncertainty, they exhibit biases and dispersion errors and therefore are known to improve by calibration or statistical post-processing. For instance the ensemble model output statistics (EMOS) method, also known as non-homogeneous regression approach (Gneiting et al., 2005) is known to strongly improve forecast skill. EMOS is based on fitting and adjusting a parametric probability density function (PDF). However, EMOS and other common post-processing approaches apply to a single weather quantity at a single location for a single look-ahead time. They are therefore unable of taking into account spatial, inter-variable and temporal dependence structures. Recently many research efforts have been invested in designing post-processing methods that resolve this drawback but also in verification methods that enable the detection of dependence structures. New verification methods are applied on two classes of post-processing methods, both generating physically coherent ensembles. A first class uses the ensemble copula coupling (ECC) that starts from EMOS but adjusts the rank structure (Schefzik et al., 2013). The second class is a member-by-member post-processing (MBM) approach that maps each raw ensemble member to a corrected one (Van Schaeybroeck and Vannitsem, 2015). We compare variants of the EMOS-ECC and MBM classes and highlight a specific theoretical connection between them. All post-processing variants are applied in the context of the ensemble system of the European Centre of Weather Forecasts (ECMWF) and compared using multivariate verification tools including the energy score, the variogram score (Scheuerer and Hamill, 2015) and the band depth rank histogram (Thorarinsdottir et al., 2015). Gneiting, Raftery, Westveld, and Goldman, 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., {133}, 1098-1118. Scheuerer and Hamill, 2015. Variogram-based proper scoring rules for probabilistic forecasts of multivariate quantities. Mon. Wea. Rev. {143},1321-1334. Schefzik, Thorarinsdottir, Gneiting. Uncertainty quantification in complex simulation models using ensemble copula coupling. Statistical Science {28},616-640, 2013. Thorarinsdottir, M. Scheuerer, and C. Heinz, 2015. Assessing the calibration of high-dimensional ensemble forecasts using rank histograms, arXiv:1310.0236. Van Schaeybroeck and Vannitsem, 2015: Ensemble post-processing using member-by-member approaches: theoretical aspects. Q.J.R. Meteorol. Soc., 141: 807-818.
Just the right age: well-clustered exposure ages from a global glacial 10Be compilation
NASA Astrophysics Data System (ADS)
Heyman, Jakob; Margold, Martin
2017-04-01
Cosmogenic exposure dating has been used extensively for defining glacial chronologies, both in ice sheet and alpine settings, and the global set of published ages today reaches well beyond 10,000 samples. Over the last few years, a number of important developments have improved the measurements (with well-defined AMS standards) and exposure age calculations (with updated data and methods for calculating production rates), in the best case enabling high precision dating of past glacial events. A remaining problem, however, is the fact that a large portion of all dated samples have been affected by prior and/or incomplete exposure, yielding erroneous exposure ages under the standard assumptions. One way to address this issue is to only use exposure ages that can be confidently considered as unaffected by prior/incomplete exposure, such as groups of samples with statistically identical ages. Here we use objective statistical criteria to identify groups of well-clustered exposure ages from the global glacial "expage" 10Be compilation. Out of ˜1700 groups with at least 3 individual samples ˜30% are well-clustered, increasing to ˜45% if allowing outlier rejection of a maximum of 1/3 of the samples (still requiring a minimum of 3 well-clustered ages). The dataset of well-clustered ages is heavily dominated by ages <30 ka, showing that well-defined cosmogenic chronologies primarily exist for the last glaciation. We observe a large-scale global synchronicity in the timing of the last deglaciation from ˜20 to 10 ka. There is also a general correlation between the timing of deglaciation and latitude (or size of the individual ice mass), with earlier deglaciation in lower latitudes and later deglaciation towards the poles. Grouping the data into regions and comparing with available paleoclimate data we can start to untangle regional differences in the last deglaciation and the climate events controlling the ice mass loss. The extensive dataset and the statistical analysis enables an unprecedented global view on the last deglaciation.
Detecting Disease Outbreaks in Mass Gatherings Using Internet Data
Yom-Tov, Elad; Cox, Ingemar J; McKendry, Rachel A
2014-01-01
Background Mass gatherings, such as music festivals and religious events, pose a health care challenge because of the risk of transmission of communicable diseases. This is exacerbated by the fact that participants disperse soon after the gathering, potentially spreading disease within their communities. The dispersion of participants also poses a challenge for traditional surveillance methods. The ubiquitous use of the Internet may enable the detection of disease outbreaks through analysis of data generated by users during events and shortly thereafter. Objective The intent of the study was to develop algorithms that can alert to possible outbreaks of communicable diseases from Internet data, specifically Twitter and search engine queries. Methods We extracted all Twitter postings and queries made to the Bing search engine by users who repeatedly mentioned one of nine major music festivals held in the United Kingdom and one religious event (the Hajj in Mecca) during 2012, for a period of 30 days and after each festival. We analyzed these data using three methods, two of which compared words associated with disease symptoms before and after the time of the festival, and one that compared the frequency of these words with those of other users in the United Kingdom in the days following the festivals. Results The data comprised, on average, 7.5 million tweets made by 12,163 users, and 32,143 queries made by 1756 users from each festival. Our methods indicated the statistically significant appearance of a disease symptom in two of the nine festivals. For example, cough was detected at higher than expected levels following the Wakestock festival. Statistically significant agreement (chi-square test, P<.01) between methods and across data sources was found where a statistically significant symptom was detected. Anecdotal evidence suggests that symptoms detected are indeed indicative of a disease that some users attributed to being at the festival. Conclusions Our work shows the feasibility of creating a public health surveillance system for mass gatherings based on Internet data. The use of multiple data sources and analysis methods was found to be advantageous for rejecting false positives. Further studies are required in order to validate our findings with data from public health authorities. PMID:24943128
Kashihara, Koji
2014-01-01
Unlike assistive technology for verbal communication, the brain-machine or brain-computer interface (BMI/BCI) has not been established as a non-verbal communication tool for amyotrophic lateral sclerosis (ALS) patients. Face-to-face communication enables access to rich emotional information, but individuals suffering from neurological disorders, such as ALS and autism, may not express their emotions or communicate their negative feelings. Although emotions may be inferred by looking at facial expressions, emotional prediction for neutral faces necessitates advanced judgment. The process that underlies brain neuronal responses to neutral faces and causes emotional changes remains unknown. To address this problem, therefore, this study attempted to decode conditioned emotional reactions to neutral face stimuli. This direction was motivated by the assumption that if electroencephalogram (EEG) signals can be used to detect patients' emotional responses to specific inexpressive faces, the results could be incorporated into the design and development of BMI/BCI-based non-verbal communication tools. To these ends, this study investigated how a neutral face associated with a negative emotion modulates rapid central responses in face processing and then identified cortical activities. The conditioned neutral face-triggered event-related potentials that originated from the posterior temporal lobe statistically significantly changed during late face processing (600–700 ms) after stimulus, rather than in early face processing activities, such as P1 and N170 responses. Source localization revealed that the conditioned neutral faces increased activity in the right fusiform gyrus (FG). This study also developed an efficient method for detecting implicit negative emotional responses to specific faces by using EEG signals. A classification method based on a support vector machine enables the easy classification of neutral faces that trigger specific individual emotions. In accordance with this classification, a face on a computer morphs into a sad or displeased countenance. The proposed method could be incorporated as a part of non-verbal communication tools to enable emotional expression. PMID:25206321
Kashihara, Koji
2014-01-01
Unlike assistive technology for verbal communication, the brain-machine or brain-computer interface (BMI/BCI) has not been established as a non-verbal communication tool for amyotrophic lateral sclerosis (ALS) patients. Face-to-face communication enables access to rich emotional information, but individuals suffering from neurological disorders, such as ALS and autism, may not express their emotions or communicate their negative feelings. Although emotions may be inferred by looking at facial expressions, emotional prediction for neutral faces necessitates advanced judgment. The process that underlies brain neuronal responses to neutral faces and causes emotional changes remains unknown. To address this problem, therefore, this study attempted to decode conditioned emotional reactions to neutral face stimuli. This direction was motivated by the assumption that if electroencephalogram (EEG) signals can be used to detect patients' emotional responses to specific inexpressive faces, the results could be incorporated into the design and development of BMI/BCI-based non-verbal communication tools. To these ends, this study investigated how a neutral face associated with a negative emotion modulates rapid central responses in face processing and then identified cortical activities. The conditioned neutral face-triggered event-related potentials that originated from the posterior temporal lobe statistically significantly changed during late face processing (600-700 ms) after stimulus, rather than in early face processing activities, such as P1 and N170 responses. Source localization revealed that the conditioned neutral faces increased activity in the right fusiform gyrus (FG). This study also developed an efficient method for detecting implicit negative emotional responses to specific faces by using EEG signals. A classification method based on a support vector machine enables the easy classification of neutral faces that trigger specific individual emotions. In accordance with this classification, a face on a computer morphs into a sad or displeased countenance. The proposed method could be incorporated as a part of non-verbal communication tools to enable emotional expression.
Multibaseline gravitational wave radiometry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Talukder, Dipongkar; Bose, Sukanta; Mitra, Sanjit
2011-03-15
We present a statistic for the detection of stochastic gravitational wave backgrounds (SGWBs) using radiometry with a network of multiple baselines. We also quantitatively compare the sensitivities of existing baselines and their network to SGWBs. We assess how the measurement accuracy of signal parameters, e.g., the sky position of a localized source, can improve when using a network of baselines, as compared to any of the single participating baselines. The search statistic itself is derived from the likelihood ratio of the cross correlation of the data across all possible baselines in a detector network and is optimal in Gaussian noise.more » Specifically, it is the likelihood ratio maximized over the strength of the SGWB and is called the maximized-likelihood ratio (MLR). One of the main advantages of using the MLR over past search strategies for inferring the presence or absence of a signal is that the former does not require the deconvolution of the cross correlation statistic. Therefore, it does not suffer from errors inherent to the deconvolution procedure and is especially useful for detecting weak sources. In the limit of a single baseline, it reduces to the detection statistic studied by Ballmer [Classical Quantum Gravity 23, S179 (2006).] and Mitra et al.[Phys. Rev. D 77, 042002 (2008).]. Unlike past studies, here the MLR statistic enables us to compare quantitatively the performances of a variety of baselines searching for a SGWB signal in (simulated) data. Although we use simulated noise and SGWB signals for making these comparisons, our method can be straightforwardly applied on real data.« less
Ince, Robin A A; Giordano, Bruno L; Kayser, Christoph; Rousselet, Guillaume A; Gross, Joachim; Schyns, Philippe G
2017-03-01
We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open-source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541-1573, 2017. © 2016 Wiley Periodicals, Inc. 2016 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
Invited review: A position on the Global Livestock Environmental Assessment Model (GLEAM).
MacLeod, M J; Vellinga, T; Opio, C; Falcucci, A; Tempio, G; Henderson, B; Makkar, H; Mottet, A; Robinson, T; Steinfeld, H; Gerber, P J
2018-02-01
The livestock sector is one of the fastest growing subsectors of the agricultural economy and, while it makes a major contribution to global food supply and economic development, it also consumes significant amounts of natural resources and alters the environment. In order to improve our understanding of the global environmental impact of livestock supply chains, the Food and Agriculture Organization of the United Nations has developed the Global Livestock Environmental Assessment Model (GLEAM). The purpose of this paper is to provide a review of GLEAM. Specifically, it explains the model architecture, methods and functionality, that is the types of analysis that the model can perform. The model focuses primarily on the quantification of greenhouse gases emissions arising from the production of the 11 main livestock commodities. The model inputs and outputs are managed and produced as raster data sets, with spatial resolution of 0.05 decimal degrees. The Global Livestock Environmental Assessment Model v1.0 consists of five distinct modules: (a) the Herd Module; (b) the Manure Module; (c) the Feed Module; (d) the System Module; (e) the Allocation Module. In terms of the modelling approach, GLEAM has several advantages. For example spatial information on livestock distributions and crops yields enables rations to be derived that reflect the local availability of feed resources in developing countries. The Global Livestock Environmental Assessment Model also contains a herd model that enables livestock statistics to be disaggregated and variation in livestock performance and management to be captured. Priorities for future development of GLEAM include: improving data quality and the methods used to perform emissions calculations; extending the scope of the model to include selected additional environmental impacts and to enable predictive modelling; and improving the utility of GLEAM output.
Mansouri, Majdi; Nounou, Mohamed N; Nounou, Hazem N
2017-09-01
In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to combine the advantages brought forward by the proposed EWMA-GLRT fault detection chart with the KPCA model. Thus, it is used to enhance fault detection of the Cad System in E. coli model through monitoring some of the key variables involved in this model such as enzymes, transport proteins, regulatory proteins, lysine, and cadaverine. The results demonstrate the effectiveness of the proposed KPCA-based EWMA-GLRT method over Q , GLRT, EWMA, Shewhart, and moving window-GLRT methods. The detection performance is assessed and evaluated in terms of FAR, missed detection rates, and average run length (ARL 1 ) values.
2011-01-01
Background Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions. Methods An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs. Results When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF. Conclusions We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited. PMID:21342584
easyGWAS: A Cloud-Based Platform for Comparing the Results of Genome-Wide Association Studies.
Grimm, Dominik G; Roqueiro, Damian; Salomé, Patrice A; Kleeberger, Stefan; Greshake, Bastian; Zhu, Wangsheng; Liu, Chang; Lippert, Christoph; Stegle, Oliver; Schölkopf, Bernhard; Weigel, Detlef; Borgwardt, Karsten M
2017-01-01
The ever-growing availability of high-quality genotypes for a multitude of species has enabled researchers to explore the underlying genetic architecture of complex phenotypes at an unprecedented level of detail using genome-wide association studies (GWAS). The systematic comparison of results obtained from GWAS of different traits opens up new possibilities, including the analysis of pleiotropic effects. Other advantages that result from the integration of multiple GWAS are the ability to replicate GWAS signals and to increase statistical power to detect such signals through meta-analyses. In order to facilitate the simple comparison of GWAS results, we present easyGWAS, a powerful, species-independent online resource for computing, storing, sharing, annotating, and comparing GWAS. The easyGWAS tool supports multiple species, the uploading of private genotype data and summary statistics of existing GWAS, as well as advanced methods for comparing GWAS results across different experiments and data sets in an interactive and user-friendly interface. easyGWAS is also a public data repository for GWAS data and summary statistics and already includes published data and results from several major GWAS. We demonstrate the potential of easyGWAS with a case study of the model organism Arabidopsis thaliana , using flowering and growth-related traits. © 2016 American Society of Plant Biologists. All rights reserved.
Tomasi, Ivan; Marconi, Ombretta; Sileoni, Valeria; Perretti, Giuseppe
2017-01-01
Beer wort β-glucans are high-molecular-weight non-starch polysaccharides of that are great interest to the brewing industries. Because glucans can increase the viscosity of the solutions and form gels, hazes, and precipitates, they are often related to poor lautering performance and beer filtration problems. In this work, a simple and suitable method was developed to determine and characterize β-glucans in beer wort using size exclusion chromatography coupled with a triple-detector array, which is composed of a light scatterer, a viscometer, and a refractive-index detector. The method performances are comparable to the commercial reference method as result from the statistical validation and enable one to obtain interesting parameters of β-glucan in beer wort, such as the molecular weight averages, fraction description, hydrodynamic radius, intrinsic viscosity, polydispersity and Mark-Houwink parameters. This characterization can be useful in brewing science to understand filtration problems, which are not always explained through conventional analysis. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kreuzer-Martin, Helen W.; Wahl, Jon H.; Metoyer, Candace N.
The toxic protein ricin is of concern as a potential biological threat agent (BTA) Recently, several samples of ricin have been seized in connection with biocriminal activity. Analytical methods are needed that enable federal investigators to determine how the samples were prepared, to match seized samples to potential source materials, and to identify samples that may have been prepared by the same method using the same source materials. One commonly described crude ricin preparation method is acetone extraction of crushed castor beans. Here we describe the use of solid-phase microextraction and headspace analysis of crude ricin preparation samples to determinemore » whether they were processed by acetone extraction. In all cases, acetone-extracted bean mash could be distinguished from un-extracted mash or mash extracted with other organic solvents. Statistical analysis showed that storage in closed containers for up to 109 days had no effect on acetone signal intensity. Signal intensity in acetone-extracted mash decreased during storage in open containers, but extracted mash could still be distinguished from un-extracted mash after 94 days.« less
NASA Astrophysics Data System (ADS)
Bonelli, Maria Grazia; Ferrini, Mauro; Manni, Andrea
2016-12-01
The assessment of metals and organic micropollutants contamination in agricultural soils is a difficult challenge due to the extensive area used to collect and analyze a very large number of samples. With Dioxins and dioxin-like PCBs measurement methods and subsequent the treatment of data, the European Community advises the develop low-cost and fast methods allowing routing analysis of a great number of samples, providing rapid measurement of these compounds in the environment, feeds and food. The aim of the present work has been to find a method suitable to describe the relations occurring between organic and inorganic contaminants and use the value of the latter in order to forecast the former. In practice, the use of a metal portable soil analyzer coupled with an efficient statistical procedure enables the required objective to be achieved. Compared to Multiple Linear Regression, the Artificial Neural Networks technique has shown to be an excellent forecasting method, though there is no linear correlation between the variables to be analyzed.
Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong
2016-05-13
In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student's t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods.
NASA Astrophysics Data System (ADS)
Saito, Asaki; Yasutomi, Shin-ichi; Tamura, Jun-ichi; Ito, Shunji
2015-06-01
We introduce a true orbit generation method enabling exact simulations of dynamical systems defined by arbitrary-dimensional piecewise linear fractional maps, including piecewise linear maps, with rational coefficients. This method can generate sufficiently long true orbits which reproduce typical behaviors (inherent behaviors) of these systems, by properly selecting algebraic numbers in accordance with the dimension of the target system, and involving only integer arithmetic. By applying our method to three dynamical systems—that is, the baker's transformation, the map associated with a modified Jacobi-Perron algorithm, and an open flow system—we demonstrate that it can reproduce their typical behaviors that have been very difficult to reproduce with conventional simulation methods. In particular, for the first two maps, we show that we can generate true orbits displaying the same statistical properties as typical orbits, by estimating the marginal densities of their invariant measures. For the open flow system, we show that an obtained true orbit correctly converges to the stable period-1 orbit, which is inherently possessed by the system.
Zooming in on vibronic structure by lowest-value projection reconstructed 4D coherent spectroscopy
NASA Astrophysics Data System (ADS)
Harel, Elad
2018-05-01
A fundamental goal of chemical physics is an understanding of microscopic interactions in liquids at and away from equilibrium. In principle, this microscopic information is accessible by high-order and high-dimensionality nonlinear optical measurements. Unfortunately, the time required to execute such experiments increases exponentially with the dimensionality, while the signal decreases exponentially with the order of the nonlinearity. Recently, we demonstrated a non-uniform acquisition method based on radial sampling of the time-domain signal [W. O. Hutson et al., J. Phys. Chem. Lett. 9, 1034 (2018)]. The four-dimensional spectrum was then reconstructed by filtered back-projection using an inverse Radon transform. Here, we demonstrate an alternative reconstruction method based on the statistical analysis of different back-projected spectra which results in a dramatic increase in sensitivity and at least a 100-fold increase in dynamic range compared to conventional uniform sampling and Fourier reconstruction. These results demonstrate that alternative sampling and reconstruction methods enable applications of increasingly high-order and high-dimensionality methods toward deeper insights into the vibronic structure of liquids.
Han, Kyunghwa; Jung, Inkyung
2018-05-01
This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.
Atwal, Anita; McIntyre, Anne
2017-01-01
Introduction High quality guidance in home strategies is needed to enable older people to measure their home environment and become involved in the provision of assistive devices and to promote consistency among professionals. This study aims to investigate the reliability of such guidance and its ability to promote accuracy of results when measurements are taken by both older people and professionals. Method Twenty-five health professionals and 26 older people participated in a within-group design to test the accuracy of measurements taken (that is, person’s popliteal height, baths, toilets, beds, stairs and chairs). Data were analysed with descriptive analysis and the Wilcoxon test. The intra-rater reliability was assessed by correlating measurements taken at two different times with guidance use. Results The intra-rater reliability analysis revealed statistical significance (P < 0.05) for all measurements except for the bath internal width. The guidance enabled participants to take 90% of measurements that they were not able to complete otherwise, 80.55% of which lay within the acceptable suggested margin of variation. Accuracy was supported by the significant reduction in the standard deviation of the actual measurements and accuracy scores. Conclusion This evidence-based guidance can be used in its current format by older people and professionals to facilitate appropriate measurements. Yet, some users might need help from carers or specialists depending on their impairments. PMID:29386701
Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online
Forsberg, Erica M; Huan, Tao; Rinehart, Duane; Benton, H Paul; Warth, Benedikt; Hilmers, Brian; Siuzdak, Gary
2018-01-01
Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LCLC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes ~30 min. PMID:29494574
FUn: a framework for interactive visualizations of large, high-dimensional datasets on the web.
Probst, Daniel; Reymond, Jean-Louis
2018-04-15
During the past decade, big data have become a major tool in scientific endeavors. Although statistical methods and algorithms are well-suited for analyzing and summarizing enormous amounts of data, the results do not allow for a visual inspection of the entire data. Current scientific software, including R packages and Python libraries such as ggplot2, matplotlib and plot.ly, do not support interactive visualizations of datasets exceeding 100 000 data points on the web. Other solutions enable the web-based visualization of big data only through data reduction or statistical representations. However, recent hardware developments, especially advancements in graphical processing units, allow for the rendering of millions of data points on a wide range of consumer hardware such as laptops, tablets and mobile phones. Similar to the challenges and opportunities brought to virtually every scientific field by big data, both the visualization of and interaction with copious amounts of data are both demanding and hold great promise. Here we present FUn, a framework consisting of a client (Faerun) and server (Underdark) module, facilitating the creation of web-based, interactive 3D visualizations of large datasets, enabling record level visual inspection. We also introduce a reference implementation providing access to SureChEMBL, a database containing patent information on more than 17 million chemical compounds. The source code and the most recent builds of Faerun and Underdark, Lore.js and the data preprocessing toolchain used in the reference implementation, are available on the project website (http://doc.gdb.tools/fun/). daniel.probst@dcb.unibe.ch or jean-louis.reymond@dcb.unibe.ch.
Thermal effect of Zn quantum dots grown on Si(111): competition between relaxation and reconstraint
NASA Astrophysics Data System (ADS)
Kao, Li-Chi; Huang, Bo-Jia; Zheng, Yu-En; Tu, Kai-Teng; Chiu, Shang-Jui; Ku, Ching-Shun; Lo, Kuang Yao
2018-01-01
Zn dots are potential solutions for metal contacts in future nanodevices. The metastable states that exist at the interface between Zn quantum dots and oxide-free Si(111) surfaces can suppress the development of the complete relaxation and increase the size of Zn dots. In this work, the actual heat consumption of the structural evolution of Zn dots resulting from extrinsic thermal effect was analyzed. Zn dots were coherently grown on oxide-free Si(111) through magnetron RF sputtering. A compensative optical method combined with reflective second harmonic generation and synchrotron x-ray diffraction (XRD) was developed to statistically analyze the thermal effect on the Zn dot system. Pattern matching (3 m) between the Zn and oxide-free Si(111) surface enabled Si(111) to constrain Zn dots from a liquid to solid phase. Annealing under vacuum induced smaller, loose Zn dots to be reconstrained by Si(111). When the size of the Zn dots was in the margin of complete relaxation, the Zn dot was partially constrained by potential barriers (metastable states) between Zn(111) and one of the six in-planes of Si〈110〉. The thermal disturbance exerted by annealing would enable partially constrained ZnO/Zn dots to overcome the potential barrier and be completely relaxed, which is obvious on the transition between Zn(111) and Zn(002) peak in synchrotron XRD. Considering the actual irradiated surface area of dots array in a wide-size distribution, the competition between reconstrained and relaxed Zn dots on Si(111) during annealing was statistically analyzed.
Clarke, M G; Kennedy, K P; MacDonagh, R P
2009-01-01
To develop a clinical prediction model enabling the calculation of an individual patient's life expectancy (LE) and survival probability based on age, sex, and comorbidity for use in the joint decision-making process regarding medical treatment. A computer software program was developed with a team of 3 clinicians, 2 professional actuaries, and 2 professional computer programmers. This incorporated statistical spreadsheet and database access design methods. Data sources included life insurance industry actuarial rating factor tables (public and private domain), Government Actuary Department UK life tables, professional actuarial sources, and evidence-based medical literature. The main outcome measures were numerical and graphical display of comorbidity-adjusted LE; 5-, 10-, and 15-year survival probability; in addition to generic UK population LE. Nineteen medical conditions, which impacted significantly on LE in actuarial terms and were commonly encountered in clinical practice, were incorporated in the final model. Numerical and graphical representations of statistical predictions of LE and survival probability were successfully generated for patients with either no comorbidity or a combination of the 19 medical conditions included. Validation and testing, including actuarial peer review, confirmed consistency with the data sources utilized. The evidence-based actuarial data utilized in this computer program design represent a valuable resource for use in the clinical decision-making process, where an accurate objective assessment of patient LE can so often make the difference between patients being offered or denied medical and surgical treatment. Ongoing development to incorporate additional comorbidities and enable Web-based access will enhance its use further.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siversson, Carl, E-mail: carl.siversson@med.lu.se; Nordström, Fredrik; Department of Radiation Physics, Skåne University Hospital, Lund 214 28
2015-10-15
Purpose: In order to enable a magnetic resonance imaging (MRI) only workflow in radiotherapy treatment planning, methods are required for generating Hounsfield unit (HU) maps (i.e., synthetic computed tomography, sCT) for dose calculations, directly from MRI. The Statistical Decomposition Algorithm (SDA) is a method for automatically generating sCT images from a single MR image volume, based on automatic tissue classification in combination with a model trained using a multimodal template material. This study compares dose calculations between sCT generated by the SDA and conventional CT in the male pelvic region. Methods: The study comprised ten prostate cancer patients, for whommore » a 3D T2 weighted MRI and a conventional planning CT were acquired. For each patient, sCT images were generated from the acquired MRI using the SDA. In order to decouple the effect of variations in patient geometry between imaging modalities from the effect of uncertainties in the SDA, the conventional CT was nonrigidly registered to the MRI to assure that their geometries were well aligned. For each patient, a volumetric modulated arc therapy plan was created for the registered CT (rCT) and recalculated for both the sCT and the conventional CT. The results were evaluated using several methods, including mean average error (MAE), a set of dose-volume histogram parameters, and a restrictive gamma criterion (2% local dose/1 mm). Results: The MAE within the body contour was 36.5 ± 4.1 (1 s.d.) HU between sCT and rCT. Average mean absorbed dose difference to target was 0.0% ± 0.2% (1 s.d.) between sCT and rCT, whereas it was −0.3% ± 0.3% (1 s.d.) between CT and rCT. The average gamma pass rate was 99.9% for sCT vs rCT, whereas it was 90.3% for CT vs rCT. Conclusions: The SDA enables a highly accurate MRI only workflow in prostate radiotherapy planning. The dosimetric uncertainties originating from the SDA appear negligible and are notably lower than the uncertainties introduced by variations in patient geometry between imaging sessions.« less
Dinov, Ivo D; Heavner, Ben; Tang, Ming; Glusman, Gustavo; Chard, Kyle; Darcy, Mike; Madduri, Ravi; Pa, Judy; Spino, Cathie; Kesselman, Carl; Foster, Ian; Deutsch, Eric W; Price, Nathan D; Van Horn, John D; Ames, Joseph; Clark, Kristi; Hood, Leroy; Hampstead, Benjamin M; Dauer, William; Toga, Arthur W
2016-01-01
A unique archive of Big Data on Parkinson's Disease is collected, managed and disseminated by the Parkinson's Progression Markers Initiative (PPMI). The integration of such complex and heterogeneous Big Data from multiple sources offers unparalleled opportunities to study the early stages of prevalent neurodegenerative processes, track their progression and quickly identify the efficacies of alternative treatments. Many previous human and animal studies have examined the relationship of Parkinson's disease (PD) risk to trauma, genetics, environment, co-morbidities, or life style. The defining characteristics of Big Data-large size, incongruency, incompleteness, complexity, multiplicity of scales, and heterogeneity of information-generating sources-all pose challenges to the classical techniques for data management, processing, visualization and interpretation. We propose, implement, test and validate complementary model-based and model-free approaches for PD classification and prediction. To explore PD risk using Big Data methodology, we jointly processed complex PPMI imaging, genetics, clinical and demographic data. Collective representation of the multi-source data facilitates the aggregation and harmonization of complex data elements. This enables joint modeling of the complete data, leading to the development of Big Data analytics, predictive synthesis, and statistical validation. Using heterogeneous PPMI data, we developed a comprehensive protocol for end-to-end data characterization, manipulation, processing, cleaning, analysis and validation. Specifically, we (i) introduce methods for rebalancing imbalanced cohorts, (ii) utilize a wide spectrum of classification methods to generate consistent and powerful phenotypic predictions, and (iii) generate reproducible machine-learning based classification that enables the reporting of model parameters and diagnostic forecasting based on new data. We evaluated several complementary model-based predictive approaches, which failed to generate accurate and reliable diagnostic predictions. However, the results of several machine-learning based classification methods indicated significant power to predict Parkinson's disease in the PPMI subjects (consistent accuracy, sensitivity, and specificity exceeding 96%, confirmed using statistical n-fold cross-validation). Clinical (e.g., Unified Parkinson's Disease Rating Scale (UPDRS) scores), demographic (e.g., age), genetics (e.g., rs34637584, chr12), and derived neuroimaging biomarker (e.g., cerebellum shape index) data all contributed to the predictive analytics and diagnostic forecasting. Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of predictive precision and reliability (e.g., forecasting patient diagnosis). We observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data. UPDRS scores play a critical role in predicting diagnosis, which is expected based on the clinical definition of Parkinson's disease. Even without longitudinal UPDRS data, however, the accuracy of model-free machine learning based classification is over 80%. The methods, software and protocols developed here are openly shared and can be employed to study other neurodegenerative disorders (e.g., Alzheimer's, Huntington's, amyotrophic lateral sclerosis), as well as for other predictive Big Data analytics applications.
Statistical methods used in articles published by the Journal of Periodontal and Implant Science.
Choi, Eunsil; Lyu, Jiyoung; Park, Jinyoung; Kim, Hae-Young
2014-12-01
The purposes of this study were to assess the trend of use of statistical methods including parametric and nonparametric methods and to evaluate the use of complex statistical methodology in recent periodontal studies. This study analyzed 123 articles published in the Journal of Periodontal & Implant Science (JPIS) between 2010 and 2014. Frequencies and percentages were calculated according to the number of statistical methods used, the type of statistical method applied, and the type of statistical software used. Most of the published articles considered (64.4%) used statistical methods. Since 2011, the percentage of JPIS articles using statistics has increased. On the basis of multiple counting, we found that the percentage of studies in JPIS using parametric methods was 61.1%. Further, complex statistical methods were applied in only 6 of the published studies (5.0%), and nonparametric statistical methods were applied in 77 of the published studies (38.9% of a total of 198 studies considered). We found an increasing trend towards the application of statistical methods and nonparametric methods in recent periodontal studies and thus, concluded that increased use of complex statistical methodology might be preferred by the researchers in the fields of study covered by JPIS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sung, Yixing; Adams, Brian M.; Witkowski, Walter R.
2011-04-01
The CASL Level 2 Milestone VUQ.Y1.03, 'Enable statistical sensitivity and UQ demonstrations for VERA,' was successfully completed in March 2011. The VUQ focus area led this effort, in close partnership with AMA, and with support from VRI. DAKOTA was coupled to VIPRE-W thermal-hydraulics simulations representing reactors of interest to address crud-related challenge problems in order to understand the sensitivity and uncertainty in simulation outputs with respect to uncertain operating and model form parameters. This report summarizes work coupling the software tools, characterizing uncertainties, selecting sensitivity and uncertainty quantification algorithms, and analyzing the results of iterative studies. These demonstration studies focusedmore » on sensitivity and uncertainty of mass evaporation rate calculated by VIPRE-W, a key predictor for crud-induced power shift (CIPS).« less
Deformably registering and annotating whole CLARITY brains to an atlas via masked LDDMM
NASA Astrophysics Data System (ADS)
Kutten, Kwame S.; Vogelstein, Joshua T.; Charon, Nicolas; Ye, Li; Deisseroth, Karl; Miller, Michael I.
2016-04-01
The CLARITY method renders brains optically transparent to enable high-resolution imaging in the structurally intact brain. Anatomically annotating CLARITY brains is necessary for discovering which regions contain signals of interest. Manually annotating whole-brain, terabyte CLARITY images is difficult, time-consuming, subjective, and error-prone. Automatically registering CLARITY images to a pre-annotated brain atlas offers a solution, but is difficult for several reasons. Removal of the brain from the skull and subsequent storage and processing cause variable non-rigid deformations, thus compounding inter-subject anatomical variability. Additionally, the signal in CLARITY images arises from various biochemical contrast agents which only sparsely label brain structures. This sparse labeling challenges the most commonly used registration algorithms that need to match image histogram statistics to the more densely labeled histological brain atlases. The standard method is a multiscale Mutual Information B-spline algorithm that dynamically generates an average template as an intermediate registration target. We determined that this method performs poorly when registering CLARITY brains to the Allen Institute's Mouse Reference Atlas (ARA), because the image histogram statistics are poorly matched. Therefore, we developed a method (Mask-LDDMM) for registering CLARITY images, that automatically finds the brain boundary and learns the optimal deformation between the brain and atlas masks. Using Mask-LDDMM without an average template provided better results than the standard approach when registering CLARITY brains to the ARA. The LDDMM pipelines developed here provide a fast automated way to anatomically annotate CLARITY images; our code is available as open source software at http://NeuroData.io.
NASA Astrophysics Data System (ADS)
Di Mauro, M.; Manconi, S.; Zechlin, H.-S.; Ajello, M.; Charles, E.; Donato, F.
2018-04-01
The Fermi Large Area Telescope (LAT) Collaboration has recently released the Third Catalog of Hard Fermi-LAT Sources (3FHL), which contains 1556 sources detected above 10 GeV with seven years of Pass 8 data. Building upon the 3FHL results, we investigate the flux distribution of sources at high Galactic latitudes (| b| > 20^\\circ ), which are mostly blazars. We use two complementary techniques: (1) a source-detection efficiency correction method and (2) an analysis of pixel photon count statistics with the one-point probability distribution function (1pPDF). With the first method, using realistic Monte Carlo simulations of the γ-ray sky, we calculate the efficiency of the LAT to detect point sources. This enables us to find the intrinsic source-count distribution at photon fluxes down to 7.5 × 10‑12 ph cm‑2 s‑1. With this method, we detect a flux break at (3.5 ± 0.4) × 10‑11 ph cm‑2 s‑1 with a significance of at least 5.4σ. The power-law indexes of the source-count distribution above and below the break are 2.09 ± 0.04 and 1.07 ± 0.27, respectively. This result is confirmed with the 1pPDF method, which has a sensitivity reach of ∼10‑11 ph cm‑2 s‑1. Integrating the derived source-count distribution above the sensitivity of our analysis, we find that (42 ± 8)% of the extragalactic γ-ray background originates from blazars.
Just-in-time Design and Additive Manufacture of Patient-specific Medical Implants
NASA Astrophysics Data System (ADS)
Shidid, Darpan; Leary, Martin; Choong, Peter; Brandt, Milan
Recent advances in medical imaging and manufacturing science have enabled the design and production of complex, patient-specific orthopaedic implants. Additive Manufacture (AM) generates three-dimensional structures layer by layer, and is not subject to the constraints associated with traditional manufacturing methods. AM provides significant opportunities for the design of novel geometries and complex lattice structures with enhanced functional performance. However, the design and manufacture of patient-specific AM implant structures requires unique expertise in handling various optimization platforms. Furthermore, the design process for complex structures is computationally intensive. The primary aim of this research is to enable the just-in-time customisation of AM prosthesis; whereby AM implant design and manufacture be completed within the time constraints of a single surgical procedure, while minimising prosthesis mass and optimising the lattice structure to match the stiffness of the surrounding bone tissue. In this research, a design approach using raw CT scan data is applied to the AM manufacture of femoral prosthesis. Using the proposed just-in-time concept, the mass of the prosthesis was rapidly designed and manufactured while satisfying the associated structural requirements. Compressive testing of lattice structures manufactured using proposed method shows that the load carrying capacity of the resected composite bone can be recovered by up to 85% and the compressive stiffness of the AM prosthesis is statistically indistinguishable from the stiffness of the initial bone.
Mini-implants for orthodontic anchorage.
Reynders, Reint Meursinge; Ladu, Luisa
2017-10-27
Data sourcesPubmed, Embase, Cochrane Central Register of Controlled Trials and the Web of Science databases. Hand searches of the journals European Journal of Orthodontics, Journal of Orthodontics, Journal of Clinical Orthodontics, Seminars in Orthodontics, American Journal of Orthodontics & Dentofacial Orthopaedics and Angle Orthodontist.Study selectionTwo reviewers independently selected studies. Randomised controlled trials (RCTs) and controlled clinical trials (CCTs) of orthodontic patients requiring extraction of the maxillary first premolars and closure of the spaces without anchorage loss were considered.Data extraction and synthesisData extraction and risk of bias assessment were carried out independently by two reviewers. Meta-analysis and sensitivity analysis were conducted.ResultsFourteen studies; seven RCTS and seven CCTs were included. In total 303 patients received TISADs with 313 control patients. Overall the quality of the studies was considered to be moderate. Overall the TISAD group had significantly less anchorage loss than the control group. On average, TISADs enabled 1.86mm more anchorage preservation than did conventional methods.ConclusionsThe results of the meta-analysis showed that TISADs are more effective than conventional methods of anchorage reinforcement. The average difference of 2mm seems not only statistically but also clinically significant. However, the results should be interpreted with caution because of the moderate quality of the included studies. More high-quality studies on this issue are necessary to enable drawing more reliable conclusions.
HiQuant: Rapid Postquantification Analysis of Large-Scale MS-Generated Proteomics Data.
Bryan, Kenneth; Jarboui, Mohamed-Ali; Raso, Cinzia; Bernal-Llinares, Manuel; McCann, Brendan; Rauch, Jens; Boldt, Karsten; Lynn, David J
2016-06-03
Recent advances in mass-spectrometry-based proteomics are now facilitating ambitious large-scale investigations of the spatial and temporal dynamics of the proteome; however, the increasing size and complexity of these data sets is overwhelming current downstream computational methods, specifically those that support the postquantification analysis pipeline. Here we present HiQuant, a novel application that enables the design and execution of a postquantification workflow, including common data-processing steps, such as assay normalization and grouping, and experimental replicate quality control and statistical analysis. HiQuant also enables the interpretation of results generated from large-scale data sets by supporting interactive heatmap analysis and also the direct export to Cytoscape and Gephi, two leading network analysis platforms. HiQuant may be run via a user-friendly graphical interface and also supports complete one-touch automation via a command-line mode. We evaluate HiQuant's performance by analyzing a large-scale, complex interactome mapping data set and demonstrate a 200-fold improvement in the execution time over current methods. We also demonstrate HiQuant's general utility by analyzing proteome-wide quantification data generated from both a large-scale public tyrosine kinase siRNA knock-down study and an in-house investigation into the temporal dynamics of the KSR1 and KSR2 interactomes. Download HiQuant, sample data sets, and supporting documentation at http://hiquant.primesdb.eu .
Discovering Conformational Sub-States Relevant to Protein Function
Ramanathan, Arvind; Savol, Andrej J.; Langmead, Christopher J.; Agarwal, Pratul K.; Chennubhotla, Chakra S.
2011-01-01
Background Internal motions enable proteins to explore a range of conformations, even in the vicinity of native state. The role of conformational fluctuations in the designated function of a protein is widely debated. Emerging evidence suggests that sub-groups within the range of conformations (or sub-states) contain properties that may be functionally relevant. However, low populations in these sub-states and the transient nature of conformational transitions between these sub-states present significant challenges for their identification and characterization. Methods and Findings To overcome these challenges we have developed a new computational technique, quasi-anharmonic analysis (QAA). QAA utilizes higher-order statistics of protein motions to identify sub-states in the conformational landscape. Further, the focus on anharmonicity allows identification of conformational fluctuations that enable transitions between sub-states. QAA applied to equilibrium simulations of human ubiquitin and T4 lysozyme reveals functionally relevant sub-states and protein motions involved in molecular recognition. In combination with a reaction pathway sampling method, QAA characterizes conformational sub-states associated with cis/trans peptidyl-prolyl isomerization catalyzed by the enzyme cyclophilin A. In these three proteins, QAA allows identification of conformational sub-states, with critical structural and dynamical features relevant to protein function. Conclusions Overall, QAA provides a novel framework to intuitively understand the biophysical basis of conformational diversity and its relevance to protein function. PMID:21297978
MEG and EEG data analysis with MNE-Python.
Gramfort, Alexandre; Luessi, Martin; Larson, Eric; Engemann, Denis A; Strohmeier, Daniel; Brodbeck, Christian; Goj, Roman; Jas, Mainak; Brooks, Teon; Parkkonen, Lauri; Hämäläinen, Matti
2013-12-26
Magnetoencephalography and electroencephalography (M/EEG) measure the weak electromagnetic signals generated by neuronal activity in the brain. Using these signals to characterize and locate neural activation in the brain is a challenge that requires expertise in physics, signal processing, statistics, and numerical methods. As part of the MNE software suite, MNE-Python is an open-source software package that addresses this challenge by providing state-of-the-art algorithms implemented in Python that cover multiple methods of data preprocessing, source localization, statistical analysis, and estimation of functional connectivity between distributed brain regions. All algorithms and utility functions are implemented in a consistent manner with well-documented interfaces, enabling users to create M/EEG data analysis pipelines by writing Python scripts. Moreover, MNE-Python is tightly integrated with the core Python libraries for scientific comptutation (NumPy, SciPy) and visualization (matplotlib and Mayavi), as well as the greater neuroimaging ecosystem in Python via the Nibabel package. The code is provided under the new BSD license allowing code reuse, even in commercial products. Although MNE-Python has only been under heavy development for a couple of years, it has rapidly evolved with expanded analysis capabilities and pedagogical tutorials because multiple labs have collaborated during code development to help share best practices. MNE-Python also gives easy access to preprocessed datasets, helping users to get started quickly and facilitating reproducibility of methods by other researchers. Full documentation, including dozens of examples, is available at http://martinos.org/mne.
Collins, Simon N; Dyson, Sue J; Murray, Rachel C; Newton, J Richard; Burden, Faith; Trawford, Andrew F
2012-08-01
To establish and validate an objective method of radiographic diagnosis of anatomic changes in laminitic forefeet of donkeys on the basis of data from a comprehensive series of radiographic measurements. 85 donkeys with and 85 without forelimb laminitis for baseline data determination; a cohort of 44 donkeys with and 18 without forelimb laminitis was used for validation analyses. For each donkey, lateromedial radiographic views of 1 weight-bearing forelimb were obtained; images from 11 laminitic and 2 nonlaminitic donkeys were excluded (motion artifact) from baseline data determination. Data from an a priori selection of 19 measurements of anatomic features of laminitic and nonlaminitic donkey feet were analyzed by use of a novel application of multivariate statistical techniques. The resultant diagnostic models were validated in a blinded manner with data from the separate cohort of laminitic and nonlaminitic donkeys. Data were modeled, and robust statistical rules were established for the diagnosis of anatomic changes within laminitic donkey forefeet. Component 1 scores ≤ -3.5 were indicative of extreme anatomic change, and scores from -2.0 to 0.0 denoted modest change. Nonlaminitic donkeys with a score from 0.5 to 1.0 should be considered as at risk for laminitis. Results indicated that the radiographic procedures evaluated can be used for the identification, assessment, and monitoring of anatomic changes associated with laminitis. Screening assessments by use of this method may enable early detection of mild anatomic change and identification of at-risk donkeys.
MEG and EEG data analysis with MNE-Python
Gramfort, Alexandre; Luessi, Martin; Larson, Eric; Engemann, Denis A.; Strohmeier, Daniel; Brodbeck, Christian; Goj, Roman; Jas, Mainak; Brooks, Teon; Parkkonen, Lauri; Hämäläinen, Matti
2013-01-01
Magnetoencephalography and electroencephalography (M/EEG) measure the weak electromagnetic signals generated by neuronal activity in the brain. Using these signals to characterize and locate neural activation in the brain is a challenge that requires expertise in physics, signal processing, statistics, and numerical methods. As part of the MNE software suite, MNE-Python is an open-source software package that addresses this challenge by providing state-of-the-art algorithms implemented in Python that cover multiple methods of data preprocessing, source localization, statistical analysis, and estimation of functional connectivity between distributed brain regions. All algorithms and utility functions are implemented in a consistent manner with well-documented interfaces, enabling users to create M/EEG data analysis pipelines by writing Python scripts. Moreover, MNE-Python is tightly integrated with the core Python libraries for scientific comptutation (NumPy, SciPy) and visualization (matplotlib and Mayavi), as well as the greater neuroimaging ecosystem in Python via the Nibabel package. The code is provided under the new BSD license allowing code reuse, even in commercial products. Although MNE-Python has only been under heavy development for a couple of years, it has rapidly evolved with expanded analysis capabilities and pedagogical tutorials because multiple labs have collaborated during code development to help share best practices. MNE-Python also gives easy access to preprocessed datasets, helping users to get started quickly and facilitating reproducibility of methods by other researchers. Full documentation, including dozens of examples, is available at http://martinos.org/mne. PMID:24431986
Zipper plot: visualizing transcriptional activity of genomic regions.
Avila Cobos, Francisco; Anckaert, Jasper; Volders, Pieter-Jan; Everaert, Celine; Rombaut, Dries; Vandesompele, Jo; De Preter, Katleen; Mestdagh, Pieter
2017-05-02
Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5'-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool.
Model-based branching point detection in single-cell data by K-branches clustering
Chlis, Nikolaos K.; Wolf, F. Alexander; Theis, Fabian J.
2017-01-01
Abstract Motivation The identification of heterogeneities in cell populations by utilizing single-cell technologies such as single-cell RNA-Seq, enables inference of cellular development and lineage trees. Several methods have been proposed for such inference from high-dimensional single-cell data. They typically assign each cell to a branch in a differentiation trajectory. However, they commonly assume specific geometries such as tree-like developmental hierarchies and lack statistically sound methods to decide on the number of branching events. Results We present K-Branches, a solution to the above problem by locally fitting half-lines to single-cell data, introducing a clustering algorithm similar to K-Means. These halflines are proxies for branches in the differentiation trajectory of cells. We propose a modified version of the GAP statistic for model selection, in order to decide on the number of lines that best describe the data locally. In this manner, we identify the location and number of subgroups of cells that are associated with branching events and full differentiation, respectively. We evaluate the performance of our method on single-cell RNA-Seq data describing the differentiation of myeloid progenitors during hematopoiesis, single-cell qPCR data of mouse blastocyst development, single-cell qPCR data of human myeloid monocytic leukemia and artificial data. Availability and implementation An R implementation of K-Branches is freely available at https://github.com/theislab/kbranches. Contact fabian.theis@helmholtz-muenchen.de Supplementary information Supplementary data are available at Bioinformatics online. PMID:28582478
Zborowsky, Terri
2014-01-01
The purpose of this paper is to explore nursing research that is focused on the impact of healthcare environments and that has resonance with the aspects of Florence Nightingale's environmental theory. Nurses have a unique ability to apply their observational skills to understand the role of the designed environment to enable healing in their patients. This affords nurses the opportunity to engage in research studies that have immediate impact on the act of nursing. Descriptive statistics were performed on 67 healthcare design-related research articles from 25 nursing journals to discover the topical areas of interest of nursing research today. Data were also analyzed to reveal the research designs, research methods, and research settings. These data are part of an ongoing study. Descriptive statistics reveal that topics and settings most frequently cited are in keeping with the current healthcare foci of patient care quality and safety in acute and intensive care environments. Research designs and methods most frequently cited are in keeping with the early progression of a knowledge area. A few assertions can be made as a result of this study. First, education is important to continue the knowledge development in this area. Second, multiple method research studies should continue to be considered as important to healthcare research. Finally, bedside nurses are in the best position possible to begin to help us all, through research, understand how the design environment impacts patients during the act of nursing. Evidence-based design, literature review, nursing.
Using statistical text classification to identify health information technology incidents
Chai, Kevin E K; Anthony, Stephen; Coiera, Enrico; Magrabi, Farah
2013-01-01
Objective To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. Design We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both ‘balanced’ (50% HIT) and ‘stratified’ (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. Measurements κ statistic, F1 score, precision and recall. Results Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). Conclusions Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation. PMID:23666777
FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption
2015-01-01
Background The increasing availability of genome data motivates massive research studies in personalized treatment and precision medicine. Public cloud services provide a flexible way to mitigate the storage and computation burden in conducting genome-wide association studies (GWAS). However, data privacy has been widely concerned when sharing the sensitive information in a cloud environment. Methods We presented a novel framework (FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption) to fully outsource GWAS (i.e., chi-square statistic computation) using homomorphic encryption. The proposed framework enables secure divisions over encrypted data. We introduced two division protocols (i.e., secure errorless division and secure approximation division) with a trade-off between complexity and accuracy in computing chi-square statistics. Results The proposed framework was evaluated for the task of chi-square statistic computation with two case-control datasets from the 2015 iDASH genome privacy protection challenge. Experimental results show that the performance of FORESEE can be significantly improved through algorithmic optimization and parallel computation. Remarkably, the secure approximation division provides significant performance gain, but without missing any significance SNPs in the chi-square association test using the aforementioned datasets. Conclusions Unlike many existing HME based studies, in which final results need to be computed by the data owner due to the lack of the secure division operation, the proposed FORESEE framework support complete outsourcing to the cloud and output the final encrypted chi-square statistics. PMID:26733391
A user-targeted synthesis of the VALUE perfect predictor experiment
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutierrez, Jose; Kotlarski, Sven; Hertig, Elke; Wibig, Joanna; Rössler, Ole; Huth, Radan
2016-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. We consider different aspects: (1) marginal aspects such as mean, variance and extremes; (2) temporal aspects such as spell length characteristics; (3) spatial aspects such as the de-correlation length of precipitation extremes; and multi-variate aspects such as the interplay of temperature and precipitation or scale-interactions. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur. Experiment 1 (perfect predictors): what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Experiment 2 (Global climate model predictors): how is the overall representation of regional climate, including errors inherited from global climate models? Experiment 3 (pseudo reality): do methods fail in representing regional climate change? Here, we present a user-targeted synthesis of the results of the first VALUE experiment. In this experiment, downscaling methods are driven with ERA-Interim reanalysis data to eliminate global climate model errors, over the period 1979-2008. As reference data we use, depending on the question addressed, (1) observations from 86 meteorological stations distributed across Europe; (2) gridded observations at the corresponding 86 locations or (3) gridded spatially extended observations for selected European regions. With more than 40 contributing methods, this study is the most comprehensive downscaling inter-comparison project so far. The results clearly indicate that for several aspects, the downscaling skill varies considerably between different methods. For specific purposes, some methods can therefore clearly be excluded.
Empowerment of women for health promotion: a meta-analysis.
Kar, S B; Pascual, C A; Chickering, K L
1999-12-01
The objective of this paper is to identify conditions, factors and methods, which empower women and mothers (WAM) for social action and health promotion movements. WAM are the primary caregivers in almost all cultures; they have demonstrated bold leadership under extreme adversity. Consequently, when empowered and involved, WAM can be effective partners in health promotion programs. The methodology includes a meta-analysis of 40 exemplary case studies from across the world, which meet predetermined criteria, to draw implications for social action and health promotion. Cases were selected from industrialized and less-industrialized nations and from four problem domains affecting quality of life and health: (1) human rights, (2) women's equal rights, (3) economic enhancement and (4) health promotion. Content analysis extracted data from all cases on six dimensions: (1) problem, (2) impetus/leadership, (3) macro-environment, (4) methods used, (5) partners/opponents and (6) impact. Analysis identified seven methods frequently used to EMPOWER (acronym): empowerment education and training, media use and advocacy, public education and participation, organizing associations and unions, work training and micro-enterprise, enabling services and support, and rights protection and promotion. Cochran's Q test confirmed significant differences in the frequencies of methods used. The seven EMPOWER methods were used in this order: enabling services, rights protection/promotion, public education, media use/advocacy, and organizing associations/unions, empowerment education, and work training and micro-enterprise. Media and public education were more frequently used by industrialized than non-industrialized societies (X2 tests). While frequencies of methods used varied in all other comparisons, these differences were not statistically significant, suggesting the importance of these methods across problem domains and levels of industrialization. The paper integrates key findings into an empowerment model consisting of five stages: motivation for action, empowerment support, initial individual action, empowerment program, and institutionalization and replication. Implications for policy and health promotion programs are discussed.
Kintrup, J; Wünsch, G
2001-11-01
The capability of sewer slime to accumulate heavy metals from municipal wastewater can be exploited to identify the sources of sewage sludge pollution. Former investigations of sewer slime looked for a few elements only and could, therefore, not account for deviations of the enrichment efficiency of the slime or for irregularities from sampling. Results of ICP-MS multi element determinations were analyzed by multivariate statistical methods. A new dimensionless characteristic "sewer slime impact" is proposed, which is zero for unloaded samples. Patterns expressed in this data format specifically extract the information required to identify the type of pollution and polluter quicker and with less effort and cost than hitherto.
Method and apparatus for offloading compute resources to a flash co-processing appliance
Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing -bung
2015-10-13
Solid-State Drive (SSD) burst buffer nodes are interposed into a parallel supercomputing cluster to enable fast burst checkpoint of cluster memory to or from nearby interconnected solid-state storage with asynchronous migration between the burst buffer nodes and slower more distant disk storage. The SSD nodes also perform tasks offloaded from the compute nodes or associated with the checkpoint data. For example, the data for the next job is preloaded in the SSD node and very fast uploaded to the respective compute node just before the next job starts. During a job, the SSD nodes perform fast visualization and statistical analysis upon the checkpoint data. The SSD nodes can also perform data reduction and encryption of the checkpoint data.
Peto, R.; Pike, M. C.; Armitage, P.; Breslow, N. E.; Cox, D. R.; Howard, S. V.; Mantel, N.; McPherson, K.; Peto, J.; Smith, P. G.
1977-01-01
Part I of this report appeared in the previous issue (Br. J. Cancer (1976) 34,585), and discussed the design of randomized clinical trials. Part II now describes efficient methods of analysis of randomized clinical trials in which we wish to compare the duration of survival (or the time until some other untoward event first occurs) among different groups of patients. It is intended to enable physicians without statistical training either to analyse such data themselves using life tables, the logrank test and retrospective stratification, or, when such analyses are presented, to appreciate them more critically, but the discussion may also be of interest to statisticians who have not yet specialized in clinical trial analyses. PMID:831755
High-Reproducibility and High-Accuracy Method for Automated Topic Classification
NASA Astrophysics Data System (ADS)
Lancichinetti, Andrea; Sirer, M. Irmak; Wang, Jane X.; Acuna, Daniel; Körding, Konrad; Amaral, Luís A. Nunes
2015-01-01
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent searching, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state of the art in topic modeling. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results that are not accurate in inferring the most suitable model parameters. Adapting approaches from community detection in networks, we propose a new algorithm that displays high reproducibility and high accuracy and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure.
Molecular modeling of polycarbonate materials: Glass transition and mechanical properties
NASA Astrophysics Data System (ADS)
Palczynski, Karol; Wilke, Andreas; Paeschke, Manfred; Dzubiella, Joachim
2017-09-01
Linking the experimentally accessible macroscopic properties of thermoplastic polymers to their microscopic static and dynamic properties is a key requirement for targeted material design. Classical molecular dynamics simulations enable us to study the structural and dynamic behavior of molecules on microscopic scales, and statistical physics provides a framework for relating these properties to the macroscopic properties. We take a first step toward creating an automated workflow for the theoretical prediction of thermoplastic material properties by developing an expeditious method for parameterizing a simple yet surprisingly powerful coarse-grained bisphenol-A polycarbonate model which goes beyond previous coarse-grained models and successfully reproduces the thermal expansion behavior, the glass transition temperature as a function of the molecular weight, and several elastic properties.
A practical approach to automate randomized design of experiments for ligand-binding assays.
Tsoi, Jennifer; Patel, Vimal; Shih, Judy
2014-03-01
Design of experiments (DOE) is utilized in optimizing ligand-binding assay by modeling factor effects. To reduce the analyst's workload and error inherent with DOE, we propose the integration of automated liquid handlers to perform the randomized designs. A randomized design created from statistical software was imported into custom macro converting the design into a liquid-handler worklist to automate reagent delivery. An optimized assay was transferred to a contract research organization resulting in a successful validation. We developed a practical solution for assay optimization by integrating DOE and automation to increase assay robustness and enable successful method transfer. The flexibility of this process allows it to be applied to a variety of assay designs.
[The future of forensic DNA analysis for criminal justice].
Laurent, François-Xavier; Vibrac, Geoffrey; Rubio, Aurélien; Thévenot, Marie-Thérèse; Pène, Laurent
2017-11-01
In the criminal framework, the analysis of approximately 20 DNA microsatellites enables the establishment of a genetic profile with a high statistical power of discrimination. This technique gives us the possibility to establish or exclude a match between a biological trace detected at a crime scene and a suspect whose DNA was collected via an oral swab. However, conventional techniques do tend to complexify the interpretation of complex DNA samples, such as degraded DNA and mixture DNA. The aim of this review is to highlight the powerness of new forensic DNA methods (including high-throughput sequencing or single-cell sequencing) to facilitate the interpretation of the expert with full compliance with existing french legislation. © 2017 médecine/sciences – Inserm.
Time studies in A&E departments--a useful tool for management.
Aharonson-Daniel, L; Fung, H; Hedley, A J
1996-01-01
A time and motion study was conducted in an accident and emergency (A&E) department in a Hong Kong Government hospital in order to suggest solutions for severe queuing problems found in A&E. The study provided useful information about the patterns of arrival and service; the throughput; and the factors that influence the length of the queue at the A&E department. Plans for building a computerized simulation model were dropped as new intelligence generated by the study enabled problem solving using simple statistical analysis and common sense. Demonstrates some potential benefits for management in applying operations research methods in busy clinical working environments. The implementation of the recommendations made by this study successfully eliminated queues in A&E.
Data survey on the effect of product features on competitive advantage of selected firms in Nigeria.
Olokundun, Maxwell; Iyiola, Oladele; Ibidunni, Stephen; Falola, Hezekiah; Salau, Odunayo; Amaihian, Augusta; Peter, Fred; Borishade, Taiye
2018-06-01
The main objective of this study was to present a data article that investigates the effect product features on firm's competitive advantage. Few studies have examined how the features of a product could help in driving the competitive advantage of a firm. Descriptive research method was used. Statistical Package for Social Sciences (SPSS 22) was engaged for analysis of one hundred and fifty (150) valid questionnaire which were completed by small business owners registered under small and medium scale enterprises development of Nigeria (SMEDAN). Stratified and simple random sampling techniques were employed; reliability and validity procedures were also confirmed. The field data set is made publicly available to enable critical or extended analysis.
Detecting Disease Specific Pathway Substructures through an Integrated Systems Biology Approach
Alaimo, Salvatore; Marceca, Gioacchino Paolo; Ferro, Alfredo; Pulvirenti, Alfredo
2017-01-01
In the era of network medicine, pathway analysis methods play a central role in the prediction of phenotype from high throughput experiments. In this paper, we present a network-based systems biology approach capable of extracting disease-perturbed subpathways within pathway networks in connection with expression data taken from The Cancer Genome Atlas (TCGA). Our system extends pathways with missing regulatory elements, such as microRNAs, and their interactions with genes. The framework enables the extraction, visualization, and analysis of statistically significant disease-specific subpathways through an easy to use web interface. Our analysis shows that the methodology is able to fill the gap in current techniques, allowing a more comprehensive analysis of the phenomena underlying disease states. PMID:29657291
The Gender Differences: Hispanic Females and Males Majoring in Science or Engineering
NASA Astrophysics Data System (ADS)
Brown, Susan Wightman
Documented by national statistics, female Hispanic students are not eagerly rushing to major in science or engineering. Using Seidman's in-depth interviewing method, 22 Hispanic students, 12 female and 10 male, majoring in science or engineering were interviewed. Besides the themes that emerged with all 22 Hispanic students, there were definite differences between the female and male Hispanic students: role and ethnic identity confusion, greater college preparation, mentoring needed, and the increased participation in enriched additional education programs by the female Hispanic students. Listening to these stories from successful female Hispanic students majoring in science and engineering, educators can make changes in our school learning environments that will encourage and enable more female Hispanic students to choose science or engineering careers.
The basis function approach for modeling autocorrelation in ecological data.
Hefley, Trevor J; Broms, Kristin M; Brost, Brian M; Buderman, Frances E; Kay, Shannon L; Scharf, Henry R; Tipton, John R; Williams, Perry J; Hooten, Mevin B
2017-03-01
Analyzing ecological data often requires modeling the autocorrelation created by spatial and temporal processes. Many seemingly disparate statistical methods used to account for autocorrelation can be expressed as regression models that include basis functions. Basis functions also enable ecologists to modify a wide range of existing ecological models in order to account for autocorrelation, which can improve inference and predictive accuracy. Furthermore, understanding the properties of basis functions is essential for evaluating the fit of spatial or time-series models, detecting a hidden form of collinearity, and analyzing large data sets. We present important concepts and properties related to basis functions and illustrate several tools and techniques ecologists can use when modeling autocorrelation in ecological data. © 2016 by the Ecological Society of America.
Technow, Frank; Messina, Carlos D; Totir, L Radu; Cooper, Mark
2015-01-01
Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.
Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation
Technow, Frank; Messina, Carlos D.; Totir, L. Radu; Cooper, Mark
2015-01-01
Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics. PMID:26121133
Near-Sun and 1 AU magnetic field of coronal mass ejections: a parametric study
NASA Astrophysics Data System (ADS)
Patsourakos, S.; Georgoulis, M. K.
2016-11-01
Aims: The magnetic field of coronal mass ejections (CMEs) determines their structure, evolution, and energetics, as well as their geoeffectiveness. However, we currently lack routine diagnostics of the near-Sun CME magnetic field, which is crucial for determining the subsequent evolution of CMEs. Methods: We recently presented a method to infer the near-Sun magnetic field magnitude of CMEs and then extrapolate it to 1 AU. This method uses relatively easy to deduce observational estimates of the magnetic helicity in CME-source regions along with geometrical CME fits enabled by coronagraph observations. We hereby perform a parametric study of this method aiming to assess its robustness. We use statistics of active region (AR) helicities and CME geometrical parameters to determine a matrix of plausible near-Sun CME magnetic field magnitudes. In addition, we extrapolate this matrix to 1 AU and determine the anticipated range of CME magnetic fields at 1 AU representing the radial falloff of the magnetic field in the CME out to interplanetary (IP) space by a power law with index αB. Results: The resulting distribution of the near-Sun (at 10 R⊙) CME magnetic fields varies in the range [0.004, 0.02] G, comparable to, or higher than, a few existing observational inferences of the magnetic field in the quiescent corona at the same distance. We also find that a theoretically and observationally motivated range exists around αB = -1.6 ± 0.2, thereby leading to a ballpark agreement between our estimates and observationally inferred field magnitudes of magnetic clouds (MCs) at L1. Conclusions: In a statistical sense, our method provides results that are consistent with observations.
Wagner, Rebecca; Wetzel, Stephanie J; Kern, John; Kingston, H M Skip
2012-02-01
The employment of chemical weapons by rogue states and/or terrorist organizations is an ongoing concern in the United States. The quantitative analysis of nerve agents must be rapid and reliable for use in the private and public sectors. Current methods describe a tedious and time-consuming derivatization for gas chromatography-mass spectrometry and liquid chromatography in tandem with mass spectrometry. Two solid-phase extraction (SPE) techniques for the analysis of glyphosate and methylphosphonic acid are described with the utilization of isotopically enriched analytes for quantitation via atmospheric pressure chemical ionization-quadrupole time-of-flight mass spectrometry (APCI-Q-TOF-MS) that does not require derivatization. Solid-phase extraction-isotope dilution mass spectrometry (SPE-IDMS) involves pre-equilibration of a naturally occurring sample with an isotopically enriched standard. The second extraction method, i-Spike, involves loading an isotopically enriched standard onto the SPE column before the naturally occurring sample. The sample and the spike are then co-eluted from the column enabling precise and accurate quantitation via IDMS. The SPE methods in conjunction with IDMS eliminate concerns of incomplete elution, matrix and sorbent effects, and MS drift. For accurate quantitation with IDMS, the isotopic contribution of all atoms in the target molecule must be statistically taken into account. This paper describes two newly developed sample preparation techniques for the analysis of nerve agent surrogates in drinking water as well as statistical probability analysis for proper molecular IDMS. The methods described in this paper demonstrate accurate molecular IDMS using APCI-Q-TOF-MS with limits of quantitation as low as 0.400 mg/kg for glyphosate and 0.031 mg/kg for methylphosphonic acid. Copyright © 2012 John Wiley & Sons, Ltd.
Quantitative analysis of tympanic membrane perforation: a simple and reliable method.
Ibekwe, T S; Adeosun, A A; Nwaorgu, O G
2009-01-01
Accurate assessment of the features of tympanic membrane perforation, especially size, site, duration and aetiology, is important, as it enables optimum management. To describe a simple, cheap and effective method of quantitatively analysing tympanic membrane perforations. The system described comprises a video-otoscope (capable of generating still and video images of the tympanic membrane), adapted via a universal serial bus box to a computer screen, with images analysed using the Image J geometrical analysis software package. The reproducibility of results and their correlation with conventional otoscopic methods of estimation were tested statistically with the paired t-test and correlational tests, using the Statistical Package for the Social Sciences version 11 software. The following equation was generated: P/T x 100 per cent = percentage perforation, where P is the area (in pixels2) of the tympanic membrane perforation and T is the total area (in pixels2) for the entire tympanic membrane (including the perforation). Illustrations are shown. Comparison of blinded data on tympanic membrane perforation area obtained independently from assessments by two trained otologists, of comparative years of experience, using the video-otoscopy system described, showed similar findings, with strong correlations devoid of inter-observer error (p = 0.000, r = 1). Comparison with conventional otoscopic assessment also indicated significant correlation, comparing results for two trained otologists, but some inter-observer variation was present (p = 0.000, r = 0.896). Correlation between the two methods for each of the otologists was also highly significant (p = 0.000). A computer-adapted video-otoscope, with images analysed by Image J software, represents a cheap, reliable, technology-driven, clinical method of quantitative analysis of tympanic membrane perforations and injuries.
Zhao, W; Busto, R; Truettner, J; Ginsberg, M D
2001-07-30
The analysis of pixel-based relationships between local cerebral blood flow (LCBF) and mRNA expression can reveal important insights into brain function. Traditionally, LCBF and in situ hybridization studies for genes of interest have been analyzed in separate series. To overcome this limitation and to increase the power of statistical analysis, this study focused on developing a double-label method to measure local cerebral blood flow (LCBF) and gene expressions simultaneously by means of a dual-autoradiography procedure. A 14C-iodoantipyrine autoradiographic LCBF study was first performed. Serial brain sections (12 in this study) were obtained at multiple coronal levels and were processed in the conventional manner to yield quantitative LCBF images. Two replicate sections at each bregma level were then used for in situ hybridization. To eliminate the 14C-iodoantipyrine from these sections, a chloroform-washout procedure was first performed. The sections were then processed for in situ hybridization autoradiography for the probes of interest. This method was tested in Wistar rats subjected to 12 min of global forebrain ischemia by two-vessel occlusion plus hypotension, followed by 2 or 6 h of reperfusion (n=4-6 per group). LCBF and in situ hybridization images for heat shock protein 70 (HSP70) were generated for each rat, aligned by disparity analysis, and analyzed on a pixel-by-pixel basis. This method yielded detailed inter-modality correlation between LCBF and HSP70 mRNA expressions. The advantages of this method include reducing the number of experimental animals by one-half; and providing accurate pixel-based correlations between different modalities in the same animals, thus enabling paired statistical analyses. This method can be extended to permit correlation of LCBF with the expression of multiple genes of interest.