Comparing geological and statistical approaches for element selection in sediment tracing research
NASA Astrophysics Data System (ADS)
Laceby, J. Patrick; McMahon, Joe; Evrard, Olivier; Olley, Jon
2015-04-01
Elevated suspended sediment loads reduce reservoir capacity and significantly increase the cost of operating water treatment infrastructure, making the management of sediment supply to reservoirs of increasingly importance. Sediment fingerprinting techniques can be used to determine the relative contributions of different sources of sediment accumulating in reservoirs. The objective of this research is to compare geological and statistical approaches to element selection for sediment fingerprinting modelling. Time-integrated samplers (n=45) were used to obtain source samples from four major subcatchments flowing into the Baroon Pocket Dam in South East Queensland, Australia. The geochemistry of potential sources were compared to the geochemistry of sediment cores (n=12) sampled in the reservoir. The geochemical approach selected elements for modelling that provided expected, observed and statistical discrimination between sediment sources. Two statistical approaches selected elements for modelling with the Kruskal-Wallis H-test and Discriminatory Function Analysis (DFA). In particular, two different significance levels (0.05 & 0.35) for the DFA were included to investigate the importance of element selection on modelling results. A distribution model determined the relative contributions of different sources to sediment sampled in the Baroon Pocket Dam. Elemental discrimination was expected between one subcatchment (Obi Obi Creek) and the remaining subcatchments (Lexys, Falls and Bridge Creek). Six major elements were expected to provide discrimination. Of these six, only Fe2O3 and SiO2 provided expected, observed and statistical discrimination. Modelling results with this geological approach indicated 36% (+/- 9%) of sediment sampled in the reservoir cores were from mafic-derived sources and 64% (+/- 9%) were from felsic-derived sources. The geological and the first statistical approach (DFA0.05) differed by only 1% (σ 5%) for 5 out of 6 model groupings with only the Lexys Creek modelling results differing significantly (35%). The statistical model with expanded elemental selection (DFA0.35) differed from the geological model by an average of 30% for all 6 models. Elemental selection for sediment fingerprinting therefore has the potential to impact modeling results. Accordingly is important to incorporate both robust geological and statistical approaches when selecting elements for sediment fingerprinting. For the Baroon Pocket Dam, management should focus on reducing the supply of sediments derived from felsic sources in each of the subcatchments.
NASA Astrophysics Data System (ADS)
Ghosh, Dipak; Sarkar, Sharmila; Sen, Sanjib; Roy, Jaya
1995-06-01
In this paper the behavior of factorial moments with rapidity window size, which is usually explained in terms of ``intermittency,'' has been interpreted by simple quantum statistical properties of the emitting system using the concept of ``modified two-source model'' as recently proposed by Ghosh and Sarkar [Phys. Lett. B 278, 465 (1992)]. The analysis has been performed using our own data of 16Ag/Br and 24Ag/Br interactions at a few tens of GeV energy regime.
NASA Astrophysics Data System (ADS)
Zhu, Jian-Rong; Li, Jian; Zhang, Chun-Mei; Wang, Qin
2017-10-01
The decoy-state method has been widely used in commercial quantum key distribution (QKD) systems. In view of the practical decoy-state QKD with both source errors and statistical fluctuations, we propose a universal model of full parameter optimization in biased decoy-state QKD with phase-randomized sources. Besides, we adopt this model to carry out simulations of two widely used sources: weak coherent source (WCS) and heralded single-photon source (HSPS). Results show that full parameter optimization can significantly improve not only the secure transmission distance but also the final key generation rate. And when taking source errors and statistical fluctuations into account, the performance of decoy-state QKD using HSPS suffered less than that of decoy-state QKD using WCS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghosh, D.; Sarkar, S.; Sen, S.
1995-06-01
In this paper the behavior of factorial moments with rapidity window size, which is usually explained in terms of ``intermittency,`` has been interpreted by simple quantum statistical properties of the emitting system using the concept of ``modified two-source model`` as recently proposed by Ghosh and Sarkar [Phys. Lett. B 278, 465 (1992)]. The analysis has been performed using our own data of {sup 16}O-Ag/Br and {sup 24}Mg-Ag/Br interactions at a few tens of GeV energy regime.
NASA Astrophysics Data System (ADS)
Shirzaei, M.; Walter, T. R.
2009-10-01
Modern geodetic techniques provide valuable and near real-time observations of volcanic activity. Characterizing the source of deformation based on these observations has become of major importance in related monitoring efforts. We investigate two random search approaches, simulated annealing (SA) and genetic algorithm (GA), and utilize them in an iterated manner. The iterated approach helps to prevent GA in general and SA in particular from getting trapped in local minima, and it also increases redundancy for exploring the search space. We apply a statistical competency test for estimating the confidence interval of the inversion source parameters, considering their internal interaction through the model, the effect of the model deficiency, and the observational error. Here, we present and test this new randomly iterated search and statistical competency (RISC) optimization method together with GA and SA for the modeling of data associated with volcanic deformations. Following synthetic and sensitivity tests, we apply the improved inversion techniques to two episodes of activity in the Campi Flegrei volcanic region in Italy, observed by the interferometric synthetic aperture radar technique. Inversion of these data allows derivation of deformation source parameters and their associated quality so that we can compare the two inversion methods. The RISC approach was found to be an efficient method in terms of computation time and search results and may be applied to other optimization problems in volcanic and tectonic environments.
Billon, Alexis; Foy, Cédric; Picaut, Judicaël; Valeau, Vincent; Sakout, Anas
2008-06-01
In this paper, a modification of the diffusion model for room acoustics is proposed to account for sound transmission between two rooms, a source room and an adjacent room, which are coupled through a partition wall. A system of two diffusion equations, one for each room, together with a set of two boundary conditions, one for the partition wall and one for the other walls of a room, is obtained and numerically solved. The modified diffusion model is validated by numerical comparisons with the statistical theory for several coupled-room configurations by varying the coupling area surface, the absorption coefficient of each room, and the volume of the adjacent room. An experimental comparison is also carried out for two coupled classrooms. The modified diffusion model results agree very well with both the statistical theory and the experimental data. The diffusion model can then be used as an alternative to the statistical theory, especially when the statistical theory is not applicable, that is, when the reverberant sound field is not diffuse. Moreover, the diffusion model allows the prediction of the spatial distribution of sound energy within each coupled room, while the statistical theory gives only one sound level for each room.
Distinguishing dark matter from unresolved point sources in the Inner Galaxy with photon statistics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Samuel K.; Lisanti, Mariangela; Safdi, Benjamin R., E-mail: samuelkl@princeton.edu, E-mail: mlisanti@princeton.edu, E-mail: bsafdi@princeton.edu
2015-05-01
Data from the Fermi Large Area Telescope suggests that there is an extended excess of GeV gamma-ray photons in the Inner Galaxy. Identifying potential astrophysical sources that contribute to this excess is an important step in verifying whether the signal originates from annihilating dark matter. In this paper, we focus on the potential contribution of unresolved point sources, such as millisecond pulsars (MSPs). We propose that the statistics of the photons—in particular, the flux probability density function (PDF) of the photon counts below the point-source detection threshold—can potentially distinguish between the dark-matter and point-source interpretations. We calculate the flux PDFmore » via the method of generating functions for these two models of the excess. Working in the framework of Bayesian model comparison, we then demonstrate that the flux PDF can potentially provide evidence for an unresolved MSP-like point-source population.« less
A novel model incorporating two variability sources for describing motor evoked potentials
Goetz, Stefan M.; Luber, Bruce; Lisanby, Sarah H.; Peterchev, Angel V.
2014-01-01
Objective Motor evoked potentials (MEPs) play a pivotal role in transcranial magnetic stimulation (TMS), e.g., for determining the motor threshold and probing cortical excitability. Sampled across the range of stimulation strengths, MEPs outline an input–output (IO) curve, which is often used to characterize the corticospinal tract. More detailed understanding of the signal generation and variability of MEPs would provide insight into the underlying physiology and aid correct statistical treatment of MEP data. Methods A novel regression model is tested using measured IO data of twelve subjects. The model splits MEP variability into two independent contributions, acting on both sides of a strong sigmoidal nonlinearity that represents neural recruitment. Traditional sigmoidal regression with a single variability source after the nonlinearity is used for comparison. Results The distribution of MEP amplitudes varied across different stimulation strengths, violating statistical assumptions in traditional regression models. In contrast to the conventional regression model, the dual variability source model better described the IO characteristics including phenomena such as changing distribution spread and skewness along the IO curve. Conclusions MEP variability is best described by two sources that most likely separate variability in the initial excitation process from effects occurring later on. The new model enables more accurate and sensitive estimation of the IO curve characteristics, enhancing its power as a detection tool, and may apply to other brain stimulation modalities. Furthermore, it extracts new information from the IO data concerning the neural variability—information that has previously been treated as noise. PMID:24794287
Rosato, Stefano; D'Errigo, Paola; Badoni, Gabriella; Fusco, Danilo; Perucci, Carlo A; Seccareccia, Fulvia
2008-08-01
The availability of two contemporary sources of information about coronary artery bypass graft (CABG) interventions, allowed 1) to verify the feasibility of performing outcome evaluation studies using administrative data sources, and 2) to compare hospital performance obtainable using the CABG Project clinical database with hospital performance derived from the use of current administrative data. Interventions recorded in the CABG Project were linked to the hospital discharge record (HDR) administrative database. Only the linked records were considered for subsequent analyses (46% of the total CABG Project). A new selected population "clinical card-HDR" was then defined. Two independent risk-adjustment models were applied, each of them using information derived from one of the two different sources. Then, HDR information was supplemented with some patient preoperative conditions from the CABG clinical database. The two models were compared in terms of their adaptability to data. Hospital performances identified by the two different models and significantly different from the mean was compared. In only 4 of the 13 hospitals considered for analysis, the results obtained using the HDR model did not completely overlap with those obtained by the CABG model. When comparing statistical parameters of the HDR model and the HDR model + patient preoperative conditions, the latter showed the best adaptability to data. In this "clinical card-HDR" population, hospital performance assessment obtained using information from the clinical database is similar to that derived from the use of current administrative data. However, when risk-adjustment models built on administrative databases are supplemented with a few clinical variables, their statistical parameters improve and hospital performance assessment becomes more accurate.
Hong, Peilong; Li, Liming; Liu, Jianji; Zhang, Guoquan
2016-03-29
Young's double-slit or two-beam interference is of fundamental importance to understand various interference effects, in which the stationary phase difference between two beams plays the key role in the first-order coherence. Different from the case of first-order coherence, in the high-order optical coherence the statistic behavior of the optical phase will play the key role. In this article, by employing a fundamental interfering configuration with two classical point sources, we showed that the high- order optical coherence between two classical point sources can be actively designed by controlling the statistic behavior of the relative phase difference between two point sources. Synchronous position Nth-order subwavelength interference with an effective wavelength of λ/M was demonstrated, in which λ is the wavelength of point sources and M is an integer not larger than N. Interestingly, we found that the synchronous position Nth-order interference fringe fingerprints the statistic trace of random phase fluctuation of two classical point sources, therefore, it provides an effective way to characterize the statistic properties of phase fluctuation for incoherent light sources.
Comparison of two trajectory based models for locating particle sources for two rural New York sites
NASA Astrophysics Data System (ADS)
Zhou, Liming; Hopke, Philip K.; Liu, Wei
Two back trajectory-based statistical models, simplified quantitative transport bias analysis and residence-time weighted concentrations (RTWC) have been compared for their capabilities of identifying likely locations of source emissions contributing to observed particle concentrations at Potsdam and Stockton, New York. Quantitative transport bias analysis (QTBA) attempts to take into account the distribution of concentrations around the directions of the back trajectories. In full QTBA approach, deposition processes (wet and dry) are also considered. Simplified QTBA omits the consideration of deposition. It is best used with multiple site data. Similarly the RTWC approach uses concentrations measured at different sites along with the back trajectories to distribute the concentration contributions across the spatial domain of the trajectories. In this study, these models are used in combination with the source contribution values obtained by the previous positive matrix factorization analysis of particle composition data from Potsdam and Stockton. The six common sources for the two sites, sulfate, soil, zinc smelter, nitrate, wood smoke and copper smelter were analyzed. The results of the two methods are consistent and locate large and clearly defined sources well. RTWC approach can find more minor sources but may also give unrealistic estimations of the source locations.
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.
Mørk, Søren; Holmes, Ian
2012-03-01
Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
Rainfall Threshold Assessment Corresponding to the Maximum Allowable Turbidity for Source Water.
Fan, Shu-Kai S; Kuan, Wen-Hui; Fan, Chihhao; Chen, Chiu-Yang
2016-12-01
This study aims to assess the upstream rainfall thresholds corresponding to the maximum allowable turbidity of source water, using monitoring data and artificial neural network computation. The Taipei Water Source Domain was selected as the study area, and the upstream rainfall records were collected for statistical analysis. Using analysis of variance (ANOVA), the cumulative rainfall records of one-day Ping-lin, two-day Ping-lin, two-day Tong-hou, one-day Guie-shan, and one-day Tai-ping (rainfall in the previous 24 or 48 hours at the named weather stations) were found to be the five most significant parameters for downstream turbidity development. An artificial neural network model was constructed to predict the downstream turbidity in the area investigated. The observed and model-calculated turbidity data were applied to assess the rainfall thresholds in the studied area. By setting preselected turbidity criteria, the upstream rainfall thresholds for these statistically determined rain gauge stations were calculated.
Sun, Gang; Hoff, Steven J; Zelle, Brian C; Nelson, Minda A
2008-12-01
It is vital to forecast gas and particle matter concentrations and emission rates (GPCER) from livestock production facilities to assess the impact of airborne pollutants on human health, ecological environment, and global warming. Modeling source air quality is a complex process because of abundant nonlinear interactions between GPCER and other factors. The objective of this study was to introduce statistical methods and radial basis function (RBF) neural network to predict daily source air quality in Iowa swine deep-pit finishing buildings. The results show that four variables (outdoor and indoor temperature, animal units, and ventilation rates) were identified as relative important model inputs using statistical methods. It can be further demonstrated that only two factors, the environment factor and the animal factor, were capable of explaining more than 94% of the total variability after performing principal component analysis. The introduction of fewer uncorrelated variables to the neural network would result in the reduction of the model structure complexity, minimize computation cost, and eliminate model overfitting problems. The obtained results of RBF network prediction were in good agreement with the actual measurements, with values of the correlation coefficient between 0.741 and 0.995 and very low values of systemic performance indexes for all the models. The good results indicated the RBF network could be trained to model these highly nonlinear relationships. Thus, the RBF neural network technology combined with multivariate statistical methods is a promising tool for air pollutant emissions modeling.
An experimental comparison of various methods of nearfield acoustic holography
Chelliah, Kanthasamy; Raman, Ganesh; Muehleisen, Ralph T.
2017-05-19
An experimental comparison of four different methods of nearfield acoustic holography (NAH) is presented in this study for planar acoustic sources. The four NAH methods considered in this study are based on: (1) spatial Fourier transform, (2) equivalent sources model, (3) boundary element methods and (4) statistically optimized NAH. Two dimensional measurements were obtained at different distances in front of a tonal sound source and the NAH methods were used to reconstruct the sound field at the source surface. Reconstructed particle velocity and acoustic pressure fields presented in this study showed that the equivalent sources model based algorithm along withmore » Tikhonov regularization provided the best localization of the sources. Reconstruction errors were found to be smaller for the equivalent sources model based algorithm and the statistically optimized NAH algorithm. Effect of hologram distance on the performance of various algorithms is discussed in detail. The study also compares the computational time required by each algorithm to complete the comparison. Four different regularization parameter choice methods were compared. The L-curve method provided more accurate reconstructions than the generalized cross validation and the Morozov discrepancy principle. Finally, the performance of fixed parameter regularization was comparable to that of the L-curve method.« less
An experimental comparison of various methods of nearfield acoustic holography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chelliah, Kanthasamy; Raman, Ganesh; Muehleisen, Ralph T.
An experimental comparison of four different methods of nearfield acoustic holography (NAH) is presented in this study for planar acoustic sources. The four NAH methods considered in this study are based on: (1) spatial Fourier transform, (2) equivalent sources model, (3) boundary element methods and (4) statistically optimized NAH. Two dimensional measurements were obtained at different distances in front of a tonal sound source and the NAH methods were used to reconstruct the sound field at the source surface. Reconstructed particle velocity and acoustic pressure fields presented in this study showed that the equivalent sources model based algorithm along withmore » Tikhonov regularization provided the best localization of the sources. Reconstruction errors were found to be smaller for the equivalent sources model based algorithm and the statistically optimized NAH algorithm. Effect of hologram distance on the performance of various algorithms is discussed in detail. The study also compares the computational time required by each algorithm to complete the comparison. Four different regularization parameter choice methods were compared. The L-curve method provided more accurate reconstructions than the generalized cross validation and the Morozov discrepancy principle. Finally, the performance of fixed parameter regularization was comparable to that of the L-curve method.« less
Statistical methods and neural network approaches for classification of data from multiple sources
NASA Technical Reports Server (NTRS)
Benediktsson, Jon Atli; Swain, Philip H.
1990-01-01
Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.
Cosmic shear measurements with Dark Energy Survey Science Verification data
Becker, M. R.
2016-07-06
Here, we present measurements of weak gravitational lensing cosmic shear two-point statistics using Dark Energy Survey Science Verification data. We demonstrate that our results are robust to the choice of shear measurement pipeline, either ngmix or im3shape, and robust to the choice of two-point statistic, including both real and Fourier-space statistics. Our results pass a suite of null tests including tests for B-mode contamination and direct tests for any dependence of the two-point functions on a set of 16 observing conditions and galaxy properties, such as seeing, airmass, galaxy color, galaxy magnitude, etc. We use a large suite of simulationsmore » to compute the covariance matrix of the cosmic shear measurements and assign statistical significance to our null tests. We find that our covariance matrix is consistent with the halo model prediction, indicating that it has the appropriate level of halo sample variance. We also compare the same jackknife procedure applied to the data and the simulations in order to search for additional sources of noise not captured by the simulations. We find no statistically significant extra sources of noise in the data. The overall detection significance with tomography for our highest source density catalog is 9.7σ. Cosmological constraints from the measurements in this work are presented in a companion paper.« less
Ionospheric scintillation studies
NASA Technical Reports Server (NTRS)
Rino, C. L.; Freemouw, E. J.
1973-01-01
The diffracted field of a monochromatic plane wave was characterized by two complex correlation functions. For a Gaussian complex field, these quantities suffice to completely define the statistics of the field. Thus, one can in principle calculate the statistics of any measurable quantity in terms of the model parameters. The best data fits were achieved for intensity statistics derived under the Gaussian statistics hypothesis. The signal structure that achieved the best fit was nearly invariant with scintillation level and irregularity source (ionosphere or solar wind). It was characterized by the fact that more than 80% of the scattered signal power is in phase quadrature with the undeviated or coherent signal component. Thus, the Gaussian-statistics hypothesis is both convenient and accurate for channel modeling work.
Nevers, Meredith; Byappanahalli, Muruleedhara; Phanikumar, Mantha S.; Whitman, Richard L.
2016-01-01
Mathematical models have been widely applied to surface waters to estimate rates of settling, resuspension, flow, dispersion, and advection in order to calculate movement of particles that influence water quality. Of particular interest are the movement, survival, and persistence of microbial pathogens or their surrogates, which may contaminate recreational water, drinking water, or shellfish. Most models devoted to microbial water quality have been focused on fecal indicator organisms (FIO), which act as a surrogate for pathogens and viruses. Process-based modeling and statistical modeling have been used to track contamination events to source and to predict future events. The use of these two types of models require different levels of expertise and input; process-based models rely on theoretical physical constructs to explain present conditions and biological distribution while data-based, statistical models use extant paired data to do the same. The selection of the appropriate model and interpretation of results is critical to proper use of these tools in microbial source tracking. Integration of the modeling approaches could provide insight for tracking and predicting contamination events in real time. A review of modeling efforts reveals that process-based modeling has great promise for microbial source tracking efforts; further, combining the understanding of physical processes influencing FIO contamination developed with process-based models and molecular characterization of the population by gene-based (i.e., biological) or chemical markers may be an effective approach for locating sources and remediating contamination in order to protect human health better.
NASA Astrophysics Data System (ADS)
Arendt, Carli A.; Aciego, Sarah M.; Hetland, Eric A.
2015-05-01
The implementation of isotopic tracers as constraints on source contributions has become increasingly relevant to understanding Earth surface processes. Interpretation of these isotopic tracers has become more accessible with the development of Bayesian Monte Carlo (BMC) mixing models, which allow uncertainty in mixing end-members and provide methodology for systems with multicomponent mixing. This study presents an open source multiple isotope BMC mixing model that is applicable to Earth surface environments with sources exhibiting distinct end-member isotopic signatures. Our model is first applied to new δ18O and δD measurements from the Athabasca Glacier, which showed expected seasonal melt evolution trends and vigorously assessed the statistical relevance of the resulting fraction estimations. To highlight the broad applicability of our model to a variety of Earth surface environments and relevant isotopic systems, we expand our model to two additional case studies: deriving melt sources from δ18O, δD, and 222Rn measurements of Greenland Ice Sheet bulk water samples and assessing nutrient sources from ɛNd and 87Sr/86Sr measurements of Hawaiian soil cores. The model produces results for the Greenland Ice Sheet and Hawaiian soil data sets that are consistent with the originally published fractional contribution estimates. The advantage of this method is that it quantifies the error induced by variability in the end-member compositions, unrealized by the models previously applied to the above case studies. Results from all three case studies demonstrate the broad applicability of this statistical BMC isotopic mixing model for estimating source contribution fractions in a variety of Earth surface systems.
A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.
Dorie, Vincent; Harada, Masataka; Carnegie, Nicole Bohme; Hill, Jennifer
2016-09-10
When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Population activity statistics dissect subthreshold and spiking variability in V1.
Bányai, Mihály; Koman, Zsombor; Orbán, Gergő
2017-07-01
Response variability, as measured by fluctuating responses upon repeated performance of trials, is a major component of neural responses, and its characterization is key to interpret high dimensional population recordings. Response variability and covariability display predictable changes upon changes in stimulus and cognitive or behavioral state, providing an opportunity to test the predictive power of models of neural variability. Still, there is little agreement on which model to use as a building block for population-level analyses, and models of variability are often treated as a subject of choice. We investigate two competing models, the doubly stochastic Poisson (DSP) model assuming stochasticity at spike generation, and the rectified Gaussian (RG) model tracing variability back to membrane potential variance, to analyze stimulus-dependent modulation of both single-neuron and pairwise response statistics. Using a pair of model neurons, we demonstrate that the two models predict similar single-cell statistics. However, DSP and RG models have contradicting predictions on the joint statistics of spiking responses. To test the models against data, we build a population model to simulate stimulus change-related modulations in pairwise response statistics. We use single-unit data from the primary visual cortex (V1) of monkeys to show that while model predictions for variance are qualitatively similar to experimental data, only the RG model's predictions are compatible with joint statistics. These results suggest that models using Poisson-like variability might fail to capture important properties of response statistics. We argue that membrane potential-level modeling of stochasticity provides an efficient strategy to model correlations. NEW & NOTEWORTHY Neural variability and covariability are puzzling aspects of cortical computations. For efficient decoding and prediction, models of information encoding in neural populations hinge on an appropriate model of variability. Our work shows that stimulus-dependent changes in pairwise but not in single-cell statistics can differentiate between two widely used models of neuronal variability. Contrasting model predictions with neuronal data provides hints on the noise sources in spiking and provides constraints on statistical models of population activity. Copyright © 2017 the American Physiological Society.
NASA Astrophysics Data System (ADS)
Koo, Bryan Bonsuk
Electricity generation from non-hydro renewable sources has increased rapidly in the last decade. For example, Renewable Energy Sources for Electricity (RES-E) generating capacity in the U.S. almost doubled for the last three year from 2009 to 2012. Multiple papers point out that RES-E policies implemented by state governments play a crucial role in increasing RES-E generation or capacity. This study examines the effects of state RES-E policies on state RES-E generating capacity, using a fixed effects model. The research employs panel data from the 50 states and the District of Columbia, for the period 1990 to 2011, and uses a two-stage approach to control endogeneity embedded in the policies adopted by state governments, and a Prais-Winsten estimator to fix any autocorrelation in the panel data. The analysis finds that Renewable Portfolio Standards (RPS) and Net-metering are significantly and positively associated with RES-E generating capacity, but neither Public Benefit Funds nor the Mandatory Green Power Option has a statistically significant relation to RES-E generating capacity. Results of the two-stage model are quite different from models which do not employ predicted policy variables. Analysis using non-predicted variables finds that RPS and Net-metering policy are statistically insignificant and negatively associated with RES-E generating capacity. On the other hand, Green Energy Purchasing policy is insignificant in the two-stage model, but significant in the model without predicted values.
Development of a statistical oil spill model for risk assessment.
Guo, Weijun
2017-11-01
To gain a better understanding of the impacts from potential risk sources, we developed an oil spill model using probabilistic method, which simulates numerous oil spill trajectories under varying environmental conditions. The statistical results were quantified from hypothetical oil spills under multiple scenarios, including area affected probability, mean oil slick thickness, and duration of water surface exposed to floating oil. The three sub-indices together with marine area vulnerability are merged to compute the composite index, characterizing the spatial distribution of risk degree. Integral of the index can be used to identify the overall risk from an emission source. The developed model has been successfully applied in comparison to and selection of an appropriate oil port construction location adjacent to a marine protected area for Phoca largha in China. The results highlight the importance of selection of candidates before project construction, since that risk estimation from two adjacent potential sources may turn out to be significantly different regarding hydrodynamic conditions and eco-environmental sensitivity. Copyright © 2017. Published by Elsevier Ltd.
Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)
NASA Astrophysics Data System (ADS)
Kasibhatla, P.
2004-12-01
In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.
An Improved Statistical Point-source Foreground Model for the Epoch of Reionization
NASA Astrophysics Data System (ADS)
Murray, S. G.; Trott, C. M.; Jordan, C. H.
2017-08-01
We present a sophisticated statistical point-source foreground model for low-frequency radio Epoch of Reionization (EoR) experiments using the 21 cm neutral hydrogen emission line. Motivated by our understanding of the low-frequency radio sky, we enhance the realism of two model components compared with existing models: the source count distributions as a function of flux density and spatial position (source clustering), extending current formalisms for the foreground covariance of 2D power-spectral modes in 21 cm EoR experiments. The former we generalize to an arbitrarily broken power law, and the latter to an arbitrary isotropically correlated field. This paper presents expressions for the modified covariance under these extensions, and shows that for a more realistic source spatial distribution, extra covariance arises in the EoR window that was previously unaccounted for. Failure to include this contribution can yield bias in the final power-spectrum and under-estimate uncertainties, potentially leading to a false detection of signal. The extent of this effect is uncertain, owing to ignorance of physical model parameters, but we show that it is dependent on the relative abundance of faint sources, to the effect that our extension will become more important for future deep surveys. Finally, we show that under some parameter choices, ignoring source clustering can lead to false detections on large scales, due to both the induced bias and an artificial reduction in the estimated measurement uncertainty.
Statistical aspects of carbon fiber risk assessment modeling. [fire accidents involving aircraft
NASA Technical Reports Server (NTRS)
Gross, D.; Miller, D. R.; Soland, R. M.
1980-01-01
The probabilistic and statistical aspects of the carbon fiber risk assessment modeling of fire accidents involving commercial aircraft are examined. Three major sources of uncertainty in the modeling effort are identified. These are: (1) imprecise knowledge in establishing the model; (2) parameter estimation; and (3)Monte Carlo sampling error. All three sources of uncertainty are treated and statistical procedures are utilized and/or developed to control them wherever possible.
NASA Astrophysics Data System (ADS)
Nishiura, Takanobu; Nakamura, Satoshi
2002-11-01
It is very important to capture distant-talking speech for a hands-free speech interface with high quality. A microphone array is an ideal candidate for this purpose. However, this approach requires localizing the target talker. Conventional talker localization algorithms in multiple sound source environments not only have difficulty localizing the multiple sound sources accurately, but also have difficulty localizing the target talker among known multiple sound source positions. To cope with these problems, we propose a new talker localization algorithm consisting of two algorithms. One is DOA (direction of arrival) estimation algorithm for multiple sound source localization based on CSP (cross-power spectrum phase) coefficient addition method. The other is statistical sound source identification algorithm based on GMM (Gaussian mixture model) for localizing the target talker position among localized multiple sound sources. In this paper, we particularly focus on the talker localization performance based on the combination of these two algorithms with a microphone array. We conducted evaluation experiments in real noisy reverberant environments. As a result, we confirmed that multiple sound signals can be identified accurately between ''speech'' or ''non-speech'' by the proposed algorithm. [Work supported by ATR, and MEXT of Japan.
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
Large-scale fluctuations in the cosmic ionizing background: the impact of beamed source emission
NASA Astrophysics Data System (ADS)
Suarez, Teresita; Pontzen, Andrew
2017-12-01
When modelling the ionization of gas in the intergalactic medium after reionization, it is standard practice to assume a uniform radiation background. This assumption is not always appropriate; models with radiative transfer show that large-scale ionization rate fluctuations can have an observable impact on statistics of the Lyman α forest. We extend such calculations to include beaming of sources, which has previously been neglected but which is expected to be important if quasars dominate the ionizing photon budget. Beaming has two effects: first, the physical number density of ionizing sources is enhanced relative to that directly observed; and secondly, the radiative transfer itself is altered. We calculate both effects in a hard-edged beaming model where each source has a random orientation, using an equilibrium Boltzmann hierarchy in terms of spherical harmonics. By studying the statistical properties of the resulting ionization rate and H I density fields at redshift z ∼ 2.3, we find that the two effects partially cancel each other; combined, they constitute a maximum 5 per cent correction to the power spectrum P_{H I}(k) at k = 0.04 h Mpc-1. On very large scales (k < 0.01 h Mpc-1) the source density renormalization dominates; it can reduce, by an order of magnitude, the contribution of ionizing shot noise to the intergalactic H I power spectrum. The effects of beaming should be considered when interpreting future observational data sets.
NASA Astrophysics Data System (ADS)
Garg, Saryu; Sinha, Baerbel
2017-10-01
This study uses two newly developed statistical source apportionment models, MuSAM and MuReSAM, to perform quantitative statistical source apportionment of PM10 at multiple receptor sites in South Hessen. MuSAM uses multi-site back trajectory data to quantify the contribution of long-range transport, while MuReSAM uses wind speed and direction as proxy for regional transport and quantifies the contribution of regional source areas. On average, between 7.8 and 9.1 μg/m3 of PM10 (∼50%) at receptor sites in South Hessen is contributed by long-range transport. The dominant source regions are Eastern, South Eastern, and Southern Europe. 32% of the PM10 at receptor sites in South Hessen is contributed by regional source areas (2.8-9.41 μg/m3). This fraction varies from <20% at remote sites to >40% for urban stations. Sources located within a 2 km radius around the receptor site are responsible for 7%-20% of the total PM10 mass (0.7-4.4 μg/m3). The perturbation study of the traffic flow due to the closing and reopening of the Schiersteiner Brücke revealed that the contribution of the bridge to PM10 mass loadings at two nearby receptor sites increased by approximately 120% after it reopened and became a bottleneck, although in absolute terms, the increase is small.
NASA Astrophysics Data System (ADS)
Sturtz, Timothy M.
Source apportionment models attempt to untangle the relationship between pollution sources and the impacts at downwind receptors. Two frameworks of source apportionment models exist: source-oriented and receptor-oriented. Source based apportionment models use presumed emissions and atmospheric processes to estimate the downwind source contributions. Conversely, receptor based models leverage speciated concentration data from downwind receptors and apply statistical methods to predict source contributions. Integration of both source-oriented and receptor-oriented models could lead to a better understanding of the implications sources have on the environment and society. The research presented here investigated three different types of constraints applied to the Positive Matrix Factorization (PMF) receptor model within the framework of the Multilinear Engine (ME-2): element ratio constraints, spatial separation constraints, and chemical transport model (CTM) source attribution constraints. PM10-2.5 mass and trace element concentrations were measured in Winston-Salem, Chicago, and St. Paul at up to 60 sites per city during two different seasons in 2010. PMF was used to explore the underlying sources of variability. Information on previously reported PM10-2.5 tire and brake wear profiles were used to constrain these features in PMF by prior specification of selected species ratios. We also modified PMF to allow for combining the measurements from all three cities into a single model while preserving city-specific soil features. Relatively minor differences were observed between model predictions with and without the prior ratio constraints, increasing confidence in our ability to identify separate brake wear and tire wear features. Using separate data, source contributions to total fine particle carbon predicted by a CTM were incorporated into the PMF receptor model to form a receptor-oriented hybrid model. The level of influence of the CTM versus traditional PMF was varied using a weighting parameter applied to an object function as implemented in ME-2. The resulting hybrid model was used to quantify the contributions of total carbon from both wildfires and biogenic sources at two Interagency Monitoring of Protected Visual Environment monitoring sites, Monture and Sula Peak, Montana, from 2006 through 2008.
Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.
2016-01-01
Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.
Statistical mechanics of shell models for two-dimensional turbulence
NASA Astrophysics Data System (ADS)
Aurell, E.; Boffetta, G.; Crisanti, A.; Frick, P.; Paladin, G.; Vulpiani, A.
1994-12-01
We study shell models that conserve the analogs of energy and enstrophy and hence are designed to mimic fluid turbulence in two-dimensions (2D). The main result is that the observed state is well described as a formal statistical equilibrium, closely analogous to the approach to two-dimensional ideal hydrodynamics of Onsager [Nuovo Cimento Suppl. 6, 279 (1949)], Hopf [J. Rat. Mech. Anal. 1, 87 (1952)], and Lee [Q. Appl. Math. 10, 69 (1952)]. In the presence of forcing and dissipation we observe a forward flux of enstrophy and a backward flux of energy. These fluxes can be understood as mean diffusive drifts from a source to two sinks in a system which is close to local equilibrium with Lagrange multipliers (``shell temperatures'') changing slowly with scale. This is clear evidence that the simplest shell models are not adequate to reproduce the main features of two-dimensional turbulence. The dimensional predictions on the power spectra from a supposed forward cascade of enstrophy and from one branch of the formal statistical equilibrium coincide in these shell models in contrast to the corresponding predictions for the Navier-Stokes and Euler equations in 2D. This coincidence has previously led to the mistaken conclusion that shell models exhibit a forward cascade of enstrophy. We also study the dynamical properties of the models and the growth of perturbations.
A model for two-dimensional bursty turbulence in magnetized plasmas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Servidio, Sergio; Primavera, Leonardo; Carbone, Vincenzo
2008-01-15
The nonlinear dynamics of two-dimensional electrostatic interchange modes in a magnetized plasma is investigated through a simple model that replaces the instability mechanism due to magnetic field curvature by an external source of vorticity and mass. Simulations in a cylindrical domain, with a spatially localized and randomized source at the center of the domain, reveal the eruption of mushroom-shaped bursts that propagate radially and are absorbed by the boundaries. Burst sizes and the interburst waiting times exhibit power-law statistics, which indicates long-range interburst correlations, similar to what has been found in sandpile models for avalanching systems. It is shown frommore » the simulations that the dynamics can be characterized by a Yaglom relation for the third-order mixed moment involving the particle number density as a passive scalar and the ExB drift velocity, and hence that the burst phenomenology can be described within the framework of turbulence theory. Statistical features are qualitatively in agreement with experiments of intermittent transport at the edge of plasma devices, and suggest that essential features such as transport can be described by this simple model of bursty turbulence.« less
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
Doiron, Dany; Marcon, Yannick; Fortier, Isabel; Burton, Paul; Ferretti, Vincent
2017-01-01
Abstract Motivation Improving the dissemination of information on existing epidemiological studies and facilitating the interoperability of study databases are essential to maximizing the use of resources and accelerating improvements in health. To address this, Maelstrom Research proposes Opal and Mica, two inter-operable open-source software packages providing out-of-the-box solutions for epidemiological data management, harmonization and dissemination. Implementation Opal and Mica are two standalone but inter-operable web applications written in Java, JavaScript and PHP. They provide web services and modern user interfaces to access them. General features Opal allows users to import, manage, annotate and harmonize study data. Mica is used to build searchable web portals disseminating study and variable metadata. When used conjointly, Mica users can securely query and retrieve summary statistics on geographically dispersed Opal servers in real-time. Integration with the DataSHIELD approach allows conducting more complex federated analyses involving statistical models. Availability Opal and Mica are open-source and freely available at [www.obiba.org] under a General Public License (GPL) version 3, and the metadata models and taxonomies that accompany them are available under a Creative Commons licence. PMID:29025122
A Simple Model of Pulsed Ejector Thrust Augmentation
NASA Technical Reports Server (NTRS)
Wilson, Jack; Deloof, Richard L. (Technical Monitor)
2003-01-01
A simple model of thrust augmentation from a pulsed source is described. In the model it is assumed that the flow into the ejector is quasi-steady, and can be calculated using potential flow techniques. The velocity of the flow is related to the speed of the starting vortex ring formed by the jet. The vortex ring properties are obtained from the slug model, knowing the jet diameter, speed and slug length. The model, when combined with experimental results, predicts an optimum ejector radius for thrust augmentation. Data on pulsed ejector performance for comparison with the model was obtained using a shrouded Hartmann-Sprenger tube as the pulsed jet source. A statistical experiment, in which ejector length, diameter, and nose radius were independent parameters, was performed at four different frequencies. These frequencies corresponded to four different slug length to diameter ratios, two below cut-off, and two above. Comparison of the model with the experimental data showed reasonable agreement. Maximum pulsed thrust augmentation is shown to occur for a pulsed source with slug length to diameter ratio equal to the cut-off value.
DeltaSA tool for source apportionment benchmarking, description and sensitivity analysis
NASA Astrophysics Data System (ADS)
Pernigotti, D.; Belis, C. A.
2018-05-01
DeltaSA is an R-package and a Java on-line tool developed at the EC-Joint Research Centre to assist and benchmark source apportionment applications. Its key functionalities support two critical tasks in this kind of studies: the assignment of a factor to a source in factor analytical models (source identification) and the model performance evaluation. The source identification is based on the similarity between a given factor and source chemical profiles from public databases. The model performance evaluation is based on statistical indicators used to compare model output with reference values generated in intercomparison exercises. The references values are calculated as the ensemble average of the results reported by participants that have passed a set of testing criteria based on chemical profiles and time series similarity. In this study, a sensitivity analysis of the model performance criteria is accomplished using the results of a synthetic dataset where "a priori" references are available. The consensus modulated standard deviation punc gives the best choice for the model performance evaluation when a conservative approach is adopted.
Size Matters: What Are the Characteristic Source Areas for Urban Planning Strategies?
Fan, Chao; Myint, Soe W.; Wang, Chenghao
2016-01-01
Urban environmental measurements and observational statistics should reflect the properties generated over an adjacent area of adequate length where homogeneity is usually assumed. The determination of this characteristic source area that gives sufficient representation of the horizontal coverage of a sensing instrument or the fetch of transported quantities is of critical importance to guide the design and implementation of urban landscape planning strategies. In this study, we aim to unify two different methods for estimating source areas, viz. the statistical correlation method commonly used by geographers for landscape fragmentation and the mechanistic footprint model by meteorologists for atmospheric measurements. Good agreement was found in the intercomparison of the estimate of source areas by the two methods, based on 2-m air temperature measurement collected using a network of weather stations. The results can be extended to shed new lights on urban planning strategies, such as the use of urban vegetation for heat mitigation. In general, a sizable patch of landscape is required in order to play an effective role in regulating the local environment, proportional to the height at which stakeholders’ interest is mainly concerned. PMID:27832111
Investigation into the performance of different models for predicting stutter.
Bright, Jo-Anne; Curran, James M; Buckleton, John S
2013-07-01
In this paper we have examined five possible models for the behaviour of the stutter ratio, SR. These were two log-normal models, two gamma models, and a two-component normal mixture model. A two-component normal mixture model was chosen with different behaviours of variance; at each locus SR was described with two distributions, both with the same mean. The distributions have difference variances: one for the majority of the observations and a second for the less well-behaved ones. We apply each model to a set of known single source Identifiler™, NGM SElect™ and PowerPlex(®) 21 DNA profiles to show the applicability of our findings to different data sets. SR determined from the single source profiles were compared to the calculated SR after application of the models. The model performance was tested by calculating the log-likelihoods and comparing the difference in Akaike information criterion (AIC). The two-component normal mixture model systematically outperformed all others, despite the increase in the number of parameters. This model, as well as performing well statistically, has intuitive appeal for forensic biologists and could be implemented in an expert system with a continuous method for DNA interpretation. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Nowcasting influenza outbreaks using open-source media report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ray, Jaideep; Brownstein, John S.
We construct and verify a statistical method to nowcast influenza activity from a time-series of the frequency of reports concerning influenza related topics. Such reports are published electronically by both public health organizations as well as newspapers/media sources, and thus can be harvested easily via web crawlers. Since media reports are timely, whereas reports from public health organization are delayed by at least two weeks, using timely, open-source data to compensate for the lag in %E2%80%9Cofficial%E2%80%9D reports can be useful. We use morbidity data from networks of sentinel physicians (both the Center of Disease Control's ILINet and France's Sentinelles network)more » as the gold standard of influenza-like illness (ILI) activity. The time-series of media reports is obtained from HealthMap (http://healthmap.org). We find that the time-series of media reports shows some correlation ( 0.5) with ILI activity; further, this can be leveraged into an autoregressive moving average model with exogenous inputs (ARMAX model) to nowcast ILI activity. We find that the ARMAX models have more predictive skill compared to autoregressive (AR) models fitted to ILI data i.e., it is possible to exploit the information content in the open-source data. We also find that when the open-source data are non-informative, the ARMAX models reproduce the performance of AR models. The statistical models are tested on data from the 2009 swine-flu outbreak as well as the mild 2011-2012 influenza season in the U.S.A.« less
Modeling of two-particle femtoscopic correlations at top RHIC energy
NASA Astrophysics Data System (ADS)
Ermakov, N.; Nigmatkulov, G.
2017-01-01
The spatial and temporal characteristics of particle emitting source produced in particle and/or nuclear collisions can be measured by using two-particle femtoscopic correlations. These correlations arise due to quantum statistics, Coulomb and strong final state interactions. In this paper we report on the calculations of like-sign pion femtoscopic correlations produced in p+p, p+Au, d+Au, Au+Au at top RHIC energy using Ultra Relativistic Quantum Molecular Dynamics Model (UrQMD). Three-dimensional correlation functions are constructed using the Bertsch-Pratt parametrization of the two-particle relative momentum. The correlation functions are studied in several transverse mass ranges. The emitting source radii of charged pions, Rout, Rside, Rlong , are obtained from Gaussian fit to the correlation functions and compared to data from the STAR and PHENIX experiments.
Photon statistics as an interference phenomenon.
Mehringer, Thomas; Mährlein, Simon; von Zanthier, Joachim; Agarwal, Girish S
2018-05-15
Interference of light fields, first postulated by Young, is one of the fundamental pillars of physics. Dirac extended this observation to the quantum world by stating that each photon interferes only with itself. A precondition for interference to occur is that no welcher-weg information labels the paths the photon takes; otherwise, the interference vanishes. This remains true, even if two-photon interference is considered, e.g., in the Hong-Ou-Mandel-experiment. Here, the two photons interfere only if they are indistinguishable, e.g., in frequency, momentum, polarization, and time. Less known is the fact that two-photon interference and photon indistinguishability also determine the photon statistics in the overlapping light fields of two independent sources. As a consequence, measuring the photon statistics in the far field of two independent sources reveals the degree of indistinguishability of the emitted photons. In this Letter, we prove this statement in theory using a quantum mechanical treatment. We also demonstrate the outcome experimentally with a simple setup consisting of two statistically independent thermal light sources with adjustable polarizations. We find that the photon statistics vary indeed as a function of the polarization settings, the latter determining the degree of welcher-weg information of the photons emanating from the two sources.
Determining the sources of fine-grained sediment using the Sediment Source Assessment Tool (Sed_SAT)
Gorman Sanisaca, Lillian E.; Gellis, Allen C.; Lorenz, David L.
2017-07-27
A sound understanding of sources contributing to instream sediment flux in a watershed is important when developing total maximum daily load (TMDL) management strategies designed to reduce suspended sediment in streams. Sediment fingerprinting and sediment budget approaches are two techniques that, when used jointly, can qualify and quantify the major sources of sediment in a given watershed. The sediment fingerprinting approach uses trace element concentrations from samples in known potential source areas to determine a clear signature of each potential source. A mixing model is then used to determine the relative source contribution to the target suspended sediment samples.The computational steps required to apportion sediment for each target sample are quite involved and time intensive, a problem the Sediment Source Assessment Tool (Sed_SAT) addresses. Sed_SAT is a user-friendly statistical model that guides the user through the necessary steps in order to quantify the relative contributions of sediment sources in a given watershed. The model is written using the statistical software R (R Core Team, 2016b) and utilizes Microsoft Access® as a user interface but requires no prior knowledge of R or Microsoft Access® to successfully run the model successfully. Sed_SAT identifies outliers, corrects for differences in size and organic content in the source samples relative to the target samples, evaluates the conservative behavior of tracers used in fingerprinting by applying a “Bracket Test,” identifies tracers with the highest discriminatory power, and provides robust error analysis through a Monte Carlo simulation following the mixing model. Quantifying sediment source contributions using the sediment fingerprinting approach provides local, State, and Federal land management agencies with important information needed to implement effective strategies to reduce sediment. Sed_SAT is designed to assist these agencies in applying the sediment fingerprinting approach to quantify sediment sources in the sediment TMDL framework.
Young, Robin L; Weinberg, Janice; Vieira, Verónica; Ozonoff, Al; Webster, Thomas F
2010-07-19
A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics. This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases. The GAM permutation testing methods provide a regression-based alternative to the spatial scan statistic. Across all hypotheses examined in this research, the GAM methods had competing or greater power estimates and sensitivities exceeding that of the spatial scan statistic.
2010-01-01
Background A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics. Results This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases. Conclusions The GAM permutation testing methods provide a regression-based alternative to the spatial scan statistic. Across all hypotheses examined in this research, the GAM methods had competing or greater power estimates and sensitivities exceeding that of the spatial scan statistic. PMID:20642827
NASA Astrophysics Data System (ADS)
Aab, A.; Abreu, P.; Aglietta, M.; Albuquerque, I. F. M.; Allekotte, I.; Almela, A.; Alvarez Castillo, J.; Alvarez-Muñiz, J.; Anastasi, G. A.; Anchordoqui, L.; Andrada, B.; Andringa, S.; Aramo, C.; Arsene, N.; Asorey, H.; Assis, P.; Avila, G.; Badescu, A. M.; Balaceanu, A.; Barbato, F.; Barreira Luz, R. J.; Beatty, J. J.; Becker, K. H.; Bellido, J. A.; Berat, C.; Bertaina, M. E.; Bertou, X.; Biermann, P. L.; Biteau, J.; Blaess, S. G.; Blanco, A.; Blazek, J.; Bleve, C.; Boháčová, M.; Bonifazi, C.; Borodai, N.; Botti, A. M.; Brack, J.; Brancus, I.; Bretz, T.; Bridgeman, A.; Briechle, F. L.; Buchholz, P.; Bueno, A.; Buitink, S.; Buscemi, M.; Caballero-Mora, K. S.; Caccianiga, L.; Cancio, A.; Canfora, F.; Caruso, R.; Castellina, A.; Catalani, F.; Cataldi, G.; Cazon, L.; Chavez, A. G.; Chinellato, J. A.; Chudoba, J.; Clay, R. W.; Cobos Cerutti, A. C.; Colalillo, R.; Coleman, A.; Collica, L.; Coluccia, M. R.; Conceição, R.; Consolati, G.; Contreras, F.; Cooper, M. J.; Coutu, S.; Covault, C. E.; Cronin, J.; D’Amico, S.; Daniel, B.; Dasso, S.; Daumiller, K.; Dawson, B. R.; de Almeida, R. M.; de Jong, S. J.; De Mauro, G.; de Mello Neto, J. R. T.; De Mitri, I.; de Oliveira, J.; de Souza, V.; Debatin, J.; Deligny, O.; Díaz Castro, M. L.; Diogo, F.; Dobrigkeit, C.; D’Olivo, J. C.; Dorosti, Q.; dos Anjos, R. C.; Dova, M. T.; Dundovic, A.; Ebr, J.; Engel, R.; Erdmann, M.; Erfani, M.; Escobar, C. O.; Espadanal, J.; Etchegoyen, A.; Falcke, H.; Farmer, J.; Farrar, G.; Fauth, A. C.; Fazzini, N.; Fenu, F.; Fick, B.; Figueira, J. M.; Filipčič, A.; Freire, M. M.; Fujii, T.; Fuster, A.; Gaïor, R.; García, B.; Gaté, F.; Gemmeke, H.; Gherghel-Lascu, A.; Ghia, P. L.; Giaccari, U.; Giammarchi, M.; Giller, M.; Głas, D.; Glaser, C.; Golup, G.; Gómez Berisso, M.; Gómez Vitale, P. F.; González, N.; Gorgi, A.; Grillo, A. F.; Grubb, T. D.; Guarino, F.; Guedes, G. P.; Halliday, R.; Hampel, M. R.; Hansen, P.; Harari, D.; Harrison, T. A.; Haungs, A.; Hebbeker, T.; Heck, D.; Heimann, P.; Herve, A. E.; Hill, G. C.; Hojvat, C.; Holt, E.; Homola, P.; Hörandel, J. R.; Horvath, P.; Hrabovský, M.; Huege, T.; Hulsman, J.; Insolia, A.; Isar, P. G.; Jandt, I.; Johnsen, J. A.; Josebachuili, M.; Jurysek, J.; Kääpä, A.; Kambeitz, O.; Kampert, K. H.; Keilhauer, B.; Kemmerich, N.; Kemp, E.; Kemp, J.; Kieckhafer, R. M.; Klages, H. O.; Kleifges, M.; Kleinfeller, J.; Krause, R.; Krohm, N.; Kuempel, D.; Kukec Mezek, G.; Kunka, N.; Kuotb Awad, A.; Lago, B. L.; LaHurd, D.; Lang, R. G.; Lauscher, M.; Legumina, R.; Leigui de Oliveira, M. A.; Letessier-Selvon, A.; Lhenry-Yvon, I.; Link, K.; Lo Presti, D.; Lopes, L.; López, R.; López Casado, A.; Lorek, R.; Luce, Q.; Lucero, A.; Malacari, M.; Mallamaci, M.; Mandat, D.; Mantsch, P.; Mariazzi, A. G.; Mariş, I. C.; Marsella, G.; Martello, D.; Martinez, H.; Martínez Bravo, O.; Masías Meza, J. J.; Mathes, H. J.; Mathys, S.; Matthews, J.; Matthiae, G.; Mayotte, E.; Mazur, P. O.; Medina, C.; Medina-Tanco, G.; Melo, D.; Menshikov, A.; Merenda, K.-D.; Michal, S.; Micheletti, M. I.; Middendorf, L.; Miramonti, L.; Mitrica, B.; Mockler, D.; Mollerach, S.; Montanet, F.; Morello, C.; Morlino, G.; Mostafá, M.; Müller, A. L.; Müller, G.; Muller, M. A.; Müller, S.; Mussa, R.; Naranjo, I.; Nellen, L.; Nguyen, P. H.; Niculescu-Oglinzanu, M.; Niechciol, M.; Niemietz, L.; Niggemann, T.; Nitz, D.; Nosek, D.; Novotny, V.; Nožka, L.; Núñez, L. A.; Oikonomou, F.; Olinto, A.; Palatka, M.; Pallotta, J.; Papenbreer, P.; Parente, G.; Parra, A.; Paul, T.; Pech, M.; Pedreira, F.; Pȩkala, J.; Pelayo, R.; Peña-Rodriguez, J.; Pereira, L. A. S.; Perlin, M.; Perrone, L.; Peters, C.; Petrera, S.; Phuntsok, J.; Pierog, T.; Pimenta, M.; Pirronello, V.; Platino, M.; Plum, M.; Poh, J.; Porowski, C.; Prado, R. R.; Privitera, P.; Prouza, M.; Quel, E. J.; Querchfeld, S.; Quinn, S.; Ramos-Pollan, R.; Rautenberg, J.; Ravignani, D.; Ridky, J.; Riehn, F.; Risse, M.; Ristori, P.; Rizi, V.; Rodrigues de Carvalho, W.; Rodriguez Fernandez, G.; Rodriguez Rojo, J.; Roncoroni, M. J.; Roth, M.; Roulet, E.; Rovero, A. C.; Ruehl, P.; Saffi, S. J.; Saftoiu, A.; Salamida, F.; Salazar, H.; Saleh, A.; Salina, G.; Sánchez, F.; Sanchez-Lucas, P.; Santos, E. M.; Santos, E.; Sarazin, F.; Sarmento, R.; Sarmiento-Cano, C.; Sato, R.; Schauer, M.; Scherini, V.; Schieler, H.; Schimp, M.; Schmidt, D.; Scholten, O.; Schovánek, P.; Schröder, F. G.; Schröder, S.; Schulz, A.; Schumacher, J.; Sciutto, S. J.; Segreto, A.; Shadkam, A.; Shellard, R. C.; Sigl, G.; Silli, G.; Šmída, R.; Snow, G. R.; Sommers, P.; Sonntag, S.; Soriano, J. F.; Squartini, R.; Stanca, D.; Stanič, S.; Stasielak, J.; Stassi, P.; Stolpovskiy, M.; Strafella, F.; Streich, A.; Suarez, F.; Suarez Durán, M.; Sudholz, T.; Suomijärvi, T.; Supanitsky, A. D.; Šupík, J.; Swain, J.; Szadkowski, Z.; Taboada, A.; Taborda, O. A.; Theodoro, V. M.; Timmermans, C.; Todero Peixoto, C. J.; Tomankova, L.; Tomé, B.; Torralba Elipe, G.; Travnicek, P.; Trini, M.; Ulrich, R.; Unger, M.; Urban, M.; Valdés Galicia, J. F.; Valiño, I.; Valore, L.; van Aar, G.; van Bodegom, P.; van den Berg, A. M.; van Vliet, A.; Varela, E.; Vargas Cárdenas, B.; Vázquez, R. A.; Veberič, D.; Ventura, C.; Vergara Quispe, I. D.; Verzi, V.; Vicha, J.; Villaseñor, L.; Vorobiov, S.; Wahlberg, H.; Wainberg, O.; Walz, D.; Watson, A. A.; Weber, M.; Weindl, A.; Wiedeński, M.; Wiencke, L.; Wilczyński, H.; Wirtz, M.; Wittkowski, D.; Wundheiler, B.; Yang, L.; Yushkov, A.; Zas, E.; Zavrtanik, D.; Zavrtanik, M.; Zepeda, A.; Zimmermann, B.; Ziolkowski, M.; Zong, Z.; Zuccarello, F.; The Pierre Auger Collaboration
2018-02-01
A new analysis of the data set from the Pierre Auger Observatory provides evidence for anisotropy in the arrival directions of ultra-high-energy cosmic rays on an intermediate angular scale, which is indicative of excess arrivals from strong, nearby sources. The data consist of 5514 events above 20 {EeV} with zenith angles up to 80° recorded before 2017 April 30. Sky models have been created for two distinct populations of extragalactic gamma-ray emitters: active galactic nuclei from the second catalog of hard Fermi-LAT sources (2FHL) and starburst galaxies from a sample that was examined with Fermi-LAT. Flux-limited samples, which include all types of galaxies from the Swift-BAT and 2MASS surveys, have been investigated for comparison. The sky model of cosmic-ray density constructed using each catalog has two free parameters, the fraction of events correlating with astrophysical objects, and an angular scale characterizing the clustering of cosmic rays around extragalactic sources. A maximum-likelihood ratio test is used to evaluate the best values of these parameters and to quantify the strength of each model by contrast with isotropy. It is found that the starburst model fits the data better than the hypothesis of isotropy with a statistical significance of 4.0σ, the highest value of the test statistic being for energies above 39 {EeV}. The three alternative models are favored against isotropy with 2.7σ–3.2σ significance. The origin of the indicated deviation from isotropy is examined and prospects for more sensitive future studies are discussed. Any correspondence should be addressed to .
Aab, A.; Abreu, P.; Aglietta, M.; ...
2018-02-02
A new analysis of the dataset from the Pierre Auger Observatory provides evidence for anisotropy in the arrival directions of ultra-high-energy cosmic rays on an intermediate angular scale, which is indicative of excess arrivals from strong, nearby sources. The data consist of 5514 events above 20 EeV with zenith angles up to 80 deg recorded before 2017 April 30. Sky models have been created for two distinct populations of extragalactic gamma-ray emitters: active galactic nuclei from the second catalog of hard Fermi-LAT sources (2FHL) and starburst galaxies from a sample that was examined with Fermi-LAT. Flux-limited samples, which include allmore » types of galaxies from the Swift-BAT and 2MASS surveys, have been investigated for comparison. The sky model of cosmic-ray density constructed using each catalog has two free parameters, the fraction of events correlating with astrophysical objects and an angular scale characterizing the clustering of cosmic rays around extragalactic sources. A maximum-likelihood ratio test is used to evaluate the best values of these parameters and to quantify the strength of each model by contrast with isotropy. It is found that the starburst model fits the data better than the hypothesis of isotropy with a statistical significance of 4.0 sigma, the highest value of the test statistic being for energies above 39 EeV. The three alternative models are favored against isotropy with 2.7-3.2 sigma significance. The origin of the indicated deviation from isotropy is examined and prospects for more sensitive future studies are discussed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aab, A.; Abreu, P.; Aglietta, M.
A new analysis of the dataset from the Pierre Auger Observatory provides evidence for anisotropy in the arrival directions of ultra-high-energy cosmic rays on an intermediate angular scale, which is indicative of excess arrivals from strong, nearby sources. The data consist of 5514 events above 20 EeV with zenith angles up to 80 deg recorded before 2017 April 30. Sky models have been created for two distinct populations of extragalactic gamma-ray emitters: active galactic nuclei from the second catalog of hard Fermi-LAT sources (2FHL) and starburst galaxies from a sample that was examined with Fermi-LAT. Flux-limited samples, which include allmore » types of galaxies from the Swift-BAT and 2MASS surveys, have been investigated for comparison. The sky model of cosmic-ray density constructed using each catalog has two free parameters, the fraction of events correlating with astrophysical objects and an angular scale characterizing the clustering of cosmic rays around extragalactic sources. A maximum-likelihood ratio test is used to evaluate the best values of these parameters and to quantify the strength of each model by contrast with isotropy. It is found that the starburst model fits the data better than the hypothesis of isotropy with a statistical significance of 4.0 sigma, the highest value of the test statistic being for energies above 39 EeV. The three alternative models are favored against isotropy with 2.7-3.2 sigma significance. The origin of the indicated deviation from isotropy is examined and prospects for more sensitive future studies are discussed.« less
Defense Acquisition Research Journal. Volume 21, Number 1, Issue 68
2014-01-01
Harrison’s game theory model of competition examines the bidding behavior of two equal competitors, but it does not address character- istics that...analysis examines a series of outcomes in both competitive and sole-source acquisition programs, using a statistical model that builds on a game theory ...model- ing, within a game theory framework developed by Todd Harrison, to show that the DoD may actually incur increased costs from competi- tion
NASA Astrophysics Data System (ADS)
Fu, Hui; Madjarska, M. S.; Li, Bo; Xia, LiDong; Huang, ZhengHua
2018-05-01
Two main models have been developed to explain the mechanisms of release, heating and acceleration of the nascent solar wind, the wave-turbulence-driven (WTD) models and reconnection-loop-opening (RLO) models, in which the plasma release processes are fundamentally different. Given that the statistical observational properties of helium ions produced in magnetically diverse solar regions could provide valuable information for the solar wind modelling, we examine the statistical properties of the helium abundance (AHe) and the speed difference between helium ions and protons (vαp) for coronal holes (CHs), active regions (ARs) and the quiet Sun (QS). We find bimodal distributions in the space of AHeand vαp/vA(where vA is the local Alfvén speed) for the solar wind as a whole. The CH wind measurements are concentrated at higher AHeand vαp/vAvalues with a smaller AHedistribution range, while the AR and QS wind is associated with lower AHeand vαp/vA, and a larger AHedistribution range. The magnetic diversity of the source regions and the physical processes related to it are possibly responsible for the different properties of AHeand vαp/vA. The statistical results suggest that the two solar wind generation mechanisms, WTD and RLO, work in parallel in all solar wind source regions. In CH regions WTD plays a major role, whereas the RLO mechanism is more important in AR and QS.
Comparisons of thermospheric density data sets and models
NASA Astrophysics Data System (ADS)
Doornbos, Eelco; van Helleputte, Tom; Emmert, John; Drob, Douglas; Bowman, Bruce R.; Pilinski, Marcin
During the past decade, continuous long-term data sets of thermospheric density have become available to researchers. These data sets have been derived from accelerometer measurements made by the CHAMP and GRACE satellites and from Space Surveillance Network (SSN) tracking data and related Two-Line Element (TLE) sets. These data have already resulted in a large number of publications on physical interpretation and improvement of empirical density modelling. This study compares four different density data sets and two empirical density models, for the period 2002-2009. These data sources are the CHAMP (1) and GRACE (2) accelerometer measurements, the long-term database of densities derived from TLE data (3), the High Accuracy Satellite Drag Model (4) run by Air Force Space Command, calibrated using SSN data, and the NRLMSISE-00 (5) and Jacchia-Bowman 2008 (6) empirical models. In describing these data sets and models, specific attention is given to differences in the geo-metrical and aerodynamic satellite modelling, applied in the conversion from drag to density measurements, which are main sources of density biases. The differences in temporal and spa-tial resolution of the density data sources are also described and taken into account. With these aspects in mind, statistics of density comparisons have been computed, both as a function of solar and geomagnetic activity levels, and as a function of latitude and local solar time. These statistics give a detailed view of the relative accuracy of the different data sets and of the biases between them. The differences are analysed with the aim at providing rough error bars on the data and models and pinpointing issues which could receive attention in future iterations of data processing algorithms and in future model development.
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
Towards a Comprehensive Model of Jet Noise Using an Acoustic Analogy and Steady RANS Solutions
NASA Technical Reports Server (NTRS)
Miller, Steven A. E.
2013-01-01
An acoustic analogy is developed to predict the noise from jet flows. It contains two source models that independently predict the noise from turbulence and shock wave shear layer interactions. The acoustic analogy is based on the Euler equations and separates the sources from propagation. Propagation effects are taken into account by calculating the vector Green's function of the linearized Euler equations. The sources are modeled following the work of Tam and Auriault, Morris and Boluriaan, and Morris and Miller. A statistical model of the two-point cross-correlation of the velocity fluctuations is used to describe the turbulence. The acoustic analogy attempts to take into account the correct scaling of the sources for a wide range of nozzle pressure and temperature ratios. It does not make assumptions regarding fine- or large-scale turbulent noise sources, self- or shear-noise, or convective amplification. The acoustic analogy is partially informed by three-dimensional steady Reynolds-Averaged Navier-Stokes solutions that include the nozzle geometry. The predictions are compared with experiments of jets operating subsonically through supersonically and at unheated and heated temperatures. Predictions generally capture the scaling of both mixing noise and BBSAN for the conditions examined, but some discrepancies remain that are due to the accuracy of the steady RANS turbulence model closure, the equivalent sources, and the use of a simplified vector Green's function solver of the linearized Euler equations.
Inferring Models of Bacterial Dynamics toward Point Sources
Jashnsaz, Hossein; Nguyen, Tyler; Petrache, Horia I.; Pressé, Steve
2015-01-01
Experiments have shown that bacteria can be sensitive to small variations in chemoattractant (CA) concentrations. Motivated by these findings, our focus here is on a regime rarely studied in experiments: bacteria tracking point CA sources (such as food patches or even prey). In tracking point sources, the CA detected by bacteria may show very large spatiotemporal fluctuations which vary with distance from the source. We present a general statistical model to describe how bacteria locate point sources of food on the basis of stochastic event detection, rather than CA gradient information. We show how all model parameters can be directly inferred from single cell tracking data even in the limit of high detection noise. Once parameterized, our model recapitulates bacterial behavior around point sources such as the “volcano effect”. In addition, while the search by bacteria for point sources such as prey may appear random, our model identifies key statistical signatures of a targeted search for a point source given any arbitrary source configuration. PMID:26466373
NASA Astrophysics Data System (ADS)
Song, Seok Goo; Kwak, Sangmin; Lee, Kyungbook; Park, Donghee
2017-04-01
It is a critical element to predict the intensity and variability of strong ground motions in seismic hazard assessment. The characteristics and variability of earthquake rupture process may be a dominant factor in determining the intensity and variability of near-source strong ground motions. Song et al. (2014) demonstrated that the variability of earthquake rupture scenarios could be effectively quantified in the framework of 1-point and 2-point statistics of earthquake source parameters, constrained by rupture dynamics and past events. The developed pseudo-dynamic source modeling schemes were also validated against the recorded ground motion data of past events and empirical ground motion prediction equations (GMPEs) at the broadband platform (BBP) developed by the Southern California Earthquake Center (SCEC). Recently we improved the computational efficiency of the developed pseudo-dynamic source-modeling scheme by adopting the nonparametric co-regionalization algorithm, introduced and applied in geostatistics initially. We also investigated the effect of earthquake rupture process on near-source ground motion characteristics in the framework of 1-point and 2-point statistics, particularly focusing on the forward directivity region. Finally we will discuss whether the pseudo-dynamic source modeling can reproduce the variability (standard deviation) of empirical GMPEs and the efficiency of 1-point and 2-point statistics to address the variability of ground motions.
Targeted versus statistical approaches to selecting parameters for modelling sediment provenance
NASA Astrophysics Data System (ADS)
Laceby, J. Patrick
2017-04-01
One effective field-based approach to modelling sediment provenance is the source fingerprinting technique. Arguably, one of the most important steps for this approach is selecting the appropriate suite of parameters or fingerprints used to model source contributions. Accordingly, approaches to selecting parameters for sediment source fingerprinting will be reviewed. Thereafter, opportunities and limitations of these approaches and some future research directions will be presented. For properties to be effective tracers of sediment, they must discriminate between sources whilst behaving conservatively. Conservative behavior is characterized by constancy in sediment properties, where the properties of sediment sources remain constant, or at the very least, any variation in these properties should occur in a predictable and measurable way. Therefore, properties selected for sediment source fingerprinting should remain constant through sediment detachment, transportation and deposition processes, or vary in a predictable and measurable way. One approach to select conservative properties for sediment source fingerprinting is to identify targeted tracers, such as caesium-137, that provide specific source information (e.g. surface versus subsurface origins). A second approach is to use statistical tests to select an optimal suite of conservative properties capable of modelling sediment provenance. In general, statistical approaches use a combination of a discrimination (e.g. Kruskal Wallis H-test, Mann-Whitney U-test) and parameter selection statistics (e.g. Discriminant Function Analysis or Principle Component Analysis). The challenge is that modelling sediment provenance is often not straightforward and there is increasing debate in the literature surrounding the most appropriate approach to selecting elements for modelling. Moving forward, it would be beneficial if researchers test their results with multiple modelling approaches, artificial mixtures, and multiple lines of evidence to provide secondary support to their initial modelling results. Indeed, element selection can greatly impact modelling results and having multiple lines of evidence will help provide confidence when modelling sediment provenance.
Forecasting runout of rock and debris avalanches
Iverson, Richard M.; Evans, S.G.; Mugnozza, G.S.; Strom, A.; Hermanns, R.L.
2006-01-01
Physically based mathematical models and statistically based empirical equations each may provide useful means of forecasting runout of rock and debris avalanches. This paper compares the foundations, strengths, and limitations of a physically based model and a statistically based forecasting method, both of which were developed to predict runout across three-dimensional topography. The chief advantage of the physically based model results from its ties to physical conservation laws and well-tested axioms of soil and rock mechanics, such as the Coulomb friction rule and effective-stress principle. The output of this model provides detailed information about the dynamics of avalanche runout, at the expense of high demands for accurate input data, numerical computation, and experimental testing. In comparison, the statistical method requires relatively modest computation and no input data except identification of prospective avalanche source areas and a range of postulated avalanche volumes. Like the physically based model, the statistical method yields maps of predicted runout, but it provides no information on runout dynamics. Although the two methods differ significantly in their structure and objectives, insights gained from one method can aid refinement of the other.
Near-Field Source Localization by Using Focusing Technique
NASA Astrophysics Data System (ADS)
He, Hongyang; Wang, Yide; Saillard, Joseph
2008-12-01
We discuss two fast algorithms to localize multiple sources in near field. The symmetry-based method proposed by Zhi and Chia (2007) is first improved by implementing a search-free procedure for the reduction of computation cost. We present then a focusing-based method which does not require symmetric array configuration. By using focusing technique, the near-field signal model is transformed into a model possessing the same structure as in the far-field situation, which allows the bearing estimation with the well-studied far-field methods. With the estimated bearing, the range estimation of each source is consequently obtained by using 1D MUSIC method without parameter pairing. The performance of the improved symmetry-based method and the proposed focusing-based method is compared by Monte Carlo simulations and with Crammer-Rao bound as well. Unlike other near-field algorithms, these two approaches require neither high-computation cost nor high-order statistics.
Jet Noise Physics and Modeling Using First-principles Simulations
NASA Technical Reports Server (NTRS)
Freund, Jonathan B.
2003-01-01
An extensive analysis of our jet DNS database has provided for the first time the complex correlations that are the core of many statistical jet noise models, including MGBK. We have also for the first time explicitly computed the noise from different components of a commonly used noise source as proposed in many modeling approaches. Key findings are: (1) While two-point (space and time) velocity statistics are well-fitted by decaying exponentials, even for our low-Reynolds-number jet, spatially integrated fourth-order space/retarded-time correlations, which constitute the noise "source" in MGBK, are instead well-fitted by Gaussians. The width of these Gaussians depends (by a factor of 2) on which components are considered. This is counter to current modeling practice, (2) A standard decomposition of the Lighthill source is shown by direct evaluation to be somewhat artificial since the noise from these nominally separate components is in fact highly correlated. We anticipate that the same will be the case for the Lilley source, and (3) The far-field sound is computed in a way that explicitly includes all quadrupole cancellations, yet evaluating the Lighthill integral for only a small part of the jet yields a far-field noise far louder than that from the whole jet due to missing nonquadrupole cancellations. Details of this study are discussed in a draft of a paper included as appendix A.
Influence of Elevation Data Source on 2D Hydraulic Modelling
NASA Astrophysics Data System (ADS)
Bakuła, Krzysztof; StĘpnik, Mateusz; Kurczyński, Zdzisław
2016-08-01
The aim of this paper is to analyse the influence of the source of various elevation data on hydraulic modelling in open channels. In the research, digital terrain models from different datasets were evaluated and used in two-dimensional hydraulic models. The following aerial and satellite elevation data were used to create the representation of terrain-digital terrain model: airborne laser scanning, image matching, elevation data collected in the LPIS, EuroDEM, and ASTER GDEM. From the results of five 2D hydrodynamic models with different input elevation data, the maximum depth and flow velocity of water were derived and compared with the results of the most accurate ALS data. For such an analysis a statistical evaluation and differences between hydraulic modelling results were prepared. The presented research proved the importance of the quality of elevation data in hydraulic modelling and showed that only ALS and photogrammetric data can be the most reliable elevation data source in accurate 2D hydraulic modelling.
Doiron, Dany; Marcon, Yannick; Fortier, Isabel; Burton, Paul; Ferretti, Vincent
2017-10-01
Improving the dissemination of information on existing epidemiological studies and facilitating the interoperability of study databases are essential to maximizing the use of resources and accelerating improvements in health. To address this, Maelstrom Research proposes Opal and Mica, two inter-operable open-source software packages providing out-of-the-box solutions for epidemiological data management, harmonization and dissemination. Opal and Mica are two standalone but inter-operable web applications written in Java, JavaScript and PHP. They provide web services and modern user interfaces to access them. Opal allows users to import, manage, annotate and harmonize study data. Mica is used to build searchable web portals disseminating study and variable metadata. When used conjointly, Mica users can securely query and retrieve summary statistics on geographically dispersed Opal servers in real-time. Integration with the DataSHIELD approach allows conducting more complex federated analyses involving statistical models. Opal and Mica are open-source and freely available at [www.obiba.org] under a General Public License (GPL) version 3, and the metadata models and taxonomies that accompany them are available under a Creative Commons licence. © The Author 2017; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association
Do gamma-ray burst sources repeat?
NASA Technical Reports Server (NTRS)
Meegan, C. A.; Hartmann, D. H.; Brainerd, J. J.; Briggs, M.; Paciesas, W. S.; Pendleton, G.; Kouveliotou, C.; Fishman, G.; Blumenthal, G.; Brock, M.
1994-01-01
The demonstration of repeated gamma-ray bursts from an individual source would severely constrain burst source models. Recent reports of evidence for repetition in the first BATSE burst catalog have generated renewed interest in this issue. Here, we analyze the angular distribution of 585 bursts of the second BATSE catalog (Meegan et al. 1994). We search for evidence of burst recurrence using the nearest and farthest neighbor statistic ad the two-point angular correlation function. We find the data to be consistent with the hypothesis that burst sources do not repeat; however, a repeater fraction of up to about 20% of the bursts cannot be excluded.
Statistics, Computation, and Modeling in Cosmology
NASA Astrophysics Data System (ADS)
Jewell, Jeff; Guiness, Joe; SAMSI 2016 Working Group in Cosmology
2017-01-01
Current and future ground and space based missions are designed to not only detect, but map out with increasing precision, details of the universe in its infancy to the present-day. As a result we are faced with the challenge of analyzing and interpreting observations from a wide variety of instruments to form a coherent view of the universe. Finding solutions to a broad range of challenging inference problems in cosmology is one of the goals of the “Statistics, Computation, and Modeling in Cosmology” workings groups, formed as part of the year long program on ‘Statistical, Mathematical, and Computational Methods for Astronomy’, hosted by the Statistical and Applied Mathematical Sciences Institute (SAMSI), a National Science Foundation funded institute. Two application areas have emerged for focused development in the cosmology working group involving advanced algorithmic implementations of exact Bayesian inference for the Cosmic Microwave Background, and statistical modeling of galaxy formation. The former includes study and development of advanced Markov Chain Monte Carlo algorithms designed to confront challenging inference problems including inference for spatial Gaussian random fields in the presence of sources of galactic emission (an example of a source separation problem). Extending these methods to future redshift survey data probing the nonlinear regime of large scale structure formation is also included in the working group activities. In addition, the working group is also focused on the study of ‘Galacticus’, a galaxy formation model applied to dark matter-only cosmological N-body simulations operating on time-dependent halo merger trees. The working group is interested in calibrating the Galacticus model to match statistics of galaxy survey observations; specifically stellar mass functions, luminosity functions, and color-color diagrams. The group will use subsampling approaches and fractional factorial designs to statistically and computationally efficiently explore the Galacticus parameter space. The group will also use the Galacticus simulations to study the relationship between the topological and physical structure of the halo merger trees and the properties of the resulting galaxies.
Sokhey, Taegh; Gaebler-Spira, Deborah; Kording, Konrad P.
2017-01-01
Background It is important to understand the motor deficits of children with Cerebral Palsy (CP). Our understanding of this motor disorder can be enriched by computational models of motor control. One crucial stage in generating movement involves combining uncertain information from different sources, and deficits in this process could contribute to reduced motor function in children with CP. Healthy adults can integrate previously-learned information (prior) with incoming sensory information (likelihood) in a close-to-optimal way when estimating object location, consistent with the use of Bayesian statistics. However, there are few studies investigating how children with CP perform sensorimotor integration. We compare sensorimotor estimation in children with CP and age-matched controls using a model-based analysis to understand the process. Methods and findings We examined Bayesian sensorimotor integration in children with CP, aged between 5 and 12 years old, with Gross Motor Function Classification System (GMFCS) levels 1–3 and compared their estimation behavior with age-matched typically-developing (TD) children. We used a simple sensorimotor estimation task which requires participants to combine probabilistic information from different sources: a likelihood distribution (current sensory information) with a prior distribution (learned target information). In order to examine sensorimotor integration, we quantified how participants weighed statistical information from the two sources (prior and likelihood) and compared this to the statistical optimal weighting. We found that the weighing of statistical information in children with CP was as statistically efficient as that of TD children. Conclusions We conclude that Bayesian sensorimotor integration is not impaired in children with CP and therefore, does not contribute to their motor deficits. Future research has the potential to enrich our understanding of motor disorders by investigating the stages of motor processing set out by computational models. Therapeutic interventions should exploit the ability of children with CP to use statistical information. PMID:29186196
Chang, Pao-Erh Paul; Yang, Jen-Chih Rena; Den, Walter; Wu, Chang-Fu
2014-09-01
Emissions of volatile organic compounds (VOCs) are most frequent environmental nuisance complaints in urban areas, especially where industrial districts are nearby. Unfortunately, identifying the responsible emission sources of VOCs is essentially a difficult task. In this study, we proposed a dynamic approach to gradually confine the location of potential VOC emission sources in an industrial complex, by combining multi-path open-path Fourier transform infrared spectrometry (OP-FTIR) measurement and the statistical method of principal component analysis (PCA). Close-cell FTIR was further used to verify the VOC emission source by measuring emitted VOCs from selected exhaust stacks at factories in the confined areas. Multiple open-path monitoring lines were deployed during a 3-month monitoring campaign in a complex industrial district. The emission patterns were identified and locations of emissions were confined by the wind data collected simultaneously. N,N-Dimethyl formamide (DMF), 2-butanone, toluene, and ethyl acetate with mean concentrations of 80.0 ± 1.8, 34.5 ± 0.8, 103.7 ± 2.8, and 26.6 ± 0.7 ppbv, respectively, were identified as the major VOC mixture at all times of the day around the receptor site. As the toxic air pollutant, the concentrations of DMF in air samples were found exceeding the ambient standard despite the path-average effect of OP-FTIR upon concentration levels. The PCA data identified three major emission sources, including PU coating, chemical packaging, and lithographic printing industries. Applying instrumental measurement and statistical modeling, this study has established a systematic approach for locating emission sources. Statistical modeling (PCA) plays an important role in reducing dimensionality of a large measured dataset and identifying underlying emission sources. Instrumental measurement, however, helps verify the outcomes of the statistical modeling. The field study has demonstrated the feasibility of using multi-path OP-FTIR measurement. The wind data incorporating with the statistical modeling (PCA) may successfully identify the major emission source in a complex industrial district.
NASA Astrophysics Data System (ADS)
Schliep, E. M.; Gelfand, A. E.; Holland, D. M.
2015-12-01
There is considerable demand for accurate air quality information in human health analyses. The sparsity of ground monitoring stations across the United States motivates the need for advanced statistical models to predict air quality metrics, such as PM2.5, at unobserved sites. Remote sensing technologies have the potential to expand our knowledge of PM2.5 spatial patterns beyond what we can predict from current PM2.5 monitoring networks. Data from satellites have an additional advantage in not requiring extensive emission inventories necessary for most atmospheric models that have been used in earlier data fusion models for air pollution. Statistical models combining monitoring station data with satellite-obtained aerosol optical thickness (AOT), also referred to as aerosol optical depth (AOD), have been proposed in the literature with varying levels of success in predicting PM2.5. The benefit of using AOT is that satellites provide complete gridded spatial coverage. However, the challenges involved with using it in fusion models are (1) the correlation between the two data sources varies both in time and in space, (2) the data sources are temporally and spatially misaligned, and (3) there is extensive missingness in the monitoring data and also in the satellite data due to cloud cover. We propose a hierarchical autoregressive spatially varying coefficients model to jointly model the two data sources, which addresses the foregoing challenges. Additionally, we offer formal model comparison for competing models in terms of model fit and out of sample prediction of PM2.5. The models are applied to daily observations of PM2.5 and AOT in the summer months of 2013 across the conterminous United States. Most notably, during this time period, we find small in-sample improvement incorporating AOT into our autoregressive model but little out-of-sample predictive improvement.
The Earthquake‐Source Inversion Validation (SIV) Project
Mai, P. Martin; Schorlemmer, Danijel; Page, Morgan T.; Ampuero, Jean-Paul; Asano, Kimiyuki; Causse, Mathieu; Custodio, Susana; Fan, Wenyuan; Festa, Gaetano; Galis, Martin; Gallovic, Frantisek; Imperatori, Walter; Käser, Martin; Malytskyy, Dmytro; Okuwaki, Ryo; Pollitz, Fred; Passone, Luca; Razafindrakoto, Hoby N. T.; Sekiguchi, Haruko; Song, Seok Goo; Somala, Surendra N.; Thingbaijam, Kiran K. S.; Twardzik, Cedric; van Driel, Martin; Vyas, Jagdish C.; Wang, Rongjiang; Yagi, Yuji; Zielke, Olaf
2016-01-01
Finite‐fault earthquake source inversions infer the (time‐dependent) displacement on the rupture surface from geophysical data. The resulting earthquake source models document the complexity of the rupture process. However, multiple source models for the same earthquake, obtained by different research teams, often exhibit remarkable dissimilarities. To address the uncertainties in earthquake‐source inversion methods and to understand strengths and weaknesses of the various approaches used, the Source Inversion Validation (SIV) project conducts a set of forward‐modeling exercises and inversion benchmarks. In this article, we describe the SIV strategy, the initial benchmarks, and current SIV results. Furthermore, we apply statistical tools for quantitative waveform comparison and for investigating source‐model (dis)similarities that enable us to rank the solutions, and to identify particularly promising source inversion approaches. All SIV exercises (with related data and descriptions) and statistical comparison tools are available via an online collaboration platform, and we encourage source modelers to use the SIV benchmarks for developing and testing new methods. We envision that the SIV efforts will lead to new developments for tackling the earthquake‐source imaging problem.
Hawe, David; Hernández Fernández, Francisco R; O'Suilleabháin, Liam; Huang, Jian; Wolsztynski, Eric; O'Sullivan, Finbarr
2012-05-01
In dynamic mode, positron emission tomography (PET) can be used to track the evolution of injected radio-labelled molecules in living tissue. This is a powerful diagnostic imaging technique that provides a unique opportunity to probe the status of healthy and pathological tissue by examining how it processes substrates. The spatial aspect of PET is well established in the computational statistics literature. This article focuses on its temporal aspect. The interpretation of PET time-course data is complicated because the measured signal is a combination of vascular delivery and tissue retention effects. If the arterial time-course is known, the tissue time-course can typically be expressed in terms of a linear convolution between the arterial time-course and the tissue residue. In statistical terms, the residue function is essentially a survival function - a familiar life-time data construct. Kinetic analysis of PET data is concerned with estimation of the residue and associated functionals such as flow, flux, volume of distribution and transit time summaries. This review emphasises a nonparametric approach to the estimation of the residue based on a piecewise linear form. Rapid implementation of this by quadratic programming is described. The approach provides a reference for statistical assessment of widely used one- and two-compartmental model forms. We illustrate the method with data from two of the most well-established PET radiotracers, (15)O-H(2)O and (18)F-fluorodeoxyglucose, used for assessment of blood perfusion and glucose metabolism respectively. The presentation illustrates the use of two open-source tools, AMIDE and R, for PET scan manipulation and model inference.
Performance evaluation of WAVEWATCH III model in the Persian Gulf using different wind resources
NASA Astrophysics Data System (ADS)
Kazeminezhad, Mohammad Hossein; Siadatmousavi, Seyed Mostafa
2017-07-01
The third-generation wave model, WAVEWATCH III, was employed to simulate bulk wave parameters in the Persian Gulf using three different wind sources: ERA-Interim, CCMP, and GFS-Analysis. Different formulations for whitecapping term and the energy transfer from wind to wave were used, namely the Tolman and Chalikov (J Phys Oceanogr 26:497-518, 1996), WAM cycle 4 (BJA and WAM4), and Ardhuin et al. (J Phys Oceanogr 40(9):1917-1941, 2010) (TEST405 and TEST451 parameterizations) source term packages. The obtained results from numerical simulations were compared to altimeter-derived significant wave heights and measured wave parameters at two stations in the northern part of the Persian Gulf through statistical indicators and the Taylor diagram. Comparison of the bulk wave parameters with measured values showed underestimation of wave height using all wind sources. However, the performance of the model was best when GFS-Analysis wind data were used. In general, when wind veering from southeast to northwest occurred, and wind speed was high during the rotation, the model underestimation of wave height was severe. Except for the Tolman and Chalikov (J Phys Oceanogr 26:497-518, 1996) source term package, which severely underestimated the bulk wave parameters during stormy condition, the performances of other formulations were practically similar. However, in terms of statistics, the Ardhuin et al. (J Phys Oceanogr 40(9):1917-1941, 2010) source terms with TEST405 parameterization were the most successful formulation in the Persian Gulf when compared to in situ and altimeter-derived observations.
Development of a reactive-dispersive plume model
NASA Astrophysics Data System (ADS)
Kim, Hyun S.; Kim, Yong H.; Song, Chul H.
2017-04-01
A reactive-dispersive plume model (RDPM) was developed in this study. The RDPM can consider two main components of large-scale point source plume: i) turbulent dispersion and ii) photochemical reactions. In order to evaluate the simulation performance of newly developed RDPM, the comparisons between the model-predicted and observed mixing ratios were made using the TexAQS II 2006 (Texas Air Quality Study II 2006) power-plant experiment data. Statistical analyses show good correlation (0.61≤R≤0.92), and good agreement with the Index of Agreement (0.70≤R≤0.95). The chemical NOx lifetimes for two power-plant plumes (Monticello and Welsh power plants) were also estimated.
Out-of-plane permeability of multilayer 0°/90° non-crimp fabrics
NASA Astrophysics Data System (ADS)
Fang, Liangchao; Wu, Wenyu; Xu, Chunting; Zhang, Hui
2018-03-01
Layer shift is the main source of the variations in permeability values for multilayer fabrics. This phenomenon could change the flow path and cause inadequate infiltration. In this paper, the out-of-plane permeability of multilayer 0°/90° non-crimp fabrics was analyzed statistically. Based on the prediction models of 2-layer fabrics, every two adjacent layers were regarded as porous media with different permeabilities. The out-of-plane permeability of multilayer fabrics was then modeled with the electrical resistance analogy. Analytical results were compared with experiment data. And the effect of number of layer on permeability was thoroughly researched based on the statistical point of view.
Accounting for multiple sources of uncertainty in impact assessments: The example of the BRACE study
NASA Astrophysics Data System (ADS)
O'Neill, B. C.
2015-12-01
Assessing climate change impacts often requires the use of multiple scenarios, types of models, and data sources, leading to a large number of potential sources of uncertainty. For example, a single study might require a choice of a forcing scenario, climate model, bias correction and/or downscaling method, societal development scenario, model (typically several) for quantifying elements of societal development such as economic and population growth, biophysical model (such as for crop yields or hydrology), and societal impact model (e.g. economic or health model). Some sources of uncertainty are reduced or eliminated by the framing of the question. For example, it may be useful to ask what an impact outcome would be conditional on a given societal development pathway, forcing scenario, or policy. However many sources of uncertainty remain, and it is rare for all or even most of these sources to be accounted for. I use the example of a recent integrated project on the Benefits of Reduced Anthropogenic Climate changE (BRACE) to explore useful approaches to uncertainty across multiple components of an impact assessment. BRACE comprises 23 papers that assess the differences in impacts between two alternative climate futures: those associated with Representative Concentration Pathways (RCPs) 4.5 and 8.5. It quantifies difference in impacts in terms of extreme events, health, agriculture, tropical cyclones, and sea level rise. Methodologically, it includes climate modeling, statistical analysis, integrated assessment modeling, and sector-specific impact modeling. It employs alternative scenarios of both radiative forcing and societal development, but generally uses a single climate model (CESM), partially accounting for climate uncertainty by drawing heavily on large initial condition ensembles. Strengths and weaknesses of the approach to uncertainty in BRACE are assessed. Options under consideration for improving the approach include the use of perturbed physics ensembles of CESM, employing results from multiple climate models, and combining the results from single impact models with statistical representations of uncertainty across multiple models. A key consideration is the relationship between the question being addressed and the uncertainty approach.
Do gamma-ray burst sources repeat?
NASA Technical Reports Server (NTRS)
Meegan, Charles A.; Hartmann, Dieter H.; Brainerd, J. J.; Briggs, Michael S.; Paciesas, William S.; Pendleton, Geoffrey; Kouveliotou, Chryssa; Fishman, Gerald; Blumenthal, George; Brock, Martin
1995-01-01
The demonstration of repeated gamma-ray bursts from an individual source would severely constrain burst source models. Recent reports (Quashnock and Lamb, 1993; Wang and Lingenfelter, 1993) of evidence for repetition in the first BATSE burst catalog have generated renewed interest in this issue. Here, we analyze the angular distribution of 585 bursts of the second BATSE catalog (Meegan et al., 1994). We search for evidence of burst recurrence using the nearest and farthest neighbor statistic and the two-point angular correlation function. We find the data to be consistent with the hypothesis that burst sources do not repeat; however, a repeater fraction of up to about 20% of the observed bursts cannot be excluded.
NASA Astrophysics Data System (ADS)
Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.
2013-04-01
Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Gladysz, Szymon; Yaitskova, Natalia; Christou, Julian C
2010-11-01
This paper is an introduction to the problem of modeling the probability density function of adaptive-optics speckle. We show that with the modified Rician distribution one cannot describe the statistics of light on axis. A dual solution is proposed: the modified Rician distribution for off-axis speckle and gamma-based distribution for the core of the point spread function. From these two distributions we derive optimal statistical discriminators between real sources and quasi-static speckles. In the second part of the paper the morphological difference between the two probability density functions is used to constrain a one-dimensional, "blind," iterative deconvolution at the position of an exoplanet. Separation of the probability density functions of signal and speckle yields accurate differential photometry in our simulations of the SPHERE planet finder instrument.
An emission-weighted proximity model for air pollution exposure assessment.
Zou, Bin; Wilson, J Gaines; Zhan, F Benjamin; Zeng, Yongnian
2009-08-15
Among the most common spatial models for estimating personal exposure are Traditional Proximity Models (TPMs). Though TPMs are straightforward to configure and interpret, they are prone to extensive errors in exposure estimates and do not provide prospective estimates. To resolve these inherent problems with TPMs, we introduce here a novel Emission Weighted Proximity Model (EWPM) to improve the TPM, which takes into consideration the emissions from all sources potentially influencing the receptors. EWPM performance was evaluated by comparing the normalized exposure risk values of sulfur dioxide (SO(2)) calculated by EWPM with those calculated by TPM and monitored observations over a one-year period in two large Texas counties. In order to investigate whether the limitations of TPM in potential exposure risk prediction without recorded incidence can be overcome, we also introduce a hybrid framework, a 'Geo-statistical EWPM'. Geo-statistical EWPM is a synthesis of Ordinary Kriging Geo-statistical interpolation and EWPM. The prediction results are presented as two potential exposure risk prediction maps. The performance of these two exposure maps in predicting individual SO(2) exposure risk was validated with 10 virtual cases in prospective exposure scenarios. Risk values for EWPM were clearly more agreeable with the observed concentrations than those from TPM. Over the entire study area, the mean SO(2) exposure risk from EWPM was higher relative to TPM (1.00 vs. 0.91). The mean bias of the exposure risk values of 10 virtual cases between EWPM and 'Geo-statistical EWPM' are much smaller than those between TPM and 'Geo-statistical TPM' (5.12 vs. 24.63). EWPM appears to more accurately portray individual exposure relative to TPM. The 'Geo-statistical EWPM' effectively augments the role of the standard proximity model and makes it possible to predict individual risk in future exposure scenarios resulting in adverse health effects from environmental pollution.
Directional Statistics for Polarization Observations of Individual Pulses from Radio Pulsars
NASA Astrophysics Data System (ADS)
McKinnon, M. M.
2010-10-01
Radio polarimetry is a three-dimensional statistical problem. The three-dimensional aspect of the problem arises from the Stokes parameters Q, U, and V, which completely describe the polarization of electromagnetic radiation and conceptually define the orientation of a polarization vector in the Poincaré sphere. The statistical aspect of the problem arises from the random fluctuations in the source-intrinsic polarization and the instrumental noise. A simple model for the polarization of pulsar radio emission has been used to derive the three-dimensional statistics of radio polarimetry. The model is based upon the proposition that the observed polarization is due to the incoherent superposition of two, highly polarized, orthogonal modes. The directional statistics derived from the model follow the Bingham-Mardia and Fisher family of distributions. The model assumptions are supported by the qualitative agreement between the statistics derived from it and those measured with polarization observations of the individual pulses from pulsars. The orthogonal modes are thought to be the natural modes of radio wave propagation in the pulsar magnetosphere. The intensities of the modes become statistically independent when generalized Faraday rotation (GFR) in the magnetosphere causes the difference in their phases to be large. A stochastic version of GFR occurs when fluctuations in the phase difference are also large, and may be responsible for the more complicated polarization patterns observed in pulsar radio emission.
Impact of South American heroin on the US heroin market 1993-2004.
Ciccarone, Daniel; Unick, George J; Kraus, Allison
2009-09-01
The past two decades have seen an increase in heroin-related morbidity and mortality in the United States. We report on trends in US heroin retail price and purity, including the effect of entry of Colombian-sourced heroin on the US heroin market. The average standardized price ($/mg-pure) and purity (% by weight) of heroin from 1993 to 2004 was from obtained from US Drug Enforcement Agency retail purchase data for 20 metropolitan statistical areas. Univariate statistics, robust Ordinary Least Squares regression and mixed fixed and random effect growth curve models were used to predict the price and purity data in each metropolitan statistical area over time. Over the 12 study years, heroin price decreased 62%. The median percentage of all heroin samples that are of South American origin increased an absolute 7% per year. Multivariate models suggest percent South American heroin is a significant predictor of lower heroin price and higher purity adjusting for time and demographics. These analyses reveal trends to historically low-cost heroin in many US cities. These changes correspond to the entrance into and rapid domination of the US heroin market by Colombian-sourced heroin. The implications of these changes are discussed.
Multiple point statistical simulation using uncertain (soft) conditional data
NASA Astrophysics Data System (ADS)
Hansen, Thomas Mejer; Vu, Le Thanh; Mosegaard, Klaus; Cordua, Knud Skou
2018-05-01
Geostatistical simulation methods have been used to quantify spatial variability of reservoir models since the 80s. In the last two decades, state of the art simulation methods have changed from being based on covariance-based 2-point statistics to multiple-point statistics (MPS), that allow simulation of more realistic Earth-structures. In addition, increasing amounts of geo-information (geophysical, geological, etc.) from multiple sources are being collected. This pose the problem of integration of these different sources of information, such that decisions related to reservoir models can be taken on an as informed base as possible. In principle, though difficult in practice, this can be achieved using computationally expensive Monte Carlo methods. Here we investigate the use of sequential simulation based MPS simulation methods conditional to uncertain (soft) data, as a computational efficient alternative. First, it is demonstrated that current implementations of sequential simulation based on MPS (e.g. SNESIM, ENESIM and Direct Sampling) do not account properly for uncertain conditional information, due to a combination of using only co-located information, and a random simulation path. Then, we suggest two approaches that better account for the available uncertain information. The first make use of a preferential simulation path, where more informed model parameters are visited preferentially to less informed ones. The second approach involves using non co-located uncertain information. For different types of available data, these approaches are demonstrated to produce simulation results similar to those obtained by the general Monte Carlo based approach. These methods allow MPS simulation to condition properly to uncertain (soft) data, and hence provides a computationally attractive approach for integration of information about a reservoir model.
Electromagnetic sinc Schell-model beams and their statistical properties.
Mei, Zhangrong; Mao, Yonghua
2014-09-22
A class of electromagnetic sources with sinc Schell-model correlations is introduced. The conditions on source parameters guaranteeing that the source generates a physical beam are derived. The evolution behaviors of statistical properties for the electromagnetic stochastic beams generated by this new source on propagating in free space and in atmosphere turbulence are investigated with the help of the weighted superposition method and by numerical simulations. It is demonstrated that the intensity distributions of such beams exhibit unique features on propagating in free space and produce a double-layer flat-top profile of being shape-invariant in the far field. This feature makes this new beam particularly suitable for some special laser processing applications. The influences of the atmosphere turbulence with a non-Kolmogorov power spectrum on statistical properties of the new beams are analyzed in detail.
Detecting Answer Copying Using Alternate Test Forms and Seat Locations in Small-Scale Examinations
ERIC Educational Resources Information Center
van der Ark, L. Andries; Emons, Wilco H. M.; Sijtsma, Klaas
2008-01-01
Two types of answer-copying statistics for detecting copiers in small-scale examinations are proposed. One statistic identifies the "copier-source" pair, and the other in addition suggests who is copier and who is source. Both types of statistics can be used when the examination has alternate test forms. A simulation study shows that the…
Syndromic surveillance models using Web data: the case of scarlet fever in the UK.
Samaras, Loukas; García-Barriocanal, Elena; Sicilia, Miguel-Angel
2012-03-01
Recent research has shown the potential of Web queries as a source for syndromic surveillance, and existing studies show that these queries can be used as a basis for estimation and prediction of the development of a syndromic disease, such as influenza, using log linear (logit) statistical models. Two alternative models are applied to the relationship between cases and Web queries in this paper. We examine the applicability of using statistical methods to relate search engine queries with scarlet fever cases in the UK, taking advantage of tools to acquire the appropriate data from Google, and using an alternative statistical method based on gamma distributions. The results show that using logit models, the Pearson correlation factor between Web queries and the data obtained from the official agencies must be over 0.90, otherwise the prediction of the peak and the spread of the distributions gives significant deviations. In this paper, we describe the gamma distribution model and show that we can obtain better results in all cases using gamma transformations, and especially in those with a smaller correlation factor.
NASA Astrophysics Data System (ADS)
Yang, Pan; Ng, Tze Ling
2017-11-01
Accurate rainfall measurement at high spatial and temporal resolutions is critical for the modeling and management of urban storm water. In this study, we conduct computer simulation experiments to test the potential of a crowd-sourcing approach, where smartphones, surveillance cameras, and other devices act as precipitation sensors, as an alternative to the traditional approach of using rain gauges to monitor urban rainfall. The crowd-sourcing approach is promising as it has the potential to provide high-density measurements, albeit with relatively large individual errors. We explore the potential of this approach for urban rainfall monitoring and the subsequent implications for storm water modeling through a series of simulation experiments involving synthetically generated crowd-sourced rainfall data and a storm water model. The results show that even under conservative assumptions, crowd-sourced rainfall data lead to more accurate modeling of storm water flows as compared to rain gauge data. We observe the relative superiority of the crowd-sourcing approach to vary depending on crowd participation rate, measurement accuracy, drainage area, choice of performance statistic, and crowd-sourced observation type. A possible reason for our findings is the differences between the error structures of crowd-sourced and rain gauge rainfall fields resulting from the differences between the errors and densities of the raw measurement data underlying the two field types.
Ellis, Hugh; Schoenberger, Erica
2017-01-01
According to the most recent estimates, 842,000 deaths in low- to middle-income countries were attributable to inadequate water, sanitation and hygiene in 2012. Despite billions of dollars and decades of effort, we still lack a sound understanding of which kinds of WASH interventions are most effective in improving public health outcomes, and an important corollary-whether the right things are being measured. The World Health Organization (WHO) has made a concerted effort to compile comprehensive data on drinking water quality and sanitation in the developing world. A recent 2014 report provides information on three phenotypes (responses): Unsafe Water Deaths, Unsafe Sanitation Deaths, Unsafe Hygiene Deaths; two grouped phenotypes: Unsafe Water and Sanitation Deaths and Unsafe Water, Sanitation and Hygiene Deaths; and six explanatory variables (predictors): Improved Sanitation, Unimproved Water Source, Piped Water To Premises, Other Improved Water Source, Filtered and Bottled Water in the Household and Handwashing. Regression analyses were performed to identify statistically significant associations between these mortality responses and predictors. Good fitted-model performance required: (1) the use of population-normalized death fractions as opposed to number of deaths; (2) transformed response (logit or power); and (3) square-root predictor transformation. Given the complexity and heterogeneity of the relationships and countries being studied, these models exhibited remarkable performance and explained, for example, about 85% of the observed variance in population-normalized Unsafe Sanitation Death fraction, with a high F-statistic and highly statistically significant predictor p-values. Similar performance was found for all other responses, which was an unexpected result (the expected associations between responses and predictors-i.e., water-related with water-related, etc. did not occur). The set of statistically significant predictors remains the same across all responses. That is, Unsafe Water Source (UWS), Improved Sanitation (IS) and Filtered and Bottled Water in the Household (FBH) were the only statistically significant predictors whether the response was Unsafe Sanitation Death Fraction, Unsafe Hygiene Death Fraction or Unsafe Water Death Fraction. Moreover, the fraction of variance explained for all fitted models remained relatively high (adjusted R2 ranges from 0.7605 to 0.8533). We find that two of the statistically significant predictors-Improved Sanitation and Unimproved Water Sources-are particularly influential. We also find that some predictors (Piped Water to Premises, Other Improved Water Sources) have very little explanatory power for predicting mortality and one (Other Improved Water Sources) has a counterintuitive effect on response (Unsafe Sanitary Death Fraction increases with increases in OIWS) and one predictor (Hand Washing) to have essentially no explanatory usefulness. Our results suggest that a higher priority may need to be given to improved sanitation than has been the case. Nevertheless, while our focus in this paper is mortality, morbidity is a staggering consequence of inadequate water, sanitation and hygiene, and lower impact on mortality may not mean a similarly low impact on morbidity. More specifically, those predictors that we found uninfluential for predicting mortality-related responses may indeed be important when morbidity is the response.
Plis, Sergey M; George, J S; Jun, S C; Paré-Blagoev, J; Ranken, D M; Wood, C C; Schmidt, D M
2007-01-01
We propose a new model to approximate spatiotemporal noise covariance for use in neural electromagnetic source analysis, which better captures temporal variability in background activity. As with other existing formalisms, our model employs a Kronecker product of matrices representing temporal and spatial covariance. In our model, spatial components are allowed to have differing temporal covariances. Variability is represented as a series of Kronecker products of spatial component covariances and corresponding temporal covariances. Unlike previous attempts to model covariance through a sum of Kronecker products, our model is designed to have a computationally manageable inverse. Despite increased descriptive power, inversion of the model is fast, making it useful in source analysis. We have explored two versions of the model. One is estimated based on the assumption that spatial components of background noise have uncorrelated time courses. Another version, which gives closer approximation, is based on the assumption that time courses are statistically independent. The accuracy of the structural approximation is compared to an existing model, based on a single Kronecker product, using both Frobenius norm of the difference between spatiotemporal sample covariance and a model, and scatter plots. Performance of ours and previous models is compared in source analysis of a large number of single dipole problems with simulated time courses and with background from authentic magnetoencephalography data.
NASA Astrophysics Data System (ADS)
Basu, Nandita B.; Fure, Adrian D.; Jawitz, James W.
2008-07-01
Simulations of nonpartitioning and partitioning tracer tests were used to parameterize the equilibrium stream tube model (ESM) that predicts the dissolution dynamics of dense nonaqueous phase liquids (DNAPLs) as a function of the Lagrangian properties of DNAPL source zones. Lagrangian, or stream-tube-based, approaches characterize source zones with as few as two trajectory-integrated parameters, in contrast to the potentially thousands of parameters required to describe the point-by-point variability in permeability and DNAPL in traditional Eulerian modeling approaches. The spill and subsequent dissolution of DNAPLs were simulated in two-dimensional domains having different hydrologic characteristics (variance of the log conductivity field = 0.2, 1, and 3) using the multiphase flow and transport simulator UTCHEM. Nonpartitioning and partitioning tracers were used to characterize the Lagrangian properties (travel time and trajectory-integrated DNAPL content statistics) of DNAPL source zones, which were in turn shown to be sufficient for accurate prediction of source dissolution behavior using the ESM throughout the relatively broad range of hydraulic conductivity variances tested here. The results were found to be relatively insensitive to travel time variability, suggesting that dissolution could be accurately predicted even if the travel time variance was only coarsely estimated. Estimation of the ESM parameters was also demonstrated using an approximate technique based on Eulerian data in the absence of tracer data; however, determining the minimum amount of such data required remains for future work. Finally, the stream tube model was shown to be a more unique predictor of dissolution behavior than approaches based on the ganglia-to-pool model for source zone characterization.
Palazón, L; Navas, A
2017-06-01
Information on sediment contribution and transport dynamics from the contributing catchments is needed to develop management plans to tackle environmental problems related with effects of fine sediment as reservoir siltation. In this respect, the fingerprinting technique is an indirect technique known to be valuable and effective for sediment source identification in river catchments. Large variability in sediment delivery was found in previous studies in the Barasona catchment (1509 km 2 , Central Spanish Pyrenees). Simulation results with SWAT and fingerprinting approaches identified badlands and agricultural uses as the main contributors to sediment supply in the reservoir. In this study the <63 μm sediment fraction from the surface reservoir sediments (2 cm) are investigated following the fingerprinting procedure to assess how the use of different statistical procedures affects the amounts of source contributions. Three optimum composite fingerprints were selected to discriminate between source contributions based in land uses/land covers from the same dataset by the application of (1) discriminant function analysis; and its combination (as second step) with (2) Kruskal-Wallis H-test and (3) principal components analysis. Source contribution results were different between assessed options with the greatest differences observed for option using #3, including the two step process: principal components analysis and discriminant function analysis. The characteristics of the solutions by the applied mixing model and the conceptual understanding of the catchment showed that the most reliable solution was achieved using #2, the two step process of Kruskal-Wallis H-test and discriminant function analysis. The assessment showed the importance of the statistical procedure used to define the optimum composite fingerprint for sediment fingerprinting applications. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Parametric Study of Fine-scale Turbulence Mixing Noise
NASA Technical Reports Server (NTRS)
Khavaran, Abbas; Bridges, James; Freund, Jonathan B.
2002-01-01
The present paper is a study of aerodynamic noise spectra from model functions that describe the source. The study is motivated by the need to improve the spectral shape of the MGBK jet noise prediction methodology at high frequency. The predicted spectral shape usually appears less broadband than measurements and faster decaying at high frequency. Theoretical representation of the source is based on Lilley's equation. Numerical simulations of high-speed subsonic jets as well as some recent turbulence measurements reveal a number of interesting statistical properties of turbulence correlation functions that may have a bearing on radiated noise. These studies indicate that an exponential spatial function may be a more appropriate representation of a two-point correlation compared to its Gaussian counterpart. The effect of source non-compactness on spectral shape is discussed. It is shown that source non-compactness could well be the differentiating factor between the Gaussian and exponential model functions. In particular, the fall-off of the noise spectra at high frequency is studied and it is shown that a non-compact source with an exponential model function results in a broader spectrum and better agreement with data. An alternate source model that represents the source as a covariance of the convective derivative of fine-scale turbulence kinetic energy is also examined.
Dong, Yingying; Luo, Ruisen; Feng, Haikuan; Wang, Jihua; Zhao, Jinling; Zhu, Yining; Yang, Guijun
2014-01-01
Differences exist among analysis results of agriculture monitoring and crop production based on remote sensing observations, which are obtained at different spatial scales from multiple remote sensors in same time period, and processed by same algorithms, models or methods. These differences can be mainly quantitatively described from three aspects, i.e. multiple remote sensing observations, crop parameters estimation models, and spatial scale effects of surface parameters. Our research proposed a new method to analyse and correct the differences between multi-source and multi-scale spatial remote sensing surface reflectance datasets, aiming to provide references for further studies in agricultural application with multiple remotely sensed observations from different sources. The new method was constructed on the basis of physical and mathematical properties of multi-source and multi-scale reflectance datasets. Theories of statistics were involved to extract statistical characteristics of multiple surface reflectance datasets, and further quantitatively analyse spatial variations of these characteristics at multiple spatial scales. Then, taking the surface reflectance at small spatial scale as the baseline data, theories of Gaussian distribution were selected for multiple surface reflectance datasets correction based on the above obtained physical characteristics and mathematical distribution properties, and their spatial variations. This proposed method was verified by two sets of multiple satellite images, which were obtained in two experimental fields located in Inner Mongolia and Beijing, China with different degrees of homogeneity of underlying surfaces. Experimental results indicate that differences of surface reflectance datasets at multiple spatial scales could be effectively corrected over non-homogeneous underlying surfaces, which provide database for further multi-source and multi-scale crop growth monitoring and yield prediction, and their corresponding consistency analysis evaluation.
Dong, Yingying; Luo, Ruisen; Feng, Haikuan; Wang, Jihua; Zhao, Jinling; Zhu, Yining; Yang, Guijun
2014-01-01
Differences exist among analysis results of agriculture monitoring and crop production based on remote sensing observations, which are obtained at different spatial scales from multiple remote sensors in same time period, and processed by same algorithms, models or methods. These differences can be mainly quantitatively described from three aspects, i.e. multiple remote sensing observations, crop parameters estimation models, and spatial scale effects of surface parameters. Our research proposed a new method to analyse and correct the differences between multi-source and multi-scale spatial remote sensing surface reflectance datasets, aiming to provide references for further studies in agricultural application with multiple remotely sensed observations from different sources. The new method was constructed on the basis of physical and mathematical properties of multi-source and multi-scale reflectance datasets. Theories of statistics were involved to extract statistical characteristics of multiple surface reflectance datasets, and further quantitatively analyse spatial variations of these characteristics at multiple spatial scales. Then, taking the surface reflectance at small spatial scale as the baseline data, theories of Gaussian distribution were selected for multiple surface reflectance datasets correction based on the above obtained physical characteristics and mathematical distribution properties, and their spatial variations. This proposed method was verified by two sets of multiple satellite images, which were obtained in two experimental fields located in Inner Mongolia and Beijing, China with different degrees of homogeneity of underlying surfaces. Experimental results indicate that differences of surface reflectance datasets at multiple spatial scales could be effectively corrected over non-homogeneous underlying surfaces, which provide database for further multi-source and multi-scale crop growth monitoring and yield prediction, and their corresponding consistency analysis evaluation. PMID:25405760
Observability of ionospheric space-time structure with ISR: A simulation study
NASA Astrophysics Data System (ADS)
Swoboda, John; Semeter, Joshua; Zettergren, Matthew; Erickson, Philip J.
2017-02-01
The sources of error from electronically steerable array (ESA) incoherent scatter radar (ISR) systems are investigated both theoretically and with use of an open-source ISR simulator, developed by the authors, called Simulator for ISR (SimISR). The main sources of error incorporated in the simulator include statistical uncertainty, which arises due to nature of the measurement mechanism and the inherent space-time ambiguity from the sensor. SimISR can take a field of plasma parameters, parameterized by time and space, and create simulated ISR data at the scattered electric field (i.e., complex receiver voltage) level, subsequently processing these data to show possible reconstructions of the original parameter field. To demonstrate general utility, we show a number of simulation examples, with two cases using data from a self-consistent multifluid transport model. Results highlight the significant influence of the forward model of the ISR process and the resulting statistical uncertainty on plasma parameter measurements and the core experiment design trade-offs that must be made when planning observations. These conclusions further underscore the utility of this class of measurement simulator as a design tool for more optimal experiment design efforts using flexible ESA class ISR systems.
Application of classification-tree methods to identify nitrate sources in ground water
Spruill, T.B.; Showers, W.J.; Howe, S.S.
2002-01-01
A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.
Educational Statistics and School Improvement. Statistics and the Federal Role in Education.
ERIC Educational Resources Information Center
Hawley, Willis D.
This paper focuses on how educational statistics might better serve the quest for educational improvement in elementary and secondary schools. A model for conceptualizing the sources and processes of school productivity is presented. The Learning Productivity Model suggests that school outcomes are the consequence of the interaction of five…
Latitude Dependence of Low-Altitude O+ Ion Upflow: Statistical Results From FAST Observations
NASA Astrophysics Data System (ADS)
Zhao, K.; Chen, K. W.; Jiang, Y.; Chen, W. J.; Huang, L. F.; Fu, S.
2017-09-01
We introduce a statistical model to explain the latitudinal dependence of the occurrence rate and energy flux of the ionospheric escaping ions, taking advantage of advances in the spatial coverage and accuracy of FAST observations. We use a weighted piecewise Gaussian function to fit the dependence, because two probability peaks are located in the dayside polar cusp source region and the nightside auroral oval zone source region. The statistical results show that (1) the Gaussian Mixture Model suitably describes the dayside polar cusp upflows, and the dayside and the nightside auroral oval zone upflows. (2) The magnetic latitudes of the ionospheric upflow source regions expand toward the magnetic equator as Kp increases, from 81° magnetic latitude (MLAT) (cusp upflows) and 63° MLAT (auroral oval upflows) during quiet times to 76° MLAT and 61° MLAT, respectively. (3) The dayside polar cusp region provides only 3-5% O+ upflows among all the source regions, which include the dayside auroral oval zone, dayside polar cusp, nightside auroral oval zone, and even the polar cap. However, observations show that more than 70% of upflows occur in the auroral oval zone and that the occurrence probability increases at the altitudes of 3500-4200 km, which is considered to be the lower altitude boundary of ion beams. This observed result suggests that soft electron precipitation and transverse wave heating are the most efficient ion energization/acceleration mechanisms at the altitudes of FAST orbit, and that the parallel acceleration caused by field-aligned potential drops becomes effective above that altitude.
How Many Separable Sources? Model Selection In Independent Components Analysis
Woods, Roger P.; Hansen, Lars Kai; Strother, Stephen
2015-01-01
Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian. PMID:25811988
NASA Astrophysics Data System (ADS)
Ross, Z. E.; Ben-Zion, Y.; Zhu, L.
2015-02-01
We analyse source tensor properties of seven Mw > 4.2 earthquakes in the complex trifurcation area of the San Jacinto Fault Zone, CA, with a focus on isotropic radiation that may be produced by rock damage in the source volumes. The earthquake mechanisms are derived with generalized `Cut and Paste' (gCAP) inversions of three-component waveforms typically recorded by >70 stations at regional distances. The gCAP method includes parameters ζ and χ representing, respectively, the relative strength of the isotropic and CLVD source terms. The possible errors in the isotropic and CLVD components due to station variability is quantified with bootstrap resampling for each event. The results indicate statistically significant explosive isotropic components for at least six of the events, corresponding to ˜0.4-8 per cent of the total potency/moment of the sources. In contrast, the CLVD components for most events are not found to be statistically significant. Trade-off and correlation between the isotropic and CLVD components are studied using synthetic tests with realistic station configurations. The associated uncertainties are found to be generally smaller than the observed isotropic components. Two different tests with velocity model perturbation are conducted to quantify the uncertainty due to inaccuracies in the Green's functions. Applications of the Mann-Whitney U test indicate statistically significant explosive isotropic terms for most events consistent with brittle damage production at the source.
An Integrative Account of Constraints on Cross-Situational Learning
Yurovsky, Daniel; Frank, Michael C.
2015-01-01
Word-object co-occurrence statistics are a powerful information source for vocabulary learning, but there is considerable debate about how learners actually use them. While some theories hold that learners accumulate graded, statistical evidence about multiple referents for each word, others suggest that they track only a single candidate referent. In two large-scale experiments, we show that neither account is sufficient: Cross-situational learning involves elements of both. Further, the empirical data are captured by a computational model that formalizes how memory and attention interact with co-occurrence tracking. Together, the data and model unify opposing positions in a complex debate and underscore the value of understanding the interaction between computational and algorithmic levels of explanation. PMID:26302052
Computational and Statistical Models: A Comparison for Policy Modeling of Childhood Obesity
NASA Astrophysics Data System (ADS)
Mabry, Patricia L.; Hammond, Ross; Ip, Edward Hak-Sing; Huang, Terry T.-K.
As systems science methodologies have begun to emerge as a set of innovative approaches to address complex problems in behavioral, social science, and public health research, some apparent conflicts with traditional statistical methodologies for public health have arisen. Computational modeling is an approach set in context that integrates diverse sources of data to test the plausibility of working hypotheses and to elicit novel ones. Statistical models are reductionist approaches geared towards proving the null hypothesis. While these two approaches may seem contrary to each other, we propose that they are in fact complementary and can be used jointly to advance solutions to complex problems. Outputs from statistical models can be fed into computational models, and outputs from computational models can lead to further empirical data collection and statistical models. Together, this presents an iterative process that refines the models and contributes to a greater understanding of the problem and its potential solutions. The purpose of this panel is to foster communication and understanding between statistical and computational modelers. Our goal is to shed light on the differences between the approaches and convey what kinds of research inquiries each one is best for addressing and how they can serve complementary (and synergistic) roles in the research process, to mutual benefit. For each approach the panel will cover the relevant "assumptions" and how the differences in what is assumed can foster misunderstandings. The interpretations of the results from each approach will be compared and contrasted and the limitations for each approach will be delineated. We will use illustrative examples from CompMod, the Comparative Modeling Network for Childhood Obesity Policy. The panel will also incorporate interactive discussions with the audience on the issues raised here.
NASA Astrophysics Data System (ADS)
Xu, Y.; Jones, A. D.; Rhoades, A.
2017-12-01
Precipitation is a key component in hydrologic cycles, and changing precipitation regimes contribute to more intense and frequent drought and flood events around the world. Numerical climate modeling is a powerful tool to study climatology and to predict future changes. Despite the continuous improvement in numerical models, long-term precipitation prediction remains a challenge especially at regional scales. To improve numerical simulations of precipitation, it is important to find out where the uncertainty in precipitation simulations comes from. There are two types of uncertainty in numerical model predictions. One is related to uncertainty in the input data, such as model's boundary and initial conditions. These uncertainties would propagate to the final model outcomes even if the numerical model has exactly replicated the true world. But a numerical model cannot exactly replicate the true world. Therefore, the other type of model uncertainty is related the errors in the model physics, such as the parameterization of sub-grid scale processes, i.e., given precise input conditions, how much error could be generated by the in-precise model. Here, we build two statistical models based on a neural network algorithm to predict long-term variation of precipitation over California: one uses "true world" information derived from observations, and the other uses "modeled world" information using model inputs and outputs from the North America Coordinated Regional Downscaling Project (NA CORDEX). We derive multiple climate feature metrics as the predictors for the statistical model to represent the impact of global climate on local hydrology, and include topography as a predictor to represent the local control. We first compare the predictors between the true world and the modeled world to determine the errors contained in the input data. By perturbing the predictors in the statistical model, we estimate how much uncertainty in the model's final outcomes is accounted for by each predictor. By comparing the statistical model derived from true world information and modeled world information, we assess the errors lying in the physics of the numerical models. This work provides a unique insight to assess the performance of numerical climate models, and can be used to guide improvement of precipitation prediction.
On the application of quantum transport theory to electron sources.
Jensen, Kevin L
2003-01-01
Electron sources (e.g., field emitter arrays, wide band-gap (WBG) semiconductor materials and coatings, carbon nanotubes, etc.) seek to exploit ballistic transport within the vacuum after emission from microfabricated structures. Regardless of kind, all sources strive to minimize the barrier to electron emission by engineering material properties (work function/electron affinity) or physical geometry (field enhancement) of the cathode. The unique capabilities of cold cathodes, such as instant ON/OFF performance, high brightness, high current density, large transconductance to capacitance ratio, cold emission, small size and/or low voltage operation characteristics, commend their use in several advanced devices when physical size, weight, power consumption, beam current, and pulse repletion frequency are important, e.g., RF power amplifier such as traveling wave tubes (TWTs) for radar and communications, electrodynamic tethers for satellite deboost/reboost, and electric propulsion systems such as Hall thrusters for small satellites. The theoretical program described herein is directed towards models to evaluate emission current from electron sources (in particular, emission from WBG and Spindt-type field emitter) in order to assess their utility, capabilities and performance characteristics. Modeling efforts particularly include: band bending, non-linear and resonant (Poole-Frenkel) potentials, the extension of one-dimensional theory to multi-dimensional structures, and emission site statistics due to variations in geometry and the presence of adsorbates. Two particular methodologies, namely, the modified Airy approach and metal-semiconductor statistical hyperbolic/ellipsoidal model, are described in detail in their present stage of development.
NASA Technical Reports Server (NTRS)
Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.
1990-01-01
Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.
ALMA observations of lensed Herschel sources: testing the dark matter halo paradigm
NASA Astrophysics Data System (ADS)
Amvrosiadis, A.; Eales, S. A.; Negrello, M.; Marchetti, L.; Smith, M. W. L.; Bourne, N.; Clements, D. L.; De Zotti, G.; Dunne, L.; Dye, S.; Furlanetto, C.; Ivison, R. J.; Maddox, S. J.; Valiante, E.; Baes, M.; Baker, A. J.; Cooray, A.; Crawford, S. M.; Frayer, D.; Harris, A.; Michałowski, M. J.; Nayyeri, H.; Oliver, S.; Riechers, D. A.; Serjeant, S.; Vaccari, M.
2018-04-01
With the advent of wide-area submillimetre surveys, a large number of high-redshift gravitationally lensed dusty star-forming galaxies have been revealed. Because of the simplicity of the selection criteria for candidate lensed sources in such surveys, identified as those with S500 μm > 100 mJy, uncertainties associated with the modelling of the selection function are expunged. The combination of these attributes makes submillimetre surveys ideal for the study of strong lens statistics. We carried out a pilot study of the lensing statistics of submillimetre-selected sources by making observations with the Atacama Large Millimeter Array (ALMA) of a sample of strongly lensed sources selected from surveys carried out with the Herschel Space Observatory. We attempted to reproduce the distribution of image separations for the lensed sources using a halo mass function taken from a numerical simulation that contains both dark matter and baryons. We used three different density distributions, one based on analytical fits to the haloes formed in the EAGLE simulation and two density distributions [Singular Isothermal Sphere (SIS) and SISSA] that have been used before in lensing studies. We found that we could reproduce the observed distribution with all three density distributions, as long as we imposed an upper mass transition of ˜1013 M⊙ for the SIS and SISSA models, above which we assumed that the density distribution could be represented by a Navarro-Frenk-White profile. We show that we would need a sample of ˜500 lensed sources to distinguish between the density distributions, which is practical given the predicted number of lensed sources in the Herschel surveys.
Quantum key distribution with entangled photon sources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma Xiongfeng; Fung, Chi-Hang Fred; Lo, H.-K.
2007-07-15
A parametric down-conversion (PDC) source can be used as either a triggered single-photon source or an entangled-photon source in quantum key distribution (QKD). The triggering PDC QKD has already been studied in the literature. On the other hand, a model and a post-processing protocol for the entanglement PDC QKD are still missing. We fill in this important gap by proposing such a model and a post-processing protocol for the entanglement PDC QKD. Although the PDC model is proposed to study the entanglement-based QKD, we emphasize that our generic model may also be useful for other non-QKD experiments involving a PDCmore » source. Since an entangled PDC source is a basis-independent source, we apply Koashi and Preskill's security analysis to the entanglement PDC QKD. We also investigate the entanglement PDC QKD with two-way classical communications. We find that the recurrence scheme increases the key rate and the Gottesman-Lo protocol helps tolerate higher channel losses. By simulating a recent 144-km open-air PDC experiment, we compare three implementations: entanglement PDC QKD, triggering PDC QKD, and coherent-state QKD. The simulation result suggests that the entanglement PDC QKD can tolerate higher channel losses than the coherent-state QKD. The coherent-state QKD with decoy states is able to achieve highest key rate in the low- and medium-loss regions. By applying the Gottesman-Lo two-way post-processing protocol, the entanglement PDC QKD can tolerate up to 70 dB combined channel losses (35 dB for each channel) provided that the PDC source is placed in between Alice and Bob. After considering statistical fluctuations, the PDC setup can tolerate up to 53 dB channel losses.« less
Prediction of Down-Gradient Impacts of DNAPL Source Depletion Using Tracer Techniques
NASA Astrophysics Data System (ADS)
Basu, N. B.; Fure, A. D.; Jawitz, J. W.
2006-12-01
Four simplified DNAPL source depletion models that have been discussed in the literature recently are evaluated for the prediction of long-term effects of source depletion under natural gradient flow. These models are simple in form (a power function equation is an example) but are shown here to serve as mathematical analogs to complex multiphase flow and transport simulators. One of the source depletion models, the equilibrium streamtube model, is shown to be relatively easily parameterized using non-reactive and reactive tracers. Non-reactive tracers are used to characterize the aquifer heterogeneity while reactive tracers are used to describe the mean DNAPL mass and its distribution. This information is then used in a Lagrangian framework to predict source remediation performance. In a Lagrangian approach the source zone is conceptualized as a collection of non-interacting streamtubes with hydrodynamic and DNAPL heterogeneity represented by the variation of the travel time and DNAPL saturation among the streamtubes. The travel time statistics are estimated from the non-reactive tracer data while the DNAPL distribution statistics are estimated from the reactive tracer data. The combined statistics are used to define an analytical solution for contaminant dissolution under natural gradient flow. The tracer prediction technique compared favorably with results from a multiphase flow and transport simulator UTCHEM in domains with different hydrodynamic heterogeneity (variance of the log conductivity field = 0.2, 1 and 3).
Probabilistic projections of 21st century climate change over Northern Eurasia
NASA Astrophysics Data System (ADS)
Monier, E.; Sokolov, A. P.; Schlosser, C. A.; Scott, J. R.; Gao, X.
2013-12-01
We present probabilistic projections of 21st century climate change over Northern Eurasia using the Massachusetts Institute of Technology (MIT) Integrated Global System Model (IGSM), an integrated assessment model that couples an earth system model of intermediate complexity, with a two-dimensional zonal-mean atmosphere, to a human activity model. Regional climate change is obtained by two downscaling methods: a dynamical downscaling, where the IGSM is linked to a three dimensional atmospheric model; and a statistical downscaling, where a pattern scaling algorithm uses climate-change patterns from 17 climate models. This framework allows for key sources of uncertainty in future projections of regional climate change to be accounted for: emissions projections; climate system parameters (climate sensitivity, strength of aerosol forcing and ocean heat uptake rate); natural variability; and structural uncertainty. Results show that the choice of climate policy and the climate parameters are the largest drivers of uncertainty. We also nd that dierent initial conditions lead to dierences in patterns of change as large as when using different climate models. Finally, this analysis reveals the wide range of possible climate change over Northern Eurasia, emphasizing the need to consider all sources of uncertainty when modeling climate impacts over Northern Eurasia.
Probabilistic projections of 21st century climate change over Northern Eurasia
NASA Astrophysics Data System (ADS)
Monier, Erwan; Sokolov, Andrei; Schlosser, Adam; Scott, Jeffery; Gao, Xiang
2013-12-01
We present probabilistic projections of 21st century climate change over Northern Eurasia using the Massachusetts Institute of Technology (MIT) Integrated Global System Model (IGSM), an integrated assessment model that couples an Earth system model of intermediate complexity with a two-dimensional zonal-mean atmosphere to a human activity model. Regional climate change is obtained by two downscaling methods: a dynamical downscaling, where the IGSM is linked to a three-dimensional atmospheric model, and a statistical downscaling, where a pattern scaling algorithm uses climate change patterns from 17 climate models. This framework allows for four major sources of uncertainty in future projections of regional climate change to be accounted for: emissions projections, climate system parameters (climate sensitivity, strength of aerosol forcing and ocean heat uptake rate), natural variability, and structural uncertainty. The results show that the choice of climate policy and the climate parameters are the largest drivers of uncertainty. We also find that different initial conditions lead to differences in patterns of change as large as when using different climate models. Finally, this analysis reveals the wide range of possible climate change over Northern Eurasia, emphasizing the need to consider these sources of uncertainty when modeling climate impacts over Northern Eurasia.
Model Performance Evaluation and Scenario Analysis ...
This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too
A two-component rain model for the prediction of attenuation statistics
NASA Technical Reports Server (NTRS)
Crane, R. K.
1982-01-01
A two-component rain model has been developed for calculating attenuation statistics. In contrast to most other attenuation prediction models, the two-component model calculates the occurrence probability for volume cells or debris attenuation events. The model performed significantly better than the International Radio Consultative Committee model when used for predictions on earth-satellite paths. It is expected that the model will have applications in modeling the joint statistics required for space diversity system design, the statistics of interference due to rain scatter at attenuating frequencies, and the duration statistics for attenuation events.
NASA Astrophysics Data System (ADS)
Feyen, Luc; Caers, Jef
2006-06-01
In this work, we address the problem of characterizing the heterogeneity and uncertainty of hydraulic properties for complex geological settings. Hereby, we distinguish between two scales of heterogeneity, namely the hydrofacies structure and the intrafacies variability of the hydraulic properties. We employ multiple-point geostatistics to characterize the hydrofacies architecture. The multiple-point statistics are borrowed from a training image that is designed to reflect the prior geological conceptualization. The intrafacies variability of the hydraulic properties is represented using conventional two-point correlation methods, more precisely, spatial covariance models under a multi-Gaussian spatial law. We address the different levels and sources of uncertainty in characterizing the subsurface heterogeneity, and explore their effect on groundwater flow and transport predictions. Typically, uncertainty is assessed by way of many images, termed realizations, of a fixed statistical model. However, in many cases, sampling from a fixed stochastic model does not adequately represent the space of uncertainty. It neglects the uncertainty related to the selection of the stochastic model and the estimation of its input parameters. We acknowledge the uncertainty inherent in the definition of the prior conceptual model of aquifer architecture and in the estimation of global statistics, anisotropy, and correlation scales. Spatial bootstrap is used to assess the uncertainty of the unknown statistical parameters. As an illustrative example, we employ a synthetic field that represents a fluvial setting consisting of an interconnected network of channel sands embedded within finer-grained floodplain material. For this highly non-stationary setting we quantify the groundwater flow and transport model prediction uncertainty for various levels of hydrogeological uncertainty. Results indicate the importance of accurately describing the facies geometry, especially for transport predictions.
NASA Astrophysics Data System (ADS)
Bakosi, J.; Franzese, P.; Boybeyi, Z.
2007-11-01
Dispersion of a passive scalar from concentrated sources in fully developed turbulent channel flow is studied with the probability density function (PDF) method. The joint PDF of velocity, turbulent frequency and scalar concentration is represented by a large number of Lagrangian particles. A stochastic near-wall PDF model combines the generalized Langevin model of Haworth and Pope [Phys. Fluids 29, 387 (1986)] with Durbin's [J. Fluid Mech. 249, 465 (1993)] method of elliptic relaxation to provide a mathematically exact treatment of convective and viscous transport with a nonlocal representation of the near-wall Reynolds stress anisotropy. The presence of walls is incorporated through the imposition of no-slip and impermeability conditions on particles without the use of damping or wall-functions. Information on the turbulent time scale is supplied by the gamma-distribution model of van Slooten et al. [Phys. Fluids 10, 246 (1998)]. Two different micromixing models are compared that incorporate the effect of small scale mixing on the transported scalar: the widely used interaction by exchange with the mean and the interaction by exchange with the conditional mean model. Single-point velocity and concentration statistics are compared to direct numerical simulation and experimental data at Reτ=1080 based on the friction velocity and the channel half width. The joint model accurately reproduces a wide variety of conditional and unconditional statistics in both physical and composition space.
NASA Astrophysics Data System (ADS)
Muhammad, Ario; Goda, Katsuichiro
2018-03-01
This study investigates the impact of model complexity in source characterization and digital elevation model (DEM) resolution on the accuracy of tsunami hazard assessment and fatality estimation through a case study in Padang, Indonesia. Two types of earthquake source models, i.e. complex and uniform slip models, are adopted by considering three resolutions of DEMs, i.e. 150 m, 50 m, and 10 m. For each of the three grid resolutions, 300 complex source models are generated using new statistical prediction models of earthquake source parameters developed from extensive finite-fault models of past subduction earthquakes, whilst 100 uniform slip models are constructed with variable fault geometry without slip heterogeneity. The results highlight that significant changes to tsunami hazard and fatality estimates are observed with regard to earthquake source complexity and grid resolution. Coarse resolution (i.e. 150 m) leads to inaccurate tsunami hazard prediction and fatality estimation, whilst 50-m and 10-m resolutions produce similar results. However, velocity and momentum flux are sensitive to the grid resolution and hence, at least 10-m grid resolution needs to be implemented when considering flow-based parameters for tsunami hazard and risk assessments. In addition, the results indicate that the tsunami hazard parameters and fatality number are more sensitive to the complexity of earthquake source characterization than the grid resolution. Thus, the uniform models are not recommended for probabilistic tsunami hazard and risk assessments. Finally, the findings confirm that uncertainties of tsunami hazard level and fatality in terms of depth, velocity and momentum flux can be captured and visualized through the complex source modeling approach. From tsunami risk management perspectives, this indeed creates big data, which are useful for making effective and robust decisions.
NASA Astrophysics Data System (ADS)
Ars, Sébastien; Broquet, Grégoire; Yver Kwok, Camille; Roustan, Yelva; Wu, Lin; Arzoumanian, Emmanuel; Bousquet, Philippe
2017-12-01
This study presents a new concept for estimating the pollutant emission rates of a site and its main facilities using a series of atmospheric measurements across the pollutant plumes. This concept combines the tracer release method, local-scale atmospheric transport modelling and a statistical atmospheric inversion approach. The conversion between the controlled emission and the measured atmospheric concentrations of the released tracer across the plume places valuable constraints on the atmospheric transport. This is used to optimise the configuration of the transport model parameters and the model uncertainty statistics in the inversion system. The emission rates of all sources are then inverted to optimise the match between the concentrations simulated with the transport model and the pollutants' measured atmospheric concentrations, accounting for the transport model uncertainty. In principle, by using atmospheric transport modelling, this concept does not strongly rely on the good colocation between the tracer and pollutant sources and can be used to monitor multiple sources within a single site, unlike the classical tracer release technique. The statistical inversion framework and the use of the tracer data for the configuration of the transport and inversion modelling systems should ensure that the transport modelling errors are correctly handled in the source estimation. The potential of this new concept is evaluated with a relatively simple practical implementation based on a Gaussian plume model and a series of inversions of controlled methane point sources using acetylene as a tracer gas. The experimental conditions are chosen so that they are suitable for the use of a Gaussian plume model to simulate the atmospheric transport. In these experiments, different configurations of methane and acetylene point source locations are tested to assess the efficiency of the method in comparison to the classic tracer release technique in coping with the distances between the different methane and acetylene sources. The results from these controlled experiments demonstrate that, when the targeted and tracer gases are not well collocated, this new approach provides a better estimate of the emission rates than the tracer release technique. As an example, the relative error between the estimated and actual emission rates is reduced from 32 % with the tracer release technique to 16 % with the combined approach in the case of a tracer located 60 m upwind of a single methane source. Further studies and more complex implementations with more advanced transport models and more advanced optimisations of their configuration will be required to generalise the applicability of the approach and strengthen its robustness.
Vahedi, Shahrum; Farrokhi, Farahman; Gahramani, Farahnaz; Issazadegan, Ali
2012-01-01
Approximately 66-80%of graduate students experience statistics anxiety and some researchers propose that many students identify statistics courses as the most anxiety-inducing courses in their academic curriculums. As such, it is likely that statistics anxiety is, in part, responsible for many students delaying enrollment in these courses for as long as possible. This paper proposes a canonical model by treating academic procrastination (AP), learning strategies (LS) as predictor variables and statistics anxiety (SA) as explained variables. A questionnaire survey was used for data collection and 246-college female student participated in this study. To examine the mutually independent relations between procrastination, learning strategies and statistics anxiety variables, a canonical correlation analysis was computed. Findings show that two canonical functions were statistically significant. The set of variables (metacognitive self-regulation, source management, preparing homework, preparing for test and preparing term papers) helped predict changes of statistics anxiety with respect to fearful behavior, Attitude towards math and class, Performance, but not Anxiety. These findings could be used in educational and psychological interventions in the context of statistics anxiety reduction.
Benefit-cost estimation for alternative drinking water maximum contaminant levels
NASA Astrophysics Data System (ADS)
Gurian, Patrick L.; Small, Mitchell J.; Lockwood, John R.; Schervish, Mark J.
2001-08-01
A simulation model for estimating compliance behavior and resulting costs at U.S. Community Water Suppliers is developed and applied to the evaluation of a more stringent maximum contaminant level (MCL) for arsenic. Probability distributions of source water arsenic concentrations are simulated using a statistical model conditioned on system location (state) and source water type (surface water or groundwater). This model is fit to two recent national surveys of source waters, then applied with the model explanatory variables for the population of U.S. Community Water Suppliers. Existing treatment types and arsenic removal efficiencies are also simulated. Utilities with finished water arsenic concentrations above the proposed MCL are assumed to select the least cost option compatible with their existing treatment from among 21 available compliance strategies and processes for meeting the standard. Estimated costs and arsenic exposure reductions at individual suppliers are aggregated to estimate the national compliance cost, arsenic exposure reduction, and resulting bladder cancer risk reduction. Uncertainties in the estimates are characterized based on uncertainties in the occurrence model parameters, existing treatment types, treatment removal efficiencies, costs, and the bladder cancer dose-response function for arsenic.
NASA Astrophysics Data System (ADS)
Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik
2016-03-01
A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2015-09-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2016-02-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.
Impact of South American heroin on the US heroin market 1993–2004
Ciccarone, Daniel; Unick, George J; Kraus, Allison
2008-01-01
Background The past two decades have seen an increase in heroin-related morbidity and mortality in the United States. We report on trends in US heroin retail price and purity, including the effect of entry of Colombian-sourced heroin on the US heroin market. Methods The average standardized price ($/mg-pure) and purity (% by weight) of heroin from 1993 to 2004 was from obtained from US Drug Enforcement Agency retail purchase data for 20 metropolitan statistical areas. Univariate statistics, robust Ordinary Least Squares regression and mixed fixed and random effect growth curve models were used to predict the price and purity data in each metropolitan statistical area over time. Results Over the 12 study years, heroin price decreased 62%. The median percentage of all heroin samples that are of South American origin increased an absolute 7% per year. Multivariate models suggest percent South American heroin is a significant predictor of lower heroin price and higher purity adjusting for time and demographics. Conclusion These analyses reveal trends to historically low-cost heroin in many US cities. These changes correspond to the entrance into and rapid domination of the US heroin market by Colombian-sourced heroin. The implications of these changes are discussed. PMID:19201184
Assessment of Current Jet Noise Prediction Capabilities
NASA Technical Reports Server (NTRS)
Hunter, Craid A.; Bridges, James E.; Khavaran, Abbas
2008-01-01
An assessment was made of the capability of jet noise prediction codes over a broad range of jet flows, with the objective of quantifying current capabilities and identifying areas requiring future research investment. Three separate codes in NASA s possession, representative of two classes of jet noise prediction codes, were evaluated, one empirical and two statistical. The empirical code is the Stone Jet Noise Module (ST2JET) contained within the ANOPP aircraft noise prediction code. It is well documented, and represents the state of the art in semi-empirical acoustic prediction codes where virtual sources are attributed to various aspects of noise generation in each jet. These sources, in combination, predict the spectral directivity of a jet plume. A total of 258 jet noise cases were examined on the ST2JET code, each run requiring only fractions of a second to complete. Two statistical jet noise prediction codes were also evaluated, JeNo v1, and Jet3D. Fewer cases were run for the statistical prediction methods because they require substantially more resources, typically a Reynolds-Averaged Navier-Stokes solution of the jet, volume integration of the source statistical models over the entire plume, and a numerical solution of the governing propagation equation within the jet. In the evaluation process, substantial justification of experimental datasets used in the evaluations was made. In the end, none of the current codes can predict jet noise within experimental uncertainty. The empirical code came within 2dB on a 1/3 octave spectral basis for a wide range of flows. The statistical code Jet3D was within experimental uncertainty at broadside angles for hot supersonic jets, but errors in peak frequency and amplitude put it out of experimental uncertainty at cooler, lower speed conditions. Jet3D did not predict changes in directivity in the downstream angles. The statistical code JeNo,v1 was within experimental uncertainty predicting noise from cold subsonic jets at all angles, but did not predict changes with heating of the jet and did not account for directivity changes at supersonic conditions. Shortcomings addressed here give direction for future work relevant to the statistical-based prediction methods. A full report will be released as a chapter in a NASA publication assessing the state of the art in aircraft noise prediction.
Gravitational Lenses and the Structure and Evolution of Galaxies
NASA Technical Reports Server (NTRS)
Oliversen, Ronald J. (Technical Monitor); Kochanek, Christopher
2004-01-01
During the first year of the project we completed five papers, each of which represents a new direction in the theory and interpretation of gravitational lenses. In the first paper, The Importance of Einstein Rings, we developed the first theory for the formation and structure of the Einstein rings formed by lensing extended sources like the host galaxies of quasar and radio sources. In the second paper, Cusped Mass Models Of Gravitational Lenses, we introduced a new class of lens models. In the third paper, Global Probes of the Impact of Baryons on Dark Matter Halos, we made the first globally consistent models for the separation distribution of gravitational lenses including both galaxy and cluster lenses. The last two papers explore the properties of two lenses in detail. During the second year we have focused more closely on the relationship of baryons and dark matter. In the third year we have been further examining the relationship between baryons and dark matter. In the present year we extended our statistical analysis of lens mass distributions using a self-similar model for the halo mass distribution as compared to the luminous galaxy.
Here Be Dragons: Effective (X-ray) Timing with the Cospectrum
NASA Astrophysics Data System (ADS)
Huppenkothen, Daniela; Bachetti, Matteo
2018-01-01
In recent years, the cross spectrum has received considerable attention as a means of characterising the variability of astronomical sources as a function of wavelength. While much has been written about the statistics of time and phase lags, the cospectrum—the real part of the cross spectrum—has only recently been understood as means of mitigating instrumental effects dependent on temporal frequency in astronomical detectors, as well as a method of characterizing the coherent variability in two wavelength ranges on different time scales. In this talk, I will present recent advances made in understanding the statistical properties of cospectra, leading to much improved inferences for periodic and quasi-periodic signals. I will also present a new method to reliably mitigate instrumental effects such as dead time in X-ray detectors, and show how we can use the cospectrum to model highly variable sources such as X-ray binaries or Active Galactic Nuclei.
Two populations and models of gamma ray bursts
NASA Technical Reports Server (NTRS)
Katz, J. I.
1993-01-01
Gamma-ray burst statistics are best explained by a source population at cosmological distances, while spectroscopy and intensity histories of some individual bursts imply an origin on Galactic neutron stars. To resolve this inconsistency I suggest the presence of two populations, one at cosmological distances and the other Galactic. I build on ideas of Shemi and Piran (1990) and of Rees and Mesozaros (1992) involving the interaction of fireball debris with surrounding clouds to explain the observed intensity histories in bursts at cosmological distances. The distances to the Galactic population are undetermined because they are too few to affect the statistics of intensity and direction; I explain them as resulting from magnetic reconnection in neutron star magnetospheres. An appendix describes the late evolution of the debris as a relativistic blast wave.
DETECTING UNSPECIFIED STRUCTURE IN LOW-COUNT IMAGES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stein, Nathan M.; Dyk, David A. van; Kashyap, Vinay L.
Unexpected structure in images of astronomical sources often presents itself upon visual inspection of the image, but such apparent structure may either correspond to true features in the source or be due to noise in the data. This paper presents a method for testing whether inferred structure in an image with Poisson noise represents a significant departure from a baseline (null) model of the image. To infer image structure, we conduct a Bayesian analysis of a full model that uses a multiscale component to allow flexible departures from the posited null model. As a test statistic, we use a tailmore » probability of the posterior distribution under the full model. This choice of test statistic allows us to estimate a computationally efficient upper bound on a p-value that enables us to draw strong conclusions even when there are limited computational resources that can be devoted to simulations under the null model. We demonstrate the statistical performance of our method on simulated images. Applying our method to an X-ray image of the quasar 0730+257, we find significant evidence against the null model of a single point source and uniform background, lending support to the claim of an X-ray jet.« less
NASA Astrophysics Data System (ADS)
Cui, Zhe; Wang, Anting; Ma, Qianli; Ming, Hai
2013-12-01
In this paper, the laser speckle pattern on human retina for a laser projection display is simulated. By introducing a specific eye model `Indiana Eye', the statistical properties of the laser speckle are numerical investigated. The results show that the aberrations of human eye (mostly spherical and chromatic) will decrease the speckle contrast felt by people. When the wavelength of the laser source is 550 nm (green), people will feel the strongest speck pattern and the weakest when the wavelength is 450 nm (blue). Myopia and hyperopia will decrease the speckle contrast by introducing large spherical aberrations. Although aberration is good for speckle reduction, but it will degrade the imaging capability of the eye. The results show that laser source (650 nm) will have the best image quality on the retina. At last, we compare the human eye with an aberration-free imaging system. Both the speckle contrast and the image quality appear different behavior in these two imaging systems. The results are useful when a standardized measurement procedure for speckle contrast needs to be built.
Goodness-Of-Fit Test for Nonparametric Regression Models: Smoothing Spline ANOVA Models as Example.
Teran Hidalgo, Sebastian J; Wu, Michael C; Engel, Stephanie M; Kosorok, Michael R
2018-06-01
Nonparametric regression models do not require the specification of the functional form between the outcome and the covariates. Despite their popularity, the amount of diagnostic statistics, in comparison to their parametric counter-parts, is small. We propose a goodness-of-fit test for nonparametric regression models with linear smoother form. In particular, we apply this testing framework to smoothing spline ANOVA models. The test can consider two sources of lack-of-fit: whether covariates that are not currently in the model need to be included, and whether the current model fits the data well. The proposed method derives estimated residuals from the model. Then, statistical dependence is assessed between the estimated residuals and the covariates using the HSIC. If dependence exists, the model does not capture all the variability in the outcome associated with the covariates, otherwise the model fits the data well. The bootstrap is used to obtain p-values. Application of the method is demonstrated with a neonatal mental development data analysis. We demonstrate correct type I error as well as power performance through simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, S.; Barua, A.; Zhou, M., E-mail: min.zhou@me.gatech.edu
2014-05-07
Accounting for the combined effect of multiple sources of stochasticity in material attributes, we develop an approach that computationally predicts the probability of ignition of polymer-bonded explosives (PBXs) under impact loading. The probabilistic nature of the specific ignition processes is assumed to arise from two sources of stochasticity. The first source involves random variations in material microstructural morphology; the second source involves random fluctuations in grain-binder interfacial bonding strength. The effect of the first source of stochasticity is analyzed with multiple sets of statistically similar microstructures and constant interfacial bonding strength. Subsequently, each of the microstructures in the multiple setsmore » is assigned multiple instantiations of randomly varying grain-binder interfacial strengths to analyze the effect of the second source of stochasticity. Critical hotspot size-temperature states reaching the threshold for ignition are calculated through finite element simulations that explicitly account for microstructure and bulk and interfacial dissipation to quantify the time to criticality (t{sub c}) of individual samples, allowing the probability distribution of the time to criticality that results from each source of stochastic variation for a material to be analyzed. Two probability superposition models are considered to combine the effects of the multiple sources of stochasticity. The first is a parallel and series combination model, and the second is a nested probability function model. Results show that the nested Weibull distribution provides an accurate description of the combined ignition probability. The approach developed here represents a general framework for analyzing the stochasticity in the material behavior that arises out of multiple types of uncertainty associated with the structure, design, synthesis and processing of materials.« less
Regression Models for Identifying Noise Sources in Magnetic Resonance Images
Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.
2009-01-01
Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478
Schick, Robert S; Kraus, Scott D; Rolland, Rosalind M; Knowlton, Amy R; Hamilton, Philip K; Pettis, Heather M; Thomas, Len; Harwood, John; Clark, James S
2016-01-01
Right whales are vulnerable to many sources of anthropogenic disturbance including ship strikes, entanglement with fishing gear, and anthropogenic noise. The effect of these factors on individual health is unclear. A statistical model using photographic evidence of health was recently built to infer the true or hidden health of individual right whales. However, two important prior assumptions about the role of missing data and unexplained variance on the estimates were not previously assessed. Here we tested these factors by varying prior assumptions and model formulation. We found sensitivity to each assumption and used the output to make guidelines on future model formulation.
Skiles, Matthew J; Lai, Alexandra M; Olson, Michael R; Schauer, James J; de Foy, Benjamin
2018-06-01
Two hundred sixty-three fine particulate matter (PM 2.5 ) samples collected on 3-day intervals over a 14-month period at two sites in the San Joaquin Valley (SJV) were analyzed for organic carbon (OC), elemental carbon (EC), water soluble organic carbon (WSOC), and organic molecular markers. A unique source profile library was applied to a chemical mass balance (CMB) source apportionment model to develop monthly and seasonally averaged source apportionment results. Five major OC sources were identified: mobile sources, biomass burning, meat smoke, vegetative detritus, and secondary organic carbon (SOC), as inferred from OC not apportioned by CMB. The SOC factor was the largest source contributor at Fresno and Bakersfield, contributing 44% and 51% of PM mass, respectively. Biomass burning was the only source with a statistically different average mass contribution (95% CI) between the two sites. Wintertime peaks of biomass burning, meat smoke, and total OC were observed at both sites, with SOC peaking during the summer months. Exceptionally strong seasonal variation in apportioned meat smoke mass could potentially be explained by oxidation of cholesterol between source and receptor and trends in wind transport outlined in a Residence Time Analysis (RTA). Fast moving nighttime winds prevalent during warmer months caused local emissions to be replaced by air mass transported from the San Francisco Bay Area, consisting of mostly diluted, oxidized concentrations of molecular markers. Good agreement was observed between SOC derived from the CMB model and from non-biomass burning WSOC mass, suggesting the CMB model is sufficiently accurate to assist in policy development. In general, uncertainty in monthly mass values derived from daily CMB apportionments were lower than that of CMB results produced with monthly marker composites, further validating daily sampling methodologies. Strong seasonal trends were observed for biomass and meat smoke OC apportionment, and monthly mass averages had lowest uncertainty when derived from daily CMB apportionments. Copyright © 2018 Elsevier Ltd. All rights reserved.
Optical spectroscopic studies of animal skin used in modeling of human cutaneous tissue
NASA Astrophysics Data System (ADS)
Drakaki, E.; Makropoulou, M.; Serafetinides, A. A.; Borisova, E.; Avramov, L.; Sianoudis, J. A.
2007-03-01
Optical spectroscopy and in particular laser-induced autofluorescence spectroscopy (LIAFS) and diffuse reflectance spectroscopy (DRS), provide excellent possibilities for real-time, noninvasive diagnosis of different skin tissue pathologies. However, the introduction of optical spectroscopy in routine medical practice demands a statistically important data collection, independent from the laser sources and detectors used. The scientists collect databases either from patients, in vivo, or they study different animal models to obtain objective information for the optical properties of various types of normal and diseased tissue. In the present work, the optical properties (fluorescence and reflectance) of two animal skin models are investigated. The aim of using animal models in optical spectroscopy investigations is to examine the statistics of the light induced effects firstly on animals, before any extrapolation effort to humans. A nitrogen laser (λ=337.1 nm) was used as an excitation source for the autofluorescence measurements, while a tungsten-halogen lamp was used for the reflectance measurements. Samples of chicken and pig skin were measured in vitro and were compared with results obtained from measurements of normal human skin in vivo. The specific features of the measured reflectance and fluorescence spectra are discussed, while the limits of data extrapolation for each skin type are also depicted.
A study of 2-20 KeV X-rays from the Cygnus region
NASA Technical Reports Server (NTRS)
Bleach, R. D.
1972-01-01
Two rocket-borne proportional counters, each with 650 sq c, met area and 1.8 x 7.1 deg FWHM rectangular mechanical collimation, surveyed the Cygnus region in the 2 to 20 keV energy range on two occasions. X-ray spectral data gathered on 21 September 1970 from discrete sources in Cygnus are presented. The data from Cyg X-1, Cyg X-2, and Cyg X-3 have sufficient statistical significance to indicate mutually exclusive spectral forms for the three. Upper limits are presented for X-ray intensities above 2 keV for Cyg X-4 and Cyg X-5 (Cygnus loop). A search was made on 9 August 1971 for a diffuse component of X-rays 1.5 keV associated with an interarm region of the galaxy at galactic longitudes in the vicinity of 60 degrees. A statistically significant excess associated with a narrow disk component was detected. Several possible emission models are discussed, with the most likely candidate being a population of unresolvable low luminosity discrete sources.
Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data
NASA Astrophysics Data System (ADS)
Kacprzak, T.; Kirk, D.; Friedrich, O.; Amara, A.; Refregier, A.; Marian, L.; Dietrich, J. P.; Suchyta, E.; Aleksić, J.; Bacon, D.; Becker, M. R.; Bonnett, C.; Bridle, S. L.; Chang, C.; Eifler, T. F.; Hartley, W. G.; Huff, E. M.; Krause, E.; MacCrann, N.; Melchior, P.; Nicola, A.; Samuroff, S.; Sheldon, E.; Troxel, M. A.; Weller, J.; Zuntz, J.; Abbott, T. M. C.; Abdalla, F. B.; Armstrong, R.; Benoit-Lévy, A.; Bernstein, G. M.; Bernstein, R. A.; Bertin, E.; Brooks, D.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Evrard, A. E.; Neto, A. Fausti; Flaugher, B.; Fosalba, P.; Frieman, J.; Gerdes, D. W.; Goldstein, D. A.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jarvis, M.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Miller, C. J.; Miquel, R.; Mohr, J. J.; Nichol, R. C.; Nord, B.; Plazas, A. A.; Romer, A. K.; Roodman, A.; Rykoff, E. S.; Sanchez, E.; Scarpine, V.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Zhang, Y.; DES Collaboration
2016-12-01
Shear peak statistics has gained a lot of attention recently as a practical alternative to the two-point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 deg2 field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range 04 would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two-point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. We discuss prospects for future peak statistics analysis with upcoming DES data.
Online Statistical Modeling (Regression Analysis) for Independent Responses
NASA Astrophysics Data System (ADS)
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Pettey, W B P; Carter, M E; Toth, D J A; Samore, M H; Gundlapalli, A V
2017-07-01
During the recent Ebola crisis in West Africa, individual person-level details of disease onset, transmissions, and outcomes such as survival or death were reported in online news media. We set out to document disease transmission chains for Ebola, with the goal of generating a timely account that could be used for surveillance, mathematical modeling, and public health decision-making. By accessing public web pages only, such as locally produced newspapers and blogs, we created a transmission chain involving two Ebola clusters in West Africa that compared favorably with other published transmission chains, and derived parameters for a mathematical model of Ebola disease transmission that were not statistically different from those derived from published sources. We present a protocol for responsibly gleaning epidemiological facts, transmission model parameters, and useful details from affected communities using mostly indigenously produced sources. After comparing our transmission parameters to published parameters, we discuss additional benefits of our method, such as gaining practical information about the affected community, its infrastructure, politics, and culture. We also briefly compare our method to similar efforts that used mostly non-indigenous online sources to generate epidemiological information.
An integrated system for rainfall induced shallow landslides modeling
NASA Astrophysics Data System (ADS)
Formetta, Giuseppe; Capparelli, Giovanna; Rigon, Riccardo; Versace, Pasquale
2014-05-01
Rainfall induced shallow landslides (RISL) cause significant damages involving loss of life and properties. Predict susceptible locations for RISL is a complex task that involves many disciplines: hydrology, geotechnical science, geomorphology, statistic. Usually to accomplish this task two main approaches are used: statistical or physically based model. In this work an open source (OS), 3-D, fully distributed hydrological model was integrated in an OS modeling framework (Object Modeling System). The chain is closed by linking the system to a component for safety factor computation with infinite slope approximation able to take into account layered soils and suction contribution to hillslope stability. The model composition was tested for a case study in Calabria (Italy) in order to simulate the triggering of a landslide happened in the Cosenza Province. The integration in OMS allows the use of other components such as a GIS to manage inputs-output processes, and automatic calibration algorithms to estimate model parameters. Finally, model performances were quantified by comparing modelled and simulated trigger time. This research is supported by Ambito/Settore AMBIENTE E SICUREZZA (PON01_01503) project.
NASA Astrophysics Data System (ADS)
Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.
2018-01-01
This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.
Rainfall runoff modelling of the Upper Ganga and Brahmaputra basins using PERSiST.
Futter, M N; Whitehead, P G; Sarkar, S; Rodda, H; Crossman, J
2015-06-01
There are ongoing discussions about the appropriate level of complexity and sources of uncertainty in rainfall runoff models. Simulations for operational hydrology, flood forecasting or nutrient transport all warrant different levels of complexity in the modelling approach. More complex model structures are appropriate for simulations of land-cover dependent nutrient transport while more parsimonious model structures may be adequate for runoff simulation. The appropriate level of complexity is also dependent on data availability. Here, we use PERSiST; a simple, semi-distributed dynamic rainfall-runoff modelling toolkit to simulate flows in the Upper Ganges and Brahmaputra rivers. We present two sets of simulations driven by single time series of daily precipitation and temperature using simple (A) and complex (B) model structures based on uniform and hydrochemically relevant land covers respectively. Models were compared based on ensembles of Bayesian Information Criterion (BIC) statistics. Equifinality was observed for parameters but not for model structures. Model performance was better for the more complex (B) structural representations than for parsimonious model structures. The results show that structural uncertainty is more important than parameter uncertainty. The ensembles of BIC statistics suggested that neither structural representation was preferable in a statistical sense. Simulations presented here confirm that relatively simple models with limited data requirements can be used to credibly simulate flows and water balance components needed for nutrient flux modelling in large, data-poor basins.
NASA Astrophysics Data System (ADS)
Taut, A.; Berger, L.; Drews, C.; Wimmer-Schweingruber, R. F.
2015-04-01
Context. Pickup ions in the inner heliosphere mainly originate in two sources, one interstellar and one in the inner solar system. In contrast to the interstellar source that is comparatively well understood, the nature of the inner source has not been clearly identified. Former results obtained with the Solar Wind Ion Composition Spectrometer on-board the Ulysses spacecraft revealed that the composition of inner-source pickup ions is similar, but not equal, to the elemental solar-wind composition. These observations suffered from very low counting statistics of roughly one C+ count per day. Aims: Because the composition of inner-source pickup ions could lead to identifying their origin, we used data from the Charge-Time-Of-Flight sensor on-board the Solar and Heliospheric Observatory. It offers a large geometry factor that results in about 100 C+ counts per day combined with an excellent mass-per-charge resolution. These features enable a precise determination of the inner-source heavy pickup ion composition at 1 AU. To address the production mechanisms of inner-source pickup ions, we set up a toy model based on the production scenario involving the passage of solar-wind ions through thin dust grains to explain the observed deviations of the inner-source PUI and the elemental solar-wind composition. Methods: An in-flight calibration of the sensor allows identification of heavy pickup ions from pulse height analysis data by their mass-per-charge. A statistical analysis was performed to derive the inner-source heavy pickup ion relative abundances of N+, O+, Ne+, Mg+, Mg2+, and Si+ compared to C+. Results: Our results for the inner-source pickup ion composition are in good agreement with previous studies and confirm the deviations from the solar-wind composition. The large geometry factor of the Charge-Time-of-Flight sensor even allowed the abundance ratios of the two most prominent pickup ions, C+ and O+, to be investigated at varying solar-wind speeds. We found that the O+/C+ ratio increases systematically with higher solar-wind speeds. This observation is an unprecedented feature characterising the production of inner-source pickup ions. Comparing our observations to the toy model results, we find that both the deviation from the solar-wind composition and the solar-wind-speed dependent O+/C+ ratio can be explained.
Samuel, Jonathan C; Sankhulani, Edward; Qureshi, Javeria S; Baloyi, Paul; Thupi, Charles; Lee, Clara N; Miller, William C; Cairns, Bruce A; Charles, Anthony G
2012-01-01
Road traffic injuries are a major cause of preventable death in sub-Saharan Africa. Accurate epidemiologic data are scarce and under-reporting from primary data sources is common. Our objectives were to estimate the incidence of road traffic deaths in Malawi using capture-recapture statistical analysis and determine what future efforts will best improve upon this estimate. Our capture-recapture model combined primary data from both police and hospital-based registries over a one year period (July 2008 to June 2009). The mortality incidences from the primary data sources were 0.075 and 0.051 deaths/1000 person-years, respectively. Using capture-recapture analysis, the combined incidence of road traffic deaths ranged 0.192-0.209 deaths/1000 person-years. Additionally, police data were more likely to include victims who were male, drivers or pedestrians, and victims from incidents with greater than one vehicle involved. We concluded that capture-recapture analysis is a good tool to estimate the incidence of road traffic deaths, and that capture-recapture analysis overcomes limitations of incomplete data sources. The World Health Organization estimated incidence of road traffic deaths for Malawi utilizing a binomial regression model and survey data and found a similar estimate despite strikingly different methods, suggesting both approaches are valid. Further research should seek to improve capture-recapture data through utilization of more than two data sources and improving accuracy of matches by minimizing missing data, application of geographic information systems, and use of names and civil registration numbers if available.
Samuel, Jonathan C.; Sankhulani, Edward; Qureshi, Javeria S.; Baloyi, Paul; Thupi, Charles; Lee, Clara N.; Miller, William C.; Cairns, Bruce A.; Charles, Anthony G.
2012-01-01
Road traffic injuries are a major cause of preventable death in sub-Saharan Africa. Accurate epidemiologic data are scarce and under-reporting from primary data sources is common. Our objectives were to estimate the incidence of road traffic deaths in Malawi using capture-recapture statistical analysis and determine what future efforts will best improve upon this estimate. Our capture-recapture model combined primary data from both police and hospital-based registries over a one year period (July 2008 to June 2009). The mortality incidences from the primary data sources were 0.075 and 0.051 deaths/1000 person-years, respectively. Using capture-recapture analysis, the combined incidence of road traffic deaths ranged 0.192–0.209 deaths/1000 person-years. Additionally, police data were more likely to include victims who were male, drivers or pedestrians, and victims from incidents with greater than one vehicle involved. We concluded that capture-recapture analysis is a good tool to estimate the incidence of road traffic deaths, and that capture-recapture analysis overcomes limitations of incomplete data sources. The World Health Organization estimated incidence of road traffic deaths for Malawi utilizing a binomial regression model and survey data and found a similar estimate despite strikingly different methods, suggesting both approaches are valid. Further research should seek to improve capture-recapture data through utilization of more than two data sources and improving accuracy of matches by minimizing missing data, application of geographic information systems, and use of names and civil registration numbers if available. PMID:22355338
Estimating chronic disease rates in Canada: which population-wide denominator to use?
Ellison, J; Nagamuthu, C; Vanderloo, S; McRae, B; Waters, C
2016-10-01
Chronic disease rates are produced from the Public Health Agency of Canada's Canadian Chronic Disease Surveillance System (CCDSS) using administrative health data from provincial/territorial health ministries. Denominators for these rates are based on estimates of populations derived from health insurance files. However, these data may not be accessible to all researchers. Another source for population size estimates is the Statistics Canada census. The purpose of our study was to calculate the major differences between the CCDSS and Statistics Canada's population denominators and to identify the sources or reasons for the potential differences between these data sources. We compared the 2009 denominators from the CCDSS and Statistics Canada. The CCDSS denominator was adjusted for the growth components (births, deaths, emigration and immigration) from Statistics Canada's census data. The unadjusted CCDSS denominator was 34 429 804, 3.2% higher than Statistics Canada's estimate of population in 2009. After the CCDSS denominator was adjusted for the growth components, the difference between the two estimates was reduced to 431 323 people, a difference of 1.3%. The CCDSS overestimates the population relative to Statistics Canada overall. The largest difference between the two estimates was from the migrant growth component, while the smallest was from the emigrant component. By using data descriptions by data source, researchers can make decisions about which population to use in their calculations of disease frequency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hiatt, JR; Rivard, MJ
2014-06-01
Purpose: The model S700 Axxent electronic brachytherapy source by Xoft was characterized in 2006 by Rivard et al. The source design was modified in 2006 to include a plastic centering insert at the source tip to more accurately position the anode. The objectives of the current study were to establish an accurate Monte Carlo source model for simulation purposes, to dosimetrically characterize the new source and obtain its TG-43 brachytherapy dosimetry parameters, and to determine dose differences between the source with and without the centering insert. Methods: Design information from dissected sources and vendor-supplied CAD drawings were used to devisemore » the source model for radiation transport simulations of dose distributions in a water phantom. Collision kerma was estimated as a function of radial distance, r, and polar angle, θ, for determination of reference TG-43 dosimetry parameters. Simulations were run for 10{sup 10} histories, resulting in statistical uncertainties on the transverse plane of 0.03% at r=1 cm and 0.08% at r=10 cm. Results: The dose rate distribution the transverse plane did not change beyond 2% between the 2006 model and the current study. While differences exceeding 15% were observed near the source distal tip, these diminished to within 2% for r>1.5 cm. Differences exceeding a factor of two were observed near θ=150° and in contact with the source, but diminished to within 20% at r=10 cm. Conclusions: Changes in source design influenced the overall dose rate and distribution by more than 2% over a third of the available solid angle external from the source. For clinical applications using balloons or applicators with tissue located within 5 cm from the source, dose differences exceeding 2% were observed only for θ>110°. This study carefully examined the current source geometry and presents a modern reference TG-43 dosimetry dataset for the model S700 source.« less
Gupta, Rishi R; Gifford, Eric M; Liston, Ted; Waller, Chris L; Hohman, Moses; Bunin, Barry A; Ekins, Sean
2010-11-01
Ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors [e.g., chemistry development kit (CDK)] and modeling algorithms, because this would negate the requirement for proprietary commercial software. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of Smiles Arbitrary Target Specification (SMARTS) keys had good statistics [κ = 0.43, sensitivity = 0.57, specificity = 0.91, and positive predicted value (PPV) = 0.64], equivalent to those of models built with commercial Molecular Operating Environment 2D (MOE2D) and the same set of SMARTS keys (κ = 0.43, sensitivity = 0.58, specificity = 0.91, and PPV = 0.63). Extending the dataset to ∼193,000 molecules and generating a continuous model using Cubist with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. When the continuous predictions and actual values were binned to get a categorical score we observed a similar κ statistic (0.42). The same combination of descriptor set and modeling method was applied to passive permeability and P-glycoprotein efflux data with similar model testing statistics. In summary, open source tools demonstrated predictive results comparable to those of commercial software with attendant cost savings. We discuss the advantages and disadvantages of open source descriptors and the opportunity for their use as a tool for organizations to share data precompetitively, avoiding repetition and assisting drug discovery.
NASA Technical Reports Server (NTRS)
Hough, D. H.; Readhead, A. C. S.
1989-01-01
A complete, flux-density-limited sample of double-lobed radio quasars is defined, with nuclei bright enough to be mapped with the Mark III VLBI system. It is shown that the statistics of linear size, nuclear strength, and curvature are consistent with the assumption of random source orientations and simple relativistic beaming in the nuclei. However, these statistics are also consistent with the effects of interaction between the beams and the surrounding medium. The distribution of jet velocities in the nuclei, as measured with VLBI, will provide a powerful test of physical theories of extragalactic radio sources.
NASA Astrophysics Data System (ADS)
Davoine, X.; Bocquet, M.
2007-03-01
The reconstruction of the Chernobyl accident source term has been previously carried out using core inventories, but also back and forth confrontations between model simulations and activity concentration or deposited activity measurements. The approach presented in this paper is based on inverse modelling techniques. It relies both on the activity concentration measurements and on the adjoint of a chemistry-transport model. The location of the release is assumed to be known, and one is looking for a source term available for long-range transport that depends both on time and altitude. The method relies on the maximum entropy on the mean principle and exploits source positivity. The inversion results are mainly sensitive to two tuning parameters, a mass scale and the scale of the prior errors in the inversion. To overcome this hardship, we resort to the statistical L-curve method to estimate balanced values for these two parameters. Once this is done, many of the retrieved features of the source are robust within a reasonable range of parameter values. Our results favour the acknowledged three-step scenario, with a strong initial release (26 to 27 April), followed by a weak emission period of four days (28 April-1 May) and again a release, longer but less intense than the initial one (2 May-6 May). The retrieved quantities of iodine-131, caesium-134 and caesium-137 that have been released are in good agreement with the latest reported estimations. Yet, a stronger apportionment of the total released activity is ascribed to the first period and less to the third one. Finer chronological details are obtained, such as a sequence of eruptive episodes in the first two days, likely related to the modulation of the boundary layer diurnal cycle. In addition, the first two-day release surges are found to have effectively reached an altitude up to the top of the domain (5000 m).
Statistical signatures of a targeted search by bacteria
NASA Astrophysics Data System (ADS)
Jashnsaz, Hossein; Anderson, Gregory G.; Pressé, Steve
2017-12-01
Chemoattractant gradients are rarely well-controlled in nature and recent attention has turned to bacterial chemotaxis toward typical bacterial food sources such as food patches or even bacterial prey. In environments with localized food sources reminiscent of a bacterium’s natural habitat, striking phenomena—such as the volcano effect or banding—have been predicted or expected to emerge from chemotactic models. However, in practice, from limited bacterial trajectory data it is difficult to distinguish targeted searches from an untargeted search strategy for food sources. Here we use a theoretical model to identify statistical signatures of a targeted search toward point food sources, such as prey. Our model is constructed on the basis that bacteria use temporal comparisons to bias their random walk, exhibit finite memory and are subject to random (Brownian) motion as well as signaling noise. The advantage with using a stochastic model-based approach is that a stochastic model may be parametrized from individual stochastic bacterial trajectories but may then be used to generate a very large number of simulated trajectories to explore average behaviors obtained from stochastic search strategies. For example, our model predicts that a bacterium’s diffusion coefficient increases as it approaches the point source and that, in the presence of multiple sources, bacteria may take substantially longer to locate their first source giving the impression of an untargeted search strategy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murray, S. G.; Trott, C. M.; Jordan, C. H.
We present a sophisticated statistical point-source foreground model for low-frequency radio Epoch of Reionization (EoR) experiments using the 21 cm neutral hydrogen emission line. Motivated by our understanding of the low-frequency radio sky, we enhance the realism of two model components compared with existing models: the source count distributions as a function of flux density and spatial position (source clustering), extending current formalisms for the foreground covariance of 2D power-spectral modes in 21 cm EoR experiments. The former we generalize to an arbitrarily broken power law, and the latter to an arbitrary isotropically correlated field. This paper presents expressions formore » the modified covariance under these extensions, and shows that for a more realistic source spatial distribution, extra covariance arises in the EoR window that was previously unaccounted for. Failure to include this contribution can yield bias in the final power-spectrum and under-estimate uncertainties, potentially leading to a false detection of signal. The extent of this effect is uncertain, owing to ignorance of physical model parameters, but we show that it is dependent on the relative abundance of faint sources, to the effect that our extension will become more important for future deep surveys. Finally, we show that under some parameter choices, ignoring source clustering can lead to false detections on large scales, due to both the induced bias and an artificial reduction in the estimated measurement uncertainty.« less
NASA Astrophysics Data System (ADS)
Ahmadalipour, Ali; Moradkhani, Hamid; Rana, Arun
2017-04-01
Uncertainty is an inevitable feature of climate change impact assessments. Understanding and quantifying different sources of uncertainty is of high importance, which can help modeling agencies improve the current models and scenarios. In this study, we have assessed the future changes in three climate variables (i.e. precipitation, maximum temperature, and minimum temperature) over 10 sub-basins across the Pacific Northwest US. To conduct the study, 10 statistically downscaled CMIP5 GCMs from two downscaling methods (i.e. BCSD and MACA) were utilized at 1/16 degree spatial resolution for the historical period of 1970-2000 and future period of 2010-2099. For the future projections, two future scenarios of RCP4.5 and RCP8.5 were used. Furthermore, Bayesian Model Averaging (BMA) was employed to develop a probabilistic future projection for each climate variable. Results indicate superiority of BMA simulations compared to individual models. Increasing temperature and precipitation are projected at annual timescale. However, the changes are not uniform among different seasons. Model uncertainty shows to be the major source of uncertainty, while downscaling uncertainty significantly contributes to the total uncertainty, especially in summer.
Vidyashankar, Anand N; Jimenez Castro, Pablo D; Kaplan, Ray M
2017-11-09
Initial studies of heartworm preventive drugs all yielded an observed efficacy of 100% with a single dose, and based on these data the US Food and Drug Administration (FDA) required all products to meet this standard for approval. Those initial studies, however, were based on just a few strains of parasites, and therefore were not representative of the full assortment of circulating biotypes. This issue has come to light in recent years, where it has become common for studies to yield less than 100% efficacy. This has changed the landscape for the testing of new products because heartworm efficacy studies lack the statistical power to conclude that finding zero worms is different from finding a few worms. To address this issue, we developed a novel statistical model, based on a hierarchical modeling and parametric bootstrap approach that provides new insights to assess multiple sources of variability encountered in heartworm drug efficacy studies. Using the newly established metrics we performed both data simulations and analyzed actual experimental data. Our results suggest that an important source of modeling variability arises from variability in the parasite establishment rate between dogs; not accounting for this can overestimate the efficacy in more than 40% of cases. We provide strong evidence that ZoeMo-2012 and JYD-34, which both were established from the same source dog, have differing levels of susceptibility to moxidectin. In addition, we provide strong evidence that the differences in efficacy seen in two published studies using the MP3 strain were not due to randomness, and thus must be biological in nature. Our results demonstrate how statistical modeling can improve the interpretation of data from heartworm efficacy studies by providing a means to identify the true efficacy range based on the observed data. Importantly, these new insights should help to inform regulators on how to move forward in establishing new statistically and scientifically valid requirements for efficacy in the registration of new heartworm preventative products. Furthermore, our results provide strong evidence that heartworm 'strains' can change their susceptibility phenotype over short periods of time, providing further evidence that a wide diversity of susceptibility phenotypes exists among naturally circulating biotypes of D. immitis.
NASA Astrophysics Data System (ADS)
Winiarek, Victor; Vira, Julius; Bocquet, Marc; Sofiev, Mikhail; Saunier, Olivier
2011-06-01
In the event of an accidental atmospheric release of radionuclides from a nuclear power plant, accurate real-time forecasting of the activity concentrations of radionuclides is required by the decision makers for the preparation of adequate countermeasures. The accuracy of the forecast plume is highly dependent on the source term estimation. On several academic test cases, including real data, inverse modelling and data assimilation techniques were proven to help in the assessment of the source term. In this paper, a semi-automatic method is proposed for the sequential reconstruction of the plume, by implementing a sequential data assimilation algorithm based on inverse modelling, with a care to develop realistic methods for operational risk agencies. The performance of the assimilation scheme has been assessed through the intercomparison between French and Finnish frameworks. Two dispersion models have been used: Polair3D and Silam developed in two different research centres. Different release locations, as well as different meteorological situations are tested. The existing and newly planned surveillance networks are used and realistically large multiplicative observational errors are assumed. The inverse modelling scheme accounts for strong error bias encountered with such errors. The efficiency of the data assimilation system is tested via statistical indicators. For France and Finland, the average performance of the data assimilation system is strong. However there are outlying situations where the inversion fails because of a too poor observability. In addition, in the case where the power plant responsible for the accidental release is not known, robust statistical tools are developed and tested to discriminate candidate release sites.
Pietz, Kenneth; Petersen, Laura A
2007-01-01
Objectives To compare the ability of two diagnosis-based risk adjustment systems and health self-report to predict short- and long-term mortality. Data Sources/Study Setting Data were obtained from the Department of Veterans Affairs (VA) administrative databases. The study population was 78,164 VA beneficiaries at eight medical centers during fiscal year (FY) 1998, 35,337 of whom completed an 36-Item Short Form Health Survey for veterans (SF-36V) survey. Study Design We tested the ability of Diagnostic Cost Groups (DCGs), Adjusted Clinical Groups (ACGs), SF-36V Physical Component score (PCS) and Mental Component Score (MCS), and eight SF-36V scales to predict 1- and 2–5 year all-cause mortality. The additional predictive value of adding PCS and MCS to ACGs and DCGs was also evaluated. Logistic regression models were compared using Akaike's information criterion, the c-statistic, and the Hosmer–Lemeshow test. Principal Findings The c-statistics for the eight scales combined with age and gender were 0.766 for 1-year mortality and 0.771 for 2–5-year mortality. For DCGs with age and gender the c-statistics for 1- and 2–5-year mortality were 0.778 and 0.771, respectively. Adding PCS and MCS to the DCG model increased the c-statistics to 0.798 for 1-year and 0.784 for 2–5-year mortality. Conclusions The DCG model showed slightly better performance than the eight-scale model in predicting 1-year mortality, but the two models showed similar performance for 2–5-year mortality. Health self-report may add health risk information in addition to age, gender, and diagnosis for predicting longer-term mortality. PMID:17362210
Statistical Modeling for Radiation Hardness Assurance
NASA Technical Reports Server (NTRS)
Ladbury, Raymond L.
2014-01-01
We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.
NASA Astrophysics Data System (ADS)
Ghotbi, Saba; Sotoudeheian, Saeed; Arhami, Mohammad
2016-09-01
Satellite remote sensing products of AOD from MODIS along with appropriate meteorological parameters were used to develop statistical models and estimate ground-level PM10. Most of previous studies obtained meteorological data from synoptic weather stations, with rather sparse spatial distribution, and used it along with 10 km AOD product to develop statistical models, applicable for PM variations in regional scale (resolution of ≥10 km). In the current study, meteorological parameters were simulated with 3 km resolution using WRF model and used along with the rather new 3 km AOD product (launched in 2014). The resulting PM statistical models were assessed for a polluted and largely variable urban area, Tehran, Iran. Despite the critical particulate pollution problem, very few PM studies were conducted in this area. The issue of rather poor direct PM-AOD associations existed, due to different factors such as variations in particles optical properties, in addition to bright background issue for satellite data, as the studied area located in the semi-arid areas of Middle East. Statistical approach of linear mixed effect (LME) was used, and three types of statistical models including single variable LME model (using AOD as independent variable) and multiple variables LME model by using meteorological data from two sources, WRF model and synoptic stations, were examined. Meteorological simulations were performed using a multiscale approach and creating an appropriate physic for the studied region, and the results showed rather good agreements with recordings of the synoptic stations. The single variable LME model was able to explain about 61%-73% of daily PM10 variations, reflecting a rather acceptable performance. Statistical models performance improved through using multivariable LME and incorporating meteorological data as auxiliary variables, particularly by using fine resolution outputs from WRF (R2 = 0.73-0.81). In addition, rather fine resolution for PM estimates was mapped for the studied city, and resulting concentration maps were consistent with PM recordings at the existing stations.
A two-step super-Gaussian independent component analysis approach for fMRI data.
Ge, Ruiyang; Yao, Li; Zhang, Hang; Long, Zhiying
2015-09-01
Independent component analysis (ICA) has been widely applied to functional magnetic resonance imaging (fMRI) data analysis. Although ICA assumes that the sources underlying data are statistically independent, it usually ignores sources' additional properties, such as sparsity. In this study, we propose a two-step super-GaussianICA (2SGICA) method that incorporates the sparse prior of the sources into the ICA model. 2SGICA uses the super-Gaussian ICA (SGICA) algorithm that is based on a simplified Lewicki-Sejnowski's model to obtain the initial source estimate in the first step. Using a kernel estimator technique, the source density is acquired and fitted to the Laplacian function based on the initial source estimates. The fitted Laplacian prior is used for each source at the second SGICA step. Moreover, the automatic target generation process for initial value generation is used in 2SGICA to guarantee the stability of the algorithm. An adaptive step size selection criterion is also implemented in the proposed algorithm. We performed experimental tests on both simulated data and real fMRI data to investigate the feasibility and robustness of 2SGICA and made a performance comparison between InfomaxICA, FastICA, mean field ICA (MFICA) with Laplacian prior, sparse online dictionary learning (ODL), SGICA and 2SGICA. Both simulated and real fMRI experiments showed that the 2SGICA was most robust to noises, and had the best spatial detection power and the time course estimation among the six methods. Copyright © 2015. Published by Elsevier Inc.
Multiple sparse volumetric priors for distributed EEG source reconstruction.
Strobbe, Gregor; van Mierlo, Pieter; De Vos, Maarten; Mijović, Bogdan; Hallez, Hans; Van Huffel, Sabine; López, José David; Vandenberghe, Stefaan
2014-10-15
We revisit the multiple sparse priors (MSP) algorithm implemented in the statistical parametric mapping software (SPM) for distributed EEG source reconstruction (Friston et al., 2008). In the present implementation, multiple cortical patches are introduced as source priors based on a dipole source space restricted to a cortical surface mesh. In this note, we present a technique to construct volumetric cortical regions to introduce as source priors by restricting the dipole source space to a segmented gray matter layer and using a region growing approach. This extension allows to reconstruct brain structures besides the cortical surface and facilitates the use of more realistic volumetric head models including more layers, such as cerebrospinal fluid (CSF), compared to the standard 3-layered scalp-skull-brain head models. We illustrated the technique with ERP data and anatomical MR images in 12 subjects. Based on the segmented gray matter for each of the subjects, cortical regions were created and introduced as source priors for MSP-inversion assuming two types of head models. The standard 3-layered scalp-skull-brain head models and extended 4-layered head models including CSF. We compared these models with the current implementation by assessing the free energy corresponding with each of the reconstructions using Bayesian model selection for group studies. Strong evidence was found in favor of the volumetric MSP approach compared to the MSP approach based on cortical patches for both types of head models. Overall, the strongest evidence was found in favor of the volumetric MSP reconstructions based on the extended head models including CSF. These results were verified by comparing the reconstructed activity. The use of volumetric cortical regions as source priors is a useful complement to the present implementation as it allows to introduce more complex head models and volumetric source priors in future studies. Copyright © 2014 Elsevier Inc. All rights reserved.
Dėdelė, Audrius; Miškinytė, Auksė
2015-09-01
In many countries, road traffic is one of the main sources of air pollution associated with adverse effects on human health and environment. Nitrogen dioxide (NO2) is considered to be a measure of traffic-related air pollution, with concentrations tending to be higher near highways, along busy roads, and in the city centers, and the exceedances are mainly observed at measurement stations located close to traffic. In order to assess the air quality in the city and the air pollution impact on public health, air quality models are used. However, firstly, before the model can be used for these purposes, it is important to evaluate the accuracy of the dispersion modelling as one of the most widely used method. The monitoring and dispersion modelling are two components of air quality monitoring system (AQMS), in which statistical comparison was made in this research. The evaluation of the Atmospheric Dispersion Modelling System (ADMS-Urban) was made by comparing monthly modelled NO2 concentrations with the data of continuous air quality monitoring stations in Kaunas city. The statistical measures of model performance were calculated for annual and monthly concentrations of NO2 for each monitoring station site. The spatial analysis was made using geographic information systems (GIS). The calculation of statistical parameters indicated a good ADMS-Urban model performance for the prediction of NO2. The results of this study showed that the agreement of modelled values and observations was better for traffic monitoring stations compared to the background and residential stations.
NASA Astrophysics Data System (ADS)
Ren, Y.
2017-12-01
Context Land surface temperatures (LSTs) spatio-temporal distribution pattern of urban forests are influenced by many ecological factors; the identification of interaction between these factors can improve simulations and predictions of spatial patterns of urban cold islands. This quantitative research requires an integrated method that combines multiple sources data with spatial statistical analysis. Objectives The purpose of this study was to clarify urban forest LST influence interaction between anthropogenic activities and multiple ecological factors using cluster analysis of hot and cold spots and Geogdetector model. We introduced the hypothesis that anthropogenic activity interacts with certain ecological factors, and their combination influences urban forests LST. We also assumed that spatio-temporal distributions of urban forest LST should be similar to those of ecological factors and can be represented quantitatively. Methods We used Jinjiang as a representative city in China as a case study. Population density was employed to represent anthropogenic activity. We built up a multi-source data (forest inventory, digital elevation models (DEM), population, and remote sensing imagery) on a unified urban scale to support urban forest LST influence interaction research. Through a combination of spatial statistical analysis results, multi-source spatial data, and Geogdetector model, the interaction mechanisms of urban forest LST were revealed. Results Although different ecological factors have different influences on forest LST, in two periods with different hot spots and cold spots, the patch area and dominant tree species were the main factors contributing to LST clustering in urban forests. The interaction between anthropogenic activity and multiple ecological factors increased LST in urban forest stands, linearly and nonlinearly. Strong interactions between elevation and dominant species were generally observed and were prevalent in either hot or cold spots areas in different years. Conclusions In conclusion, a combination of spatial statistics and GeogDetector models should be effective for quantitatively evaluating interactive relationships among ecological factors, anthropogenic activity and LST.
NASA Astrophysics Data System (ADS)
Kokkinaki, A.; Sleep, B. E.; Chambers, J. E.; Cirpka, O. A.; Nowak, W.
2010-12-01
Electrical Resistance Tomography (ERT) is a popular method for investigating subsurface heterogeneity. The method relies on measuring electrical potential differences and obtaining, through inverse modeling, the underlying electrical conductivity field, which can be related to hydraulic conductivities. The quality of site characterization strongly depends on the utilized inversion technique. Standard ERT inversion methods, though highly computationally efficient, do not consider spatial correlation of soil properties; as a result, they often underestimate the spatial variability observed in earth materials, thereby producing unrealistic subsurface models. Also, these methods do not quantify the uncertainty of the estimated properties, thus limiting their use in subsequent investigations. Geostatistical inverse methods can be used to overcome both these limitations; however, they are computationally expensive, which has hindered their wide use in practice. In this work, we compare a standard Gauss-Newton smoothness constrained least squares inversion method against the quasi-linear geostatistical approach using the three-dimensional ERT dataset of the SABRe (Source Area Bioremediation) project. The two methods are evaluated for their ability to: a) produce physically realistic electrical conductivity fields that agree with the wide range of data available for the SABRe site while being computationally efficient, and b) provide information on the spatial statistics of other parameters of interest, such as hydraulic conductivity. To explore the trade-off between inversion quality and computational efficiency, we also employ a 2.5-D forward model with corrections for boundary conditions and source singularities. The 2.5-D model accelerates the 3-D geostatistical inversion method. New adjoint equations are developed for the 2.5-D forward model for the efficient calculation of sensitivities. Our work shows that spatial statistics can be incorporated in large-scale ERT inversions to improve the inversion results without making them computationally prohibitive.
NASA Astrophysics Data System (ADS)
Bae, Minja; Park, Jihyun; Kim, Jongju; Xue, Dandan; Park, Kyu-Chil; Yoon, Jong Rak
2016-07-01
The bit error rate of an underwater acoustic communication system is related to multipath fading statistics, which determine the signal-to-noise ratio. The amplitude and delay of each path depend on sea surface roughness, propagation medium properties, and source-to-receiver range as a function of frequency. Therefore, received signals will show frequency-dependent fading. A shallow-water acoustic communication channel generally shows a few strong multipaths that interfere with each other and the resulting interference affects the fading statistics model. In this study, frequency-selective fading statistics are modeled on the basis of the phasor representation of the complex path amplitude. The fading statistics distribution is parameterized by the frequency-dependent constructive or destructive interference of multipaths. At a 16 m depth with a muddy bottom, a wave height of 0.2 m, and source-to-receiver ranges of 100 and 400 m, fading statistics tend to show a Rayleigh distribution at a destructive interference frequency, but a Rice distribution at a constructive interference frequency. The theoretical fading statistics well matched the experimental ones.
Vahedi, Shahrum; Farrokhi, Farahman; Gahramani, Farahnaz; Issazadegan, Ali
2012-01-01
Objective: Approximately 66-80%of graduate students experience statistics anxiety and some researchers propose that many students identify statistics courses as the most anxiety-inducing courses in their academic curriculums. As such, it is likely that statistics anxiety is, in part, responsible for many students delaying enrollment in these courses for as long as possible. This paper proposes a canonical model by treating academic procrastination (AP), learning strategies (LS) as predictor variables and statistics anxiety (SA) as explained variables. Methods: A questionnaire survey was used for data collection and 246-college female student participated in this study. To examine the mutually independent relations between procrastination, learning strategies and statistics anxiety variables, a canonical correlation analysis was computed. Results: Findings show that two canonical functions were statistically significant. The set of variables (metacognitive self-regulation, source management, preparing homework, preparing for test and preparing term papers) helped predict changes of statistics anxiety with respect to fearful behavior, Attitude towards math and class, Performance, but not Anxiety. Conclusion: These findings could be used in educational and psychological interventions in the context of statistics anxiety reduction. PMID:24644468
NASA Astrophysics Data System (ADS)
Fadakar Alghalandis, Younes
2017-05-01
Rapidly growing topic, the discrete fracture network engineering (DFNE), has already attracted many talents from diverse disciplines in academia and industry around the world to challenge difficult problems related to mining, geothermal, civil, oil and gas, water and many other projects. Although, there are few commercial software capable of providing some useful functionalities fundamental for DFNE, their costs, closed code (black box) distributions and hence limited programmability and tractability encouraged us to respond to this rising demand with a new solution. This paper introduces an open source comprehensive software package for stochastic modeling of fracture networks in two- and three-dimension in discrete formulation. Functionalities included are geometric modeling (e.g., complex polygonal fracture faces, and utilizing directional statistics), simulations, characterizations (e.g., intersection, clustering and connectivity analyses) and applications (e.g., fluid flow). The package is completely written in Matlab scripting language. Significant efforts have been made to bring maximum flexibility to the functions in order to solve problems in both two- and three-dimensions in an easy and united way that is suitable for beginners, advanced and experienced users.
The Advanced Statistical Trajectory Regional Air Pollution (ASTRAP) model simulates long-term transport and deposition of oxides of and nitrogen. t is a potential screening tool for assessing long-term effects on regional visibility from sulfur emission sources. owever, a rigorou...
Chapter two: Phenomenology of tsunamis II: scaling, event statistics, and inter-event triggering
Geist, Eric L.
2012-01-01
Observations related to tsunami catalogs are reviewed and described in a phenomenological framework. An examination of scaling relationships between earthquake size (as expressed by scalar seismic moment and mean slip) and tsunami size (as expressed by mean and maximum local run-up and maximum far-field amplitude) indicates that scaling is significant at the 95% confidence level, although there is uncertainty in how well earthquake size can predict tsunami size (R2 ~ 0.4-0.6). In examining tsunami event statistics, current methods used to estimate the size distribution of earthquakes and landslides and the inter-event time distribution of earthquakes are first reviewed. These methods are adapted to estimate the size and inter-event distribution of tsunamis at a particular recording station. Using a modified Pareto size distribution, the best-fit power-law exponents of tsunamis recorded at nine Pacific tide-gauge stations exhibit marked variation, in contrast to the approximately constant power-law exponent for inter-plate thrust earthquakes. With regard to the inter-event time distribution, significant temporal clustering of tsunami sources is demonstrated. For tsunami sources occurring in close proximity to other sources in both space and time, a physical triggering mechanism, such as static stress transfer, is a likely cause for the anomalous clustering. Mechanisms of earthquake-to-earthquake and earthquake-to-landslide triggering are reviewed. Finally, a modification of statistical branching models developed for earthquake triggering is introduced to describe triggering among tsunami sources.
Biosurveillance applying scan statistics with multiple, disparate data sources.
Burkom, Howard S
2003-06-01
Researchers working on the Department of Defense Global Emerging Infections System (DoD-GEIS) pilot system, the Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCE), have applied scan statistics for early outbreak detection using both traditional and nontraditional data sources. These sources include medical data indexed by International Classification of Disease, 9th Revision (ICD-9) diagnosis codes, as well as less-specific, but potentially timelier, indicators such as records of over-the-counter remedy sales and of school absenteeism. Early efforts employed the Kulldorff scan statistic as implemented in the SaTScan software of the National Cancer Institute. A key obstacle to this application is that the input data streams are typically based on time-varying factors, such as consumer behavior, rather than simply on the populations of the component subregions. We have used both modeling and recent historical data distributions to obtain background spatial distributions. Data analyses have provided guidance on how to condition and model input data to avoid excessive clustering. We have used this methodology in combining data sources for both retrospective studies of known outbreaks and surveillance of high-profile events of concern to local public health authorities. We have integrated the scan statistic capability into a Microsoft Access-based system in which we may include or exclude data sources, vary time windows separately for different data sources, censor data from subsets of individual providers or subregions, adjust the background computation method, and run retrospective or simulated studies.
Sources of Instabilities in Two-Way Satellite Time Transfer
2005-08-01
Frequency Division 325 Broadway Boulder, CO USA Abstract -- Two-Way Satellite Time and Frequency Transfer ( TWSTFT ) has become an important...stability of TWSTFT a more complete understanding of the sources of instabilities is required. This paper analyzes several sources of instabilities...Frequency Transfer ( TWSTFT ) regularly delivers subnanosecond time transfer stability at 1 day as measured by the time deviation (TDEV) statistic
Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data
Kacprzak, T.; Kirk, D.; Friedrich, O.; ...
2016-08-19
Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less
Quantifying Variation in Gait Features from Wearable Inertial Sensors Using Mixed Effects Models
Cresswell, Kellen Garrison; Shin, Yongyun; Chen, Shanshan
2017-01-01
The emerging technology of wearable inertial sensors has shown its advantages in collecting continuous longitudinal gait data outside laboratories. This freedom also presents challenges in collecting high-fidelity gait data. In the free-living environment, without constant supervision from researchers, sensor-based gait features are susceptible to variation from confounding factors such as gait speed and mounting uncertainty, which are challenging to control or estimate. This paper is one of the first attempts in the field to tackle such challenges using statistical modeling. By accepting the uncertainties and variation associated with wearable sensor-based gait data, we shift our efforts from detecting and correcting those variations to modeling them statistically. From gait data collected on one healthy, non-elderly subject during 48 full-factorial trials, we identified four major sources of variation, and quantified their impact on one gait outcome—range per cycle—using a random effects model and a fixed effects model. The methodology developed in this paper lays the groundwork for a statistical framework to account for sources of variation in wearable gait data, thus facilitating informative statistical inference for free-living gait analysis. PMID:28245602
A two-factor error model for quantitative steganalysis
NASA Astrophysics Data System (ADS)
Böhme, Rainer; Ker, Andrew D.
2006-02-01
Quantitative steganalysis refers to the exercise not only of detecting the presence of hidden stego messages in carrier objects, but also of estimating the secret message length. This problem is well studied, with many detectors proposed but only a sparse analysis of errors in the estimators. A deep understanding of the error model, however, is a fundamental requirement for the assessment and comparison of different detection methods. This paper presents a rationale for a two-factor model for sources of error in quantitative steganalysis, and shows evidence from a dedicated large-scale nested experimental set-up with a total of more than 200 million attacks. Apart from general findings about the distribution functions found in both classes of errors, their respective weight is determined, and implications for statistical hypothesis tests in benchmarking scenarios or regression analyses are demonstrated. The results are based on a rigorous comparison of five different detection methods under many different external conditions, such as size of the carrier, previous JPEG compression, and colour channel selection. We include analyses demonstrating the effects of local variance and cover saturation on the different sources of error, as well as presenting the case for a relative bias model for between-image error.
NASA Astrophysics Data System (ADS)
Collier, J. D.; Tingay, S. J.; Callingham, J. R.; Norris, R. P.; Filipović, M. D.; Galvin, T. J.; Huynh, M. T.; Intema, H. T.; Marvil, J.; O'Brien, A. N.; Roper, Q.; Sirothia, S.; Tothill, N. F. H.; Bell, M. E.; For, B.-Q.; Gaensler, B. M.; Hancock, P. J.; Hindson, L.; Hurley-Walker, N.; Johnston-Hollitt, M.; Kapińska, A. D.; Lenc, E.; Morgan, J.; Procopio, P.; Staveley-Smith, L.; Wayth, R. B.; Wu, C.; Zheng, Q.; Heywood, I.; Popping, A.
2018-06-01
We present very long baseline interferometry observations of a faint and low-luminosity (L1.4 GHz < 1027 W Hz-1) gigahertz-peaked spectrum (GPS) and compact steep-spectrum (CSS) sample. We select eight sources from deep radio observations that have radio spectra characteristic of a GPS or CSS source and an angular size of θ ≲ 2 arcsec, and detect six of them with the Australian Long Baseline Array. We determine their linear sizes, and model their radio spectra using synchrotron self-absorption (SSA) and free-free absorption (FFA) models. We derive statistical model ages, based on a fitted scaling relation, and spectral ages, based on the radio spectrum, which are generally consistent with the hypothesis that GPS and CSS sources are young and evolving. We resolve the morphology of one CSS source with a radio luminosity of 10^{25} W Hz^{-1}, and find what appear to be two hotspots spanning 1.7 kpc. We find that our sources follow the turnover-linear size relation, and that both homogeneous SSA and an inhomogeneous FFA model can account for the spectra with observable turnovers. All but one of the FFA models do not require a spectral break to account for the radio spectrum, while all but one of the alternative SSA and power-law models do require a spectral break to account for the radio spectrum. We conclude that our low-luminosity sample is similar to brighter samples in terms of their spectral shape, turnover frequencies, linear sizes, and ages, but cannot test for a difference in morphology.
Localization of extended brain sources from EEG/MEG: the ExSo-MUSIC approach.
Birot, Gwénaël; Albera, Laurent; Wendling, Fabrice; Merlet, Isabelle
2011-05-01
We propose a new MUSIC-like method, called 2q-ExSo-MUSIC (q ≥ 1). This method is an extension of the 2q-MUSIC (q ≥ 1) approach for solving the EEG/MEG inverse problem, when spatially-extended neocortical sources ("ExSo") are considered. It introduces a novel ExSo-MUSIC principle. The novelty is two-fold: i) the parameterization of the spatial source distribution that leads to an appropriate metric in the context of distributed brain sources and ii) the introduction of an original, efficient and low-cost way of optimizing this metric. In 2q-ExSo-MUSIC, the possible use of higher order statistics (q ≥ 2) offers a better robustness with respect to Gaussian noise of unknown spatial coherence and modeling errors. As a result we reduced the penalizing effects of both the background cerebral activity that can be seen as a Gaussian and spatially correlated noise, and the modeling errors induced by the non-exact resolution of the forward problem. Computer results on simulated EEG signals obtained with physiologically-relevant models of both the sources and the volume conductor show a highly increased performance of our 2q-ExSo-MUSIC method as compared to the classical 2q-MUSIC algorithms. Copyright © 2011 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Y.; Xu, X.
2017-12-01
The broad band Lg 1/Q tomographic models in eastern Eurasia are inverted from source- and site-corrected path 1/Q data. The path 1/Q are measured between stations (or events) by the two-station (TS), reverse two-station (RTS) and reverse two-event (RTE) methods, respectively. Because path 1/Q are computed using logarithm of the product of observed spectral ratios and simplified 1D geometrical spreading correction, they are subject to "modeling errors" dominated by uncompensated 3D structural effects. We have found in Chen and Xie [2017] that these errors closely follow normal distribution after the long-tailed outliers are screened out (similar to teleseismic travel time residuals). We thus rigorously analyze the statistics of these errors collected from repeated samplings of station (and event) pairs from 1.0 to 10.0Hz and reject about 15% outliers at each frequency band. The resultant variance of Δ/Q decreases with frequency as 1/f2. The 1/Q tomography using screened data is now a stochastic inverse problem with solutions approximate the means of Gaussian random variables and the model covariance matrix is that of Gaussian variables with well-known statistical behavior. We adopt a new SVD based tomographic method to solve for 2D Q image together with its resolution and covariance matrices. The RTS and RTE yield the most reliable 1/Q data free of source and site effects, but the path coverage is rather sparse due to very strict recording geometry. The TS absorbs the effects of non-unit site response ratios into 1/Q data. The RTS also yields site responses, which can then be corrected from the path 1/Q of TS to make them also free of site effect. The site corrected TS data substantially improve path coverage, allowing able to solve for 1/Q tomography up to 6.0Hz. The model resolution and uncertainty are first quantitively accessed by spread functions (fulfilled by resolution matrix) and covariance matrix. The reliably retrieved Q models correlate well with the distinct tectonic blocks featured by the most recent major deformations and vary with frequencies. With the 1/Q tomographic model and its covariance matrix, we can formally estimate the uncertainty of any path-specific Lg 1/Q prediction. This new capability significantly benefits source estimation for which reliable uncertainty estimate is especially important.
PKS 2155-304 relativistically beamed synchrotron radiation from BL LAC object
NASA Technical Reports Server (NTRS)
Urry, C. M.; Mushotzky, R. F.
1981-01-01
The newly discovered BL Lacertae object, PKS 2155-304, was observed with the medium and high intensity energy detectors of the HEAO-1 A2 experiment. The variability by a factor of two in less than a day reported by Snyder, et al (1979) is confirmed. Two spectra, obtained a year apart, while the satellite was in scanning mode, are well fit by simple power laws with energy spectral index alpha sub 1 equals approximately 1.4. A third spectrum, of higher statistical quality, obtained while the satellite was pointed at its source, has has two components. An acceptable fit was obtained using a two power law model, with indices alpha sub 1 equals 2.0 (+1.2, -0.6) and alpha sub 2 equals -1.5 (+1.5, -2.3). An interpretation of the overall spectrum from radio through X-rays in terms of a synchrotron self-Compton model gives a good description of the data if allowance is made for relativistic beaming. Thus, from a consideration of the spectrum, combined with an estimate of the size of the source, the presence of jets is inferred without their observation.
Mathematical Anxiety among Business Statistics Students.
ERIC Educational Resources Information Center
High, Robert V.
A survey instrument was developed to identify sources of mathematics anxiety among undergraduate business students in a statistics class. A number of statistics classes were selected at two colleges in Long Island, New York. A final sample of n=102 respondents indicated that there was a relationship between the mathematics grade in prior…
Upscaling pore pressure-dependent gas permeability in shales
NASA Astrophysics Data System (ADS)
Ghanbarian, Behzad; Javadpour, Farzam
2017-04-01
Upscaling pore pressure dependence of shale gas permeability is of great importance and interest in the investigation of gas production in unconventional reservoirs. In this study, we apply the Effective Medium Approximation, an upscaling technique from statistical physics, and modify the Doyen model for unconventional rocks. We develop an upscaling model to estimate the pore pressure-dependent gas permeability from pore throat size distribution, pore connectivity, tortuosity, porosity, and gas characteristics. We compare our adapted model with six data sets: three experiments, one pore-network model, and two lattice-Boltzmann simulations. Results showed that the proposed model estimated the gas permeability within a factor of 3 of the measurements/simulations in all data sets except the Eagle Ford experiment for which we discuss plausible sources of discrepancies.
Fiori, Simone
2007-01-01
Bivariate statistical modeling from incomplete data is a useful statistical tool that allows to discover the model underlying two data sets when the data in the two sets do not correspond in size nor in ordering. Such situation may occur when the sizes of the two data sets do not match (i.e., there are “holes” in the data) or when the data sets have been acquired independently. Also, statistical modeling is useful when the amount of available data is enough to show relevant statistical features of the phenomenon underlying the data. We propose to tackle the problem of statistical modeling via a neural (nonlinear) system that is able to match its input-output statistic to the statistic of the available data sets. A key point of the new implementation proposed here is that it is based on look-up-table (LUT) neural systems, which guarantee a computationally advantageous way of implementing neural systems. A number of numerical experiments, performed on both synthetic and real-world data sets, illustrate the features of the proposed modeling procedure. PMID:18566641
A Comparison of Two Methods for Initiating Air Mass Back Trajectories
NASA Astrophysics Data System (ADS)
Putman, A.; Posmentier, E. S.; Faiia, A. M.; Sonder, L. J.; Feng, X.
2014-12-01
Lagrangian air mass tracking programs in back cast mode are a powerful tool for estimating the water vapor source of precipitation events. The altitudes above the precipitation site where particle's back trajectories begin influences the source estimation. We assume that precipitation comes from water vapor in condensing regions of the air column, so particles are placed in proportion to an estimated condensation profile. We compare two methods for estimating where condensation occurs and the resulting evaporation sites for 63 events at Barrow, AK. The first method (M1) uses measurements from a 35 GHz vertically resolved cloud radar (MMCR), and algorithms developed by Zhao and Garrett1 to calculate precipitation rate. The second method (M2) uses the Global Data Assimilation System reanalysis data in a lofting model. We assess how accurately M2, developed for global coverage, will perform in absence of direct cloud observations. Results from the two methods are statistically similar. The mean particle height estimated by M2 is, on average, 695 m (s.d. = 1800 m) higher than M1. The corresponding average vapor source estimated by M2 is 1.5⁰ (s.d. = 5.4⁰) south of M1. In addition, vapor sources for M2 relative to M1 have ocean surface temperatures averaging 1.1⁰C (s.d. = 3.5⁰C) warmer, and reported ocean surface relative humidities 0.31% (s.d. = 6.1%) drier. All biases except the latter are statistically significant (p = 0.02 for each). Results were skewed by events where M2 estimated very high altitudes of condensation. When M2 produced an average particle height less than 5000 m (89% of events), M2 estimated mean particle heights 76 m (s.d. = 741 m) higher than M1, corresponding to a vapor source 0.54⁰ (s.d. = 4.2⁰) south of M1. The ocean surface at the vapor source was an average of 0.35⁰C (s.d. = 2.35⁰C) warmer and ocean surface relative humidities were 0.02% (s.d. = 5.5%) wetter. None of the biases was statistically significant. If the vapor source meteorology estimated by M2 is used to determine vapor isotopic properties it would produce results similar to M1 in all cases except the occasional very high cloud. The methods strive to balance a sufficient number of tracked air masses for meaningful vapor source estimation with minimal computational time. Zhao, C and Garrett, T.J. 2008, J. Geophys. Res.
Olson, Andrew; Halloran, Elizabeth; Romani, Cristina
2015-12-01
We present three jargonaphasic patients who made phonological errors in naming, repetition and reading. We analyse target/response overlap using statistical models to answer three questions: 1) Is there a single phonological source for errors or two sources, one for target-related errors and a separate source for abstruse errors? 2) Can correct responses be predicted by the same distribution used to predict errors or do they show a completion boost (CB)? 3) Is non-lexical and lexical information summed during reading and repetition? The answers were clear. 1) Abstruse errors did not require a separate distribution created by failure to access word forms. Abstruse and target-related errors were the endpoints of a single overlap distribution. 2) Correct responses required a special factor, e.g., a CB or lexical/phonological feedback, to preserve their integrity. 3) Reading and repetition required separate lexical and non-lexical contributions that were combined at output. Copyright © 2015 Elsevier Ltd. All rights reserved.
Data assimilation and bathymetric inversion in a two-dimensional horizontal surf zone model
NASA Astrophysics Data System (ADS)
Wilson, G. W.; Ã-Zkan-Haller, H. T.; Holman, R. A.
2010-12-01
A methodology is described for assimilating observations in a steady state two-dimensional horizontal (2-DH) model of nearshore hydrodynamics (waves and currents), using an ensemble-based statistical estimator. In this application, we treat bathymetry as a model parameter, which is subject to a specified prior uncertainty. The statistical estimator uses state augmentation to produce posterior (inverse, updated) estimates of bathymetry, wave height, and currents, as well as their posterior uncertainties. A case study is presented, using data from a 2-D array of in situ sensors on a natural beach (Duck, NC). The prior bathymetry is obtained by interpolation from recent bathymetric surveys; however, the resulting prior circulation is not in agreement with measurements. After assimilating data (significant wave height and alongshore current), the accuracy of modeled fields is improved, and this is quantified by comparing with observations (both assimilated and unassimilated). Hence, for the present data, 2-DH bathymetric uncertainty is an important source of error in the model and can be quantified and corrected using data assimilation. Here the bathymetric uncertainty is ascribed to inadequate temporal sampling; bathymetric surveys were conducted on a daily basis, but bathymetric change occurred on hourly timescales during storms, such that hydrodynamic model skill was significantly degraded. Further tests are performed to analyze the model sensitivities used in the assimilation and to determine the influence of different observation types and sampling schemes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kacprzak, T.; Kirk, D.; Friedrich, O.
Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less
Bryan, Rebecca; Nair, Prasanth B; Taylor, Mark
2009-09-18
Interpatient variability is often overlooked in orthopaedic computational studies due to the substantial challenges involved in sourcing and generating large numbers of bone models. A statistical model of the whole femur incorporating both geometric and material property variation was developed as a potential solution to this problem. The statistical model was constructed using principal component analysis, applied to 21 individual computer tomography scans. To test the ability of the statistical model to generate realistic, unique, finite element (FE) femur models it was used as a source of 1000 femurs to drive a study on femoral neck fracture risk. The study simulated the impact of an oblique fall to the side, a scenario known to account for a large proportion of hip fractures in the elderly and have a lower fracture load than alternative loading approaches. FE model generation, application of subject specific loading and boundary conditions, FE processing and post processing of the solutions were completed automatically. The generated models were within the bounds of the training data used to create the statistical model with a high mesh quality, able to be used directly by the FE solver without remeshing. The results indicated that 28 of the 1000 femurs were at highest risk of fracture. Closer analysis revealed the percentage of cortical bone in the proximal femur to be a crucial differentiator between the failed and non-failed groups. The likely fracture location was indicated to be intertrochantic. Comparison to previous computational, clinical and experimental work revealed support for these findings.
Hospital adoption of medical technology: an empirical test of alternative models.
Teplensky, J. D.; Pauly, M. V.; Kimberly, J. R.; Hillman, A. L.; Schwartz, J. S.
1995-01-01
OBJECTIVE. This study examines hospital motivations to acquire new medical technology, an issue of considerable policy relevance: in this case, whether, when, and why hospitals acquire a new capital-intensive medical technology, magnetic resonance imaging equipment (MRI). STUDY DESIGN. We review three common explanations for medical technology adoption: profit maximization, technological preeminence, and clinical excellence, and incorporate them into a composite model, controlling for regulatory differences, market structures, and organizational characteristics. All four models are then tested using Cox regressions. DATA SOURCES. The study is based on an initial sample of 637 hospitals in the continental United States that owned or leased an MRI unit as of 31 December 1988, plus nonadopters. Due to missing data the final sample consisted of 507 hospitals. The data, drawn from two telephone surveys, are supplemented by the AHA Survey, census data, and industry and academic sources. PRINCIPAL FINDING. Statistically, the three individual models account for roughly comparable amounts of variance in past adoption behavior. On the basis of explanatory power and parsimony, however, the technology model is "best." Although the composite model is statistically better than any of the individual models, it does not add much more explanatory power adjusting for the number of variables added. CONCLUSIONS. The composite model identified the importance a hospital attached to being a technological leader, its clinical requirements, and the change in revenues it associated with the adoption of MRI as the major determinants of adoption behavior. We conclude that a hospital's adoption behavior is strongly linked to its strategic orientation. PMID:7649751
Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and...
OpenMx: An Open Source Extended Structural Equation Modeling Framework
ERIC Educational Resources Information Center
Boker, Steven; Neale, Michael; Maes, Hermine; Wilde, Michael; Spiegel, Michael; Brick, Timothy; Spies, Jeffrey; Estabrook, Ryne; Kenny, Sarah; Bates, Timothy; Mehta, Paras; Fox, John
2011-01-01
OpenMx is free, full-featured, open source, structural equation modeling (SEM) software. OpenMx runs within the "R" statistical programming environment on Windows, Mac OS-X, and Linux computers. The rationale for developing OpenMx is discussed along with the philosophy behind the user interface. The OpenMx data structures are…
Modular Open-Source Software for Item Factor Analysis
ERIC Educational Resources Information Center
Pritikin, Joshua N.; Hunter, Micheal D.; Boker, Steven M.
2015-01-01
This article introduces an item factor analysis (IFA) module for "OpenMx," a free, open-source, and modular statistical modeling package that runs within the R programming environment on GNU/Linux, Mac OS X, and Microsoft Windows. The IFA module offers a novel model specification language that is well suited to programmatic generation…
Pekey, Hakan; Karakaş, Duran; Bakoğlu, Mithat
2004-11-01
Surface water samples were collected from ten previously selected sites of the polluted Dil Deresi stream, during two field surveys, December 2001 and April 2002. All samples were analyzed using ICP-AES, and the concentrations of trace metals (Al, As, Ba, Cd, Co, Cr, Cu, Fe, Pb, Sn and Zn) were determined. The results were compared with national and international water quality guidelines, as well as literature values reported for similar rivers. Factor analysis (FA) and a factor analysis-multiple regression (FA-MR) model were used for source apportionment and estimation of contributions from identified sources to the concentration of each parameter. By a varimax rotated factor analysis, four source types were identified as the paint industry; sewage, crustal and road traffic runoff for trace metals, explaining about 83% of the total variance. FA-MR results showed that predicted concentrations were calculated with uncertainties lower than 15%.
Evaluating a linearized Euler equations model for strong turbulence effects on sound propagation.
Ehrhardt, Loïc; Cheinet, Sylvain; Juvé, Daniel; Blanc-Benon, Philippe
2013-04-01
Sound propagation outdoors is strongly affected by atmospheric turbulence. Under strongly perturbed conditions or long propagation paths, the sound fluctuations reach their asymptotic behavior, e.g., the intensity variance progressively saturates. The present study evaluates the ability of a numerical propagation model based on the finite-difference time-domain solving of the linearized Euler equations in quantitatively reproducing the wave statistics under strong and saturated intensity fluctuations. It is the continuation of a previous study where weak intensity fluctuations were considered. The numerical propagation model is presented and tested with two-dimensional harmonic sound propagation over long paths and strong atmospheric perturbations. The results are compared to quantitative theoretical or numerical predictions available on the wave statistics, including the log-amplitude variance and the probability density functions of the complex acoustic pressure. The match is excellent for the evaluated source frequencies and all sound fluctuations strengths. Hence, this model captures these many aspects of strong atmospheric turbulence effects on sound propagation. Finally, the model results for the intensity probability density function are compared with a standard fit by a generalized gamma function.
Estimating and Testing the Sources of Evoked Potentials in the Brain.
ERIC Educational Resources Information Center
Huizenga, Hilde M.; Molenaar, Peter C. M.
1994-01-01
The source of an event-related brain potential (ERP) is estimated from multivariate measures of ERP on the head under several mathematical and physical constraints on the parameters of the source model. Statistical aspects of estimation are discussed, and new tests are proposed. (SLD)
Approximate Bayesian estimation of extinction rate in the Finnish Daphnia magna metapopulation.
Robinson, John D; Hall, David W; Wares, John P
2013-05-01
Approximate Bayesian computation (ABC) is useful for parameterizing complex models in population genetics. In this study, ABC was applied to simultaneously estimate parameter values for a model of metapopulation coalescence and test two alternatives to a strict metapopulation model in the well-studied network of Daphnia magna populations in Finland. The models shared four free parameters: the subpopulation genetic diversity (θS), the rate of gene flow among patches (4Nm), the founding population size (N0) and the metapopulation extinction rate (e) but differed in the distribution of extinction rates across habitat patches in the system. The three models had either a constant extinction rate in all populations (strict metapopulation), one population that was protected from local extinction (i.e. a persistent source), or habitat-specific extinction rates drawn from a distribution with specified mean and variance. Our model selection analysis favoured the model including a persistent source population over the two alternative models. Of the closest 750,000 data sets in Euclidean space, 78% were simulated under the persistent source model (estimated posterior probability = 0.769). This fraction increased to more than 85% when only the closest 150,000 data sets were considered (estimated posterior probability = 0.774). Approximate Bayesian computation was then used to estimate parameter values that might produce the observed set of summary statistics. Our analysis provided posterior distributions for e that included the point estimate obtained from previous data from the Finnish D. magna metapopulation. Our results support the use of ABC and population genetic data for testing the strict metapopulation model and parameterizing complex models of demography. © 2013 Blackwell Publishing Ltd.
Evaluation of probabilistic forecasts with the scoringRules package
NASA Astrophysics Data System (ADS)
Jordan, Alexander; Krüger, Fabian; Lerch, Sebastian
2017-04-01
Over the last decades probabilistic forecasts in the form of predictive distributions have become popular in many scientific disciplines. With the proliferation of probabilistic models arises the need for decision-theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way in order to better understand sources of prediction errors and to improve the models. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. In coherence with decision-theoretical principles they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This contribution presents the software package scoringRules for the statistical programming language R, which provides functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. For univariate variables, two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, ensemble weather forecasts take this form. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices. Recent developments include the addition of scoring rules to evaluate multivariate forecast distributions. The use of the scoringRules package is illustrated in an example on post-processing ensemble forecasts of temperature.
NASA Astrophysics Data System (ADS)
Bergant, Klemen; Kajfež-Bogataj, Lučka; Črepinšek, Zalika
2002-02-01
Phenological observations are a valuable source of information for investigating the relationship between climate variation and plant development. Potential climate change in the future will shift the occurrence of phenological phases. Information about future climate conditions is needed in order to estimate this shift. General circulation models (GCM) provide the best information about future climate change. They are able to simulate reliably the most important mean features on a large scale, but they fail on a regional scale because of their low spatial resolution. A common approach to bridging the scale gap is statistical downscaling, which was used to relate the beginning of flowering of Taraxacum officinale in Slovenia with the monthly mean near-surface air temperature for January, February and March in Central Europe. Statistical models were developed and tested with NCAR/NCEP Reanalysis predictor data and EARS predictand data for the period 1960-1999. Prior to developing statistical models, empirical orthogonal function (EOF) analysis was employed on the predictor data. Multiple linear regression was used to relate the beginning of flowering with expansion coefficients of the first three EOF for the Janauary, Febrauary and March air temperatures, and a strong correlation was found between them. Developed statistical models were employed on the results of two GCM (HadCM3 and ECHAM4/OPYC3) to estimate the potential shifts in the beginning of flowering for the periods 1990-2019 and 2020-2049 in comparison with the period 1960-1989. The HadCM3 model predicts, on average, 4 days earlier occurrence and ECHAM4/OPYC3 5 days earlier occurrence of flowering in the period 1990-2019. The analogous results for the period 2020-2049 are a 10- and 11-day earlier occurrence.
Trends of atmospheric circulation during singular hot days in Europe
NASA Astrophysics Data System (ADS)
Jézéquel, Aglaé; Cattiaux, Julien; Naveau, Philippe; Radanovics, Sabine; Ribes, Aurélien; Vautard, Robert; Vrac, Mathieu; Yiou, Pascal
2018-05-01
The influence of climate change on mid-latitudes atmospheric circulation is still very uncertain. The large internal variability makes it difficult to extract any statistically significant signal regarding the evolution of the circulation. Here we propose a methodology to calculate dynamical trends tailored to the circulation of specific days by computing the evolution of the distances between the circulation of the day of interest and the other days of the time series. We compute these dynamical trends for two case studies of the hottest days recorded in two different European regions (corresponding to the heat-waves of summer 2003 and 2010). We use the NCEP reanalysis dataset, an ensemble of CMIP5 models, and a large ensemble of a single model (CESM), in order to account for different sources of uncertainty. While we find a positive trend for most models for 2003, we cannot conclude for 2010 since the models disagree on the trend estimates.
Najmeddin, Ali; Keshavarzi, Behnam; Moore, Farid; Lahijanzadeh, Ahmadreza
2017-10-28
This study investigates the occurrence and spatial distribution of potentially toxic elements (PTEs) (Hg, Cd, Cu, Mo, Pb, Zn, Ni, Co, Cr, Al, Fe, Mn, V and Sb) in 67 road dust samples collected from urban industrial areas in Ahvaz megacity, southwest of Iran. Geochemical methods, multivariate statistics, geostatistics and health risk assessment model were adopted to study the spatial pollution pattern and to identify the priority pollutants, regions of concern and sources of the studied PTEs. Also, receptor positive matrix factorization model was employed to assess pollution sources. Compared to the local background, the median enrichment factor values revealed the following order: Sb > Pb > Hg > Zn > Cu > V > Fe > Mo > Cd > Mn > Cr ≈ Co ≈ Al ≈ Ni. Statistical results show that a significant difference exists between concentrations of Mo, Cu, Pb, Zn, Fe, Sb, V and Hg in different regions (univariate analysis, Kruskal-Wallis test p < 0.05), indicating the existence of highly contaminated spots. Integrated source identification coupled with positive matrix factorization model revealed that traffic-related emissions (43.5%) and steel industries (26.4%) were first two sources of PTEs in road dust, followed by natural sources (22.6%) and pipe and oil processing companies (7.5%). The arithmetic mean of pollution load index (PLI) values for high traffic sector (1.92) is greater than industrial (1.80) and residential areas (1.25). Also, the results show that ecological risk values for Hg and Pb in 41.8 and 9% of total dust samples are higher than 80, indicating their considerable or higher potential ecological risk. The health risk assessment model showed that ingestion of dust particles contributed more than 83% of the overall non-carcinogenic risk. For both residential and industrial scenarios, Hg and Pb had the highest risk values, whereas Mo has the lowest value.
Hierarchical statistical modeling of xylem vulnerability to cavitation.
Ogle, Kiona; Barber, Jarrett J; Willson, Cynthia; Thompson, Brenda
2009-01-01
Cavitation of xylem elements diminishes the water transport capacity of plants, and quantifying xylem vulnerability to cavitation is important to understanding plant function. Current approaches to analyzing hydraulic conductivity (K) data to infer vulnerability to cavitation suffer from problems such as the use of potentially unrealistic vulnerability curves, difficulty interpreting parameters in these curves, a statistical framework that ignores sampling design, and an overly simplistic view of uncertainty. This study illustrates how two common curves (exponential-sigmoid and Weibull) can be reparameterized in terms of meaningful parameters: maximum conductivity (k(sat)), water potential (-P) at which percentage loss of conductivity (PLC) =X% (P(X)), and the slope of the PLC curve at P(X) (S(X)), a 'sensitivity' index. We provide a hierarchical Bayesian method for fitting the reparameterized curves to K(H) data. We illustrate the method using data for roots and stems of two populations of Juniperus scopulorum and test for differences in k(sat), P(X), and S(X) between different groups. Two important results emerge from this study. First, the Weibull model is preferred because it produces biologically realistic estimates of PLC near P = 0 MPa. Second, stochastic embolisms contribute an important source of uncertainty that should be included in such analyses.
Foster, Guy M.; Graham, Jennifer L.
2016-04-06
The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes in water-quality conditions through time, characterizing potentially harmful cyanobacterial events, and indicating changes in water-quality conditions that may affect drinking-water treatment processes.
Optimized design and analysis of preclinical intervention studies in vivo
Laajala, Teemu D.; Jumppanen, Mikael; Huhtaniemi, Riikka; Fey, Vidal; Kaur, Amanpreet; Knuuttila, Matias; Aho, Eija; Oksala, Riikka; Westermarck, Jukka; Mäkelä, Sari; Poutanen, Matti; Aittokallio, Tero
2016-01-01
Recent reports have called into question the reproducibility, validity and translatability of the preclinical animal studies due to limitations in their experimental design and statistical analysis. To this end, we implemented a matching-based modelling approach for optimal intervention group allocation, randomization and power calculations, which takes full account of the complex animal characteristics at baseline prior to interventions. In prostate cancer xenograft studies, the method effectively normalized the confounding baseline variability, and resulted in animal allocations which were supported by RNA-seq profiling of the individual tumours. The matching information increased the statistical power to detect true treatment effects at smaller sample sizes in two castration-resistant prostate cancer models, thereby leading to saving of both animal lives and research costs. The novel modelling approach and its open-source and web-based software implementations enable the researchers to conduct adequately-powered and fully-blinded preclinical intervention studies, with the aim to accelerate the discovery of new therapeutic interventions. PMID:27480578
Optimized design and analysis of preclinical intervention studies in vivo.
Laajala, Teemu D; Jumppanen, Mikael; Huhtaniemi, Riikka; Fey, Vidal; Kaur, Amanpreet; Knuuttila, Matias; Aho, Eija; Oksala, Riikka; Westermarck, Jukka; Mäkelä, Sari; Poutanen, Matti; Aittokallio, Tero
2016-08-02
Recent reports have called into question the reproducibility, validity and translatability of the preclinical animal studies due to limitations in their experimental design and statistical analysis. To this end, we implemented a matching-based modelling approach for optimal intervention group allocation, randomization and power calculations, which takes full account of the complex animal characteristics at baseline prior to interventions. In prostate cancer xenograft studies, the method effectively normalized the confounding baseline variability, and resulted in animal allocations which were supported by RNA-seq profiling of the individual tumours. The matching information increased the statistical power to detect true treatment effects at smaller sample sizes in two castration-resistant prostate cancer models, thereby leading to saving of both animal lives and research costs. The novel modelling approach and its open-source and web-based software implementations enable the researchers to conduct adequately-powered and fully-blinded preclinical intervention studies, with the aim to accelerate the discovery of new therapeutic interventions.
NASA Astrophysics Data System (ADS)
Lee, Silvia Wen-Yu; Liang, Jyh-Chong; Tsai, Chin-Chung
2016-10-01
This study investigated the relationships among college students' epistemic beliefs in biology (EBB), conceptions of learning biology (COLB), and strategies of learning biology (SLB). EBB includes four dimensions, namely 'multiple-source,' 'uncertainty,' 'development,' and 'justification.' COLB is further divided into 'constructivist' and 'reproductive' conceptions, while SLB represents deep strategies and surface learning strategies. Questionnaire responses were gathered from 303 college students. The results of the confirmatory factor analysis and structural equation modelling showed acceptable model fits. Mediation testing further revealed two paths with complete mediation. In sum, students' epistemic beliefs of 'uncertainty' and 'justification' in biology were statistically significant in explaining the constructivist and reproductive COLB, respectively; and 'uncertainty' was statistically significant in explaining the deep SLB as well. The results of mediation testing further revealed that 'uncertainty' predicted surface strategies through the mediation of 'reproductive' conceptions; and the relationship between 'justification' and deep strategies was mediated by 'constructivist' COLB. This study provides evidence for the essential roles some epistemic beliefs play in predicting students' learning.
Dose-escalation designs in oncology: ADEPT and the CRM.
Shu, Jianfen; O'Quigley, John
2008-11-20
The ADEPT software package is not a statistical method in its own right as implied by Gerke and Siedentop (Statist. Med. 2008; DOI: 10.1002/sim.3037). ADEPT implements two-parameter CRM models as described in O'Quigley et al. (Biometrics 1990; 46(1):33-48). All of the basic ideas (use of a two-parameter logistic model, use of a two-dimensional prior for the unknown slope and intercept parameters, sequential estimation and subsequent patient allocation based on minimization of some loss function, flexibility to use cohorts instead of one by one inclusion) are strictly identical. The only, and quite trivial, difference arises in the setting of the prior. O'Quigley et al. (Biometrics 1990; 46(1):33-48) used priors having an analytic expression whereas Whitehead and Brunier (Statist. Med. 1995; 14:33-48) use pseudo-data to play the role of the prior. The question of interest is whether two-parameter CRM works as well, or better, than the one-parameter CRM recommended in O'Quigley et al. (Biometrics 1990; 46(1):33-48). Gerke and Siedentop argue that it does. The published literature suggests otherwise. The conclusions of Gerke and Siedentop stem from three highly particular, and somewhat contrived, situations. Unlike one-parameter CRM (Biometrika 1996; 83:395-405; J. Statist. Plann. Inference 2006; 136:1765-1780; Biometrika 2005; 92:863-873), no statistical properties appear to have been studied for two-parameter CRM. In particular, for two-parameter CRM, the parameter estimates are inconsistent. This ought to be a source of major concern to those proposing its use. Worse still, for finite samples the behavior of estimates can be quite wild despite having incorporated the kind of dampening priors discussed by Gerke and Siedentop. An example in which we illustrate this behavior describes a single patient included at level 1 of 6 levels and experiencing a dose limiting toxicity. The subsequent recommendation is to experiment at level 6! Such problematic behavior is not common. Even so, we show that the allocation behavior of two-parameter CRM is very much less stable than that of one-parameter CRM.
Spectral Modeling of the EGRET 3EG Gamma Ray Sources Near the Galactic Plane
NASA Technical Reports Server (NTRS)
Bertsch, D. L.; Hartman, R. C.; Hunter, S. D.; Thompson, D. J.; Lin, Y. C.; Kniffen, D. A.; Kanbach, G.; Mayer-Hasselwander, H. A.; Reimer, O.; Sreekumar, P.
1999-01-01
The third EGRET catalog lists 84 sources within 10 deg of the Galactic Plane. Five of these are well-known spin-powered pulsars, 2 and possibly 3 others are blazars, and the remaining 74 are classified as unidentified, although 6 of these are likely to be artifacts of nearby strong sources. Several of the remaining 68 unidentified sources have been noted as having positional agreement with supernovae remnants and OB associations. Others may be radio-quiet pulsars like Geminga, and still others may belong to a totally new class of sources. The question of the energy spectral distributions of these sources is an important clue to their identification. In this paper, the spectra of the sources within 10 deg of Galactic Plane are fit with three different functional forms; a single power law, two power laws, and a power law with an exponential cutoff. Where possible, the best fit is selected with statistical tests. Twelve, and possibly an additional 5 sources, are found to have spectra that are fit by a breaking power law or by the power law with exponential cutoff function.
Interactions and triggering in a 3D rate and state asperity model
NASA Astrophysics Data System (ADS)
Dublanchet, P.; Bernard, P.
2012-12-01
Precise relocation of micro-seismicity and careful analysis of seismic source parameters have progressively imposed the concept of seismic asperities embedded in a creeping fault segment as being one of the most important aspect that should appear in a realistic representation of micro-seismic sources. Another important issue concerning micro-seismic activity is the existence of robust empirical laws describing the temporal and magnitude distribution of earthquakes, such as the Omori law, the distribution of inter-event time and the Gutenberg-Richter law. In this framework, this study aims at understanding statistical properties of earthquakes, by generating synthetic catalogs with a 3D, quasi-dynamic continuous rate and state asperity model, that takes into account a realistic geometry of asperities. Our approach contrasts with ETAS models (Kagan and Knopoff, 1981) usually implemented to produce earthquake catalogs, in the sense that the non linearity observed in rock friction experiments (Dieterich, 1979) is fully taken into account by the use of rate and state friction law. Furthermore, our model differs from discrete models of faults (Ziv and Cochard, 2006) because the continuity allows us to define realistic geometries and distributions of asperities by the assembling of sub-critical computational cells that always fail in a single event. Moreover, this model allows us to adress the question of the influence of barriers and distribution of asperities on the event statistics. After recalling the main observations of asperities in the specific case of Parkfield segment of San-Andreas Fault, we analyse earthquake statistical properties computed for this area. Then, we present synthetic statistics obtained by our model that allow us to discuss the role of barriers on clustering and triggering phenomena among a population of sources. It appears that an effective size of barrier, that depends on its frictional strength, controls the presence or the absence, in the synthetic catalog, of statistical laws that are similar to what is observed for real earthquakes. As an application, we attempt to draw a comparison between synthetic statistics and the observed statistics of Parkfield in order to characterize what could be a realistic frictional model of Parkfield area. More generally, we obtained synthetic statistical properties that are in agreement with power-law decays characterized by exponents that match the observations at a global scale, showing that our mechanical model is able to provide new insights into the understanding of earthquake interaction processes in general.
Lee, Kibaek; Yoo, Jaeheung; Choi, Munkee; Zo, Hangjung; Ciganek, Andrew P.
2016-01-01
Firms continuously search for external knowledge that can contribute to product innovation, which may ultimately increase market performance. The relationship between external knowledge sourcing and market performance is not well-documented. The extant literature primarily examines the causal relationship between external knowledge sources and product innovation performance or to identify factors which moderates the relationship between external knowledge sourcing and product innovation. Non-technological innovations, such as organization and marketing innovations, intervene in the process of external knowledge sourcing to product innovation to market performance but has not been extensively examined. This study addresses two research questions: does external knowledge sourcing lead to market performance and how does external knowledge sourcing interact with a firm’s different innovation activities to enhance market performance. This study proposes a comprehensive model to capture the causal mechanism from external knowledge sourcing to market performance. The research model was tested using survey data from manufacturing firms in South Korea and the results demonstrate a strong statistical relationship in the path of external knowledge sourcing (EKS) to product innovation performance (PIP) to market performance (MP). Organizational innovation is an antecedent to EKS while marketing innovation is a consequence of EKS, which significantly influences PIP and MP. The results imply that any potential EKS effort should also consider organizational innovations which may ultimately enhance market performance. Theoretical and practical implications are discussed as well as concluding remarks. PMID:28006022
Lee, Kibaek; Yoo, Jaeheung; Choi, Munkee; Zo, Hangjung; Ciganek, Andrew P
2016-01-01
Firms continuously search for external knowledge that can contribute to product innovation, which may ultimately increase market performance. The relationship between external knowledge sourcing and market performance is not well-documented. The extant literature primarily examines the causal relationship between external knowledge sources and product innovation performance or to identify factors which moderates the relationship between external knowledge sourcing and product innovation. Non-technological innovations, such as organization and marketing innovations, intervene in the process of external knowledge sourcing to product innovation to market performance but has not been extensively examined. This study addresses two research questions: does external knowledge sourcing lead to market performance and how does external knowledge sourcing interact with a firm's different innovation activities to enhance market performance. This study proposes a comprehensive model to capture the causal mechanism from external knowledge sourcing to market performance. The research model was tested using survey data from manufacturing firms in South Korea and the results demonstrate a strong statistical relationship in the path of external knowledge sourcing (EKS) to product innovation performance (PIP) to market performance (MP). Organizational innovation is an antecedent to EKS while marketing innovation is a consequence of EKS, which significantly influences PIP and MP. The results imply that any potential EKS effort should also consider organizational innovations which may ultimately enhance market performance. Theoretical and practical implications are discussed as well as concluding remarks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Apte, A; Veeraraghavan, H; Oh, J
Purpose: To present an open source and free platform to facilitate radiomics research — The “Radiomics toolbox” in CERR. Method: There is scarcity of open source tools that support end-to-end modeling of image features to predict patient outcomes. The “Radiomics toolbox” strives to fill the need for such a software platform. The platform supports (1) import of various kinds of image modalities like CT, PET, MR, SPECT, US. (2) Contouring tools to delineate structures of interest. (3) Extraction and storage of image based features like 1st order statistics, gray-scale co-occurrence and zonesize matrix based texture features and shape features andmore » (4) Statistical Analysis. Statistical analysis of the extracted features is supported with basic functionality that includes univariate correlations, Kaplan-Meir curves and advanced functionality that includes feature reduction and multivariate modeling. The graphical user interface and the data management are performed with Matlab for the ease of development and readability of code and features for wide audience. Open-source software developed with other programming languages is integrated to enhance various components of this toolbox. For example: Java-based DCM4CHE for import of DICOM, R for statistical analysis. Results: The Radiomics toolbox will be distributed as an open source, GNU copyrighted software. The toolbox was prototyped for modeling Oropharyngeal PET dataset at MSKCC. The analysis will be presented in a separate paper. Conclusion: The Radiomics Toolbox provides an extensible platform for extracting and modeling image features. To emphasize new uses of CERR for radiomics and image-based research, we have changed the name from the “Computational Environment for Radiotherapy Research” to the “Computational Environment for Radiological Research”.« less
Zhang, Wanfeng; Zhu, Shukui; He, Sheng; Wang, Yanxin
2015-02-06
Using comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC/TOFMS), volatile and semi-volatile organic compounds in crude oil samples from different reservoirs or regions were analyzed for the development of a molecular fingerprint database. Based on the GC×GC/TOFMS fingerprints of crude oils, principal component analysis (PCA) and cluster analysis were used to distinguish the oil sources and find biomarkers. As a supervised technique, the geological characteristics of crude oils, including thermal maturity, sedimentary environment etc., are assigned to the principal components. The results show that tri-aromatic steroid (TAS) series are the suitable marker compounds in crude oils for the oil screening, and the relative abundances of individual TAS compounds have excellent correlation with oil sources. In order to correct the effects of some other external factors except oil sources, the variables were defined as the content ratio of some target compounds and 13 parameters were proposed for the screening of oil sources. With the developed model, the crude oils were easily discriminated, and the result is in good agreement with the practical geological setting. Copyright © 2014 Elsevier B.V. All rights reserved.
The discounting model selector: Statistical software for delay discounting applications.
Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A
2017-05-01
Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (IBM VERSION)
NASA Technical Reports Server (NTRS)
Manteufel, R.
1994-01-01
The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (DEC VAX VERSION)
NASA Technical Reports Server (NTRS)
Merwarth, P. D.
1994-01-01
The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
Fitting and Modeling in the ASC Data Analysis Environment
NASA Astrophysics Data System (ADS)
Doe, S.; Siemiginowska, A.; Joye, W.; McDowell, J.
As part of the AXAF Science Center (ASC) Data Analysis Environment, we will provide to the astronomical community a Fitting Application. We present a design of the application in this paper. Our design goal is to give the user the flexibility to use a variety of optimization techniques (Levenberg-Marquardt, maximum entropy, Monte Carlo, Powell, downhill simplex, CERN-Minuit, and simulated annealing) and fit statistics (chi (2) , Cash, variance, and maximum likelihood); our modular design allows the user easily to add their own optimization techniques and/or fit statistics. We also present a comparison of the optimization techniques to be provided by the Application. The high spatial and spectral resolutions that will be obtained with AXAF instruments require a sophisticated data modeling capability. We will provide not only a suite of astronomical spatial and spectral source models, but also the capability of combining these models into source models of up to four data dimensions (i.e., into source functions f(E,x,y,t)). We will also provide tools to create instrument response models appropriate for each observation.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
Statistical Modeling for Radiation Hardness Assurance: Toward Bigger Data
NASA Technical Reports Server (NTRS)
Ladbury, R.; Campola, M. J.
2015-01-01
New approaches to statistical modeling in radiation hardness assurance are discussed. These approaches yield quantitative bounds on flight-part radiation performance even in the absence of conventional data sources. This allows the analyst to bound radiation risk at all stages and for all decisions in the RHA process. It also allows optimization of RHA procedures for the project's risk tolerance.
Evaluation of The Operational Benefits Versus Costs of An Automated Cargo Mover
2016-12-01
logistics footprint and life-cycle cost are presented as part of this report. Analysis of modeling and simulation results identified statistically...life-cycle cost are presented as part of this report. Analysis of modeling and simulation results identified statistically significant differences...Error of Estimation. Source: Eskew and Lawler (1994). ...........................75 Figure 24. Load Results (100 Runs per Scenario
Deep Learning to Classify Radiology Free-Text Reports.
Chen, Matthew C; Ball, Robyn L; Yang, Lingyao; Moradzadeh, Nathaniel; Chapman, Brian E; Larson, David B; Langlotz, Curtis P; Amrhein, Timothy J; Lungren, Matthew P
2018-03-01
Purpose To evaluate the performance of a deep learning convolutional neural network (CNN) model compared with a traditional natural language processing (NLP) model in extracting pulmonary embolism (PE) findings from thoracic computed tomography (CT) reports from two institutions. Materials and Methods Contrast material-enhanced CT examinations of the chest performed between January 1, 1998, and January 1, 2016, were selected. Annotations by two human radiologists were made for three categories: the presence, chronicity, and location of PE. Classification of performance of a CNN model with an unsupervised learning algorithm for obtaining vector representations of words was compared with the open-source application PeFinder. Sensitivity, specificity, accuracy, and F1 scores for both the CNN model and PeFinder in the internal and external validation sets were determined. Results The CNN model demonstrated an accuracy of 99% and an area under the curve value of 0.97. For internal validation report data, the CNN model had a statistically significant larger F1 score (0.938) than did PeFinder (0.867) when classifying findings as either PE positive or PE negative, but no significant difference in sensitivity, specificity, or accuracy was found. For external validation report data, no statistical difference between the performance of the CNN model and PeFinder was found. Conclusion A deep learning CNN model can classify radiology free-text reports with accuracy equivalent to or beyond that of an existing traditional NLP model. © RSNA, 2017 Online supplemental material is available for this article.
de Freitas, Patricia Moreira; Menezes, Andressa Nery; da Mota, Ana Carolina Costa; Simões, Alyne; Mendes, Fausto Medeiros; Lago, Andrea Dias Neves; Ferreira, Leila Soares; Ramos-Oliveira, Thayanne Monteiro
2016-01-01
The present study investigated how a hybrid light source (LED/laser) influences temperature variation on the enamel surfaces during 35% hydrogen peroxide (HP) bleaching. Effects on the whitening effectiveness and tooth sensitivity were analyzed. Twenty-two volunteers were randomly assigned to two different treatments in a split-mouth experimental model: group 1 (control), 35% HP; group 2 (experimental), 35% HP + LED/laser. Color evaluation was performed before treatment, and 7 and 14 days after completion of bleaching, using a color shade scale. Tooth sensitivity was assessed using a visual analog scale (VAS; before, immediately, and 24 hours after bleaching). During the bleaching treatment, thermocouple channels positioned on the tooth surfaces recorded the temperature. Data on color and temperature changes were subjected to statistical analysis (α = 5%). Tooth sensitivity data were evaluated descriptively. Groups 1 and 2 showed mean temperatures (± standard deviation) of 30.7 ± 1.2 °C and 34.1 ± 1.3 °C, respectively. It was found that there were statistically significant differences between the groups, with group 2 showing higher mean variation (P < .0001). The highest temperature variation occurred for group 2, with an increase of 5.3 °C at the enamel surface. The color change results showed no differences in bleaching between the two treatment groups (P = .177). The variation of the average temperature during the treatments was not statistically associated with color variation (P = .079). Immediately after bleaching, it was found that 36.4% of the subjects in group 2 had mild to moderate sensitivity. In group 1, 45.5% showed moderate sensitivity. In both groups, the sensitivity ceased within 24 hours. Hybrid light source (LED/ laser) influences temperature variation on the enamel surface during 35% HP bleaching and is not related to greater tooth sensitivity.
NASA Astrophysics Data System (ADS)
Ghannadpour, Seyyed Saeed; Hezarkhani, Ardeshir
2016-03-01
The U-statistic method is one of the most important structural methods to separate the anomaly from the background. It considers the location of samples and carries out the statistical analysis of the data without judging from a geochemical point of view and tries to separate subpopulations and determine anomalous areas. In the present study, to use U-statistic method in three-dimensional (3D) condition, U-statistic is applied on the grade of two ideal test examples, by considering sample Z values (elevation). So far, this is the first time that this method has been applied on a 3D condition. To evaluate the performance of 3D U-statistic method and in order to compare U-statistic with one non-structural method, the method of threshold assessment based on median and standard deviation (MSD method) is applied on the two example tests. Results show that the samples indicated by U-statistic method as anomalous are more regular and involve less dispersion than those indicated by the MSD method. So that, according to the location of anomalous samples, denser areas of them can be determined as promising zones. Moreover, results show that at a threshold of U = 0, the total error of misclassification for U-statistic method is much smaller than the total error of criteria of bar {x}+n× s. Finally, 3D model of two test examples for separating anomaly from background using 3D U-statistic method is provided. The source code for a software program, which was developed in the MATLAB programming language in order to perform the calculations of the 3D U-spatial statistic method, is additionally provided. This software is compatible with all the geochemical varieties and can be used in similar exploration projects.
A source-specific model for lossless compression of global Earth data
NASA Astrophysics Data System (ADS)
Kess, Barbara Lynne
A Source Specific Model for Global Earth Data (SSM-GED) is a lossless compression method for large images that captures global redundancy in the data and achieves a significant improvement over CALIC and DCXT-BT/CARP, two leading lossless compression schemes. The Global Land 1-Km Advanced Very High Resolution Radiometer (AVHRR) data, which contains 662 Megabytes (MB) per band, is an example of a large data set that requires decompression of regions of the data. For this reason, SSM-GED compresses the AVHRR data as a collection of subwindows. This approach defines the statistical parameters for the model prior to compression. Unlike universal models that assume no a priori knowledge of the data, SSM-GED captures global redundancy that exists among all of the subwindows of data. The overlap in parameters among subwindows of data enables SSM-GED to improve the compression rate by increasing the number of parameters and maintaining a small model cost for each subwindow of data. This lossless compression method is applicable to other large volumes of image data such as video.
NASA Technical Reports Server (NTRS)
Mihalas, D.; Kunasz, P. B.
1978-01-01
The coupled radiative transfer and statistical equilibrium equations for multilevel ionic structures in the atmospheres of early-type stars are solved. Both lines and continua are treated consistently; the treatment is applicable throughout a transonic wind, and allows for the presence of background continuum sources and sinks in the transfer. An equivalent-two-level-atoms approach provides the solution for the equations. Calculations for simplified He (+)-like model atoms in parameterized isothermal wind models indicate that subordinate line profiles are sensitive to the assumed mass-loss rate, and to the assumed structure of the velocity law in the atmospheres.
Threshold Values for Identification of Contamination Predicted by Reduced-Order Models
Last, George V.; Murray, Christopher J.; Bott, Yi-Ju; ...
2014-12-31
The U.S. Department of Energy’s (DOE’s) National Risk Assessment Partnership (NRAP) Project is developing reduced-order models to evaluate potential impacts on underground sources of drinking water (USDWs) if CO2 or brine leaks from deep CO2 storage reservoirs. Threshold values, below which there would be no predicted impacts, were determined for portions of two aquifer systems. These threshold values were calculated using an interwell approach for determining background groundwater concentrations that is an adaptation of methods described in the U.S. Environmental Protection Agency’s Unified Guidance for Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities.
Statistical fluctuations as the origin of nontopological solitons
NASA Technical Reports Server (NTRS)
Griest, Kim; Kolb, Edward W.; Masarotti, Alessandro
1989-01-01
Nontopological solitons can be formed during a phase transition in the early universe as long as some net charge can be trapped in regions of false vacuum. It has been previously suggested that a particle-antiparticle asymmetry would provide a source for such trapped charge. It is pointed out that, for the model and parameters considered, statistical fluctuations provide a much larger concentration of charge, and are therefore, the dominant source of charge fluctuations in solitogenesis.
IMFIT: A FAST, FLEXIBLE NEW PROGRAM FOR ASTRONOMICAL IMAGE FITTING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erwin, Peter; Universitäts-Sternwarte München, Scheinerstrasse 1, D-81679 München
2015-02-01
I describe a new, open-source astronomical image-fitting program called IMFIT, specialized for galaxies but potentially useful for other sources, which is fast, flexible, and highly extensible. A key characteristic of the program is an object-oriented design that allows new types of image components (two-dimensional surface-brightness functions) to be easily written and added to the program. Image functions provided with IMFIT include the usual suspects for galaxy decompositions (Sérsic, exponential, Gaussian), along with Core-Sérsic and broken-exponential profiles, elliptical rings, and three components that perform line-of-sight integration through three-dimensional luminosity-density models of disks and rings seen at arbitrary inclinations. Available minimization algorithmsmore » include Levenberg-Marquardt, Nelder-Mead simplex, and Differential Evolution, allowing trade-offs between speed and decreased sensitivity to local minima in the fit landscape. Minimization can be done using the standard χ{sup 2} statistic (using either data or model values to estimate per-pixel Gaussian errors, or else user-supplied error images) or Poisson-based maximum-likelihood statistics; the latter approach is particularly appropriate for cases of Poisson data in the low-count regime. I show that fitting low-signal-to-noise ratio galaxy images using χ{sup 2} minimization and individual-pixel Gaussian uncertainties can lead to significant biases in fitted parameter values, which are avoided if a Poisson-based statistic is used; this is true even when Gaussian read noise is present.« less
Estimate of main local sources to ambient ultrafine particle number concentrations in an urban area
NASA Astrophysics Data System (ADS)
Rahman, Md Mahmudur; Mazaheri, Mandana; Clifford, Sam; Morawska, Lidia
2017-09-01
Quantifying and apportioning the contribution of a range of sources to ultrafine particles (UFPs, D < 100 nm) is a challenge due to the complex nature of the urban environments. Although vehicular emissions have long been considered one of the major sources of ultrafine particles in urban areas, the contribution of other major urban sources is not yet fully understood. This paper aims to determine and quantify the contribution of local ground traffic, nucleated particle (NP) formation and distant non-traffic (e.g. airport, oil refineries, and seaport) sources to the total ambient particle number concentration (PNC) in a busy, inner-city area in Brisbane, Australia using Bayesian statistical modelling and other exploratory tools. The Bayesian model was trained on the PNC data on days where NP formations were known to have not occurred, hourly traffic counts, solar radiation data, and smooth daily trend. The model was applied to apportion and quantify the contribution of NP formations and local traffic and non-traffic sources to UFPs. The data analysis incorporated long-term measured time-series of total PNC (D ≥ 6 nm), particle number size distributions (PSD, D = 8 to 400 nm), PM2.5, PM10, NOx, CO, meteorological parameters and traffic counts at a stationary monitoring site. The developed Bayesian model showed reliable predictive performances in quantifying the contribution of NP formation events to UFPs (up to 4 × 104 particles cm- 3), with a significant day to day variability. The model identified potential NP formation and no-formations days based on PNC data and quantified the sources contribution to UFPs. Exploratory statistical analyses show that total mean PNC during the middle of the day was up to 32% higher than during peak morning and evening traffic periods, which were associated with NP formation events. The majority of UFPs measured during the peak traffic and NP formation periods were between 30-100 nm and smaller than 30 nm, respectively. To date, this is the first application of Bayesian model to apportion different sources contribution to UFPs, and therefore the importance of this study is not only in its modelling outcomes but in demonstrating the applicability and advantages of this statistical approach to air pollution studies.
The equal load-sharing model of cascade failures in power grids
NASA Astrophysics Data System (ADS)
Scala, Antonio; De Sanctis Lucentini, Pier Giorgio
2016-11-01
Electric power-systems are one of the most important critical infrastructures. In recent years, they have been exposed to extreme stress due to the increasing power demand, the introduction of distributed renewable energy sources, and the development of extensive interconnections. We investigate the phenomenon of abrupt breakdown of an electric power-system under two scenarios: load growth (mimicking the ever-increasing customer demand) and power fluctuations (mimicking the effects of renewable sources). Our results indicate that increasing the system size causes breakdowns to become more abrupt; in fact, mapping the system to a solvable statistical-physics model indicates the occurrence of a first order transition in the large size limit. Such an enhancement for the systemic risk failures (black-outs) with increasing network size is an effect that should be considered in the current projects aiming to integrate national power-grids into ;super-grids;.
Teo, Guoshou; Kim, Sinae; Tsou, Chih-Chiang; Collins, Ben; Gingras, Anne-Claude; Nesvizhskii, Alexey I; Choi, Hyungwon
2015-11-03
Data independent acquisition (DIA) mass spectrometry is an emerging technique that offers more complete detection and quantification of peptides and proteins across multiple samples. DIA allows fragment-level quantification, which can be considered as repeated measurements of the abundance of the corresponding peptides and proteins in the downstream statistical analysis. However, few statistical approaches are available for aggregating these complex fragment-level data into peptide- or protein-level statistical summaries. In this work, we describe a software package, mapDIA, for statistical analysis of differential protein expression using DIA fragment-level intensities. The workflow consists of three major steps: intensity normalization, peptide/fragment selection, and statistical analysis. First, mapDIA offers normalization of fragment-level intensities by total intensity sums as well as a novel alternative normalization by local intensity sums in retention time space. Second, mapDIA removes outlier observations and selects peptides/fragments that preserve the major quantitative patterns across all samples for each protein. Last, using the selected fragments and peptides, mapDIA performs model-based statistical significance analysis of protein-level differential expression between specified groups of samples. Using a comprehensive set of simulation datasets, we show that mapDIA detects differentially expressed proteins with accurate control of the false discovery rates. We also describe the analysis procedure in detail using two recently published DIA datasets generated for 14-3-3β dynamic interaction network and prostate cancer glycoproteome. The software was written in C++ language and the source code is available for free through SourceForge website http://sourceforge.net/projects/mapdia/.This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.
Bayesian Modeling of Exposure and Airflow Using Two-Zone Models
Zhang, Yufen; Banerjee, Sudipto; Yang, Rui; Lungu, Claudiu; Ramachandran, Gurumurthy
2009-01-01
Mathematical modeling is being increasingly used as a means for assessing occupational exposures. However, predicting exposure in real settings is constrained by lack of quantitative knowledge of exposure determinants. Validation of models in occupational settings is, therefore, a challenge. Not only do the model parameters need to be known, the models also need to predict the output with some degree of accuracy. In this paper, a Bayesian statistical framework is used for estimating model parameters and exposure concentrations for a two-zone model. The model predicts concentrations in a zone near the source and far away from the source as functions of the toluene generation rate, air ventilation rate through the chamber, and the airflow between near and far fields. The framework combines prior or expert information on the physical model along with the observed data. The framework is applied to simulated data as well as data obtained from the experiments conducted in a chamber. Toluene vapors are generated from a source under different conditions of airflow direction, the presence of a mannequin, and simulated body heat of the mannequin. The Bayesian framework accounts for uncertainty in measurement as well as in the unknown rate of airflow between the near and far fields. The results show that estimates of the interzonal airflow are always close to the estimated equilibrium solutions, which implies that the method works efficiently. The predictions of near-field concentration for both the simulated and real data show nice concordance with the true values, indicating that the two-zone model assumptions agree with the reality to a large extent and the model is suitable for predicting the contaminant concentration. Comparison of the estimated model and its margin of error with the experimental data thus enables validation of the physical model assumptions. The approach illustrates how exposure models and information on model parameters together with the knowledge of uncertainty and variability in these quantities can be used to not only provide better estimates of model outputs but also model parameters. PMID:19403840
Estimating the Diets of Animals Using Stable Isotopes and a Comprehensive Bayesian Mixing Model
Hopkins, John B.; Ferguson, Jake M.
2012-01-01
Using stable isotope mixing models (SIMMs) as a tool to investigate the foraging ecology of animals is gaining popularity among researchers. As a result, statistical methods are rapidly evolving and numerous models have been produced to estimate the diets of animals—each with their benefits and their limitations. Deciding which SIMM to use is contingent on factors such as the consumer of interest, its food sources, sample size, the familiarity a user has with a particular framework for statistical analysis, or the level of inference the researcher desires to make (e.g., population- or individual-level). In this paper, we provide a review of commonly used SIMM models and describe a comprehensive SIMM that includes all features commonly used in SIMM analysis and two new features. We used data collected in Yosemite National Park to demonstrate IsotopeR's ability to estimate dietary parameters. We then examined the importance of each feature in the model and compared our results to inferences from commonly used SIMMs. IsotopeR's user interface (in R) will provide researchers a user-friendly tool for SIMM analysis. The model is also applicable for use in paleontology, archaeology, and forensic studies as well as estimating pollution inputs. PMID:22235246
Valid Statistical Analysis for Logistic Regression with Multiple Sources
NASA Astrophysics Data System (ADS)
Fienberg, Stephen E.; Nardi, Yuval; Slavković, Aleksandra B.
Considerable effort has gone into understanding issues of privacy protection of individual information in single databases, and various solutions have been proposed depending on the nature of the data, the ways in which the database will be used and the precise nature of the privacy protection being offered. Once data are merged across sources, however, the nature of the problem becomes far more complex and a number of privacy issues arise for the linked individual files that go well beyond those that are considered with regard to the data within individual sources. In the paper, we propose an approach that gives full statistical analysis on the combined database without actually combining it. We focus mainly on logistic regression, but the method and tools described may be applied essentially to other statistical models as well.
NASA Astrophysics Data System (ADS)
Pandolfi, Marco; Alastuey, Andrés; Pérez, Noemi; Reche, Cristina; Castro, Iria; Shatalov, Victor; Querol, Xavier
2016-09-01
In this work for the first time data from two twin stations (Barcelona, urban background, and Montseny, regional background), located in the northeast (NE) of Spain, were used to study the trends of the concentrations of different chemical species in PM10 and PM2.5 along with the trends of the PM10 source contributions from the positive matrix factorization (PMF) model. Eleven years of chemical data (2004-2014) were used for this study. Trends of both species concentrations and source contributions were studied using the Mann-Kendall test for linear trends and a new approach based on multi-exponential fit of the data. Despite the fact that different PM fractions (PM2.5, PM10) showed linear decreasing trends at both stations, the contributions of specific sources of pollutants and of their chemical tracers showed exponential decreasing trends. The different types of trends observed reflected the different effectiveness and/or time of implementation of the measures taken to reduce the concentrations of atmospheric pollutants. Moreover, the trends of the contributions of specific sources such as those related with industrial activities and with primary energy consumption mirrored the effect of the financial crisis in Spain from 2008. The sources that showed statistically significant downward trends at both Barcelona (BCN) and Montseny (MSY) during 2004-2014 were secondary sulfate, secondary nitrate, and V-Ni-bearing source. The contributions from these sources decreased exponentially during the considered period, indicating that the observed reductions were not gradual and consistent over time. Conversely, the trends were less steep at the end of the period compared to the beginning, thus likely indicating the attainment of a lower limit. Moreover, statistically significant decreasing trends were observed for the contributions to PM from the industrial/traffic source at MSY (mixed metallurgy and road traffic) and from the industrial (metallurgy mainly) source at BCN. These sources were clearly linked with anthropogenic activities, and the observed decreasing trends confirmed the effectiveness of pollution control measures implemented at European or regional/local levels. Conversely, at regional level, the contributions from sources mostly linked with natural processes, such as aged marine and aged organics, did not show statistically significant trends. The trends observed for the PM10 source contributions reflected the trends observed for the chemical tracers of these pollutant sources well.
Walter, Donald A.; Starn, J. Jeffrey
2013-01-01
Statistical models of nitrate occurrence in the glacial aquifer system of the northern United States, developed by the U.S. Geological Survey, use observed relations between nitrate concentrations and sets of explanatory variables—representing well-construction, environmental, and source characteristics— to predict the probability that nitrate, as nitrogen, will exceed a threshold concentration. However, the models do not explicitly account for the processes that control the transport of nitrogen from surface sources to a pumped well and use area-weighted mean spatial variables computed from within a circular buffer around the well as a simplified source-area conceptualization. The use of models that explicitly represent physical-transport processes can inform and, potentially, improve these statistical models. Specifically, groundwater-flow models simulate advective transport—predominant in many surficial aquifers— and can contribute to the refinement of the statistical models by (1) providing for improved, physically based representations of a source area to a well, and (2) allowing for more detailed estimates of environmental variables. A source area to a well, known as a contributing recharge area, represents the area at the water table that contributes recharge to a pumped well; a well pumped at a volumetric rate equal to the amount of recharge through a circular buffer will result in a contributing recharge area that is the same size as the buffer but has a shape that is a function of the hydrologic setting. These volume-equivalent contributing recharge areas will approximate circular buffers in areas of relatively flat hydraulic gradients, such as near groundwater divides, but in areas with steep hydraulic gradients will be elongated in the upgradient direction and agree less with the corresponding circular buffers. The degree to which process-model-estimated contributing recharge areas, which simulate advective transport and therefore account for local hydrologic settings, would inform and improve the development of statistical models can be implicitly estimated by evaluating the differences between explanatory variables estimated from the contributing recharge areas and the circular buffers used to develop existing statistical models. The larger the difference in estimated variables, the more likely that statistical models would be changed, and presumably improved, if explanatory variables estimated from contributing recharge areas were used in model development. Comparing model predictions from the two sets of estimated variables would further quantify—albeit implicitly—how an improved, physically based estimate of explanatory variables would be reflected in model predictions. Differences between the two sets of estimated explanatory variables and resultant model predictions vary spatially; greater differences are associated with areas of steep hydraulic gradients. A direct comparison, however, would require the development of a separate set of statistical models using explanatory variables from contributing recharge areas. Area-weighted means of three environmental variables—silt content, alfisol content, and depth to water from the U.S. Department of Agriculture State Soil Geographic (STATSGO) data—and one nitrogen-source variable (fertilizer-application rate from county data mapped to Enhanced National Land Cover Data 1992 (NLCDe 92) agricultural land use) can vary substantially between circular buffers and volume-equivalent contributing recharge areas and among contributing recharge areas for different sets of well variables. The differences in estimated explanatory variables are a function of the same factors affecting the contributing recharge areas as well as the spatial resolution and local distribution of the underlying spatial data. As a result, differences in estimated variables between circular buffers and contributing recharge areas are complex and site specific as evidenced by differences in estimated variables for circular buffers and contributing recharge areas of existing public-supply and network wells in the Great Miami River Basin. Large differences in areaweighted mean environmental variables are observed at the basin scale, determined by using the network of uniformly spaced hypothetical wells; the differences have a spatial pattern that generally is similar to spatial patterns in the underlying STATSGO data. Generally, the largest differences were observed for area-weighted nitrogen-application rate from county and national land-use data; the basin-scale differences ranged from -1,600 (indicating a larger value from within the volume-equivalent contributing recharge area) to 1,900 kilograms per year (kg/yr); the range in the underlying spatial data was from 0 to 2,200 kg/yr. Silt content, alfisol content, and nitrogen-application rate are defined by the underlying spatial data and are external to the groundwater system; however, depth to water is an environmental variable that can be estimated in more detail and, presumably, in a more physically based manner using a groundwater-flow model than using the spatial data. Model-calculated depths to water within circular buffers in the Great Miami River Basin differed substantially from values derived from the spatial data and had a much larger range. Differences in estimates of area-weighted spatial variables result in corresponding differences in predictions of nitrate occurrence in the aquifer. In addition to the factors affecting contributing recharge areas and estimated explanatory variables, differences in predictions also are a function of the specific set of explanatory variables used and the fitted slope coefficients in a given model. For models that predicted the probability of exceeding 1 and 4 milligrams per liter as nitrogen (mg/L as N), predicted probabilities using variables estimated from circular buffers and contributing recharge areas generally were correlated but differed significantly at the local and basin scale. The scale and distribution of prediction differences can be explained by the underlying differences in the estimated variables and the relative weight of the variables in the statistical models. Differences in predictions of exceeding 1 mg/L as N, which only includes environmental variables, generally correlated with the underlying differences in STATSGO data, whereas differences in exceeding 4 mg/L as N were more spatially extensive because that model included environmental and nitrogen-source variables. Using depths to water from within circular buffers derived from the spatial data and depths to water within the circular buffers calculated from the groundwater-flow model, restricted to the same range, resulted in large differences in predicted probabilities. The differences in estimated explanatory variables between contributing recharge areas and circular buffers indicate incorporation of physically based contributing recharge area likely would result in a different set of explanatory variables and an improved set of statistical models. The use of a groundwater-flow model to improve representations of source areas or to provide more-detailed estimates of specific explanatory variables includes a number of limitations and technical considerations. An assumption in these analyses is that (1) there is a state of mass balance between recharge and pumping, and (2) transport to a pumped well is under a steady state flow field. Comparison of volumeequivalent contributing recharge areas under steady-state and transient transport conditions at a location in the southeastern part of the basin shows the steady-state contributing recharge area is a reasonable approximation of the transient contributing recharge area after between 10 and 20 years of pumping. The first assumption is a more important consideration for this analysis. A gradient effect refers to a condition where simulated pumping from a well is less than recharge through the corresponding contributing recharge area. This generally takes place in areas with steep hydraulic gradients, such as near discharge locations, and can be mitigated using a finer model discretization. A boundary effect refers to a condition where recharge through the contributing recharge area is less than pumping. This indicates other sources of water to the simulated well and could reflect a real hydrologic process. In the Great Miami River Basin, large gradient and boundary effects—defined as the balance between pumping and recharge being less than half—occurred in 5 and 14 percent of the basin, respectively. The agreement between circular buffers and volume-equivalent contributing recharge areas, differences in estimated variables, and the effect on statisticalmodel predictions between the population of wells with a balance between pumping and recharge within 10 percent and the population of all wells were similar. This indicated process-model limitations did not affect the overall findings in the Great Miami River Basin; however, this would be model specific, and prudent use of a process model needs to entail a limitations analysis and, if necessary, alterations to the model.
Long, Zhiying; Chen, Kewei; Wu, Xia; Reiman, Eric; Peng, Danling; Yao, Li
2009-02-01
Spatial Independent component analysis (sICA) has been widely used to analyze functional magnetic resonance imaging (fMRI) data. The well accepted implicit assumption is the spatially statistical independency of intrinsic sources identified by sICA, making the sICA applications difficult for data in which there exist interdependent sources and confounding factors. This interdependency can arise, for instance, from fMRI studies investigating two tasks in a single session. In this study, we introduced a linear projection approach and considered its utilization as a tool to separate task-related components from two-task fMRI data. The robustness and feasibility of the method are substantiated through simulation on computer data and fMRI real rest data. Both simulated and real two-task fMRI experiments demonstrated that sICA in combination with the projection method succeeded in separating spatially dependent components and had better detection power than pure model-based method when estimating activation induced by each task as well as both tasks.
Model Parameter Variability for Enhanced Anaerobic Bioremediation of DNAPL Source Zones
NASA Astrophysics Data System (ADS)
Mao, X.; Gerhard, J. I.; Barry, D. A.
2005-12-01
The objective of the Source Area Bioremediation (SABRE) project, an international collaboration of twelve companies, two government agencies and three research institutions, is to evaluate the performance of enhanced anaerobic bioremediation for the treatment of chlorinated ethene source areas containing dense, non-aqueous phase liquids (DNAPL). This 4-year, 5.7 million dollars research effort focuses on a pilot-scale demonstration of enhanced bioremediation at a trichloroethene (TCE) DNAPL field site in the United Kingdom, and includes a significant program of laboratory and modelling studies. Prior to field implementation, a large-scale, multi-laboratory microcosm study was performed to determine the optimal system properties to support dehalogenation of TCE in site soil and groundwater. This statistically-based suite of experiments measured the influence of key variables (electron donor, nutrient addition, bioaugmentation, TCE concentration and sulphate concentration) in promoting the reductive dechlorination of TCE to ethene. As well, a comprehensive biogeochemical numerical model was developed for simulating the anaerobic dehalogenation of chlorinated ethenes. An appropriate (reduced) version of this model was combined with a parameter estimation method based on fitting of the experimental results. Each of over 150 individual microcosm calibrations involved matching predicted and observed time-varying concentrations of all chlorinated compounds. This study focuses on an analysis of this suite of fitted model parameter values. This includes determining the statistical correlation between parameters typically employed in standard Michaelis-Menten type rate descriptions (e.g., maximum dechlorination rates, half-saturation constants) and the key experimental variables. The analysis provides insight into the degree to which aqueous phase TCE and cis-DCE inhibit dechlorination of less-chlorinated compounds. Overall, this work provides a database of the numerical modelling parameters typically employed for simulating TCE dechlorination relevant for a range of system conditions (e.g, bioaugmented, high TCE concentrations, etc.). The significance of the obtained variability of parameters is illustrated with one-dimensional simulations of enhanced anaerobic bioremediation of residual TCE DNAPL.
Spatiotemporal Bayesian analysis of Lyme disease in New York state, 1990-2000.
Chen, Haiyan; Stratton, Howard H; Caraco, Thomas B; White, Dennis J
2006-07-01
Mapping ordinarily increases our understanding of nontrivial spatial and temporal heterogeneities in disease rates. However, the large number of parameters required by the corresponding statistical models often complicates detailed analysis. This study investigates the feasibility of a fully Bayesian hierarchical regression approach to the problem and identifies how it outperforms two more popular methods: crude rate estimates (CRE) and empirical Bayes standardization (EBS). In particular, we apply a fully Bayesian approach to the spatiotemporal analysis of Lyme disease incidence in New York state for the period 1990-2000. These results are compared with those obtained by CRE and EBS in Chen et al. (2005). We show that the fully Bayesian regression model not only gives more reliable estimates of disease rates than the other two approaches but also allows for tractable models that can accommodate more numerous sources of variation and unknown parameters.
NASA Astrophysics Data System (ADS)
Tumanov, Sergiu
A test of goodness of fit based on rank statistics was applied to prove the applicability of the Eggenberger-Polya discrete probability law to hourly SO 2-concentrations measured in the vicinity of single sources. With this end in view, the pollutant concentration was considered an integral quantity which may be accepted if one properly chooses the unit of measurement (in this case μg m -3) and if account is taken of the limited accuracy of measurements. The results of the test being satisfactory, even in the range of upper quantiles, the Eggenberger-Polya law was used in association with numerical modelling to estimate statistical parameters, e.g. quantiles, cumulative probabilities of threshold concentrations to be exceeded, and so on, in the grid points of a network covering the area of interest. This only needs accurate estimations of means and variances of the concentration series which can readily be obtained through routine air pollution dispersion modelling.
Park, Eun Sug; Symanski, Elaine; Han, Daikwon; Spiegelman, Clifford
2015-06-01
A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.
Gokhale, Sharad; Raokhande, Namita
2008-05-01
There are several models that can be used to evaluate roadside air quality. The comparison of the operational performance of different models pertinent to local conditions is desirable so that the model that performs best can be identified. Three air quality models, namely the 'modified General Finite Line Source Model' (M-GFLSM) of particulates, the 'California Line Source' (CALINE3) model, and the 'California Line Source for Queuing & Hot Spot Calculations' (CAL3QHC) model have been identified for evaluating the air quality at one of the busiest traffic intersections in the city of Guwahati. These models have been evaluated statistically with the vehicle-derived airborne particulate mass emissions in two sizes, i.e. PM10 and PM2.5, the prevailing meteorology and the temporal distribution of the measured daily average PM10 and PM2.5 concentrations in wintertime. The study has shown that the CAL3QHC model would make better predictions compared to other models for varied meteorology and traffic conditions. The detailed study reveals that the agreements between the measured and the modeled PM10 and PM2.5 concentrations have been reasonably good for CALINE3 and CAL3QHC models. Further detailed analysis shows that the CAL3QHC model performed well compared to the CALINE3. The monthly performance measures have also led to the similar results. These two models have also outperformed for a class of wind speed velocities except for low winds (<1 m s(-1)), for which, the M-GFLSM model has shown the tendency of better performance for PM10. Nevertheless, the CAL3QHC model has outperformed for both the particulate sizes and for all the wind classes, which therefore can be optional for air quality assessment at urban traffic intersections.
Cancer Related-Knowledge - Small Area Estimates
These model-based estimates are produced using statistical models that combine data from the Health Information National Trends Survey, and auxiliary variables obtained from relevant sources and borrow strength from other areas with similar characteristics.
Towards a Unified Source-Propagation Model of Cosmic Rays
NASA Astrophysics Data System (ADS)
Taylor, M.; Molla, M.
2010-07-01
It is well known that the cosmic ray energy spectrum is multifractal with the analysis of cosmic ray fluxes as a function of energy revealing a first “knee” slightly below 1016 eV, a second knee slightly below 1018 eV and an “ankle” close to 1019 eV. The behaviour of the highest energy cosmic rays around and above the ankle is still a mystery and precludes the development of a unified source-propagation model of cosmic rays from their source origin to Earth. A variety of acceleration and propagation mechanisms have been proposed to explain different parts of the spectrum the most famous of course being Fermi acceleration in magnetised turbulent plasmas (Fermi 1949). Many others have been proposd for energies at and below the first knee (Peters & Cimento (1961); Lagage & Cesarsky (1983); Drury et al. (1984); Wdowczyk & Wolfendale (1984); Ptuskin et al. (1993); Dova et al. (0000); Horandel et al. (2002); Axford (1991)) as well as at higher energies between the first knee and the ankle (Nagano & Watson (2000); Bhattacharjee & Sigl (2000); Malkov & Drury (2001)). The recent fit of most of the cosmic ray spectrum up to the ankle using non-extensive statistical mechanics (NESM) (Tsallis et al. (2003)) provides what may be the strongest evidence for a source-propagation system deviating significantly from Boltmann statistics. As Tsallis has shown (Tsallis et al. (2003)), the knees appear as crossovers between two fractal-like thermal regimes. In this work, we have developed a generalisation of the second order NESM model (Tsallis et al. (2003)) to higher orders and we have fit the complete spectrum including the ankle with third order NESM. We find that, towards the GDZ limit, a new mechanism comes into play. Surprisingly it also presents as a modulation akin to that in our own local neighbourhood of cosmic rays emitted by the sun. We propose that this is due to modulation at the source and is possibly due to processes in the shell of the originating supernova. We report that the entire spectrum, spanning cosmic rays of local solar origin and those eminating from galactic and extra-galactic sources can be explained using a new diagnostic — the gradient of the log-log plot. This diagnostic reveals the known Boltmann statistics in the solar-terrestrial neighbourhood but at the highest energies — presumably at the cosmic ray source, with clearly separated fractal scales in between. We interpret this as modulation at the source followed by Fermi acceleration facilitated by galactic and extra-galactic magnetic fields with a final modulation in the solar-terrestrial neighbourhood. We conclude that the gradient of multifractal curves appears to be an excellent detector of fractality.
Quantum Theory of Superresolution for Incoherent Optical Imaging
NASA Astrophysics Data System (ADS)
Tsang, Mankei
Rayleigh's criterion for resolving two incoherent point sources has been the most influential measure of optical imaging resolution for over a century. In the context of statistical image processing, violation of the criterion is especially detrimental to the estimation of the separation between the sources, and modern far-field superresolution techniques rely on suppressing the emission of close sources to enhance the localization precision. Using quantum optics, quantum metrology, and statistical analysis, here we show that, even if two close incoherent sources emit simultaneously, measurements with linear optics and photon counting can estimate their separation from the far field almost as precisely as conventional methods do for isolated sources, rendering Rayleigh's criterion irrelevant to the problem. Our results demonstrate that superresolution can be achieved not only for fluorophores but also for stars. Recent progress in generalizing our theory for multiple sources and spectroscopy will also be discussed. This work is supported by the Singapore National Research Foundation under NRF Grant No. NRF-NRFF2011-07 and the Singapore Ministry of Education Academic Research Fund Tier 1 Project R-263-000-C06-112.
Martian methane plume models for defining Mars rover methane source search strategies
NASA Astrophysics Data System (ADS)
Nicol, Christopher; Ellery, Alex; Lynch, Brian; Cloutis, Ed
2018-07-01
The detection of atmospheric methane on Mars implies an active methane source. This introduces the possibility of a biotic source with the implied need to determine whether the methane is indeed biotic in nature or geologically generated. There is a clear need for robotic algorithms which are capable of manoeuvring a rover through a methane plume on Mars to locate its source. We explore aspects of Mars methane plume modelling to reveal complex dynamics characterized by advection and diffusion. A statistical analysis of the plume model has been performed and compared to analyses of terrestrial plume models. Finally, we consider a robotic search strategy to find a methane plume source. We find that gradient-based techniques are ineffective, but that more sophisticated model-based search strategies are unlikely to be available in near-term rover missions.
An analytic technique for statistically modeling random atomic clock errors in estimation
NASA Technical Reports Server (NTRS)
Fell, P. J.
1981-01-01
Minimum variance estimation requires that the statistics of random observation errors be modeled properly. If measurements are derived through the use of atomic frequency standards, then one source of error affecting the observable is random fluctuation in frequency. This is the case, for example, with range and integrated Doppler measurements from satellites of the Global Positioning and baseline determination for geodynamic applications. An analytic method is presented which approximates the statistics of this random process. The procedure starts with a model of the Allan variance for a particular oscillator and develops the statistics of range and integrated Doppler measurements. A series of five first order Markov processes is used to approximate the power spectral density obtained from the Allan variance.
Meteorological models for estimating phenology of corn
NASA Technical Reports Server (NTRS)
Daughtry, C. S. T.; Cochran, J. C.; Hollinger, S. E.
1984-01-01
Knowledge of when critical crop stages occur and how the environment affects them should provide useful information for crop management decisions and crop production models. Two sources of data were evaluated for predicting dates of silking and physiological maturity of corn (Zea mays L.). Initial evaluations were conducted using data of an adapted corn hybrid grown on a Typic Agriaquoll at the Purdue University Agronomy Farm. The second phase extended the analyses to large areas using data acquired by the Statistical Reporting Service of USDA for crop reporting districts (CRD) in Indiana and Iowa. Several thermal models were compared to calendar days for predicting dates of silking and physiological maturity. Mixed models which used a combination of thermal units to predict silking and days after silking to predict physiological maturity were also evaluated. At the Agronomy Farm the models were calibrated and tested on the same data. The thermal models were significantly less biased and more accurate than calendar days for predicting dates of silking. Differences among the thermal models were small. Significant improvements in both bias and accuracy were observed when the mixed models were used to predict dates of physiological maturity. The results indicate that statistical data for CRD can be used to evaluate models developed at agricultural experiment stations.
A search for AGN activity in Infrared-Faint Radio Sources (IFRS)
NASA Astrophysics Data System (ADS)
Lenc, Emil; Middelberg, Enno; Norris, Ray; Mao, Minnie
2009-04-01
We propose to observe a large sample of radio sources from the ATLAS (Australia Telescope Large Area Survey) source catalogue with the LBA, to determine their compactness. The sample consists of 36 sources with no counterpart in the co-located SWIRE survey (3.6 um to 160 um), carried out with the Spitzer Space Telescope. This rare class of sources, dubber Infrared-Faint Radio Sources (IFRS), is inconsistent with current galaxy evolution models. VLBI observations are an essential way to obtain further clues on what these objects are and why they are hidden from infrared observations. We will measure the flux densities on long baselines to determine their compactness. Only five IFRS have been previously targeted with VLBI observations (resulting in two detections). We propose using single baseline (Parkes-ATCA) eVLBI observations with the LBA at 1 Gbps to maximise sensitivity. With the observations proposed here we will increase the number of VLBI-observed IFRS from 5 to 36, allowing us to draw statistical conclusions about this intriguing new class of objects.
A search for AGN activity in Infrared-Faint Radio Sources (IFRS)
NASA Astrophysics Data System (ADS)
Lenc, Emil; Middelberg, Enno; Norris, Ray; Mao, Minnie
2010-04-01
We propose to observe a large sample of radio sources from the ATLAS (Australia Telescope Large Area Survey) source catalogue with the LBA, to determine their compactness. The sample consists of 36 sources with no counterpart in the co-located SWIRE survey (3.6 um to 160 um), carried out with the Spitzer Space Telescope. This rare class of sources, dubber Infrared-Faint Radio Sources (IFRS), is inconsistent with current galaxy evolution models. VLBI observations are an essential way to obtain further clues on what these objects are and why they are hidden from infrared observations. We will measure the flux densities on long baselines to determine their compactness. Only five IFRS have been previously targeted with VLBI observations (resulting in two detections). We propose using single baseline (Parkes-ATCA) eVLBI observations with the LBA at 1 Gbps to maximise sensitivity. With the observations proposed here we will increase the number of VLBI-observed IFRS from 5 to 36, allowing us to draw statistical conclusions about this intriguing new class of objects.
Efficient Moment-Based Inference of Admixture Parameters and Sources of Gene Flow
Levin, Alex; Reich, David; Patterson, Nick; Berger, Bonnie
2013-01-01
The recent explosion in available genetic data has led to significant advances in understanding the demographic histories of and relationships among human populations. It is still a challenge, however, to infer reliable parameter values for complicated models involving many populations. Here, we present MixMapper, an efficient, interactive method for constructing phylogenetic trees including admixture events using single nucleotide polymorphism (SNP) genotype data. MixMapper implements a novel two-phase approach to admixture inference using moment statistics, first building an unadmixed scaffold tree and then adding admixed populations by solving systems of equations that express allele frequency divergences in terms of mixture parameters. Importantly, all features of the model, including topology, sources of gene flow, branch lengths, and mixture proportions, are optimized automatically from the data and include estimates of statistical uncertainty. MixMapper also uses a new method to express branch lengths in easily interpretable drift units. We apply MixMapper to recently published data for Human Genome Diversity Cell Line Panel individuals genotyped on a SNP array designed especially for use in population genetics studies, obtaining confident results for 30 populations, 20 of them admixed. Notably, we confirm a signal of ancient admixture in European populations—including previously undetected admixture in Sardinians and Basques—involving a proportion of 20–40% ancient northern Eurasian ancestry. PMID:23709261
Statistical Analysis of the Impacts of Regional Transportation on the Air Quality in Beijing
NASA Astrophysics Data System (ADS)
Huang, Zhongwen; Zhang, Huiling; Tong, Lei; Xiao, Hang
2016-04-01
From October to December 2015, Beijing-Tianjin-Hebei (BTH) region had experienced several severe haze events. In order to assess the effects of the regional transportation on the air quality in Beijing, the air monitoring data (PM2.5, SO2, NO2 and CO) from that period published by Chinese National Environmental Monitoring Center (CNEMC) was collected and analyzed with various statistical models. The cities within BTH area were clustered into three groups according to the geographical conditions, while the air pollutant concentrations of cities within a group sharing similar variation trends. The Granger causality test results indicate that significant causal relationships exist between the air pollutant data of Beijing and its surrounding cities (Baoding, Chengde, Tianjin and Zhangjiakou) for the reference period. Then, linear regression models were constructed to capture the interdependency among the multiple time series. It shows that the observed air pollutant concentrations in Beijing were well consistent with the model-fitted results. More importantly, further analysis suggests that the air pollutants in Beijing were strongly affected by regional transportation, as the local sources only contributed 17.88%, 27.12%, 14.63% and 31.36% of PM2.5, SO2, NO2 and CO concentrations, respectively. And the major foreign source for Beijing was from Southwest (Baoding) direction, account for more than 42% of all these air pollutants. Thus, by combining various statistical models, it may not only be able to quickly predict the air qualities of any cities on a regional scale, but also to evaluate the local and regional source contributions for a particular city. Key words: regional transportation, air pollution, Granger causality test, statistical models
Monitoring alert and drowsy states by modeling EEG source nonstationarity
NASA Astrophysics Data System (ADS)
Hsu, Sheng-Hsiou; Jung, Tzyy-Ping
2017-10-01
Objective. As a human brain performs various cognitive functions within ever-changing environments, states of the brain characterized by recorded brain activities such as electroencephalogram (EEG) are inevitably nonstationary. The challenges of analyzing the nonstationary EEG signals include finding neurocognitive sources that underlie different brain states and using EEG data to quantitatively assess the state changes. Approach. This study hypothesizes that brain activities under different states, e.g. levels of alertness, can be modeled as distinct compositions of statistically independent sources using independent component analysis (ICA). This study presents a framework to quantitatively assess the EEG source nonstationarity and estimate levels of alertness. The framework was tested against EEG data collected from 10 subjects performing a sustained-attention task in a driving simulator. Main results. Empirical results illustrate that EEG signals under alert versus drowsy states, indexed by reaction speeds to driving challenges, can be characterized by distinct ICA models. By quantifying the goodness-of-fit of each ICA model to the EEG data using the model deviation index (MDI), we found that MDIs were significantly correlated with the reaction speeds (r = -0.390 with alertness models and r = 0.449 with drowsiness models) and the opposite correlations indicated that the two models accounted for sources in the alert and drowsy states, respectively. Based on the observed source nonstationarity, this study also proposes an online framework using a subject-specific ICA model trained with an initial (alert) state to track the level of alertness. For classification of alert against drowsy states, the proposed online framework achieved an averaged area-under-curve of 0.745 and compared favorably with a classic power-based approach. Significance. This ICA-based framework provides a new way to study changes of brain states and can be applied to monitoring cognitive or mental states of human operators in attention-critical settings or in passive brain-computer interfaces.
MOSFiT: Modular Open Source Fitter for Transients
NASA Astrophysics Data System (ADS)
Guillochon, James; Nicholl, Matt; Villar, V. Ashley; Mockler, Brenna; Narayan, Gautham; Mandel, Kaisey S.; Berger, Edo; Williams, Peter K. G.
2018-05-01
Much of the progress made in time-domain astronomy is accomplished by relating observational multiwavelength time-series data to models derived from our understanding of physical laws. This goal is typically accomplished by dividing the task in two: collecting data (observing), and constructing models to represent that data (theorizing). Owing to the natural tendency for specialization, a disconnect can develop between the best available theories and the best available data, potentially delaying advances in our understanding new classes of transients. We introduce MOSFiT: the Modular Open Source Fitter for Transients, a Python-based package that downloads transient data sets from open online catalogs (e.g., the Open Supernova Catalog), generates Monte Carlo ensembles of semi-analytical light-curve fits to those data sets and their associated Bayesian parameter posteriors, and optionally delivers the fitting results back to those same catalogs to make them available to the rest of the community. MOSFiT is designed to help bridge the gap between observations and theory in time-domain astronomy; in addition to making the application of existing models and creation of new models as simple as possible, MOSFiT yields statistically robust predictions for transient characteristics, with a standard output format that includes all the setup information necessary to reproduce a given result. As large-scale surveys such as that conducted with the Large Synoptic Survey Telescope (LSST), discover entirely new classes of transients, tools such as MOSFiT will be critical for enabling rapid comparison of models against data in statistically consistent, reproducible, and scientifically beneficial ways.
A hybrid model for predicting carbon monoxide from vehicular exhausts in urban environments
NASA Astrophysics Data System (ADS)
Gokhale, Sharad; Khare, Mukesh
Several deterministic-based air quality models evaluate and predict the frequently occurring pollutant concentration well but, in general, are incapable of predicting the 'extreme' concentrations. In contrast, the statistical distribution models overcome the above limitation of the deterministic models and predict the 'extreme' concentrations. However, the environmental damages are caused by both extremes as well as by the sustained average concentration of pollutants. Hence, the model should predict not only 'extreme' ranges but also the 'middle' ranges of pollutant concentrations, i.e. the entire range. Hybrid modelling is one of the techniques that estimates/predicts the 'entire range' of the distribution of pollutant concentrations by combining the deterministic based models with suitable statistical distribution models ( Jakeman, et al., 1988). In the present paper, a hybrid model has been developed to predict the carbon monoxide (CO) concentration distributions at one of the traffic intersections, Income Tax Office (ITO), in the Delhi city, where the traffic is heterogeneous in nature and meteorology is 'tropical'. The model combines the general finite line source model (GFLSM) as its deterministic, and log logistic distribution (LLD) model, as its statistical components. The hybrid (GFLSM-LLD) model is then applied at the ITO intersection. The results show that the hybrid model predictions match with that of the observed CO concentration data within the 5-99 percentiles range. The model is further validated at different street location, i.e. Sirifort roadway. The validation results show that the model predicts CO concentrations fairly well ( d=0.91) in 10-95 percentiles range. The regulatory compliance is also developed to estimate the probability of exceedance of hourly CO concentration beyond the National Ambient Air Quality Standards (NAAQS) of India. It consists of light vehicles, heavy vehicles, three- wheelers (auto rickshaws) and two-wheelers (scooters, motorcycles, etc).
NASA Astrophysics Data System (ADS)
Iguchi, Kazumoto
We discuss the statistical mechanical foundation for the two-state transition in the protein folding of small globular proteins. In the standard arguments of protein folding, the statistical search for the ground state is carried out from astronomically many conformations in the configuration space. This leads us to the famous Levinthal's paradox. To resolve the paradox, Gō first postulated that the two-state transition - all-or-none type transition - is very crucial for the protein folding of small globular proteins and used the Gō's lattice model to show the two-state transition nature. Recently, there have been accumulated many experimental results that support the two-state transition for small globular proteins. Stimulated by such recent experiments, Zwanzig has introduced a minimal statistical mechanical model that exhibits the two-state transition. Also, Finkelstein and coworkers have discussed the solution of the paradox by considering the sequential folding of a small globular protein. On the other hand, recently Iguchi have introduced a toy model of protein folding using the Rubik's magic snake model, in which all folded structures are exactly known and mathematically represented in terms of the four types of conformations: cis-, trans-, left and right gauche-configurations between the unit polyhedrons. In this paper, we study the relationship between the Gō's two-state transition, the Zwanzig's statistical mechanics model and the Finkelsteinapos;s sequential folding model by applying them to the Rubik's magic snake models. We show that the foundation of the Gō's two-state transition model relies on the search within the equienergy surface that is labeled by the contact order of the hydrophobic condensation. This idea reproduces the Zwanzig's statistical model as a special case, realizes the Finkelstein's sequential folding model and fits together to understand the nature of the two-state transition of a small globular protein by calculating the physical quantities such as the free energy, the contact order and the specific heat. We point out the similarity between the liquid-gas transition in statistical mechanics and the two-state transition of protein folding. We also study morphology of the Rubik's magic snake models to give a prototype model for understanding the differences between α-helices proteins and β-sheets proteins.
*K-means and cluster models for cancer signatures.
Kakushadze, Zura; Yu, Willie
2017-09-01
We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means' computational cost is a fraction of NMF's. Using 1389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.
Annotating novel genes by integrating synthetic lethals and genomic information
Schöner, Daniel; Kalisch, Markus; Leisner, Christian; Meier, Lukas; Sohrmann, Marc; Faty, Mahamadou; Barral, Yves; Peter, Matthias; Gruissem, Wilhelm; Bühlmann, Peter
2008-01-01
Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W) as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process. PMID:18194531
European Wintertime Windstorms and its Links to Large-Scale Variability Modes
NASA Astrophysics Data System (ADS)
Befort, D. J.; Wild, S.; Walz, M. A.; Knight, J. R.; Lockwood, J. F.; Thornton, H. E.; Hermanson, L.; Bett, P.; Weisheimer, A.; Leckebusch, G. C.
2017-12-01
Winter storms associated with extreme wind speeds and heavy precipitation are the most costly natural hazard in several European countries. Improved understanding and seasonal forecast skill of winter storms will thus help society, policy-makers and (re-) insurance industry to be better prepared for such events. We firstly assess the ability to represent extra-tropical windstorms over the Northern Hemisphere of three seasonal forecast ensemble suites: ECMWF System3, ECMWF System4 and GloSea5. Our results show significant skill for inter-annual variability of windstorm frequency over parts of Europe in two of these forecast suites (ECMWF-S4 and GloSea5) indicating the potential use of current seasonal forecast systems. In a regression model we further derive windstorm variability using the forecasted NAO from the seasonal model suites thus estimating the suitability of the NAO as the only predictor. We find that the NAO as the main large-scale mode over Europe can explain some of the achieved skill and is therefore an important source of variability in the seasonal models. However, our results show that the regression model fails to reproduce the skill level of the directly forecast windstorm frequency over large areas of central Europe. This suggests that the seasonal models also capture other sources of variability/predictability of windstorms than the NAO. In order to investigate which other large-scale variability modes steer the interannual variability of windstorms we develop a statistical model using a Poisson GLM. We find that the Scandinavian Pattern (SCA) in fact explains a larger amount of variability for Central Europe during the 20th century than the NAO. This statistical model is able to skilfully reproduce the interannual variability of windstorm frequency especially for the British Isles and Central Europe with correlations up to 0.8.
Statistics of the geomagnetic secular variation for the past 5Ma
NASA Technical Reports Server (NTRS)
Constable, C. G.; Parker, R. L.
1986-01-01
A new statistical model is proposed for the geomagnetic secular variation over the past 5Ma. Unlike previous models, the model makes use of statistical characteristics of the present day geomagnetic field. The spatial power spectrum of the non-dipole field is consistent with a white source near the core-mantle boundary with Gaussian distribution. After a suitable scaling, the spherical harmonic coefficients may be regarded as statistical samples from a single giant Gaussian process; this is the model of the non-dipole field. The model can be combined with an arbitrary statistical description of the dipole and probability density functions and cumulative distribution functions can be computed for declination and inclination that would be observed at any site on Earth's surface. Global paleomagnetic data spanning the past 5Ma are used to constrain the statistics of the dipole part of the field. A simple model is found to be consistent with the available data. An advantage of specifying the model in terms of the spherical harmonic coefficients is that it is a complete statistical description of the geomagnetic field, enabling us to test specific properties for a general description. Both intensity and directional data distributions may be tested to see if they satisfy the expected model distributions.
Statistics of the geomagnetic secular variation for the past 5 m.y
NASA Technical Reports Server (NTRS)
Constable, C. G.; Parker, R. L.
1988-01-01
A new statistical model is proposed for the geomagnetic secular variation over the past 5Ma. Unlike previous models, the model makes use of statistical characteristics of the present day geomagnetic field. The spatial power spectrum of the non-dipole field is consistent with a white source near the core-mantle boundary with Gaussian distribution. After a suitable scaling, the spherical harmonic coefficients may be regarded as statistical samples from a single giant Gaussian process; this is the model of the non-dipole field. The model can be combined with an arbitrary statistical description of the dipole and probability density functions and cumulative distribution functions can be computed for declination and inclination that would be observed at any site on Earth's surface. Global paleomagnetic data spanning the past 5Ma are used to constrain the statistics of the dipole part of the field. A simple model is found to be consistent with the available data. An advantage of specifying the model in terms of the spherical harmonic coefficients is that it is a complete statistical description of the geomagnetic field, enabling us to test specific properties for a general description. Both intensity and directional data distributions may be tested to see if they satisfy the expected model distributions.
NASA Astrophysics Data System (ADS)
Kumar, Jagadish; Ananthakrishna, G.
2018-01-01
Scale-invariant power-law distributions for acoustic emission signals are ubiquitous in several plastically deforming materials. However, power-law distributions for acoustic emission energies are reported in distinctly different plastically deforming situations such as hcp and fcc single and polycrystalline samples exhibiting smooth stress-strain curves and in dilute metallic alloys exhibiting discontinuous flow. This is surprising since the underlying dislocation mechanisms in these two types of deformations are very different. So far, there have been no models that predict the power-law statistics for discontinuous flow. Furthermore, the statistics of the acoustic emission signals in jerky flow is even more complex, requiring multifractal measures for a proper characterization. There has been no model that explains the complex statistics either. Here we address the problem of statistical characterization of the acoustic emission signals associated with the three types of the Portevin-Le Chatelier bands. Following our recently proposed general framework for calculating acoustic emission, we set up a wave equation for the elastic degrees of freedom with a plastic strain rate as a source term. The energy dissipated during acoustic emission is represented by the Rayleigh-dissipation function. Using the plastic strain rate obtained from the Ananthakrishna model for the Portevin-Le Chatelier effect, we compute the acoustic emission signals associated with the three Portevin-Le Chatelier bands and the Lüders-like band. The so-calculated acoustic emission signals are used for further statistical characterization. Our results show that the model predicts power-law statistics for all the acoustic emission signals associated with the three types of Portevin-Le Chatelier bands with the exponent values increasing with increasing strain rate. The calculated multifractal spectra corresponding to the acoustic emission signals associated with the three band types have a maximum spread for the type C bands and decreasing with types B and A. We further show that the acoustic emission signals associated with Lüders-like band also exhibit a power-law distribution and multifractality.
Kumar, Jagadish; Ananthakrishna, G
2018-01-01
Scale-invariant power-law distributions for acoustic emission signals are ubiquitous in several plastically deforming materials. However, power-law distributions for acoustic emission energies are reported in distinctly different plastically deforming situations such as hcp and fcc single and polycrystalline samples exhibiting smooth stress-strain curves and in dilute metallic alloys exhibiting discontinuous flow. This is surprising since the underlying dislocation mechanisms in these two types of deformations are very different. So far, there have been no models that predict the power-law statistics for discontinuous flow. Furthermore, the statistics of the acoustic emission signals in jerky flow is even more complex, requiring multifractal measures for a proper characterization. There has been no model that explains the complex statistics either. Here we address the problem of statistical characterization of the acoustic emission signals associated with the three types of the Portevin-Le Chatelier bands. Following our recently proposed general framework for calculating acoustic emission, we set up a wave equation for the elastic degrees of freedom with a plastic strain rate as a source term. The energy dissipated during acoustic emission is represented by the Rayleigh-dissipation function. Using the plastic strain rate obtained from the Ananthakrishna model for the Portevin-Le Chatelier effect, we compute the acoustic emission signals associated with the three Portevin-Le Chatelier bands and the Lüders-like band. The so-calculated acoustic emission signals are used for further statistical characterization. Our results show that the model predicts power-law statistics for all the acoustic emission signals associated with the three types of Portevin-Le Chatelier bands with the exponent values increasing with increasing strain rate. The calculated multifractal spectra corresponding to the acoustic emission signals associated with the three band types have a maximum spread for the type C bands and decreasing with types B and A. We further show that the acoustic emission signals associated with Lüders-like band also exhibit a power-law distribution and multifractality.
Probabilistic forecasts of debris-flow hazard at the regional scale with a combination of models.
NASA Astrophysics Data System (ADS)
Malet, Jean-Philippe; Remaître, Alexandre
2015-04-01
Debris flows are one of the many active slope-forming processes in the French Alps, where rugged and steep slopes mantled by various slope deposits offer a great potential for triggering hazardous events. A quantitative assessment of debris-flow hazard requires the estimation, in a probabilistic framework, of the spatial probability of occurrence of source areas, the spatial probability of runout areas, the temporal frequency of events, and their intensity. The main objective of this research is to propose a pipeline for the estimation of these quantities at the region scale using a chain of debris-flow models. The work uses the experimental site of the Barcelonnette Basin (South French Alps), where 26 active torrents have produced more than 150 debris-flow events since 1850 to develop and validate the methodology. First, a susceptibility assessment is performed to identify the debris-flow prone source areas. The most frequently used approach is the combination of environmental factors with GIS procedures and statistical techniques, integrating or not, detailed event inventories. Based on a 5m-DEM and derivatives, and information on slope lithology, engineering soils and landcover, the possible source areas are identified with a statistical logistic regression model. The performance of the statistical model is evaluated with the observed distribution of debris-flow events recorded after 1850 in the study area. The source areas in the three most active torrents (Riou-Bourdoux, Faucon, Sanières) are well identified by the model. Results are less convincing for three other active torrents (Bourget, La Valette and Riou-Chanal); this could be related to the type of debris-flow triggering mechanism as the model seems to better spot the open slope debris-flow source areas (e.g. scree slopes), but appears to be less efficient for the identification of landslide-induced debris flows. Second, a susceptibility assessment is performed to estimate the possible runout distance with a process-based model. The MassMov-2D code is a two-dimensional model of mud and debris flow dynamics over complex topography, based on a numerical integration of the depth-averaged motion equations using shallow water approximation. The run-out simulations are performed for the most active torrents. The performance of the model has been evaluated by comparing modelling results with the observed spreading areas of several recent debris flows. Existing data on the debris flow volume, input discharge and deposits were used to back-analyze those events and estimate the values of the model parameters. Third, hazard is estimated on the basis of scenarios computed in a probabilistic way, for volumes in the range 20'000 to 350'000 m3, and for several combinations of rheological parameters. In most cases, the simulations indicate that the debris flows cause significant overflowing on the alluvial fans for volumes exceeding 100'000 m3 (height of deposits > 2 m, velocities > 5 m.s-1). Probabilities of debris flow runout and debris flow intensities are then computed for each terrain units.
DOA-informed source extraction in the presence of competing talkers and background noise
NASA Astrophysics Data System (ADS)
Taseska, Maja; Habets, Emanuël A. P.
2017-12-01
A desired speech signal in hands-free communication systems is often degraded by noise and interfering speech. Even though the number and locations of the interferers are often unknown in practice, it is justified to assume in certain applications that the direction-of-arrival (DOA) of the desired source is approximately known. Using the known DOA, fixed spatial filters such as the delay-and-sum beamformer can be steered to extract the desired source. However, it is well-known that fixed data-independent spatial filters do not provide sufficient reduction of directional interferers. Instead, the DOA information can be used to estimate the statistics of the desired and the undesired signals and to compute optimal data-dependent spatial filters. One way the DOA is exploited for optimal spatial filtering in the literature, is by designing DOA-based narrowband detectors to determine whether a desired or an undesired signal is dominant at each time-frequency (TF) bin. Subsequently, the statistics of the desired and the undesired signals can be estimated during the TF bins where the respective signal is dominant. In a similar manner, a Gaussian signal model-based detector which does not incorporate DOA information has been used in scenarios where the undesired signal consists of stationary background noise. However, when the undesired signal is non-stationary, resulting for example from interfering speakers, such a Gaussian signal model-based detector is unable to robustly distinguish desired from undesired speech. To this end, we propose a DOA model-based detector to determine the dominant source at each TF bin and estimate the desired and undesired signal statistics. We demonstrate that data-dependent spatial filters that use the statistics estimated by the proposed framework achieve very good undesired signal reduction, even when using only three microphones.
Modeling the space debris environment with MASTER-2009 and ORDEM2010
NASA Astrophysics Data System (ADS)
Flegel, Sven Kevin; Krisko, Paula; Gelhaus, Johannes; Wiedemann, Carsten; Moeckel, Marek; Krag, Holger; Klinkrad, Heiner; Xu, Yu-Lin; Horstman, Matthew; Matney, Mark; Vörsmann, Peter
The two software tools MASTER-2009 and ORDEM2010 are the ESA and NASA reference software tools respectively which describe the earth's debris environment. The primary goal of both programs is to allow users to estimate the object flux onto a target object for mission planning. The current paper describes the basic distinctions in the model philosophies. At the core of each model lies the method by which the object environment is established. Cen-tral to this process is the role played by the results from radar/telescope observations or impact fluxes on surfaces returned from earth orbit. The ESA Meteoroid and Space Debris Terrestrial Environment Reference Model (MASTER) is engineered to give a realistic description of the natural and the man-made particulate environment of the earth. Debris sources are simulated based on detailed lists of known historical events such as fragmentations or solid rocket motor firings or through simulation of secondary debris such as impact ejecta or the release of paint flakes from degrading spacecraft surfaces. The resulting population is then validated against historical telescope/radar campaigns using the ESA Program for Radar and Optical Observa-tion Forecasting (PROOF) and against object impact fluxes on surfaces returned from space. The NASA Orbital Debris Engineering Model (ORDEM) series is designed to provide reliable estimates of orbital debris flux on spacecraft and through telescope or radar fields-of-view. Central to the model series is the empirical nature of the input populations. These are derived from NASA orbital debris modeling but verified, where possible, with measurement data from various sources. The latest version of the series, ORDEM2010, compiles over two decades of data from NASA radar systems, telescopes, in-situ sources, and ground tests that are analyzed by statistical methods. For increased understanding of the application ranges of the two programs, the current paper provides an overview of the two models' main program features and the methods by which simulation results are presented. This paper is written in a combined effort by ESA and NASA.
A Review of Meta-Analysis Packages in R
ERIC Educational Resources Information Center
Polanin, Joshua R.; Hennessy, Emily A.; Tanner-Smith, Emily E.
2017-01-01
Meta-analysis is a statistical technique that allows an analyst to synthesize effect sizes from multiple primary studies. To estimate meta-analysis models, the open-source statistical environment R is quickly becoming a popular choice. The meta-analytic community has contributed to this growth by developing numerous packages specific to…
Modeling Statistical Insensitivity: Sources of Suboptimal Behavior
ERIC Educational Resources Information Center
Gagliardi, Annie; Feldman, Naomi H.; Lidz, Jeffrey
2017-01-01
Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the…
NASA Astrophysics Data System (ADS)
Ehsan, Muhammad Azhar; Tippett, Michael K.; Almazroui, Mansour; Ismail, Muhammad; Yousef, Ahmed; Kucharski, Fred; Omar, Mohamed; Hussein, Mahmoud; Alkhalaf, Abdulrahman A.
2017-05-01
Northern Hemisphere winter precipitation reforecasts from the European Centre for Medium Range Weather Forecast System-4 and six of the models in the North American Multi-Model Ensemble are evaluated, focusing on two regions (Region-A: 20°N-45°N, 10°E-65°E and Region-B: 20°N-55°N, 205°E-255°E) where winter precipitation is a dominant fraction of the annual total and where precipitation from mid-latitude storms is important. Predictability and skill (deterministic and probabilistic) are assessed for 1983-2013 by the multimodel composite (MME) of seven prediction models. The MME climatological mean and variability over the two regions is comparable to observation with some regional differences. The statistically significant decreasing trend observed in Region-B precipitation is captured well by the MME and most of the individual models. El Niño Southern Oscillation is a source of forecast skill, and the correlation coefficient between the Niño3.4 index and precipitation over region A and B is 0.46 and 0.35, statistically significant at the 95 % level. The MME reforecasts weakly reproduce the observed teleconnection. Signal, noise and signal to noise ratio analysis show that the signal variance over two regions is very small as compared to noise variance which tends to reduce the prediction skill. The MME ranked probability skill score is higher than that of individual models, showing the advantage of a multimodel ensemble. Observed Region-A rainfall anomalies are strongly associated with the North Atlantic Oscillation, but none of the models reproduce this relation, which may explain the low skill over Region-A. The superior quality of multimodel ensemble compared with individual models is mainly due to larger ensemble size.
Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.
Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W
2018-05-18
Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.
NASA Astrophysics Data System (ADS)
El Naqa, I.; Suneja, G.; Lindsay, P. E.; Hope, A. J.; Alaly, J. R.; Vicic, M.; Bradley, J. D.; Apte, A.; Deasy, J. O.
2006-11-01
Radiotherapy treatment outcome models are a complicated function of treatment, clinical and biological factors. Our objective is to provide clinicians and scientists with an accurate, flexible and user-friendly software tool to explore radiotherapy outcomes data and build statistical tumour control or normal tissue complications models. The software tool, called the dose response explorer system (DREES), is based on Matlab, and uses a named-field structure array data type. DREES/Matlab in combination with another open-source tool (CERR) provides an environment for analysing treatment outcomes. DREES provides many radiotherapy outcome modelling features, including (1) fitting of analytical normal tissue complication probability (NTCP) and tumour control probability (TCP) models, (2) combined modelling of multiple dose-volume variables (e.g., mean dose, max dose, etc) and clinical factors (age, gender, stage, etc) using multi-term regression modelling, (3) manual or automated selection of logistic or actuarial model variables using bootstrap statistical resampling, (4) estimation of uncertainty in model parameters, (5) performance assessment of univariate and multivariate analyses using Spearman's rank correlation and chi-square statistics, boxplots, nomograms, Kaplan-Meier survival plots, and receiver operating characteristics curves, and (6) graphical capabilities to visualize NTCP or TCP prediction versus selected variable models using various plots. DREES provides clinical researchers with a tool customized for radiotherapy outcome modelling. DREES is freely distributed. We expect to continue developing DREES based on user feedback.
Open Source Tools for Seismicity Analysis
NASA Astrophysics Data System (ADS)
Powers, P.
2010-12-01
The spatio-temporal analysis of seismicity plays an important role in earthquake forecasting and is integral to research on earthquake interactions and triggering. For instance, the third version of the Uniform California Earthquake Rupture Forecast (UCERF), currently under development, will use Epidemic Type Aftershock Sequences (ETAS) as a model for earthquake triggering. UCERF will be a "living" model and therefore requires robust, tested, and well-documented ETAS algorithms to ensure transparency and reproducibility. Likewise, as earthquake aftershock sequences unfold, real-time access to high quality hypocenter data makes it possible to monitor the temporal variability of statistical properties such as the parameters of the Omori Law and the Gutenberg Richter b-value. Such statistical properties are valuable as they provide a measure of how much a particular sequence deviates from expected behavior and can be used when assigning probabilities of aftershock occurrence. To address these demands and provide public access to standard methods employed in statistical seismology, we present well-documented, open-source JavaScript and Java software libraries for the on- and off-line analysis of seismicity. The Javascript classes facilitate web-based asynchronous access to earthquake catalog data and provide a framework for in-browser display, analysis, and manipulation of catalog statistics; implementations of this framework will be made available on the USGS Earthquake Hazards website. The Java classes, in addition to providing tools for seismicity analysis, provide tools for modeling seismicity and generating synthetic catalogs. These tools are extensible and will be released as part of the open-source OpenSHA Commons library.
NASA Astrophysics Data System (ADS)
Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Argüeso, F.; Arnaud, M.; Ashdown, M.; Atrio-Barandela, F.; Aumont, J.; Baccigalupi, C.; Balbi, A.; Banday, A. J.; Barreiro, R. B.; Battaner, E.; Benabed, K.; Benoît, A.; Bernard, J.-P.; Bersanelli, M.; Bethermin, M.; Bhatia, R.; Bonaldi, A.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Burigana, C.; Cabella, P.; Cardoso, J.-F.; Catalano, A.; Cayón, L.; Chamballu, A.; Chary, R.-R.; Chen, X.; Chiang, L.-Y.; Christensen, P. R.; Clements, D. L.; Colafrancesco, S.; Colombi, S.; Colombo, L. P. L.; Coulais, A.; Crill, B. P.; Cuttaia, F.; Danese, L.; Davis, R. J.; de Bernardis, P.; de Gasperis, G.; de Zotti, G.; Delabrouille, J.; Dickinson, C.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Dörl, U.; Douspis, M.; Dupac, X.; Efstathiou, G.; Enßlin, T. A.; Eriksen, H. K.; Finelli, F.; Forni, O.; Fosalba, P.; Frailis, M.; Franceschi, E.; Galeotta, S.; Ganga, K.; Giard, M.; Giardino, G.; Giraud-Héraud, Y.; González-Nuevo, J.; Górski, K. M.; Gregorio, A.; Gruppuso, A.; Hansen, F. K.; Harrison, D.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Jaffe, T. R.; Jaffe, A. H.; Jagemann, T.; Jones, W. C.; Juvela, M.; Keihänen, E.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Knox, L.; Kunz, M.; Kurinsky, N.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lawrence, C. R.; Leonardi, R.; Lilje, P. B.; López-Caniego, M.; Macías-Pérez, J. F.; Maino, D.; Mandolesi, N.; Maris, M.; Marshall, D. J.; Martínez-González, E.; Masi, S.; Massardi, M.; Matarrese, S.; Mazzotta, P.; Melchiorri, A.; Mendes, L.; Mennella, A.; Mitra, S.; Miville-Deschènes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Osborne, S.; Pajot, F.; Paladini, R.; Paoletti, D.; Partridge, B.; Pasian, F.; Patanchon, G.; Perdereau, O.; Perotto, L.; Perrotta, F.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Ponthieu, N.; Popa, L.; Poutanen, T.; Pratt, G. W.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Reach, W. T.; Rebolo, R.; Reinecke, M.; Renault, C.; Ricciardi, S.; Riller, T.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rowan-Robinson, M.; Rubiño-Martín, J. A.; Rusholme, B.; Sajina, A.; Sandri, M.; Savini, G.; Scott, D.; Smoot, G. F.; Starck, J.-L.; Sudiwala, R.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Türler, M.; Valenziano, L.; Van Tent, B.; Vielva, P.; Villa, F.; Vittorio, N.; Wade, L. A.; Wandelt, B. D.; White, M.; Yvon, D.; Zacchei, A.; Zonca, A.
2013-02-01
We make use of the Planck all-sky survey to derive number counts and spectral indices of extragalactic sources - infrared and radio sources - from the Planck Early Release Compact Source Catalogue (ERCSC) at 100 to 857 GHz (3 mm to 350 μm). Three zones (deep, medium and shallow) of approximately homogeneous coverage are used to permit a clean and controlled correction for incompleteness, which was explicitly not done for the ERCSC, as it was aimed at providing lists of sources to be followed up. Our sample, prior to the 80% completeness cut, contains between 217 sources at 100 GHz and 1058 sources at 857 GHz over about 12 800 to 16 550 deg2 (31 to 40% of the sky). After the 80% completeness cut, between 122 and 452 and sources remain, with flux densities above 0.3 and 1.9 Jy at 100 and 857 GHz. The sample so defined can be used for statistical analysis. Using the multi-frequency coverage of the Planck High Frequency Instrument, all the sources have been classified as either dust-dominated (infrared galaxies) or synchrotron-dominated (radio galaxies) on the basis of their spectral energy distributions (SED). Our sample is thus complete, flux-limited and color-selected to differentiate between the two populations. We find an approximately equal number of synchrotron and dusty sources between 217 and 353 GHz; at 353 GHz or higher (or 217 GHz and lower) frequencies, the number is dominated by dusty (synchrotron) sources, as expected. For most of the sources, the spectral indices are also derived. We provide for the first time counts of bright sources from 353 to 857 GHz and the contributions from dusty and synchrotron sources at all HFI frequencies in the key spectral range where these spectra are crossing. The observed counts are in the Euclidean regime. The number counts are compared to previously published data (from earlier Planck results, Herschel, BLAST, SCUBA, LABOCA, SPT, and ACT) and models taking into account both radio or infrared galaxies, and covering a large range of flux densities. We derive the multi-frequency Euclidean level - the plateau in the normalised differential counts at high flux-density - and compare it to WMAP, Spitzer and IRAS results. The submillimetre number counts are not well reproduced by current evolution models of dusty galaxies, whereas the millimetre part appears reasonably well fitted by the most recent model for synchrotron-dominated sources. Finally we provide estimates of the local luminosity density of dusty galaxies, providing the first such measurements at 545 and 857 GHz. Appendices are available in electronic form at http://www.aanda.orgCorresponding author: herve.dole@ias.u-psud.fr
NASA Astrophysics Data System (ADS)
Song, S. G.
2016-12-01
Simulation-based ground motion prediction approaches have several benefits over empirical ground motion prediction equations (GMPEs). For instance, full 3-component waveforms can be produced and site-specific hazard analysis is also possible. However, it is important to validate them against observed ground motion data to confirm their efficiency and validity before practical uses. There have been community efforts for these purposes, which are supported by the Broadband Platform (BBP) project at the Southern California Earthquake Center (SCEC). In the simulation-based ground motion prediction approaches, it is a critical element to prepare a possible range of scenario rupture models. I developed a pseudo-dynamic source model for Mw 6.5-7.0 by analyzing a number of dynamic rupture models, based on 1-point and 2-point statistics of earthquake source parameters (Song et al. 2014; Song 2016). In this study, the developed pseudo-dynamic source models were tested against observed ground motion data at the SCEC BBP, Ver 16.5. The validation was performed at two stages. At the first stage, simulated ground motions were validated against observed ground motion data for past events such as the 1992 Landers and 1994 Northridge, California, earthquakes. At the second stage, they were validated against the latest version of empirical GMPEs, i.e., NGA-West2. The validation results show that the simulated ground motions produce ground motion intensities compatible with observed ground motion data at both stages. The compatibility of the pseudo-dynamic source models with the omega-square spectral decay and the standard deviation of the simulated ground motion intensities are also discussed in the study
NASA Astrophysics Data System (ADS)
Pires, Carlos; Ribeiro, Andreia
2016-04-01
An efficient nonlinear method of statistical source separation of space-distributed non-Gaussian distributed data is proposed. The method relies in the so called Independent Subspace Analysis (ISA), being tested on a long time-series of the stream-function field of an atmospheric quasi-geostrophic 3-level model (QG3) simulating the winter's monthly variability of the Northern Hemisphere. ISA generalizes the Independent Component Analysis (ICA) by looking for multidimensional and minimally dependent, uncorrelated and non-Gaussian distributed statistical sources among the rotated projections or subspaces of the multivariate probability distribution of the leading principal components of the working field whereas ICA restrict to scalar sources. The rationale of that technique relies upon the projection pursuit technique, looking for data projections of enhanced interest. In order to accomplish the decomposition, we maximize measures of the sources' non-Gaussianity by contrast functions which are given by squares of nonlinear, cross-cumulant-based correlations involving the variables spanning the sources. Therefore sources are sought matching certain nonlinear data structures. The maximized contrast function is built in such a way that it provides the minimization of the mean square of the residuals of certain nonlinear regressions. The issuing residuals, followed by spherization, provide a new set of nonlinear variable changes that are at once uncorrelated, quasi-independent and quasi-Gaussian, representing an advantage with respect to the Independent Components (scalar sources) obtained by ICA where the non-Gaussianity is concentrated into the non-Gaussian scalar sources. The new scalar sources obtained by the above process encompass the attractor's curvature thus providing improved nonlinear model indices of the low-frequency atmospheric variability which is useful since large circulation indices are nonlinearly correlated. The non-Gaussian tested sources (dyads and triads, respectively of two and three dimensions) lead to a dense data concentration along certain curves or surfaces, nearby which the clusters' centroids of the joint probability density function tend to be located. That favors a better splitting of the QG3 atmospheric model's weather regimes: the positive and negative phases of the Arctic Oscillation and positive and negative phases of the North Atlantic Oscillation. The leading model's non-Gaussian dyad is associated to a positive correlation between: 1) the squared anomaly of the extratropical jet-stream and 2) the meridional jet-stream meandering. Triadic sources coming from maximized third-order cross cumulants between pairwise uncorrelated components reveal situations of triadic wave resonance and nonlinear triadic teleconnections, only possible thanks to joint non-Gaussianity. That kind of triadic synergies are accounted for an Information-Theoretic measure: the Interaction Information. The dominant model's triad occurs between anomalies of: 1) the North Pole anomaly pressure 2) the jet-stream intensity at the Eastern North-American boundary and 3) the jet-stream intensity at the Eastern Asian boundary. Publication supported by project FCT UID/GEO/50019/2013 - Instituto Dom Luiz.
Ghirardi-Rimini-Weber model with massive flashes
NASA Astrophysics Data System (ADS)
Tilloy, Antoine
2018-01-01
I introduce a modification of the Ghirardi-Rimini-Weber (GRW) model in which the flashes (or space-time collapse events) source a classical gravitational field. The resulting semiclassical theory of Newtonian gravity preserves the statistical interpretation of quantum states of matter in contrast with mean field approaches. It can be seen as a discrete version of recent proposals of consistent hybrid quantum classical theories. The model is in agreement with known experimental data and introduces new falsifiable predictions: (1) single particles do not self-interact, (2) the 1 /r gravitational potential of Newtonian gravity is cut off at short (≲10-7 m ) distances, and (3) gravity makes spatial superpositions decohere at a rate inversely proportional to that coming from the vanilla GRW model. Together, the last two predictions make the model experimentally falsifiable for all values of its parameters.
Angular Baryon Acoustic Oscillation measure at z=2.225 from the SDSS quasar survey
NASA Astrophysics Data System (ADS)
de Carvalho, E.; Bernui, A.; Carvalho, G. C.; Novaes, C. P.; Xavier, H. S.
2018-04-01
Following a quasi model-independent approach we measure the transversal BAO mode at high redshift using the two-point angular correlation function (2PACF). The analyses done here are only possible now with the quasar catalogue from the twelfth data release (DR12Q) from the Sloan Digital Sky Survey, because it is spatially dense enough to allow the measurement of the angular BAO signature with moderate statistical significance and acceptable precision. Our analyses with quasars in the redshift interval z in [2.20,2.25] produce the angular BAO scale θBAO = 1.77° ± 0.31° with a statistical significance of 2.12 σ (i.e., 97% confidence level), calculated through a likelihood analysis performed using the theoretical covariance matrix sourced by the analytical power spectra expected in the ΛCDM concordance model. Additionally, we show that the BAO signal is robust—although with less statistical significance—under diverse bin-size choices and under small displacements of the quasars' angular coordinates. Finally, we also performed cosmological parameter analyses comparing the θBAO predictions for wCDM and w(a)CDM models with angular BAO data available in the literature, including the measurement obtained here, jointly with CMB data. The constraints on the parameters ΩM, w0 and wa are in excellent agreement with the ΛCDM concordance model.
Kenney, Terry A.; Gerner, Steven J.; Buto, Susan G.; Spangler, Lawrence E.
2009-01-01
The Upper Colorado River Basin (UCRB) discharges more than 6 million tons of dissolved solids annually, about 40 to 45 percent of which are attributed to agricultural activities. The U.S. Department of the Interior estimates economic damages related to salinity in excess of $330 million annually in the Colorado River Basin. Salinity in the UCRB, as measured by dissolved-solids load and concentration, has been studied extensively during the past century. Over this period, a solid conceptual understanding of the sources and transport mechanisms of dissolved solids in the basin has been developed. This conceptual understanding was incorporated into the U.S. Geological Survey Spatially Referenced Regressions on Watershed Attributes (SPARROW) surface-water quality model to examine statistically the dissolved-solids supply and transport within the UCRB. Geologic and agricultural sources of dissolved solids in the UCRB were defined and represented in the model. On the basis of climatic and hydrologic conditions along with data availability, water year 1991 was selected for examination with SPARROW. Dissolved-solids loads for 218 monitoring sites were used to calibrate a dissolved-solids SPARROW model for the UCRB. The calibrated model generally captures the transport mechanisms that deliver dissolved solids to streams of the UCRB as evidenced by R2 and yield R2 values of 0.98 and 0.71, respectively. Model prediction error is approximated at 51 percent. Model results indicate that of the seven geologic source groups, the high-yield sedimentary Mesozoic rocks have the largest yield of dissolved solids, about 41.9 tons per square mile (tons/mi2). Irrigated sedimentary-clastic Mesozoic lands have an estimated yield of 1,180 tons/mi2, and irrigated sedimentary-clastic Tertiary lands have an estimated yield of 662 tons/mi2. Coefficients estimated for the seven landscape transport characteristics seem to agree well with the conceptual understanding of the role they play in the delivery of dissolved solids to streams in the UCRB. Predictions of dissolved-solids loads were generated for more than 10,000 stream reaches of the stream network defined in the UCRB. From these estimates, the downstream accumulation of dissolved solids, including natural and agricultural components, were examined in selected rivers. Contributions from each of the 11 dissolved-solids sources were also examined at select locations in the Grand, Green, and San Juan Divisions of the UCRB. At the downstream boundary of the UCRB, the Colorado River at Lees Ferry, Arizona, monitoring site, the dissolved-solids contribution of irrigated agricultural lands and natural sources were about 45 and 57 percent, respectively. Finally, model predictions, including the contributions of natural and agricultural sources for selected locations in the UCRB, were compared with results from two previous studies.
Statistics of initial density perturbations in heavy ion collisions and their fluid dynamic response
NASA Astrophysics Data System (ADS)
Floerchinger, Stefan; Wiedemann, Urs Achim
2014-08-01
An interesting opportunity to determine thermodynamic and transport properties in more detail is to identify generic statistical properties of initial density perturbations. Here we study event-by-event fluctuations in terms of correlation functions for two models that can be solved analytically. The first assumes Gaussian fluctuations around a distribution that is fixed by the collision geometry but leads to non-Gaussian features after averaging over the reaction plane orientation at non-zero impact parameter. In this context, we derive a three-parameter extension of the commonly used Bessel-Gaussian event-by-event distribution of harmonic flow coefficients. Secondly, we study a model of N independent point sources for which connected n-point correlation functions of initial perturbations scale like 1 /N n-1. This scaling is violated for non-central collisions in a way that can be characterized by its impact parameter dependence. We discuss to what extent these are generic properties that can be expected to hold for any model of initial conditions, and how this can improve the fluid dynamical analysis of heavy ion collisions.
A Clustered Extragalactic Foreground Model for the EoR
NASA Astrophysics Data System (ADS)
Murray, S. G.; Trott, C. M.; Jordan, C. H.
2018-05-01
We review an improved statistical model of extra-galactic point-source foregrounds first introduced in Murray et al. (2017), in the context of the Epoch of Reionization. This model extends the instrumentally-convolved foreground covariance used in inverse-covariance foreground mitigation schemes, by considering the cosmological clustering of the sources. In this short work, we show that over scales of k ~ (0.6, 40.)hMpc-1, ignoring source clustering is a valid approximation. This is in contrast to Murray et al. (2017), who found a possibility of false detection if the clustering was ignored. The dominant cause for this change is the introduction of a Galactic synchrotron component which shadows the clustering of sources.
NASA Astrophysics Data System (ADS)
Preston, L. A.
2017-12-01
Marine hydrokinetic (MHK) devices offer a clean, renewable alternative energy source for the future. Responsible utilization of MHK devices, however, requires that the effects of acoustic noise produced by these devices on marine life and marine-related human activities be well understood. Paracousti is a 3-D full waveform acoustic modeling suite that can accurately propagate MHK noise signals in the complex bathymetry found in the near-shore to open ocean environment and considers real properties of the seabed, water column, and air-surface interface. However, this is a deterministic simulation that assumes the environment and source are exactly known. In reality, environmental and source characteristics are often only known in a statistical sense. Thus, to fully characterize the expected noise levels within the marine environment, this uncertainty in environmental and source factors should be incorporated into the acoustic simulations. One method is to use Monte Carlo (MC) techniques where simulation results from a large number of deterministic solutions are aggregated to provide statistical properties of the output signal. However, MC methods can be computationally prohibitive since they can require tens of thousands or more simulations to build up an accurate representation of those statistical properties. An alternative method, using the technique of stochastic partial differential equations (SPDE), allows computation of the statistical properties of output signals at a small fraction of the computational cost of MC. We are developing a SPDE solver for the 3-D acoustic wave propagation problem called Paracousti-UQ to help regulators and operators assess the statistical properties of environmental noise produced by MHK devices. In this presentation, we present the SPDE method and compare statistical distributions of simulated acoustic signals in simple models to MC simulations to show the accuracy and efficiency of the SPDE method. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.
NASA Astrophysics Data System (ADS)
Kouhpeima, A.; Feiznia, S.; Ahmadi, H.; Hashemi, S. A.; Zareiee, A. R.
2010-09-01
The targeting of sediment management strategies is a key requirement in developing countries including Iran because of the limited resources available. These targeting is, however hampered by the lack of reliable information on catchment sediment sources. This paper reports the results of using a quantitative composite fingerprinting technique to estimate the relative importance of the primary potential sources within the Amrovan and Royan catchments in Semnan Province, Iran. Fifteen tracers were first selected for tracing and samples were analyzed in the laboratory for these parameters. Statistical methods were applied to the data including nonparametric Kruskal-Wallis test and Differentiation Function Analysis (DFA). For Amrovan catchment three parameters (N, Cr and Co) were found to be not significant in making the discrimination. The optimum fingerprint, comprising Oc, PH, Kaolinite and K was able to distinguish correctly 100% of the source material samples. For the Royan catchment, all of the 15 properties were able to distinguish between the six source types and the optimum fingerprint provided by stepwise DFA (Cholorite, XFD, N and C) correctly classifies 92.9% of the source material samples. The mean contributions from each sediment source obtained by multivariate mixing model varied at two catchments. For Amrovan catchment Upper Red formation is the main sediment sources as this sediment source approximately supplies 36% of the reservoir sediment whereas the dominant sediment source for the Royan catchment is from Karaj formation that supplies 33% of the reservoir sediments. Results indicate that the source fingerprinting approach appears to work well in the study catchments and to generate reliable results.
Hunting statistics: what data for what use? An account of an international workshop
Nichols, J.D.; Lancia, R.A.; Lebreton, J.D.
2001-01-01
Hunting interacts with the underlying dynamics of game species in several different ways and is, at the same time, a source of valuable information not easily obtained from populations that are not subjected to hunting. Specific questions, including the sustainability of hunting activities, can be addressed using hunting statistics. Such investigations will frequently require that hunting statistics be combined with data from other sources of population-level information. Such reflections served as a basis for the meeting, ?Hunting Statistics: What Data for What Use,? held on January 15-18, 2001 in Saint-Benoist, France. We review here the 20 talks held during the workshop and the contribution of hunting statistics to our knowledge of the population dynamics of game species. Three specific topics (adaptive management, catch-effort models, and dynamics of exploited populations) were highlighted as important themes and are more extensively presented as boxes.
NASA Astrophysics Data System (ADS)
Zhang, Yi; Zhao, Yanxia; Wang, Chunyi; Chen, Sining
2017-11-01
Assessment of the impact of climate change on crop productions with considering uncertainties is essential for properly identifying and decision-making agricultural practices that are sustainable. In this study, we employed 24 climate projections consisting of the combinations of eight GCMs and three emission scenarios representing the climate projections uncertainty, and two crop statistical models with 100 sets of parameters in each model representing parameter uncertainty within the crop models. The goal of this study was to evaluate the impact of climate change on maize ( Zea mays L.) yield at three locations (Benxi, Changling, and Hailun) across Northeast China (NEC) in periods 2010-2039 and 2040-2069, taking 1976-2005 as the baseline period. The multi-models ensembles method is an effective way to deal with the uncertainties. The results of ensemble simulations showed that maize yield reductions were less than 5 % in both future periods relative to the baseline. To further understand the contributions of individual sources of uncertainty, such as climate projections and crop model parameters, in ensemble yield simulations, variance decomposition was performed. The results indicated that the uncertainty from climate projections was much larger than that contributed by crop model parameters. Increased ensemble yield variance revealed the increasing uncertainty in the yield simulation in the future periods.
2002-06-01
fits our actual data . To determine the goodness of fit, statisticians typically use the following four measures: R2 Statistic. The R2 statistic...reviewing instruction, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of...mathematical model is developed to better estimate cleanup costs using historical cost data that could be used by the Defense Department prior to placing
Comparing multiple statistical methods for inverse prediction in nuclear forensics applications
Lewis, John R.; Zhang, Adah; Anderson-Cook, Christine Michaela
2017-10-29
Forensic science seeks to predict source characteristics using measured observables. Statistically, this objective can be thought of as an inverse problem where interest is in the unknown source characteristics or factors ( X) of some underlying causal model producing the observables or responses (Y = g ( X) + error). Here, this paper reviews several statistical methods for use in inverse problems and demonstrates that comparing results from multiple methods can be used to assess predictive capability. Motivation for assessing inverse predictions comes from the desired application to historical and future experiments involving nuclear material production for forensics research inmore » which inverse predictions, along with an assessment of predictive capability, are desired.« less
Comparing multiple statistical methods for inverse prediction in nuclear forensics applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lewis, John R.; Zhang, Adah; Anderson-Cook, Christine Michaela
Forensic science seeks to predict source characteristics using measured observables. Statistically, this objective can be thought of as an inverse problem where interest is in the unknown source characteristics or factors ( X) of some underlying causal model producing the observables or responses (Y = g ( X) + error). Here, this paper reviews several statistical methods for use in inverse problems and demonstrates that comparing results from multiple methods can be used to assess predictive capability. Motivation for assessing inverse predictions comes from the desired application to historical and future experiments involving nuclear material production for forensics research inmore » which inverse predictions, along with an assessment of predictive capability, are desired.« less
NASA Astrophysics Data System (ADS)
Zhao, Runchen; Ientilucci, Emmett J.
2017-05-01
Hyperspectral remote sensing systems provide spectral data composed of hundreds of narrow spectral bands. Spectral remote sensing systems can be used to identify targets, for example, without physical interaction. Often it is of interested to characterize the spectral variability of targets or objects. The purpose of this paper is to identify and characterize the LWIR spectral variability of targets based on an improved earth observing statistical performance model, known as the Forecasting and Analysis of Spectroradiometric System Performance (FASSP) model. FASSP contains three basic modules including a scene model, sensor model and a processing model. Instead of using mean surface reflectance only as input to the model, FASSP transfers user defined statistical characteristics of a scene through the image chain (i.e., from source to sensor). The radiative transfer model, MODTRAN, is used to simulate the radiative transfer based on user defined atmospheric parameters. To retrieve class emissivity and temperature statistics, or temperature / emissivity separation (TES), a LWIR atmospheric compensation method is necessary. The FASSP model has a method to transform statistics in the visible (ie., ELM) but currently does not have LWIR TES algorithm in place. This paper addresses the implementation of such a TES algorithm and its associated transformation of statistics.
Rayleigh scattering in an emitter-nanofiber-coupling system
NASA Astrophysics Data System (ADS)
Tang, Shui-Jing; Gao, Fei; Xu, Da; Li, Yan; Gong, Qihuang; Xiao, Yun-Feng
2017-04-01
Scattering is a general process in both fundamental and applied physics. In this paper, we investigate Rayleigh scattering of a solid-state-emitter coupled to a nanofiber, by S -matrix-like theory in k -space description. Under this model, both Rayleigh scattering and dipole interaction are studied between a two-level artificial atom embedded in a nanocrystal and fiber modes (guided and radiation modes). It is found that Rayleigh scattering plays a critical role in the transport properties and quantum statistics of photons. On the one hand, Rayleigh scattering produces the transparency in the optical transmitted field of the nanofiber, accompanied by the change of atomic phase, population, and frequency shift. On the other hand, the interference between two kinds of scattering fields by Rayleigh scattering and dipole transition modifies the photon statistics (second-order autocorrelation function) of output fields, showing a strong wavelength dependence. This study provides guidance for the solid-state emitter acting as a single-photon source and can be extended to explore the scattering effect in many-body physics.
Texas Academic Library Statistics, 1986.
ERIC Educational Resources Information Center
Texas State Library, Austin. Dept. of Library Development.
This publication is the latest in a series of annual publications which are intended to provide a comprehensive source of statistics on academic libraries in Texas. The report is divided into four sections containing data on four-year public institutions, four-year private institutions, two-year colleges (both public and private), and law schools…
SABRE MULTI-LAB, STATISTICALLY-BASED MICROCOSM STUDY FOR TCE SOURCE ZONE REMEDIATION (ABSTRACT ONLY)
SABRE (source area bioremediation) is a public/private consortium of twelve companies, two government agencies, and three research institutions whose charter is to determine if enhanced anaerobic bioremediation can result in effective and quantifiable treatment of chlorinated sol...
Improved Bayesian Infrasonic Source Localization for regional infrasound
Blom, Philip S.; Marcillo, Omar; Arrowsmith, Stephen J.
2015-10-20
The Bayesian Infrasonic Source Localization (BISL) methodology is examined and simplified providing a generalized method of estimating the source location and time for an infrasonic event and the mathematical framework is used therein. The likelihood function describing an infrasonic detection used in BISL has been redefined to include the von Mises distribution developed in directional statistics and propagation-based, physically derived celerity-range and azimuth deviation models. Frameworks for constructing propagation-based celerity-range and azimuth deviation statistics are presented to demonstrate how stochastic propagation modelling methods can be used to improve the precision and accuracy of the posterior probability density function describing themore » source localization. Infrasonic signals recorded at a number of arrays in the western United States produced by rocket motor detonations at the Utah Test and Training Range are used to demonstrate the application of the new mathematical framework and to quantify the improvement obtained by using the stochastic propagation modelling methods. Moreover, using propagation-based priors, the spatial and temporal confidence bounds of the source decreased by more than 40 per cent in all cases and by as much as 80 per cent in one case. Further, the accuracy of the estimates remained high, keeping the ground truth within the 99 per cent confidence bounds for all cases.« less
Voth-Gaeddert, Lee E; Stoker, Matthew; Cornell, Devin; Oerther, Daniel B
2018-04-01
Guatemala has the sixth worst stunting rate with 48% of children under five years of age classified as stunted according to World Health Organization standards. This study utilizes two different yet complimentary system-analysis approaches to analyze correlations among environmental and demographic variables, environmental enteric dysfunction (EED), and child height-for-age (stunting metric) in Guatemala. Two descriptive models constructed around applicable environmental and demographic factors on child height-for-age and EED were analyzed using Network Analysis (NA) and Structural Equation Modeling (SEM). Data from two populations of children between the age of three months and five years were used. The first population (n = 2103) was drawn from the Food for Peace Baseline Survey conducted by the US Agency for International Development (USAID) in 2012, and the second population (n = 372) was drawn from an independent survey conducted by the San Vicente Health Center in 2016. The results from the NA of the height-for-age model confirmed pathogen exposure, nutrition, and prenatal health as important, and the results from the NA of the EED model confirmed water source, water treatment, and type of sanitation as important. The results from the SEM of the height-for-age model confirmed a statistically significant correlation among child height-for-age and child-mother interaction (-0.092, p = 0.076) while the SEM of the EED model confirmed the statistically significant correlation among EED and type of water treatment (-0.115, p = 0.013). Our approach supports important efforts to understand the complex set of factors associated with child stunting among communities sharing similarities with San Vicente. Copyright © 2018 Elsevier GmbH. All rights reserved.
Adali, Tülay; Levin-Schwartz, Yuri; Calhoun, Vince D.
2015-01-01
Fusion of information from multiple sets of data in order to extract a set of features that are most useful and relevant for the given task is inherent to many problems we deal with today. Since, usually, very little is known about the actual interaction among the datasets, it is highly desirable to minimize the underlying assumptions. This has been the main reason for the growing importance of data-driven methods, and in particular of independent component analysis (ICA) as it provides useful decompositions with a simple generative model and using only the assumption of statistical independence. A recent extension of ICA, independent vector analysis (IVA) generalizes ICA to multiple datasets by exploiting the statistical dependence across the datasets, and hence, as we discuss in this paper, provides an attractive solution to fusion of data from multiple datasets along with ICA. In this paper, we focus on two multivariate solutions for multi-modal data fusion that let multiple modalities fully interact for the estimation of underlying features that jointly report on all modalities. One solution is the Joint ICA model that has found wide application in medical imaging, and the second one is the the Transposed IVA model introduced here as a generalization of an approach based on multi-set canonical correlation analysis. In the discussion, we emphasize the role of diversity in the decompositions achieved by these two models, present their properties and implementation details to enable the user make informed decisions on the selection of a model along with its associated parameters. Discussions are supported by simulation results to help highlight the main issues in the implementation of these methods. PMID:26525830
Martin, Jordan S; Suarez, Scott A
2017-08-01
Interest in quantifying consistent among-individual variation in primate behavior, also known as personality, has grown rapidly in recent decades. Although behavioral coding is the most frequently utilized method for assessing primate personality, limitations in current statistical practice prevent researchers' from utilizing the full potential of their coding datasets. These limitations include the use of extensive data aggregation, not modeling biologically relevant sources of individual variance during repeatability estimation, not partitioning between-individual (co)variance prior to modeling personality structure, the misuse of principal component analysis, and an over-reliance upon exploratory statistical techniques to compare personality models across populations, species, and data collection methods. In this paper, we propose a statistical framework for primate personality research designed to address these limitations. Our framework synthesizes recently developed mixed-effects modeling approaches for quantifying behavioral variation with an information-theoretic model selection paradigm for confirmatory personality research. After detailing a multi-step analytic procedure for personality assessment and model comparison, we employ this framework to evaluate seven models of personality structure in zoo-housed bonobos (Pan paniscus). We find that differences between sexes, ages, zoos, time of observation, and social group composition contributed to significant behavioral variance. Independently of these factors, however, personality nonetheless accounted for a moderate to high proportion of variance in average behavior across observational periods. A personality structure derived from past rating research receives the strongest support relative to our model set. This model suggests that personality variation across the measured behavioral traits is best described by two correlated but distinct dimensions reflecting individual differences in affiliation and sociability (Agreeableness) as well as activity level, social play, and neophilia toward non-threatening stimuli (Openness). These results underscore the utility of our framework for quantifying personality in primates and facilitating greater integration between the behavioral ecological and comparative psychological approaches to personality research. © 2017 Wiley Periodicals, Inc.
Valdes, Claudia P.; Varma, Hari M.; Kristoffersen, Anna K.; Dragojevic, Tanja; Culver, Joseph P.; Durduran, Turgut
2014-01-01
We introduce a new, non-invasive, diffuse optical technique, speckle contrast optical spectroscopy (SCOS), for probing deep tissue blood flow using the statistical properties of laser speckle contrast and the photon diffusion model for a point source. The feasibility of the method is tested using liquid phantoms which demonstrate that SCOS is capable of measuring the dynamic properties of turbid media non-invasively. We further present an in vivo measurement in a human forearm muscle using SCOS in two modalities: one with the dependence of the speckle contrast on the source-detector separation and another on the exposure time. In doing so, we also introduce crucial corrections to the speckle contrast that account for the variance of the shot and sensor dark noises. PMID:25136500
A comparison of PCA and PMF models for source identification of fugitive methane emissions
NASA Astrophysics Data System (ADS)
Assan, Sabina; Baudic, Alexia; Bsaibes, Sandy; Gros, Valerie; Ciais, Philippe; Staufer, Johannes; Robinson, Rod; Vogel, Felix
2017-04-01
Methane (CH_4) is a greenhouse gas with a global warming potential 28-32 times that of carbon dioxide (CO_2) on a 100 year period, and even greater on shorter timescales [Etminan, et al., 2016, Allen, 2014]. Thus, despite its relatively short life time and smaller emission quantities compared to CO_2, CH4 emissions contribute to approximately 20{%} of today's anthropogenic greenhouse gas warming [Kirschke et al., 2013]. Major anthropogenic sources include livestock (enteric fermentation), oil and gas production and distribution, landfills, and wastewater emissions [EPA, 2011]. Especially in densely populated areas multiple CH4 sources can be found in close vicinity. Thus, when measuring CH4 emissions at local scales it is necessary to distinguish between different CH4 source categories to effectively quantify the contribution of each sector and aid the implementation of greenhouse gas reduction strategies. To this end, source apportionment models can be used to aid the interpretation of spatial and temporal patterns in order to identify and characterise emission sources. The focus of this study is to evaluate two common linear receptor models, namely Principle Component Analysis (PCA) and Positive Matrix Factorisation (PMF) for CH4 source apportionment. The statistical models I will present combine continuous in-situ CH4 , C_2H_6, δ^1^3CH4 measured using a Cavity Ring Down Spectroscopy (CRDS) instrument [Assan et al. 2016] with volatile organic compound (VOC) observations performed using Gas Chromatography (GC) in order to explain the underlying variance of the data. The strengths and weaknesses of both models are identified for data collected in multi-source environments in the vicinity of four different types of sites; an agricultural farm with cattle, a natural gas compressor station, a wastewater treatment plant, and a pari-urban location in the Ile de France region impacted by various sources. To conclude, receptor model results to separate statistically the different sources from the variability of atmospheric observations are compared with an independent source identification method using stable methane isotopic analysis and simple CH_4/VOC ratios. Allen, D. T. (2014). Methane emissions from natural gas production and use: reconciling bottom-up and top-down measurements. Current Opinion in Chemical Engineering, 5, 78-83. Assan, S., Baudic, A., Guemri, A., Ciais, P., Gros, V., and Vogel, F. R.: Characterisation of interferences to in-situ observations of δ13CH4 and C2H6 when using a Cavity Ring Down Spectrometer at industrial sites, Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2016-261, in review, 2016. Etminan, M., G. Myhre, E. J. Highwood and K. P. Shine (2016), Radiative forcing of carbon dioxide, methane, and nitrous oxide: A significant revision of the methane radiative forcing, Geophys. Res.Lett,43. Kirschke, S., Bousquet, P., Ciais, P., Saunois, M., Canadell, J. G., Dlugokencky, E. et al. (2013). Three decades of global methane sources and sinks. Nature Geoscience, 6(10), 813-823. U.S. Environmental Protection Agency's (U.S. EPA's). (2011) Global Anthropogenic Emissions of Non-CO2 Greenhouse Gases: 1990-2030. EPA 430-D-11-003
On the adequacy of identified Cole Cole models
NASA Astrophysics Data System (ADS)
Xiang, Jianping; Cheng, Daizhan; Schlindwein, F. S.; Jones, N. B.
2003-06-01
The Cole-Cole model has been widely used to interpret electrical geophysical data. Normally an iterative computer program is used to invert the frequency domain complex impedance data and simple error estimation is obtained from the squared difference of the measured (field) and calculated values over the full frequency range. Recently a new direct inversion algorithm was proposed for the 'optimal' estimation of the Cole-Cole parameters, which differs from existing inversion algorithms in that the estimated parameters are direct solutions of a set of equations without the need for an initial guess for initialisation. This paper first briefly investigates the advantages and disadvantages of the new algorithm compared to the standard Levenberg-Marquardt "ridge regression" algorithm. Then, and more importantly, we address the adequacy of the models resulting from both the "ridge regression" and the new algorithm, using two different statistical tests and we give objective statistical criteria for acceptance or rejection of the estimated models. The first is the standard χ2 technique. The second is a parameter-accuracy based test that uses a joint multi-normal distribution. Numerical results that illustrate the performance of both testing methods are given. The main goals of this paper are (i) to provide the source code for the new ''direct inversion'' algorithm in Matlab and (ii) to introduce and demonstrate two methods to determine the reliability of a set of data before data processing, i.e., to consider the adequacy of the resulting Cole-Cole model.
Limited-information goodness-of-fit testing of diagnostic classification item response models.
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen
2016-11-01
Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics such as Pearson's X 2 and the likelihood ratio statistic G 2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited-information fit statistics such as Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M 2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M 2 was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M 2 , we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic XLD2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The XLD2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M 2 and XLD2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). © 2016 The British Psychological Society.
Mr-Moose: An advanced SED-fitting tool for heterogeneous multi-wavelength datasets
NASA Astrophysics Data System (ADS)
Drouart, G.; Falkendal, T.
2018-04-01
We present the public release of Mr-Moose, a fitting procedure that is able to perform multi-wavelength and multi-object spectral energy distribution (SED) fitting in a Bayesian framework. This procedure is able to handle a large variety of cases, from an isolated source to blended multi-component sources from an heterogeneous dataset (i.e. a range of observation sensitivities and spectral/spatial resolutions). Furthermore, Mr-Moose handles upper-limits during the fitting process in a continuous way allowing models to be gradually less probable as upper limits are approached. The aim is to propose a simple-to-use, yet highly-versatile fitting tool fro handling increasing source complexity when combining multi-wavelength datasets with fully customisable filter/model databases. The complete control of the user is one advantage, which avoids the traditional problems related to the "black box" effect, where parameter or model tunings are impossible and can lead to overfitting and/or over-interpretation of the results. Also, while a basic knowledge of Python and statistics is required, the code aims to be sufficiently user-friendly for non-experts. We demonstrate the procedure on three cases: two artificially-generated datasets and a previous result from the literature. In particular, the most complex case (inspired by a real source, combining Herschel, ALMA and VLA data) in the context of extragalactic SED fitting, makes Mr-Moose a particularly-attractive SED fitting tool when dealing with partially blended sources, without the need for data deconvolution.
MR-MOOSE: an advanced SED-fitting tool for heterogeneous multi-wavelength data sets
NASA Astrophysics Data System (ADS)
Drouart, G.; Falkendal, T.
2018-07-01
We present the public release of MR-MOOSE, a fitting procedure that is able to perform multi-wavelength and multi-object spectral energy distribution (SED) fitting in a Bayesian framework. This procedure is able to handle a large variety of cases, from an isolated source to blended multi-component sources from a heterogeneous data set (i.e. a range of observation sensitivities and spectral/spatial resolutions). Furthermore, MR-MOOSE handles upper limits during the fitting process in a continuous way allowing models to be gradually less probable as upper limits are approached. The aim is to propose a simple-to-use, yet highly versatile fitting tool for handling increasing source complexity when combining multi-wavelength data sets with fully customisable filter/model data bases. The complete control of the user is one advantage, which avoids the traditional problems related to the `black box' effect, where parameter or model tunings are impossible and can lead to overfitting and/or over-interpretation of the results. Also, while a basic knowledge of PYTHON and statistics is required, the code aims to be sufficiently user-friendly for non-experts. We demonstrate the procedure on three cases: two artificially generated data sets and a previous result from the literature. In particular, the most complex case (inspired by a real source, combining Herschel, ALMA, and VLA data) in the context of extragalactic SED fitting makes MR-MOOSE a particularly attractive SED fitting tool when dealing with partially blended sources, without the need for data deconvolution.
Simeonov, V; Massart, D L; Andreev, G; Tsakovski, S
2000-11-01
The paper deals with application of different statistical methods like cluster and principal components analysis (PCA), partial least squares (PLSs) modeling. These approaches are an efficient tool in achieving better understanding about the contamination of two gulf regions in Black Sea. As objects of the study, a collection of marine sediment samples from Varna and Bourgas "hot spots" gulf areas are used. In the present case the use of cluster and PCA make it possible to separate three zones of the marine environment with different levels of pollution by interpretation of the sediment analysis (Bourgas gulf, Varna gulf and lake buffer zone). Further, the extraction of four latent factors offers a specific interpretation of the possible pollution sources and separates natural from anthropogenic factors, the latter originating from contamination by chemical, oil refinery and steel-work enterprises. Finally, the PLSs modeling gives a better opportunity in predicting contaminant concentration on tracer (or tracers) element as compared to the one-dimensional approach of the baseline models. The results of the study are important not only in local aspect as they allow quick response in finding solutions and decision making but also in broader sense as a useful environmetrical methodology.
Walsh, Daniel P.; Norton, Andrew S.; Storm, Daniel J.; Van Deelen, Timothy R.; Heisy, Dennis M.
2018-01-01
Implicit and explicit use of expert knowledge to inform ecological analyses is becoming increasingly common because it often represents the sole source of information in many circumstances. Thus, there is a need to develop statistical methods that explicitly incorporate expert knowledge, and can successfully leverage this information while properly accounting for associated uncertainty during analysis. Studies of cause-specific mortality provide an example of implicit use of expert knowledge when causes-of-death are uncertain and assigned based on the observer's knowledge of the most likely cause. To explicitly incorporate this use of expert knowledge and the associated uncertainty, we developed a statistical model for estimating cause-specific mortality using a data augmentation approach within a Bayesian hierarchical framework. Specifically, for each mortality event, we elicited the observer's belief of cause-of-death by having them specify the probability that the death was due to each potential cause. These probabilities were then used as prior predictive values within our framework. This hierarchical framework permitted a simple and rigorous estimation method that was easily modified to include covariate effects and regularizing terms. Although applied to survival analysis, this method can be extended to any event-time analysis with multiple event types, for which there is uncertainty regarding the true outcome. We conducted simulations to determine how our framework compared to traditional approaches that use expert knowledge implicitly and assume that cause-of-death is specified accurately. Simulation results supported the inclusion of observer uncertainty in cause-of-death assignment in modeling of cause-specific mortality to improve model performance and inference. Finally, we applied the statistical model we developed and a traditional method to cause-specific survival data for white-tailed deer, and compared results. We demonstrate that model selection results changed between the two approaches, and incorporating observer knowledge in cause-of-death increased the variability associated with parameter estimates when compared to the traditional approach. These differences between the two approaches can impact reported results, and therefore, it is critical to explicitly incorporate expert knowledge in statistical methods to ensure rigorous inference.
NASA Astrophysics Data System (ADS)
Giocoli, Carlo; Moscardini, Lauro; Baldi, Marco; Meneghetti, Massimo; Metcalf, Robert B.
2018-05-01
In this paper, we study the statistical properties of weak lensing peaks in light-cones generated from cosmological simulations. In order to assess the prospects of such observable as a cosmological probe, we consider simulations that include interacting Dark Energy (hereafter DE) models with coupling term between DE and Dark Matter. Cosmological models that produce a larger population of massive clusters have more numerous high signal-to-noise peaks; among models with comparable numbers of clusters those with more concentrated haloes produce more peaks. The most extreme model under investigation shows a difference in peak counts of about 20% with respect to the reference ΛCDM model. We find that peak statistics can be used to distinguish a coupling DE model from a reference one with the same power spectrum normalisation. The differences in the expansion history and the growth rate of structure formation are reflected in their halo counts, non-linear scale features and, through them, in the properties of the lensing peaks. For a source redshift distribution consistent with the expectations of future space-based wide field surveys, we find that typically seventy percent of the cluster population contributes to weak-lensing peaks with signal-to-noise ratios larger than two, and that the fraction of clusters in peaks approaches one-hundred percent for haloes with redshift z ≤ 0.5. Our analysis demonstrates that peak statistics are an important tool for disentangling DE models by accurately tracing the structure formation processes as a function of the cosmic time.
Turbulent mass inhomogeneities induced by a point-source
NASA Astrophysics Data System (ADS)
Thalabard, Simon
2018-03-01
We describe how turbulence distributes tracers away from a localized source of injection, and analyze how the spatial inhomogeneities of the concentration field depend on the amount of randomness in the injection mechanism. For that purpose, we contrast the mass correlations induced by purely random injections with those induced by continuous injections in the environment. Using the Kraichnan model of turbulent advection, whereby the underlying velocity field is assumed to be shortly correlated in time, we explicitly identify scaling regions for the statistics of the mass contained within a shell of radius r and located at a distance ρ away from the source. The two key parameters are found to be (i) the ratio s 2 between the absolute and the relative timescales of dispersion and (ii) the ratio Λ between the size of the cloud and its distance away from the source. When the injection is random, only the former is relevant, as previously shown by Celani et al (2007 J. Fluid Mech. 583 189–98) in the case of an incompressible fluid. It is argued that the space partition in terms of s 2 and Λ is a robust feature of the injection mechanism itself, which should remain relevant beyond the Kraichnan model. This is for instance the case in a generalized version of the model, where the absolute dispersion is prescribed to be ballistic rather than diffusive.
NASA Technical Reports Server (NTRS)
Cerniglia, M. C.; Douglass, A. R.; Rood, R. B.; Sparling, L. C..; Nielsen, J. E.
1999-01-01
We present a study of the distribution of ozone in the lowermost stratosphere with the goal of understanding the relative contribution to the observations of air of either distinctly tropospheric or stratospheric origin. The air in the lowermost stratosphere is divided into two population groups based on Ertel's potential vorticity at 300 hPa. High [low] potential vorticity at 300 hPa suggests that the tropopause is low [high], and the identification of the two groups helps to account for dynamic variability. Conditional probability distribution functions are used to define the statistics of the mix from both observations and model simulations. Two data sources are chosen. First, several years of ozonesonde observations are used to exploit the high vertical resolution. Second, observations made by the Halogen Occultation Experiment [HALOE] on the Upper Atmosphere Research Satellite [UARS] are used to understand the impact on the results of the spatial limitations of the ozonesonde network. The conditional probability distribution functions are calculated at a series of potential temperature surfaces spanning the domain from the midlatitude tropopause to surfaces higher than the mean tropical tropopause [about 380K]. Despite the differences in spatial and temporal sampling, the probability distribution functions are similar for the two data sources. Comparisons with the model demonstrate that the model maintains a mix of air in the lowermost stratosphere similar to the observations. The model also simulates a realistic annual cycle. By using the model, possible mechanisms for the maintenance of mix of air in the lowermost stratosphere are revealed. The relevance of the results to the assessment of the environmental impact of aircraft effluence is discussed.
NASA Technical Reports Server (NTRS)
Cerniglia, M. C.; Douglass, A. R.; Rood, R. B.; Sparling, L. C.; Nielsen, J. E.
1999-01-01
We present a study of the distribution of ozone in the lowermost stratosphere with the goal of understanding the relative contribution to the observations of air of either distinctly tropospheric or stratospheric origin. The air in the lowermost stratosphere is divided into two population groups based on Ertel's potential vorticity at 300 hPa. High [low] potential vorticity at 300 hPa suggests that the tropopause is low [high], and the identification of the two groups helps to account for dynamic variability. Conditional probability distribution functions are used to define the statistics of the mix from both observations and model simulations. Two data sources are chosen. First, several years of ozonesonde observations are used to exploit the high vertical resolution. Second, observations made by the Halogen Occultation Experiment [HALOE) on the Upper Atmosphere Research Satellite [UARS] are used to understand the impact on the results of the spatial limitations of the ozonesonde network. The conditional probability distribution functions are calculated at a series of potential temperature surfaces spanning the domain from the midlatitude tropopause to surfaces higher than the mean tropical tropopause [approximately 380K]. Despite the differences in spatial and temporal sampling, the probability distribution functions are similar for the two data sources. Comparisons with the model demonstrate that the model maintains a mix of air in the lowermost stratosphere similar to the observations. The model also simulates a realistic annual cycle. By using the model, possible mechanisms for the maintenance of mix of air in the lowermost stratosphere are revealed. The relevance of the results to the assessment of the environmental impact of aircraft effluence is discussed.
A MODEL TO EVALUATE PAST EXPOSURE TO 2,3,7,8 ...
Data from several studies suggest that concentrations of dioxins rose in the environment from the 1930s to about the 1960s/70s and have been declining over the last decade or two. The most direct evidence of this trend comes from lake core sediments, which can be used to estimate past atmospheric depositions of dioxins. The primary source of human exposure to dioxins is through the food supply. The pathway relating atmospheric depositions to concentrations in food is quite complex, and accordingly, it is not known to what extent the trend in human exposure mirrors the trend in atmospheric depositions. This paper describes an attempt to statistically reconstruct the pattern of past human exposure to the most toxic dioxin congener, 2,3,7,8-TCDD (abbreviated TCDD), through use of a simple pharmacokinetic (PK) model which included a time-varying TCDD exposure dose. This PK model was fit to TCDD body burden data (i.e., TCDD concentrations in lipid) from five U.S. studies dating from 1972 to 1987 and covering a wide age range. A Bayesian statistical approach was used to fit TCDD exposure; model parameters other than exposure were all previously known or estimated from other data sources. The primary results of the analysis are as follows: 1.) use of a time-varying exposure dose provided a far better fit to the TCDD body burden data than did using a dose that was constant over time; this is strong evidence that exposure to TCDD has, in fact, varied during the
NASA Astrophysics Data System (ADS)
Lin, Hui; Liu, Tianyu; Su, Lin; Bednarz, Bryan; Caracappa, Peter; Xu, X. George
2017-09-01
Monte Carlo (MC) simulation is well recognized as the most accurate method for radiation dose calculations. For radiotherapy applications, accurate modelling of the source term, i.e. the clinical linear accelerator is critical to the simulation. The purpose of this paper is to perform source modelling and examine the accuracy and performance of the models on Intel Many Integrated Core coprocessors (aka Xeon Phi) and Nvidia GPU using ARCHER and explore the potential optimization methods. Phase Space-based source modelling for has been implemented. Good agreements were found in a tomotherapy prostate patient case and a TrueBeam breast case. From the aspect of performance, the whole simulation for prostate plan and breast plan cost about 173s and 73s with 1% statistical error.
Chern-Simons Term: Theory and Applications.
NASA Astrophysics Data System (ADS)
Gupta, Kumar Sankar
1992-01-01
We investigate the quantization and applications of Chern-Simons theories to several systems of interest. Elementary canonical methods are employed for the quantization of abelian and nonabelian Chern-Simons actions using ideas from gauge theories and quantum gravity. When the spatial slice is a disc, it yields quantum states at the edge of the disc carrying a representation of the Kac-Moody algebra. We next include sources in this model and their quantum states are shown to be those of a conformal family. Vertex operators for both abelian and nonabelian sources are constructed. The regularized abelian Wilson line is proved to be a vertex operator. The spin-statistics theorem is established for Chern-Simons dynamics using purely geometrical techniques. Chern-Simons action is associated with exotic spin and statistics in 2 + 1 dimensions. We study several systems in which the Chern-Simons action affects the spin and statistics. The first class of systems we study consist of G/H models. The solitons of these models are shown to obey anyonic statistics in the presence of a Chern-Simons term. The second system deals with the effect of the Chern -Simons term in a model for high temperature superconductivity. The coefficient of the Chern-Simons term is shown to be quantized, one of its possible values giving fermionic statistics to the solitons of this model. Finally, we study a system of spinning particles interacting with 2 + 1 gravity, the latter being described by an ISO(2,1) Chern-Simons term. An effective action for the particles is obtained by integrating out the gauge fields. Next we construct operators which exchange the particles. They are shown to satisfy the braid relations. There are ambiguities in the quantization of this system which can be exploited to give anyonic statistics to the particles. We also point out that at the level of the first quantized theory, the usual spin-statistics relation need not apply to these particles.
Coxen, Christopher L.; Frey, Jennifer K.; Carleton, Scott A.; Collins, Daniel P.
2017-01-01
Species distribution models can provide critical baseline distribution information for the conservation of poorly understood species. Here, we compared the performance of band-tailed pigeon (Patagioenas fasciata) species distribution models created using Maxent and derived from two separate presence-only occurrence data sources in New Mexico: 1) satellite tracked birds and 2) observations reported in eBird basic data set. Both models had good accuracy (test AUC > 0.8 and True Skill Statistic > 0.4), and high overlap between suitability scores (I statistic 0.786) and suitable habitat patches (relative rank 0.639). Our results suggest that, at the state-wide level, eBird occurrence data can effectively model similar species distributions as satellite tracking data. Climate change models for the band-tailed pigeon predict a 35% loss in area of suitable climate by 2070 if CO2 emissions drop to 1990 levels by 2100, and a 45% loss by 2070 if we continue current CO2 emission levels through the end of the century. These numbers may be conservative given the predicted increase in drought, wildfire, and forest pest impacts to the coniferous forests the species inhabits in New Mexico. The northern portion of the species’ range in New Mexico is predicted to be the most viable through time.
Morrissey, Karyn; Kinderman, Peter; Pontin, Eleanor; Tai, Sara; Schwannauer, Mathias
2016-08-01
In June 2011 the BBC Lab UK carried out a web-based survey on the causes of mental distress. The 'Stress Test' was launched on 'All in the Mind' a BBC Radio 4 programme and the test's URL was publicised on radio and TV broadcasts, and made available via BBC web pages and social media. Given the large amount of data created, over 32,800 participants, with corresponding diagnosis, demographic and socioeconomic characteristics; the dataset are potentially an important source of data for population based research on depression and anxiety. However, as respondents self-selected to participate in the online survey, the survey may comprise a non-random sample. It may be only individuals that listen to BBC Radio 4 and/or use their website that participated in the survey. In this instance using the Stress Test data for wider population based research may create sample selection bias. Focusing on the depression component of the Stress Test, this paper presents an easy-to-use method, the Two Step Probit Selection Model, to detect and statistically correct selection bias in the Stress Test. Using a Two Step Probit Selection Model; this paper did not find a statistically significant selection on unobserved factors for participants of the Stress Test. That is, survey participants who accessed and completed an online survey are not systematically different from non-participants on the variables of substantive interest. Copyright © 2016 Elsevier Ltd. All rights reserved.
Probabilistic models in human sensorimotor control
Wolpert, Daniel M.
2009-01-01
Sensory and motor uncertainty form a fundamental constraint on human sensorimotor control. Bayesian decision theory (BDT) has emerged as a unifying framework to understand how the central nervous system performs optimal estimation and control in the face of such uncertainty. BDT has two components: Bayesian statistics and decision theory. Here we review Bayesian statistics and show how it applies to estimating the state of the world and our own body. Recent results suggest that when learning novel tasks we are able to learn the statistical properties of both the world and our own sensory apparatus so as to perform estimation using Bayesian statistics. We review studies which suggest that humans can combine multiple sources of information to form maximum likelihood estimates, can incorporate prior beliefs about possible states of the world so as to generate maximum a posteriori estimates and can use Kalman filter-based processes to estimate time-varying states. Finally, we review Bayesian decision theory in motor control and how the central nervous system processes errors to determine loss functions and optimal actions. We review results that suggest we plan movements based on statistics of our actions that result from signal-dependent noise on our motor outputs. Taken together these studies provide a statistical framework for how the motor system performs in the presence of uncertainty. PMID:17628731
The USGS=s SPARROW Model is a statistical model with mechanistic features that has been used to calculate annual nutrient fluxes in nontidal streams nationally on the basis of nitrogen sources, landscape characteristics, and stream properties. This model has been useful for asses...
De Moor, G J E; Claerhout, B; De Meyer, F
2003-01-01
To introduce some of the privacy protection problems related to genomics based medicine and to highlight the relevance of Trusted Third Parties (TTPs) and of Privacy Enhancing Techniques (PETs) in the restricted context of clinical research and statistics. Practical approaches based on two different pseudonymisation models, both for batch and interactive data collection and exchange, are described and analysed. The growing need of managing both clinical and genetic data raises important legal and ethical challenges. Protecting human rights in the realm of privacy, while optimising research potential and other statistical activities is a challenge that can easily be overcome with the assistance of a trust service provider offering advanced privacy enabling/enhancing solutions. As such, the use of pseudonymisation and other innovative Privacy Enhancing Techniques can unlock valuable data sources.
Kotasidis, F A; Matthews, J C; Angelis, G I; Noonan, P J; Jackson, A; Price, P; Lionheart, W R; Reader, A J
2011-05-21
Incorporation of a resolution model during statistical image reconstruction often produces images of improved resolution and signal-to-noise ratio. A novel and practical methodology to rapidly and accurately determine the overall emission and detection blurring component of the system matrix using a printed point source array within a custom-made Perspex phantom is presented. The array was scanned at different positions and orientations within the field of view (FOV) to examine the feasibility of extrapolating the measured point source blurring to other locations in the FOV and the robustness of measurements from a single point source array scan. We measured the spatially-variant image-based blurring on two PET/CT scanners, the B-Hi-Rez and the TruePoint TrueV. These measured spatially-variant kernels and the spatially-invariant kernel at the FOV centre were then incorporated within an ordinary Poisson ordered subset expectation maximization (OP-OSEM) algorithm and compared to the manufacturer's implementation using projection space resolution modelling (RM). Comparisons were based on a point source array, the NEMA IEC image quality phantom, the Cologne resolution phantom and two clinical studies (carbon-11 labelled anti-sense oligonucleotide [(11)C]-ASO and fluorine-18 labelled fluoro-l-thymidine [(18)F]-FLT). Robust and accurate measurements of spatially-variant image blurring were successfully obtained from a single scan. Spatially-variant resolution modelling resulted in notable resolution improvements away from the centre of the FOV. Comparison between spatially-variant image-space methods and the projection-space approach (the first such report, using a range of studies) demonstrated very similar performance with our image-based implementation producing slightly better contrast recovery (CR) for the same level of image roughness (IR). These results demonstrate that image-based resolution modelling within reconstruction is a valid alternative to projection-based modelling, and that, when using the proposed practical methodology, the necessary resolution measurements can be obtained from a single scan. This approach avoids the relatively time-consuming and involved procedures previously proposed in the literature.
Modeling Group Interactions via Open Data Sources
2011-08-30
data. The state-of-art search engines are designed to help general query-specific search and not suitable for finding disconnected online groups. The...groups, (2) developing innovative mathematical and statistical models and efficient algorithms that leverage existing search engines and employ
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
A Two-Tiered Model for Analyzing Library Web Site Usage Statistics, Part 1: Web Server Logs.
ERIC Educational Resources Information Center
Cohen, Laura B.
2003-01-01
Proposes a two-tiered model for analyzing web site usage statistics for academic libraries: one tier for library administrators that analyzes measures indicating library use, and a second tier for web site managers that analyzes measures aiding in server maintenance and site design. Discusses the technology of web site usage statistics, and…
CENTAURUS A AS A POINT SOURCE OF ULTRAHIGH ENERGY COSMIC RAYS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Hang Bae, E-mail: hbkim@hanyang.ac.kr
We probe the possibility that Centaurus A (Cen A) is a point source of ultrahigh energy cosmic rays (UHECRs) observed by Pierre Auger Observatory (PAO), through the statistical analysis of the arrival direction distribution. For this purpose, we set up the Cen A dominance model for the UHECR sources, in which Cen A contributes the fraction f {sub C} of the whole UHECR with energy above 5.5 Multiplication-Sign 10{sup 19} eV and the isotropic background contributes the remaining 1 - f {sub C} fraction. The effect of the intergalactic magnetic fields on the bending of the trajectory of Cen Amore » originated UHECRs is parameterized by the Gaussian smearing angle {theta} {sub s}. For the statistical analysis, we adopted the correlational angular distance distribution (CADD) for the reduction of the arrival direction distribution and the Kuiper test to compare the observed and the expected CADDs. We identify the excess of UHECRs in the Cen A direction and fit the CADD of the observed PAO data by varying two parameters f {sub C} and {theta} {sub s} of the Cen A dominance model. The best-fit parameter values are f {sub C} Almost-Equal-To 0.1 (the corresponding Cen A fraction observed at PAO is f {sub C,PAO} Almost-Equal-To 0.15, that is, about 10 out of 69 UHECRs) and {theta} {sub s} = 5 Degree-Sign with the maximum likelihood L {sub max} = 0.29. This result supports the existence of a point source smeared by the intergalactic magnetic fields in the direction of Cen A. If Cen A is actually the source responsible for the observed excess of UHECRs, the rms deflection angle of the excess UHECRs implies the order of 10 nG intergalactic magnetic field in the vicinity of Cen A.« less
STATISTICS OF GAMMA-RAY POINT SOURCES BELOW THE FERMI DETECTION LIMIT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Malyshev, Dmitry; Hogg, David W., E-mail: dm137@nyu.edu
2011-09-10
An analytic relation between the statistics of photons in pixels and the number counts of multi-photon point sources is used to constrain the distribution of gamma-ray point sources below the Fermi detection limit at energies above 1 GeV and at latitudes below and above 30 deg. The derived source-count distribution is consistent with the distribution found by the Fermi Collaboration based on the first Fermi point-source catalog. In particular, we find that the contribution of resolved and unresolved active galactic nuclei (AGNs) to the total gamma-ray flux is below 20%-25%. In the best-fit model, the AGN-like point-source fraction is 17%more » {+-} 2%. Using the fact that the Galactic emission varies across the sky while the extragalactic diffuse emission is isotropic, we put a lower limit of 51% on Galactic diffuse emission and an upper limit of 32% on the contribution from extragalactic weak sources, such as star-forming galaxies. Possible systematic uncertainties are discussed.« less
Andersen, Lau M
2018-01-01
An important aim of an analysis pipeline for magnetoencephalographic (MEG) data is that it allows for the researcher spending maximal effort on making the statistical comparisons that will answer his or her questions. The example question being answered here is whether the so-called beta rebound differs between novel and repeated stimulations. Two analyses are presented: going from individual sensor space representations to, respectively, an across-group sensor space representation and an across-group source space representation. The data analyzed are neural responses to tactile stimulations of the right index finger in a group of 20 healthy participants acquired from an Elekta Neuromag System. The processing steps covered for the first analysis are MaxFiltering the raw data, defining, preprocessing and epoching the data, cleaning the data, finding and removing independent components related to eye blinks, eye movements and heart beats, calculating participants' individual evoked responses by averaging over epoched data and subsequently removing the average response from single epochs, calculating a time-frequency representation and baselining it with non-stimulation trials and finally calculating a grand average, an across-group sensor space representation. The second analysis starts from the grand average sensor space representation and after identification of the beta rebound the neural origin is imaged using beamformer source reconstruction. This analysis covers reading in co-registered magnetic resonance images, segmenting the data, creating a volume conductor, creating a forward model, cutting out MEG data of interest in the time and frequency domains, getting Fourier transforms and estimating source activity with a beamformer model where power is expressed relative to MEG data measured during periods of non-stimulation. Finally, morphing the source estimates onto a common template and performing group-level statistics on the data are covered. Functions for saving relevant figures in an automated and structured manner are also included. The protocol presented here can be applied to any research protocol where the emphasis is on source reconstruction of induced responses where the underlying sources are not coherent.
Hart, Carl R; Reznicek, Nathan J; Wilson, D Keith; Pettit, Chris L; Nykaza, Edward T
2016-05-01
Many outdoor sound propagation models exist, ranging from highly complex physics-based simulations to simplified engineering calculations, and more recently, highly flexible statistical learning methods. Several engineering and statistical learning models are evaluated by using a particular physics-based model, namely, a Crank-Nicholson parabolic equation (CNPE), as a benchmark. Narrowband transmission loss values predicted with the CNPE, based upon a simulated data set of meteorological, boundary, and source conditions, act as simulated observations. In the simulated data set sound propagation conditions span from downward refracting to upward refracting, for acoustically hard and soft boundaries, and low frequencies. Engineering models used in the comparisons include the ISO 9613-2 method, Harmonoise, and Nord2000 propagation models. Statistical learning methods used in the comparisons include bagged decision tree regression, random forest regression, boosting regression, and artificial neural network models. Computed skill scores are relative to sound propagation in a homogeneous atmosphere over a rigid ground. Overall skill scores for the engineering noise models are 0.6%, -7.1%, and 83.8% for the ISO 9613-2, Harmonoise, and Nord2000 models, respectively. Overall skill scores for the statistical learning models are 99.5%, 99.5%, 99.6%, and 99.6% for bagged decision tree, random forest, boosting, and artificial neural network regression models, respectively.
Filtered Mass Density Function for Design Simulation of High Speed Airbreathing Propulsion Systems
NASA Technical Reports Server (NTRS)
Drozda, T. G.; Sheikhi, R. M.; Givi, Peyman
2001-01-01
The objective of this research is to develop and implement new methodology for large eddy simulation of (LES) of high-speed reacting turbulent flows. We have just completed two (2) years of Phase I of this research. This annual report provides a brief and up-to-date summary of our activities during the period: September 1, 2000 through August 31, 2001. In the work within the past year, a methodology termed "velocity-scalar filtered density function" (VSFDF) is developed and implemented for large eddy simulation (LES) of turbulent flows. In this methodology the effects of the unresolved subgrid scales (SGS) are taken into account by considering the joint probability density function (PDF) of all of the components of the velocity and scalar vectors. An exact transport equation is derived for the VSFDF in which the effects of the unresolved SGS convection, SGS velocity-scalar source, and SGS scalar-scalar source terms appear in closed form. The remaining unclosed terms in this equation are modeled. A system of stochastic differential equations (SDEs) which yields statistically equivalent results to the modeled VSFDF transport equation is constructed. These SDEs are solved numerically by a Lagrangian Monte Carlo procedure. The consistency of the proposed SDEs and the convergence of the Monte Carlo solution are assessed by comparison with results obtained by an Eulerian LES procedure in which the corresponding transport equations for the first two SGS moments are solved. The unclosed SGS convection, SGS velocity-scalar source, and SGS scalar-scalar source in the Eulerian LES are replaced by corresponding terms from VSFDF equation. The consistency of the results is then analyzed for a case of two dimensional mixing layer.
Possibility of reconstruction of dental plaster cast from 3D digital study models
2013-01-01
Objectives To compare traditional plaster casts, digital models and 3D printed copies of dental plaster casts based on various criteria. To determine whether 3D printed copies obtained using open source system RepRap can replace traditional plaster casts in dental practice. To compare and contrast the qualities of two possible 3D printing options – open source system RepRap and commercially available 3D printing. Design and settings A method comparison study on 10 dental plaster casts from the Orthodontic department, Department of Stomatology, 2nd medical Faulty, Charles University Prague, Czech Republic. Material and methods Each of 10 plaster casts were scanned by inEos Blue scanner and the printed on 3D printer RepRap [10 models] and ProJet HD3000 3D printer [1 model]. Linear measurements between selected points on the dental arches of upper and lower jaws on plaster casts and its 3D copy were recorded and statistically analyzed. Results 3D printed copies have many advantages over traditional plaster casts. The precision and accuracy of the RepRap 3D printed copies of plaster casts were confirmed based on the statistical analysis. Although the commercially available 3D printing enables to print more details than the RepRap system, it is expensive and for the purpose of clinical use can be replaced by the cheaper prints obtained from RepRap printed copies. Conclusions Scanning of the traditional plaster casts to obtain a digital model offers a pragmatic approach. The scans can subsequently be used as a template to print the plaster casts as required. Using 3D printers can replace traditional plaster casts primarily due to their accuracy and price. PMID:23721330
An Analysis of Fundamental Mode Surface Wave Amplitude Measurements
NASA Astrophysics Data System (ADS)
Schardong, L.; Ferreira, A. M.; van Heijst, H. J.; Ritsema, J.
2014-12-01
Seismic tomography is a powerful tool to decipher the Earth's interior structure at various scales. Traveltimes of seismic waves are widely used to build velocity models, whereas amplitudes are still only seldomly accounted for. This mainly results from our limited ability to separate the various physical effects responsible for observed amplitude variations, such as focussing/defocussing, scattering and source effects. We present new measurements from 50 global earthquakes of fundamental-mode Rayleigh and Love wave amplitude anomalies measured in the period range 35-275 seconds using two different schemes: (i) a standard time-domain amplitude power ratio technique; and (ii) a mode-branch stripping scheme. For minor-arc data, we observe amplitude anomalies with respect to PREM in the range of 0-4, for which the two measurement techniques show a very good overall agreement. We present here a statistical analysis and comparison of these datasets, as well as comparisons with theoretical calculations for a variety of 3-D Earth models. We assess the geographical coherency of the measurements, and investigate the impact of source, path and receiver effects on surface wave amplitudes, as well as their variations with frequency in a wider range than previously studied.
Malacarne, Mario; Nardin, Tiziana; Bertoldi, Daniela; Nicolini, Giorgio; Larcher, Roberto
2016-09-01
Commercial tannins from several botanical sources and with different chemical and technological characteristics are used in the food and winemaking industries. Different ways to check their botanical authenticity have been studied in the last few years, through investigation of different analytical parameters. This work proposes a new, effective approach based on the quantification of 6 carbohydrates, 7 polyalcohols, and 55 phenols. 87 tannins from 12 different botanical sources were analysed following a very simple sample preparation procedure. Using Forward Stepwise Discriminant Analysis, 3 statistical models were created based on sugars content, phenols concentration and combination of the two classes of compounds for the 8 most abundant categories (i.e. oak, grape seed, grape skin, gall, chestnut, quebracho, tea and acacia). The last approach provided good results in attributing tannins to the correct botanical origin. Validation, repeated 3 times on subsets of 10% of samples, confirmed the reliability of this model. Copyright © 2016 Elsevier Ltd. All rights reserved.
Navy Nurse Corps manpower management model.
Kinstler, Daniel P; Johnson, Raymond W; Richter, Anke; Kocher, Kathryn
2008-01-01
The Navy Nurse Corps is part of a team of professionals that provides high quality, economical health care to approximately 700,000 active duty Navy and Marine Corps members, as well as 2.6 million retired and family members. Navy Nurse Corps manpower management efficiency is critical to providing this care. This paper aims to focus on manpower planning in the Navy Nurse Corps. The Nurse Corps manages personnel primarily through the recruitment process, drawing on multiple hiring sources. Promotion rates at the lowest two ranks are mandated, but not at the higher ranks. Retention rates vary across pay grades. Using these promotion and attrition rates, a Markov model was constructed to model the personnel flow of junior nurse corps officers. Hiring sources were shown to have a statistically significant effect on promotion and retention rates. However, these effects were not found to be practically significant in the Markov model. Only small improvements in rank imbalances are possible given current recruiting guidelines. Allowing greater flexibility in recruiting practices, fewer recruits would generate a 25 percent reduction in rank imbalances, but result in understaffing. Recruiting different ranks at entry would generate a 65 percent reduction in rank imbalances without understaffing issues. Policies adjusting promotion and retention rates are more powerful in controlling personnel flows than adjusting hiring sources. These policies are the only means for addressing the fundamental sources of rank imbalances in the Navy Nurse Corps arising from current manpower guidelines. The paper shows that modeling to improve manpower management may enable the Navy Nurse Corps to more efficiently fulfill its mandate for high-quality healthcare.
A Climate Statistics Tool and Data Repository
NASA Astrophysics Data System (ADS)
Wang, J.; Kotamarthi, V. R.; Kuiper, J. A.; Orr, A.
2017-12-01
Researchers at Argonne National Laboratory and collaborating organizations have generated regional scale, dynamically downscaled climate model output using Weather Research and Forecasting (WRF) version 3.3.1 at a 12km horizontal spatial resolution over much of North America. The WRF model is driven by boundary conditions obtained from three independent global scale climate models and two different future greenhouse gas emission scenarios, named representative concentration pathways (RCPs). The repository of results has a temporal resolution of three hours for all the simulations, includes more than 50 variables, is stored in Network Common Data Form (NetCDF) files, and the data volume is nearly 600Tb. A condensed 800Gb set of NetCDF files were made for selected variables most useful for climate-related planning, including daily precipitation, relative humidity, solar radiation, maximum temperature, minimum temperature, and wind. The WRF model simulations are conducted for three 10-year time periods (1995-2004, 2045-2054, and 2085-2094), and two future scenarios RCP4.5 and RCP8.5). An open-source tool was coded using Python 2.7.8 and ESRI ArcGIS 10.3.1 programming libraries to parse the NetCDF files, compute summary statistics, and output results as GIS layers. Eight sets of summary statistics were generated as examples for the contiguous U.S. states and much of Alaska, including number of days over 90°F, number of days with a heat index over 90°F, heat waves, monthly and annual precipitation, drought, extreme precipitation, multi-model averages, and model bias. This paper will provide an overview of the project to generate the main and condensed data repositories, describe the Python tool and how to use it, present the GIS results of the computed examples, and discuss some of the ways they can be used for planning. The condensed climate data, Python tool, computed GIS results, and documentation of the work are shared on the Internet.
NASA Astrophysics Data System (ADS)
Hu, Jin; Tian, Jie; Pan, Xiaohong; Liu, Jiangang
2007-03-01
The purpose of this paper is to compare between EEG source localization and fMRI during emotional processing. 108 pictures for EEG (categorized as positive, negative and neutral) and 72 pictures for fMRI were presented to 24 healthy, right-handed subjects. The fMRI data were analyzed using statistical parametric mapping with SPM2. LORETA was applied to grand averaged ERP data to localize intracranial sources. Statistical analysis was implemented to compare spatiotemporal activation of fMRI and EEG. The fMRI results are in accordance with EEG source localization to some extent, while part of mismatch in localization between the two methods was also observed. In the future we should apply the method for simultaneous recording of EEG and fMRI to our study.
Adaptive distributed source coding.
Varodayan, David; Lin, Yao-Chung; Girod, Bernd
2012-05-01
We consider distributed source coding in the presence of hidden variables that parameterize the statistical dependence among sources. We derive the Slepian-Wolf bound and devise coding algorithms for a block-candidate model of this problem. The encoder sends, in addition to syndrome bits, a portion of the source to the decoder uncoded as doping bits. The decoder uses the sum-product algorithm to simultaneously recover the source symbols and the hidden statistical dependence variables. We also develop novel techniques based on density evolution (DE) to analyze the coding algorithms. We experimentally confirm that our DE analysis closely approximates practical performance. This result allows us to efficiently optimize parameters of the algorithms. In particular, we show that the system performs close to the Slepian-Wolf bound when an appropriate doping rate is selected. We then apply our coding and analysis techniques to a reduced-reference video quality monitoring system and show a bit rate saving of about 75% compared with fixed-length coding.
Buultjens, Andrew H.; Chua, Kyra Y. L.; Baines, Sarah L.; Kwong, Jason; Gao, Wei; Cutcher, Zoe; Adcock, Stuart; Ballard, Susan; Schultz, Mark B.; Tomita, Takehiro; Subasinghe, Nela; Carter, Glen P.; Pidot, Sacha J.; Franklin, Lucinda; Seemann, Torsten; Gonçalves Da Silva, Anders
2017-01-01
ABSTRACT Public health agencies are increasingly relying on genomics during Legionnaires' disease investigations. However, the causative bacterium (Legionella pneumophila) has an unusual population structure, with extreme temporal and spatial genome sequence conservation. Furthermore, Legionnaires' disease outbreaks can be caused by multiple L. pneumophila genotypes in a single source. These factors can confound cluster identification using standard phylogenomic methods. Here, we show that a statistical learning approach based on L. pneumophila core genome single nucleotide polymorphism (SNP) comparisons eliminates ambiguity for defining outbreak clusters and accurately predicts exposure sources for clinical cases. We illustrate the performance of our method by genome comparisons of 234 L. pneumophila isolates obtained from patients and cooling towers in Melbourne, Australia, between 1994 and 2014. This collection included one of the largest reported Legionnaires' disease outbreaks, which involved 125 cases at an aquarium. Using only sequence data from L. pneumophila cooling tower isolates and including all core genome variation, we built a multivariate model using discriminant analysis of principal components (DAPC) to find cooling tower-specific genomic signatures and then used it to predict the origin of clinical isolates. Model assignments were 93% congruent with epidemiological data, including the aquarium Legionnaires' disease outbreak and three other unrelated outbreak investigations. We applied the same approach to a recently described investigation of Legionnaires' disease within a UK hospital and observed a model predictive ability of 86%. We have developed a promising means to breach L. pneumophila genetic diversity extremes and provide objective source attribution data for outbreak investigations. IMPORTANCE Microbial outbreak investigations are moving to a paradigm where whole-genome sequencing and phylogenetic trees are used to support epidemiological investigations. It is critical that outbreak source predictions are accurate, particularly for pathogens, like Legionella pneumophila, which can spread widely and rapidly via cooling system aerosols, causing Legionnaires' disease. Here, by studying hundreds of Legionella pneumophila genomes collected over 21 years around a major Australian city, we uncovered limitations with the phylogenetic approach that could lead to a misidentification of outbreak sources. We implement instead a statistical learning technique that eliminates the ambiguity of inferring disease transmission from phylogenies. Our approach takes geolocation information and core genome variation from environmental L. pneumophila isolates to build statistical models that predict with high confidence the environmental source of clinical L. pneumophila during disease outbreaks. We show the versatility of the technique by applying it to unrelated Legionnaires' disease outbreaks in Australia and the UK. PMID:28821546
Song, Yong-Ze; Yang, Hong-Lei; Peng, Jun-Huan; Song, Yi-Rong; Sun, Qian; Li, Yuan
2015-01-01
Particulate matter with an aerodynamic diameter <2.5 μm (PM2.5) represents a severe environmental problem and is of negative impact on human health. Xi'an City, with a population of 6.5 million, is among the highest concentrations of PM2.5 in China. In 2013, in total, there were 191 days in Xi’an City on which PM2.5 concentrations were greater than 100 μg/m3. Recently, a few studies have explored the potential causes of high PM2.5 concentration using remote sensing data such as the MODIS aerosol optical thickness (AOT) product. Linear regression is a commonly used method to find statistical relationships among PM2.5 concentrations and other pollutants, including CO, NO2, SO2, and O3, which can be indicative of emission sources. The relationships of these variables, however, are usually complicated and non-linear. Therefore, a generalized additive model (GAM) is used to estimate the statistical relationships between potential variables and PM2.5 concentrations. This model contains linear functions of SO2 and CO, univariate smoothing non-linear functions of NO2, O3, AOT and temperature, and bivariate smoothing non-linear functions of location and wind variables. The model can explain 69.50% of PM2.5 concentrations, with R2 = 0.691, which improves the result of a stepwise linear regression (R2 = 0.582) by 18.73%. The two most significant variables, CO concentration and AOT, represent 20.65% and 19.54% of the deviance, respectively, while the three other gas-phase concentrations, SO2, NO2, and O3 account for 10.88% of the total deviance. These results show that in Xi'an City, the traffic and other industrial emissions are the primary source of PM2.5. Temperature, location, and wind variables also non-linearly related with PM2.5. PMID:26540446
Evaluation of eight short-term long-range transport models using field data
NASA Astrophysics Data System (ADS)
Carhart, R. A.; Policastro, A. J.; Wastag, M.; Coke, L.
Eight short-term long-range transport models (MESOPUFF, MESOPLUME, MSPUFF, MESOPUFF II, MTDDIS, ARRPA, RADM and RTM-II) have been tested with field data from two data bases involving tracer releases. The Oklahoma data base involved two separate experiments with measurements taken at 100 and 600 km arcs downwind of a 3-h perfluorocarbon release. The Savannah River Plant data base encompassed 15 experiments with measurements taken over 2-5 days at distances of 28-144 km downwind from a 62 m stack release of Kr-85 gas. Application of the American Meteorological Society statistics to the model/data comparisons showed that six of the eight models predicted within a factor of two of the observed concentrations for all of the following: points paired in space and time, points paired in space only, and for points unpaired in space and time. However, the ratio of the standard deviation of residuals to the average observed value showed improvement as more unpairing was done in the comparison of the models with the data. The statistical comparisons reveal a definite tendency of the models to overpredict plume concentrations. Supplemental graphical comparisons showed that plume concentration overprediction is accompanied with an underprediction of plume spreading, and that a definite time lag is often observed between the time of arrival of the observed plume and the time of arrival of the predicted plume. The causes of model/data discrepancies can be largely traced to inadequate wind field modeling that leads to an incorrect temporal and spatial positioning of the plume, and the use of the Turner [Workbook of atmospheric dispersion estimates. U.S. Dept of H.E.W. Publication 999-AP-26 (1970)] curves to downwind distances beyond which they can accurately represent the scales of atmospheric turbulence. The use of multilayer wind field models and the use of the Heffter [ J. appl. Met.4, 153-156 (1965)] formula for lateral plume dispersion close to the source appear to improve model accuracies.
The Co-Emergence of Aggregate and Modelling Reasoning
ERIC Educational Resources Information Center
Aridor, Keren; Ben-Zvi, Dani
2017-01-01
This article examines how two processes--reasoning with statistical modelling of a real phenomenon and aggregate reasoning--can co-emerge. We focus in this case study on the emergent reasoning of two fifth graders (aged 10) involved in statistical data analysis, informal inference, and modelling activities using TinkerPlots™. We describe nine…
NASA Astrophysics Data System (ADS)
Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.
2015-12-01
Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.
A two-way interface between limited Systems Biology Markup Language and R.
Radivoyevitch, Tomas
2004-12-07
Systems Biology Markup Language (SBML) is gaining broad usage as a standard for representing dynamical systems as data structures. The open source statistical programming environment R is widely used by biostatisticians involved in microarray analyses. An interface between SBML and R does not exist, though one might be useful to R users interested in SBML, and SBML users interested in R. A model structure that parallels SBML to a limited degree is defined in R. An interface between this structure and SBML is provided through two function definitions: write.SBML() which maps this R model structure to SBML level 2, and read.SBML() which maps a limited range of SBML level 2 files back to R. A published model of purine metabolism is provided in this SBML-like format and used to test the interface. The model reproduces published time course responses before and after its mapping through SBML. List infrastructure preexisting in R makes it well-suited for manipulating SBML models. Further developments of this SBML-R interface seem to be warranted.
A two-way interface between limited Systems Biology Markup Language and R
Radivoyevitch, Tomas
2004-01-01
Background Systems Biology Markup Language (SBML) is gaining broad usage as a standard for representing dynamical systems as data structures. The open source statistical programming environment R is widely used by biostatisticians involved in microarray analyses. An interface between SBML and R does not exist, though one might be useful to R users interested in SBML, and SBML users interested in R. Results A model structure that parallels SBML to a limited degree is defined in R. An interface between this structure and SBML is provided through two function definitions: write.SBML() which maps this R model structure to SBML level 2, and read.SBML() which maps a limited range of SBML level 2 files back to R. A published model of purine metabolism is provided in this SBML-like format and used to test the interface. The model reproduces published time course responses before and after its mapping through SBML. Conclusions List infrastructure preexisting in R makes it well-suited for manipulating SBML models. Further developments of this SBML-R interface seem to be warranted. PMID:15585059
Implication of correlations among some common stability statistics - a Monte Carlo simulations.
Piepho, H P
1995-03-01
Stability analysis of multilocation trials is often based on a mixed two-way model. Two stability measures in frequent use are the environmental variance (S i (2) )and the ecovalence (W i). Under the two-way model the rank orders of the expected values of these two statistics are identical for a given set of genotypes. By contrast, empirical rank correlations among these measures are consistently low. This suggests that the two-way mixed model may not be appropriate for describing real data. To check this hypothesis, a Monte Carlo simulation was conducted. It revealed that the low empirical rank correlation amongS i (2) and W i is most likely due to sampling errors. It is concluded that the observed low rank correlation does not invalidate the two-way model. The paper also discusses tests for homogeneity of S i (2) as well as implications of the two-way model for the classification of stability statistics.
NASA Astrophysics Data System (ADS)
Lyon, David Richard
Methane emissions from the oil and gas (O&G) supply chain reduce potential climate benefits of natural gas as a replacement for other fossil fuels that emit more carbon dioxide per energy produced. O&G facilities have skewed emission rate distributions with a small fraction of sites contributing the majority of emissions. Knowledge of the identity and cause of these high emission facilities, referred to as super-emitters or fat-tail sources, is critical for reducing supply chain emissions. This dissertation addresses the quantification of super-emitter emissions, assessment of their prevalence and relationship to site characteristics, and mitigation with continuous leak detection systems. Chapter 1 summarizes the state of the knowledge of O&G methane emissions. Chapter 2 constructs a spatially-resolved emission inventory to estimate total and O&G methane emissions in the Barnett Shale as part of a coordinated research campaign using multiple top-down and bottom-up methods to quantify emissions. The emission inventory accounts for super-emitters with two-phase Monte Carlo simulations that combine site measurements collected with two approaches: unbiased sampling and targeted sampling of super-emitters. More comprehensive activity data and the inclusion of super-emitters, which account for 19% of O&G emissions, produces a emission inventory that is not statistically different than top-down regional emission estimates. Chapter 3 describes a helicopter-based survey of over 8,000 well pads in seven basins with infrared optical gas imaging to assess high emission sources. Four percent of sites are observed to have high emissions with over 90% of observed sources from tanks. The occurrence of high emissions is weakly correlated to site parameters and the best statistical model explains only 14% of variance, which demonstrates that the occurrence of super-emitters is primarily stochastic. Chapter 4 presents a Gaussian dispersion model for optimizing the placement of continuous leak detection systems at three example well pads. The model demonstrates that large leaks can be detected quickly with first generation systems. Continuous leak detection can be used in the near future to cost-effectively mitigate methane emissions from O&G super-emitters.
Spectral Analysis of the Wake behind a Helicopter Rotor Hub
NASA Astrophysics Data System (ADS)
Petrin, Christopher; Reich, David; Schmitz, Sven; Elbing, Brian
2016-11-01
A scaled model of a notional helicopter rotor hub was tested in the 48" Garfield Thomas Water Tunnel at the Applied Research Laboratory Penn State. LDV and PIV measurements in the far-wake consistently showed a six-per-revolution flow structure, in addition to stronger two- and four-per-revolution structures. These six-per-revolution structures persisted into the far-field, and have no direct geometric counterpart on the hub model. The current study will examine the Reynolds number dependence of these structures and present higher-order statistics of the turbulence within the wake. In addition, current activity using the EFPL Large Water Tunnel at Oklahoma State University will be presented. This effort uses a more canonical configuration to identify the source for these six-per-revolution structures, which are assumed to be a non-linear interaction between the two- and four-per-revolution structures.
SiGN-SSM: open source parallel software for estimating gene networks with state space models.
Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru
2011-04-15
SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
Acoustic Analogy and Alternative Theories for Jet Noise Prediction
NASA Technical Reports Server (NTRS)
Morris, Philip J.; Farassat, F.
2002-01-01
Several methods for the prediction of jet noise are described. All but one of the noise prediction schemes are based on Lighthill's or Lilley's acoustic analogy, whereas the other is the jet noise generation model recently proposed by Tam and Auriault. In all of the approaches, some assumptions must be made concerning the statistical properties of the turbulent sources. In each case the characteristic scales of the turbulence are obtained from a solution of the Reynolds-averaged Navier-Stokes equation using a kappa-sigma turbulence model. It is shown that, for the same level of empiricism, Tam and Auriault's model yields better agreement with experimental noise measurements than the acoustic analogy. It is then shown that this result is not because of some fundamental flaw in the acoustic analogy approach, but instead is associated with the assumptions made in the approximation of the turbulent source statistics. If consistent assumptions are made, both the acoustic analogy and Tam and Auriault's model yield identical noise predictions. In conclusion, a proposal is presented for an acoustic analogy that provides a clearer identification of the equivalent source mechanisms, as is a discussion of noise prediction issues that remain to be resolved.
The Acoustic Analogy and Alternative Theories for Jet Noise Prediction
NASA Technical Reports Server (NTRS)
Morris, Philip J.; Farassat, F.; Morris, Philip J.
2002-01-01
This paper describes several methods for the prediction of jet noise. All but one of the noise prediction schemes are based on Lighthill's or Lilley's acoustic analogy while the other is the jet noise generation model recently proposed by Tam and Auriault. In all the approaches some assumptions must be made concerning the statistical properties of the turbulent sources. In each case the characteristic scales of the turbulence are obtained from a solution of the Reynolds-averaged Navier Stokes equation using a k-epsilon turbulence model. It is shown that, for the same level of empiricism, Tam and Auriault's model yields better agreement with experimental noise measurements than the acoustic analogy. It is then shown that this result is not because of some fundamental flaw in the acoustic analogy approach: but, is associated with the assumptions made in the approximation of the turbulent source statistics. If consistent assumptions are made, both the acoustic analogy and Tam and Auriault's model yield identical noise predictions. The paper concludes with a proposal for an acoustic analogy that provides a clearer identification of the equivalent source mechanisms and a discussion of noise prediction issues that remain to be resolved.
The Acoustic Analogy and Alternative Theories for Jet Noise Prediction
NASA Technical Reports Server (NTRS)
Morris, Philip J.; Farassat, F.
2002-01-01
This paper describes several methods for the prediction of jet noise. All but one of the noise prediction schemes are based on Lighthill's or Lilley's acoustic analogy while the other is the jet noise generation model recently proposed by Tam and Auriault. In all the approaches some assumptions must be made concerning the statistical properties of the turbulent sources. In each case the characteristic scales of the turbulence are obtained from a solution of the Reynolds-averaged Navier Stokes equation using a k - epsilon turbulence model. It is shown that, for the same level of empiricism, Tam and Auriault's model yields better agreement with experimental noise measurements than the acoustic analogy. It is then shown that this result is not because of some fundamental flaw in the acoustic analogy approach: but, is associated with the assumptions made in the approximation of the turbulent source statistics. If consistent assumptions are made, both the acoustic analogy and Tam and Auriault's model yield identical noise predictions. The paper concludes with a proposal for an acoustic analogy that provides a clearer identification of the equivalent source mechanisms and a discussion of noise prediction issues that remain to be resolved.
Bauer, Timothy J
2013-06-15
The Jack Rabbit Test Program was sponsored in April and May 2010 by the Department of Homeland Security Transportation Security Administration to generate source data for large releases of chlorine and ammonia from transport tanks. In addition to a variety of data types measured at the release location, concentration versus time data was measured using sensors at distances up to 500 m from the tank. Release data were used to create accurate representations of the vapor flux versus time for the ten releases. This study was conducted to determine the importance of source terms and meteorological conditions in predicting downwind concentrations and the accuracy that can be obtained in those predictions. Each source representation was entered into an atmospheric transport and dispersion model using simplifying assumptions regarding the source characterization and meteorological conditions, and statistics for cloud duration and concentration at the sensor locations were calculated. A detailed characterization for one of the chlorine releases predicted 37% of concentration values within a factor of two, but cannot be considered representative of all the trials. Predictions of toxic effects at 200 m are relevant to incidents involving 1-ton chlorine tanks commonly used in parts of the United States and internationally. Published by Elsevier B.V.
Searches for correlation between UHECR events and high-energy gamma-ray Fermi-LAT data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Álvarez, Ezequiel; Cuoco, Alessandro; Mirabal, Nestor
The astrophysical sources responsible for ultra high-energy cosmic rays (UHECRs) continue to be one of the most intriguing mysteries in astrophysics. We present a comprehensive search for correlations between high-energy (∼> 1 GeV) gamma-ray events from the Fermi Large Area Telescope (LAT) and UHECRs (∼> 60 EeV) detected by the Telescope Array and the Pierre Auger Observatory. We perform two separate searches. First, we conduct a standard cross-correlation analysis between the arrival directions of 148 UHECRs and 360 gamma-ray sources in the Second Catalog of Hard Fermi-LAT sources (2FHL). Second, we search for a possible correlation between UHECR directions andmore » unresolved Fermi -LAT gamma-ray emission. For the latter, we use three different methods: a stacking technique with both a model-dependent and model-independent background estimate, and a cross-correlation function analysis. We also test for statistically significant excesses in gamma rays from signal regions centered on Cen A and the Telescope Array hotspot. No significant correlation is found in any of the analyses performed, except a weak (∼< 2σ) hint of signal with the correlation function method on scales ∼ 1°. Upper limits on the flux of possible power-law gamma-ray sources of UHECRs are derived.« less
Searches for correlation between UHECR events and high-energy gamma-ray Fermi-LAT data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Álvarez, Ezequiel; Cuoco, Alessandro; Mirabal, Nestor
The astrophysical sources responsible for ultra high-energy cosmic rays (UHECRs) continue to be one of the most intriguing mysteries in astrophysics. Here, we present a comprehensive search for correlations between high-energy (≳ 1 GeV) gamma-ray events from the Fermi Large Area Telescope (LAT) and UHECRs (≳ 60 EeV) detected by the Telescope Array and the Pierre Auger Observatory. We perform two separate searches. First, we conduct a standard cross-correlation analysis between the arrival directions of 148 UHECRs and 360 gamma-ray sources in the Second Catalog of Hard Fermi-LAT sources (2FHL). Second, we search for a possible correlation between UHECR directionsmore » and unresolved Fermi-LAT gamma-ray emission. For the latter, we use three different methods: a stacking technique with both a model-dependent and model-independent background estimate, and a cross-correlation function analysis. We also test for statistically significant excesses in gamma rays from signal regions centered on Cen A and the Telescope Array hotspot. There was no significant correlation is found in any of the analyses performed, except a weak (≲ 2σ) hint of signal with the correlation function method on scales ~ 1°. Upper limits on the flux of possible power-law gamma-ray sources of UHECRs are derived.« less
Searches for correlation between UHECR events and high-energy gamma-ray Fermi-LAT data
Álvarez, Ezequiel; Cuoco, Alessandro; Mirabal, Nestor; ...
2016-12-13
The astrophysical sources responsible for ultra high-energy cosmic rays (UHECRs) continue to be one of the most intriguing mysteries in astrophysics. Here, we present a comprehensive search for correlations between high-energy (≳ 1 GeV) gamma-ray events from the Fermi Large Area Telescope (LAT) and UHECRs (≳ 60 EeV) detected by the Telescope Array and the Pierre Auger Observatory. We perform two separate searches. First, we conduct a standard cross-correlation analysis between the arrival directions of 148 UHECRs and 360 gamma-ray sources in the Second Catalog of Hard Fermi-LAT sources (2FHL). Second, we search for a possible correlation between UHECR directionsmore » and unresolved Fermi-LAT gamma-ray emission. For the latter, we use three different methods: a stacking technique with both a model-dependent and model-independent background estimate, and a cross-correlation function analysis. We also test for statistically significant excesses in gamma rays from signal regions centered on Cen A and the Telescope Array hotspot. There was no significant correlation is found in any of the analyses performed, except a weak (≲ 2σ) hint of signal with the correlation function method on scales ~ 1°. Upper limits on the flux of possible power-law gamma-ray sources of UHECRs are derived.« less
A Comparison Between Spectral Properties of ULXs and Luminous X-ray Binaries
NASA Astrophysics Data System (ADS)
Berghea, C. T.; Colbert, E. J. M.; Roberts, T. P.
2004-05-01
What is special about the 1039 erg s-1 limit that is used to define the ULX class? We investigate this question by analyzing Chandra X-ray spectra of 71 X-ray bright point sources from nearby galaxies. Fifty-one of these sources are ULXs (LX(0.3-8.0 keV) ≥ 1039 erg s-1), and 20 sources (our comparison sample) are less-luminous X-ray binaries with LX(0.3-8.0 keV) = 1038-39 erg s-1. Our sample objects were selected from the Chandra archive to have ≥1000 counts and thus represent the highest quality spectra in the Chandra archives for extragalactic X-ray binaries and ULXs. We fit the spectra with one-component models (e.g., cold absorption with power-law, or cold absorption with multi-colored disk blackbody) and two-component models (e.g. absorption with both a power-law and a multi colored disk blackbody). A crude measure of the spectral states of the sources are determined observationally by calibrating the strength of the disk (blackbody) and coronal (power-law) components. These results are then use to determine if spectral properties of the ULXs are statistically distinct from those of the comparison objects, which are assumed to be ``normal'' black-hole X-ray binaries.
TinkerPlots™ Model Construction Approaches for Comparing Two Groups: Student Perspectives
ERIC Educational Resources Information Center
Noll, Jennifer; Kirin, Dana
2017-01-01
Teaching introductory statistics using curricula focused on modeling and simulation is becoming increasingly common in introductory statistics courses and touted as a more beneficial approach for fostering students' statistical thinking. Yet, surprisingly little research has been conducted to study the impact of modeling and simulation curricula…
England, Lucinda; Kotelchuck, Milton; Wilson, Hoyt G; Diop, Hafsatou; Oppedisano, Paul; Kim, Shin Y; Cui, Xiaohui; Shapiro-Mendoza, Carrie K
2015-10-01
Women with gestational diabetes mellitus (GDM) may be able to reduce their risk of recurrent GDM and progression to type 2 diabetes mellitus through lifestyle change; however, there is limited population-based information on GDM recurrence rates. We used data from a population of women delivering two sequential live singleton infants in Massachusetts (1998-2007) to estimate the prevalence of chronic diabetes mellitus (CDM) and GDM in parity one pregnancies and recurrence of GDM and progression from GDM to CDM in parity two pregnancies. We examined four diabetes classification approaches; birth certificate (BC) data alone, hospital discharge (HD) data alone, both sources hierarchically combined with a diagnosis of CDM from either source taking priority over a diagnosis of GDM, and both sources combined including only pregnancies with full agreement in diagnosis. Descriptive statistics were used to describe population characteristics, prevalence of CDM and GDM, and recurrence of diabetes in successive pregnancies. Diabetes classification agreement was assessed using the Kappa statistic. Associated maternal characteristics were examined through adjusted model-based t tests and Chi square tests. A total of 134,670 women with two sequential deliveries of parities one and two were identified. While there was only slight agreement on GDM classification across HD and BC records, estimates of GDM recurrence were fairly consistent; nearly half of women with GDM in their parity one pregnancy developed GDM in their subsequent pregnancy. While estimates of progression from GDM to CDM across sequential pregnancies were more variable, all approaches yielded estimates of ≤5 %. The development of either GDM or CDM following a parity one pregnancy with no diagnosis of diabetes was <3 % across approaches. Women with recurrent GDM were disproportionately older and foreign born. Recurrent GDM is a serious life course public health issue; the inter-pregnancy interval provides an important window for diabetes prevention.
Blangiardo, Marta; Finazzi, Francesco; Cameletti, Michela
2016-08-01
Exposure to high levels of air pollutant concentration is known to be associated with respiratory problems which can translate into higher morbidity and mortality rates. The link between air pollution and population health has mainly been assessed considering air quality and hospitalisation or mortality data. However, this approach limits the analysis to individuals characterised by severe conditions. In this paper we evaluate the link between air pollution and respiratory diseases using general practice drug prescriptions for chronic respiratory diseases, which allow to draw conclusions based on the general population. We propose a two-stage statistical approach: in the first stage we specify a space-time model to estimate the monthly NO2 concentration integrating several data sources characterised by different spatio-temporal resolution; in the second stage we link the concentration to the β2-agonists prescribed monthly by general practices in England and we model the prescription rates through a small area approach. Copyright © 2016 Elsevier Ltd. All rights reserved.
Magari, Robert T
2002-03-01
The effect of different lot-to-lot variability levels on the prediction of stability are studied based on two statistical models for estimating degradation in real time and accelerated stability tests. Lot-to-lot variability is considered as random in both models, and is attributed to two sources-variability at time zero, and variability of degradation rate. Real-time stability tests are modeled as a function of time while accelerated stability tests as a function of time and temperatures. Several data sets were simulated, and a maximum likelihood approach was used for estimation. The 95% confidence intervals for the degradation rate depend on the amount of lot-to-lot variability. When lot-to-lot degradation rate variability is relatively large (CV > or = 8%) the estimated confidence intervals do not represent the trend for individual lots. In such cases it is recommended to analyze each lot individually. Copyright 2002 Wiley-Liss, Inc. and the American Pharmaceutical Association J Pharm Sci 91: 893-899, 2002
Investigation of Pre-Earthquake Ionospheric Disturbances by 3D Tomographic Analysis
NASA Astrophysics Data System (ADS)
Yagmur, M.
2016-12-01
Ionospheric variations before earthquakes have been widely discussed phenomena in ionospheric studies. To clarify the source and mechanism of these phenomena is highly important for earthquake forecasting. To well understanding the mechanical and physical processes of pre-seismic Ionospheric anomalies that might be related even with Lithosphere-Atmosphere-Ionosphere-Magnetosphere Coupling, both statistical and 3D modeling analysis are needed. For these purpose, firstly we have investigated the relation between Ionospheric TEC Anomalies and potential source mechanisms such as space weather activity and lithospheric phenomena like positive surface electric charges. To distinguish their effects on Ionospheric TEC, we have focused on pre-seismically active days. Then, we analyzed the statistical data of 54 earthquakes that M≽6 between 2000 and 2013 as well as the 2011 Tohoku and the 2016 Kumamoto Earthquakes in Japan. By comparing TEC anomaly and Solar activity by Dst Index, we have found that 28 events that might be related with Earthquake activity. Following the statistical analysis, we also investigate the Lithospheric effect on TEC change on selected days. Among those days, we have chosen two case studies as the 2011 Tohoku and the 2016 Kumamoto Earthquakes to make 3D reconstructed images by utilizing 3D Tomography technique with Neural Networks. The results will be presented in our presentation. Keywords : Earthquake, 3D Ionospheric Tomography, Positive and Negative Anomaly, Geomagnetic Storm, Lithosphere
Validating Remotely Sensed Land Surface Evapotranspiration Based on Multi-scale Field Measurements
NASA Astrophysics Data System (ADS)
Jia, Z.; Liu, S.; Ziwei, X.; Liang, S.
2012-12-01
The land surface evapotranspiration plays an important role in the surface energy balance and the water cycle. There have been significant technical and theoretical advances in our knowledge of evapotranspiration over the past two decades. Acquisition of the temporally and spatially continuous distribution of evapotranspiration using remote sensing technology has attracted the widespread attention of researchers and managers. However, remote sensing technology still has many uncertainties coming from model mechanism, model inputs, parameterization schemes, and scaling issue in the regional estimation. Achieving remotely sensed evapotranspiration (RS_ET) with confident certainty is required but difficult. As a result, it is indispensable to develop the validation methods to quantitatively assess the accuracy and error sources of the regional RS_ET estimations. This study proposes an innovative validation method based on multi-scale evapotranspiration acquired from field measurements, with the validation results including the accuracy assessment, error source analysis, and uncertainty analysis of the validation process. It is a potentially useful approach to evaluate the accuracy and analyze the spatio-temporal properties of RS_ET at both the basin and local scales, and is appropriate to validate RS_ET in diverse resolutions at different time-scales. An independent RS_ET validation using this method was presented over the Hai River Basin, China in 2002-2009 as a case study. Validation at the basin scale showed good agreements between the 1 km annual RS_ET and the validation data such as the water balanced evapotranspiration, MODIS evapotranspiration products, precipitation, and landuse types. Validation at the local scale also had good results for monthly, daily RS_ET at 30 m and 1 km resolutions, comparing to the multi-scale evapotranspiration measurements from the EC and LAS, respectively, with the footprint model over three typical landscapes. Although some validation experiments demonstrated that the models yield accurate estimates at flux measurement sites, the question remains whether they are performing well over the broader landscape. Moreover, a large number of RS_ET products have been released in recent years. Thus, we also pay attention to the cross-validation method of RS_ET derived from multi-source models. "The Multi-scale Observation Experiment on Evapotranspiration over Heterogeneous Land Surfaces: Flux Observation Matrix" campaign is carried out at the middle reaches of the Heihe River Basin, China in 2012. Flux measurements from an observation matrix composed of 22 EC and 4 LAS are acquired to investigate the cross-validation of multi-source models over different landscapes. In this case, six remote sensing models, including the empirical statistical model, the one-source and two-source models, the Penman-Monteith equation based model, the Priestley-Taylor equation based model, and the complementary relationship based model, are used to perform an intercomparison. All the results from the two cases of RS_ET validation showed that the proposed validation methods are reasonable and feasible.
NASA Astrophysics Data System (ADS)
Liuzzo, E.; Giovannini, G.; Giroletti, M.; Taylor, G. B.
2009-10-01
Aims: To study statistical properties of different classes of sources, it is necessary to observe a sample that is free of selection effects. To do this, we initiated a project to observe a complete sample of radio galaxies selected from the B2 Catalogue of Radio Sources and the Third Cambridge Revised Catalogue (3CR), with no selection constraint on the nuclear properties. We named this sample “the Bologna Complete Sample” (BCS). Methods: We present new VLBI observations at 5 and 1.6 GHz for 33 sources drawn from a sample not biased toward orientation. By combining these data with those in the literature, information on the parsec-scale morphology is available for a total of 76 of 94 radio sources with a range in radio power and kiloparsec-scale morphologies. Results: The fraction of two-sided sources at milliarcsecond resolution is high (30%), compared to the fraction found in VLBI surveys selected at centimeter wavelengths, as expected from the predictions of unified models. The parsec-scale jets are generally found to be straight and to line up with the kiloparsec-scale jets. A few peculiar sources are discussed in detail. Tables 1-4 are only available in electronic form at http://www.aanda.org
Jansson, Daniel; Lindström, Susanne Wiklund; Norlin, Rikard; Hok, Saphon; Valdez, Carlos A; Williams, Audrey M; Alcaraz, Armando; Nilsson, Calle; Åstot, Crister
2018-08-15
This work is part two of a three-part series in this issue of a Sweden-United States collaborative effort towards the understanding of the chemical attribution signatures of Russian VX (VR) in synthesized samples and complex food matrices. In this study, we describe the sourcing of VR present in food based on chemical analysis of attribution signatures by liquid chromatography-tandem mass spectrometry (LC-MS/MS) combined with multivariate data analysis. Analytical data was acquired from seven different foods spiked with VR batches that were synthesized via six different routes in two separate laboratories. The synthesis products were spiked at a lethal dose into seven food matrices: water, orange juice, apple purée, baby food, pea purée, liquid eggs and hot dog. After acetonitrile sample extraction, the samples were analyzed by LC-MS/MS operated in MRM mode. A multivariate statistical calibration model was built on the chemical attribution profiles from 118 VR spiked food samples. Using the model, an external test-set of the six synthesis routes employed for VR production was correctly identified with no observable major impact of the food matrices to the classification. The overall performance of the statistical models was found to be exceptional (94%) for the test set samples retrospectively classified to their synthesis routes. Copyright © 2018 Elsevier B.V. All rights reserved.
Moore, Richard Bridge; Johnston, Craig M.; Robinson, Keith W.; Deacon, Jeffrey R.
2004-01-01
The U.S. Geological Survey (USGS), in cooperation with the U.S. Environmental Protection Agency (USEPA) and the New England Interstate Water Pollution Control Commission (NEIWPCC), has developed a water-quality model, called SPARROW (Spatially Referenced Regressions on Watershed Attributes), to assist in regional total maximum daily load (TMDL) and nutrient-criteria activities in New England. SPARROW is a spatially detailed, statistical model that uses regression equations to relate total nitrogen and phosphorus (nutrient) stream loads to nutrient sources and watershed characteristics. The statistical relations in these equations are then used to predict nutrient loads in unmonitored streams. The New England SPARROW models are built using a hydrologic network of 42,000 stream reaches and associated watersheds. Watershed boundaries are defined for each stream reach in the network through the use of a digital elevation model and existing digitized watershed divides. Nutrient source data is from permitted wastewater discharge data from USEPA's Permit Compliance System (PCS), various land-use sources, and atmospheric deposition. Physical watershed characteristics include drainage area, land use, streamflow, time-of-travel, stream density, percent wetlands, slope of the land surface, and soil permeability. The New England SPARROW models for total nitrogen and total phosphorus have R-squared values of 0.95 and 0.94, with mean square errors of 0.16 and 0.23, respectively. Variables that were statistically significant in the total nitrogen model include permitted municipal-wastewater discharges, atmospheric deposition, agricultural area, and developed land area. Total nitrogen stream-loss rates were significant only in streams with average annual flows less than or equal to 2.83 cubic meters per second. In streams larger than this, there is nondetectable in-stream loss of annual total nitrogen in New England. Variables that were statistically significant in the total phosphorus model include discharges for municipal wastewater-treatment facilities and pulp and paper facilities, developed land area, agricultural area, and forested area. For total phosphorus, loss rates were significant for reservoirs with surface areas of 10 square kilometers or less, and in streams with flows less than or equal to 2.83 cubic meters per second. Applications of SPARROW for evaluating nutrient loading in New England waters include estimates of the spatial distributions of total nitrogen and phosphorus yields, sources of the nutrients, and the potential for delivery of those yields to receiving waters. This information can be used to (1) predict ranges in nutrient levels in surface waters, (2) identify the environmental variables that are statistically significant predictors of nutrient levels in streams, (3) evaluate monitoring efforts for better determination of nutrient loads, and (4) evaluate management options for reducing nutrient loads to achieve water-quality goals.
Current and future pluvial flood hazard analysis for the city of Antwerp
NASA Astrophysics Data System (ADS)
Willems, Patrick; Tabari, Hossein; De Niel, Jan; Van Uytven, Els; Lambrechts, Griet; Wellens, Geert
2016-04-01
For the city of Antwerp in Belgium, higher rainfall extremes were observed in comparison with surrounding areas. The differences were found statistically significant for some areas and may be the result of the heat island effect in combination with the higher concentrations of aerosols. A network of 19 rain gauges but with varying records length (the longest since the 1960s) and continuous radar data for 10 years were combined to map the spatial variability of rainfall extremes over the city at various durations from 15 minutes to 1 day together with the uncertainty. The improved spatial rainfall information was used as input in the sewer system model of the city to analyze the frequency of urban pluvial floods. Comparison with historical flood observations from various sources (fire brigade and media) confirmed that the improved spatial rainfall information also improved sewer impact results on both the magnitude and frequency of the sewer floods. Next to these improved urban flood impact results for recent and current climatological conditions, the new insights on the local rainfall microclimate were also helpful to enhance future projections on rainfall extremes and pluvial floods in the city. This was done by improved statistical downscaling of all available CMIP5 global climate model runs (160 runs) for the 4 RCP scenarios, as well as the available EURO-CORDEX regional climate model runs. Two types of statistical downscaling methods were applied for that purpose (a weather typing based method, and a quantile perturbation approach), making use of the microclimate results and its dependency on specific weather types. Changes in extreme rainfall intensities were analyzed and mapped as a function of the RCP scenario, together with the uncertainty, decomposed in the uncertainties related to the climate models, the climate model initialization or limited length of the 30-year time series (natural climate variability) and the statistical downscaling (albeit limited to two types of methods). These were finally transferred into future pluvial flash flood hazard maps for the city together with the uncertainties, and are considered as basis for spatial planning and adaptation.
ERIC Educational Resources Information Center
Gálvez, Jaime; Conejo, Ricardo; Guzmán, Eduardo
2013-01-01
One of the most popular student modeling approaches is Constraint-Based Modeling (CBM). It is an efficient approach that can be easily applied inside an Intelligent Tutoring System (ITS). Even with these characteristics, building new ITSs requires carefully designing the domain model to be taught because different sources of errors could affect…
Numerical and Qualitative Contrasts of Two Statistical Models ...
Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and products. This study provided an empirical and qualitative comparison of both models using 29 years of data for two discrete time series of chlorophyll-a (chl-a) in the Patuxent River estuary. Empirical descriptions of each model were based on predictive performance against the observed data, ability to reproduce flow-normalized trends with simulated data, and comparisons of performance with validation datasets. Between-model differences were apparent but minor and both models had comparable abilities to remove flow effects from simulated time series. Both models similarly predicted observations for missing data with different characteristics. Trends from each model revealed distinct mainstem influences of the Chesapeake Bay with both models predicting a roughly 65% increase in chl-a over time in the lower estuary, whereas flow-normalized predictions for the upper estuary showed a more dynamic pattern, with a nearly 100% increase in chl-a in the last 10 years. Qualitative comparisons highlighted important differences in the statistical structure, available products, and characteristics of the data and desired analysis. This manuscript describes a quantitative comparison of two recently-
Bareño, Jorge O.; Parra Vargas, Carlos A.; Gutierrez Velásquez, Elkin I.
2017-01-01
Force Sensing Resistors (FSRs) are manufactured by sandwiching a Conductive Polymer Composite (CPC) between metal electrodes. The piezoresistive property of FSRs has been exploited to perform stress and strain measurements, but the rheological property of polymers has undermined the repeatability of measurements causing creep in the electrical resistance of FSRs. With the aim of understanding the creep phenomenon, the drift response of thirty two specimens of FSRs was studied using a statistical approach. Similarly, a theoretical model for the creep response was developed by combining the Burger’s rheological model with the equations for the quantum tunneling conduction through thin insulating films. The proposed model and the experimental observations showed that the sourcing voltage has a strong influence on the creep response; this observation—and the corresponding model—is an important contribution that has not been previously accounted. The phenomenon of sensitivity degradation was also studied. It was found that sensitivity degradation is a voltage-related phenomenon that can be avoided by choosing an appropriate sourcing voltage in the driving circuit. The models and experimental observations from this study are key aspects to enhance the repeatability of measurements and the accuracy of FSRs. PMID:29160834
2012-06-06
Statistical Data ........................................................................................... 45 31 Parametric Model for Rotor Wing Debris...Area .............................................................. 46 32 Skid Distance Statistical Data...results. The curve that related the BC value to the probability of skull fracture resulted in a tight confidence interval and a two tailed statistical p
NASA Astrophysics Data System (ADS)
Guimarães Nobre, Gabriela; Arnbjerg-Nielsen, Karsten; Rosbjerg, Dan; Madsen, Henrik
2016-04-01
Traditionally, flood risk assessment studies have been carried out from a univariate frequency analysis perspective. However, statistical dependence between hydrological variables, such as extreme rainfall and extreme sea surge, is plausible to exist, since both variables to some extent are driven by common meteorological conditions. Aiming to overcome this limitation, multivariate statistical techniques has the potential to combine different sources of flooding in the investigation. The aim of this study was to apply a range of statistical methodologies for analyzing combined extreme hydrological variables that can lead to coastal and urban flooding. The study area is the Elwood Catchment, which is a highly urbanized catchment located in the city of Port Phillip, Melbourne, Australia. The first part of the investigation dealt with the marginal extreme value distributions. Two approaches to extract extreme value series were applied (Annual Maximum and Partial Duration Series), and different probability distribution functions were fit to the observed sample. Results obtained by using the Generalized Pareto distribution demonstrate the ability of the Pareto family to model the extreme events. Advancing into multivariate extreme value analysis, first an investigation regarding the asymptotic properties of extremal dependence was carried out. As a weak positive asymptotic dependence between the bivariate extreme pairs was found, the Conditional method proposed by Heffernan and Tawn (2004) was chosen. This approach is suitable to model bivariate extreme values, which are relatively unlikely to occur together. The results show that the probability of an extreme sea surge occurring during a one-hour intensity extreme precipitation event (or vice versa) can be twice as great as what would occur when assuming independent events. Therefore, presuming independence between these two variables would result in severe underestimation of the flooding risk in the study area.
Visualization of the variability of 3D statistical shape models by animation.
Lamecker, Hans; Seebass, Martin; Lange, Thomas; Hege, Hans-Christian; Deuflhard, Peter
2004-01-01
Models of the 3D shape of anatomical objects and the knowledge about their statistical variability are of great benefit in many computer assisted medical applications like images analysis, therapy or surgery planning. Statistical model of shapes have successfully been applied to automate the task of image segmentation. The generation of 3D statistical shape models requires the identification of corresponding points on two shapes. This remains a difficult problem, especially for shapes of complicated topology. In order to interpret and validate variations encoded in a statistical shape model, visual inspection is of great importance. This work describes the generation and interpretation of statistical shape models of the liver and the pelvic bone.
NASA Astrophysics Data System (ADS)
Engström, Emma; Mörtberg, Ulla; Karlström, Anders; Mangold, Mikael
2017-06-01
This study developed methodology for statistically assessing groundwater contamination mechanisms. It focused on microbial water pollution in low-income regions. Risk factors for faecal contamination of groundwater-fed drinking-water sources were evaluated in a case study in Juba, South Sudan. The study was based on counts of thermotolerant coliforms in water samples from 129 sources, collected by the humanitarian aid organisation Médecins Sans Frontières in 2010. The factors included hydrogeological settings, land use and socio-economic characteristics. The results showed that the residuals of a conventional probit regression model had a significant positive spatial autocorrelation (Moran's I = 3.05, I-stat = 9.28); therefore, a spatial model was developed that had better goodness-of-fit to the observations. The most significant factor in this model ( p-value 0.005) was the distance from a water source to the nearest Tukul area, an area with informal settlements that lack sanitation services. It is thus recommended that future remediation and monitoring efforts in the city be concentrated in such low-income regions. The spatial model differed from the conventional approach: in contrast with the latter case, lowland topography was not significant at the 5% level, as the p-value was 0.074 in the spatial model and 0.040 in the traditional model. This study showed that statistical risk-factor assessments of groundwater contamination need to consider spatial interactions when the water sources are located close to each other. Future studies might further investigate the cut-off distance that reflects spatial autocorrelation. Particularly, these results advise research on urban groundwater quality.
Data-optimized source modeling with the Backwards Liouville Test–Kinetic method
Woodroffe, J. R.; Brito, T. V.; Jordanova, V. K.; ...
2017-09-14
In the standard practice of neutron multiplicity counting , the first three sampled factorial moments of the event triggered neutron count distribution were used to quantify the three main neutron source terms: the spontaneous fissile material effective mass, the relative (α,n) production and the induced fission source responsible for multiplication. Our study compares three methods to quantify the statistical uncertainty of the estimated mass: the bootstrap method, propagation of variance through moments, and statistical analysis of cycle data method. Each of the three methods was implemented on a set of four different NMC measurements, held at the JRC-laboratory in Ispra,more » Italy, sampling four different Pu samples in a standard Plutonium Scrap Multiplicity Counter (PSMC) well counter.« less
Stotts, Steven A; Koch, Robert A
2017-08-01
In this paper an approach is presented to estimate the constraint required to apply maximum entropy (ME) for statistical inference with underwater acoustic data from a single track segment. Previous algorithms for estimating the ME constraint require multiple source track segments to determine the constraint. The approach is relevant for addressing model mismatch effects, i.e., inaccuracies in parameter values determined from inversions because the propagation model does not account for all acoustic processes that contribute to the measured data. One effect of model mismatch is that the lowest cost inversion solution may be well outside a relatively well-known parameter value's uncertainty interval (prior), e.g., source speed from track reconstruction or towed source levels. The approach requires, for some particular parameter value, the ME constraint to produce an inferred uncertainty interval that encompasses the prior. Motivating this approach is the hypothesis that the proposed constraint determination procedure would produce a posterior probability density that accounts for the effect of model mismatch on inferred values of other inversion parameters for which the priors might be quite broad. Applications to both measured and simulated data are presented for model mismatch that produces minimum cost solutions either inside or outside some priors.
Application of crowd-sourced data to multi-scale evolutionary exposure and vulnerability models
NASA Astrophysics Data System (ADS)
Pittore, Massimiliano
2016-04-01
Seismic exposure, defined as the assets (population, buildings, infrastructure) exposed to earthquake hazard and susceptible to damage, is a critical -but often neglected- component of seismic risk assessment. This partly stems from the burden associated with the compilation of a useful and reliable model over wide spatial areas. While detailed engineering data have still to be collected in order to constrain exposure and vulnerability models, the availability of increasingly large crowd-sourced datasets (e. g. OpenStreetMap) opens up the exciting possibility to generate incrementally evolving models. Integrating crowd-sourced and authoritative data using statistical learning methodologies can reduce models uncertainties and also provide additional drive and motivation to volunteered geoinformation collection. A case study in Central Asia will be presented and discussed.
Investigation of Statistical Inference Methodologies Through Scale Model Propagation Experiments
2015-09-30
statistical inference methodologies for ocean- acoustic problems by investigating and applying statistical methods to data collected from scale-model...to begin planning experiments for statistical inference applications. APPROACH In the ocean acoustics community over the past two decades...solutions for waveguide parameters. With the introduction of statistical inference to the field of ocean acoustics came the desire to interpret marginal
Strongly magnetized classical plasma models
NASA Technical Reports Server (NTRS)
Montgomery, D. C.
1972-01-01
The class of plasma processes for which the so-called Vlasov approximation is inadequate is investigated. Results from the equilibrium statistical mechanics of two-dimensional plasmas are derived. These results are independent of the presence of an external dc magnetic field. The nonequilibrium statistical mechanics of the electrostatic guiding-center plasma, a two-dimensional plasma model, is discussed. This model is then generalized to three dimensions. The guiding-center model is relaxed to include finite Larmor radius effects for a two-dimensional plasma.
Geodetic positioning using a global positioning system of satellites
NASA Technical Reports Server (NTRS)
Fell, P. J.
1980-01-01
Geodetic positioning using range, integrated Doppler, and interferometric observations from a constellation of twenty-four Global Positioning System satellites is analyzed. A summary of the proposals for geodetic positioning and baseline determination is given which includes a description of measurement techniques and comments on rank deficiency and error sources. An analysis of variance comparison of range, Doppler, and interferometric time delay to determine their relative geometric strength for baseline determination is included. An analytic examination to the effect of a priori constraints on positioning using simultaneous observations from two stations is presented. Dynamic point positioning and baseline determination using range and Doppler is examined in detail. Models for the error sources influencing dynamic positioning are developed. Included is a discussion of atomic clock stability, and range and Doppler observation error statistics based on random correlated atomic clock error are derived.
Toward the ICRF3: Astrometric Comparison of the USNO 2016A VLBI Solution with ICRF2 and Gaia DR1
NASA Astrophysics Data System (ADS)
Frouard, Julien; Johnson, Megan C.; Fey, Alan; Makarov, Valeri V.; Dorland, Bryan N.
2018-06-01
The VLBI USNO 2016A (U16A) solution is part of a work-in-progress effort by USNO toward the preparation of the ICRF3. Most of the astrometric improvement with respect to the ICRF2 is due to the re-observation of the VCS sources. Our objective in this paper is to assess U16A’s astrometry. A comparison with ICRF2 shows statistically significant offsets of size 0.1 mas between the two solutions. While Gaia DR1 positions are not precise enough to resolve these offsets, they are found to be significantly closer to U16A than ICRF2. In particular, the trend for typically larger errors for southern sources in VLBI solutions is decreased in U16A. Overall, the VLBI-Gaia offsets are reduced by 21%. The U16A list includes 718 sources not previously included in ICRF2. Twenty of those new sources have statistically significant radio-optical offsets. In two-thirds of the cases, these offsets can be explained from PanSTARRS images.
NASA Astrophysics Data System (ADS)
Kumar, J.; Jain, A.; Srivastava, R.
2005-12-01
The identification of pollution sources in aquifers is an important area of research not only for the hydrologists but also for the local and Federal agencies and defense organizations. Once the data in terms of pollutant concentration measurements at observation wells become known, it is important to identify the polluting industry in order to implement punitive or remedial measures. Traditionally, hydrologists have relied on the conceptual methods for the identification of groundwater pollution sources. The problem of identification of groundwater pollution sources using the conceptual methods requires a thorough understanding of the groundwater flow and contaminant transport processes and inverse modeling procedures that are highly complex and difficult to implement. Recently, the soft computing techniques, such as artificial neural networks (ANNs) and genetic algorithms, have provided an attractive and easy to implement alternative to solve complex problems efficiently. Some researchers have used ANNs for the identification of pollution sources in aquifers. A major problem with most previous studies using ANNs has been the large size of the neural networks that are needed to model the inverse problem. The breakthrough curves at an observation well may consist of hundreds of concentration measurements, and presenting all of them to the input layer of an ANN not only results in humongous networks but also requires large amount of training and testing data sets to develop the ANN models. This paper presents the results of a study aimed at using certain characteristics of the breakthrough curves and ANNs for determining the distance of the pollution source from a given observation well. Two different neural network models are developed that differ in the manner of characterizing the breakthrough curves. The first ANN model uses five parameters, similar to the synthetic unit hydrograph parameters, to characterize the breakthrough curves. The five parameters employed are peak concentration, time to peak concentration, the widths of the breakthrough curves at 50% and 75% of the peak concentration, and the time base of the breakthrough curve. The second ANN model employs only the first four parameters leaving out the time base. The measurement of breakthrough curve at an observation well involves very high costs in sample collection at suitable time intervals and analysis for various contaminants. The receding portions of the breakthrough curves are normally very long and excluding the time base from modeling would result in considerable cost savings. The feed-forward multi-layer perceptron (MLP) type neural networks trained using the back-propagation algorithm, are employed in this study. The ANN models for the two approaches were developed using simulated data generated for conservative pollutant transport through a homogeneous aquifer. A new approach for ANN training using back-propagation is employed that considers two different error statistics to prevent over-training and under-training of the ANNs. The preliminary results indicate that the ANNs are able to identify the location of the pollution source very efficiently from both the methods of the breakthrough curves characterization.
Development of LACIE CCEA-1 weather/wheat yield models. [regression analysis
NASA Technical Reports Server (NTRS)
Strommen, N. D.; Sakamoto, C. M.; Leduc, S. K.; Umberger, D. E. (Principal Investigator)
1979-01-01
The advantages and disadvantages of the casual (phenological, dynamic, physiological), statistical regression, and analog approaches to modeling for grain yield are examined. Given LACIE's primary goal of estimating wheat production for the large areas of eight major wheat-growing regions, the statistical regression approach of correlating historical yield and climate data offered the Center for Climatic and Environmental Assessment the greatest potential return within the constraints of time and data sources. The basic equation for the first generation wheat-yield model is given. Topics discussed include truncation, trend variable, selection of weather variables, episodic events, strata selection, operational data flow, weighting, and model results.
Shen, Shihao; Park, Juw Won; Lu, Zhi-xiang; Lin, Lan; Henry, Michael D; Wu, Ying Nian; Zhou, Qing; Xing, Yi
2014-12-23
Ultra-deep RNA sequencing (RNA-Seq) has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We previously developed multivariate analysis of transcript splicing (MATS), a statistical method for detecting differential alternative splicing between two RNA-Seq samples. Here we describe a new statistical model and computer program, replicate MATS (rMATS), designed for detection of differential alternative splicing from replicate RNA-Seq data. rMATS uses a hierarchical model to simultaneously account for sampling uncertainty in individual replicates and variability among replicates. In addition to the analysis of unpaired replicates, rMATS also includes a model specifically designed for paired replicates between sample groups. The hypothesis-testing framework of rMATS is flexible and can assess the statistical significance over any user-defined magnitude of splicing change. The performance of rMATS is evaluated by the analysis of simulated and real RNA-Seq data. rMATS outperformed two existing methods for replicate RNA-Seq data in all simulation settings, and RT-PCR yielded a high validation rate (94%) in an RNA-Seq dataset of prostate cancer cell lines. Our data also provide guiding principles for designing RNA-Seq studies of alternative splicing. We demonstrate that it is essential to incorporate biological replicates in the study design. Of note, pooling RNAs or merging RNA-Seq data from multiple replicates is not an effective approach to account for variability, and the result is particularly sensitive to outliers. The rMATS source code is freely available at rnaseq-mats.sourceforge.net/. As the popularity of RNA-Seq continues to grow, we expect rMATS will be useful for studies of alternative splicing in diverse RNA-Seq projects.
Jones, Hayley E; Hickman, Matthew; Kasprzyk-Hordern, Barbara; Welton, Nicky J; Baker, David R; Ades, A E
2014-07-15
Concentrations of metabolites of illicit drugs in sewage water can be measured with great accuracy and precision, thanks to the development of sensitive and robust analytical methods. Based on assumptions about factors including the excretion profile of the parent drug, routes of administration and the number of individuals using the wastewater system, the level of consumption of a drug can be estimated from such measured concentrations. When presenting results from these 'back-calculations', the multiple sources of uncertainty are often discussed, but are not usually explicitly taken into account in the estimation process. In this paper we demonstrate how these calculations can be placed in a more formal statistical framework by assuming a distribution for each parameter involved, based on a review of the evidence underpinning it. Using a Monte Carlo simulations approach, it is then straightforward to propagate uncertainty in each parameter through the back-calculations, producing a distribution for instead of a single estimate of daily or average consumption. This can be summarised for example by a median and credible interval. To demonstrate this approach, we estimate cocaine consumption in a large urban UK population, using measured concentrations of two of its metabolites, benzoylecgonine and norbenzoylecgonine. We also demonstrate a more sophisticated analysis, implemented within a Bayesian statistical framework using Markov chain Monte Carlo simulation. Our model allows the two metabolites to simultaneously inform estimates of daily cocaine consumption and explicitly allows for variability between days. After accounting for this variability, the resulting credible interval for average daily consumption is appropriately wider, representing additional uncertainty. We discuss possibilities for extensions to the model, and whether analysis of wastewater samples has potential to contribute to a prevalence model for illicit drug use. Copyright © 2014. Published by Elsevier B.V.
Jones, Hayley E.; Hickman, Matthew; Kasprzyk-Hordern, Barbara; Welton, Nicky J.; Baker, David R.; Ades, A.E.
2014-01-01
Concentrations of metabolites of illicit drugs in sewage water can be measured with great accuracy and precision, thanks to the development of sensitive and robust analytical methods. Based on assumptions about factors including the excretion profile of the parent drug, routes of administration and the number of individuals using the wastewater system, the level of consumption of a drug can be estimated from such measured concentrations. When presenting results from these ‘back-calculations’, the multiple sources of uncertainty are often discussed, but are not usually explicitly taken into account in the estimation process. In this paper we demonstrate how these calculations can be placed in a more formal statistical framework by assuming a distribution for each parameter involved, based on a review of the evidence underpinning it. Using a Monte Carlo simulations approach, it is then straightforward to propagate uncertainty in each parameter through the back-calculations, producing a distribution for instead of a single estimate of daily or average consumption. This can be summarised for example by a median and credible interval. To demonstrate this approach, we estimate cocaine consumption in a large urban UK population, using measured concentrations of two of its metabolites, benzoylecgonine and norbenzoylecgonine. We also demonstrate a more sophisticated analysis, implemented within a Bayesian statistical framework using Markov chain Monte Carlo simulation. Our model allows the two metabolites to simultaneously inform estimates of daily cocaine consumption and explicitly allows for variability between days. After accounting for this variability, the resulting credible interval for average daily consumption is appropriately wider, representing additional uncertainty. We discuss possibilities for extensions to the model, and whether analysis of wastewater samples has potential to contribute to a prevalence model for illicit drug use. PMID:24636801
NASA Astrophysics Data System (ADS)
Manzanas, R., Sr.; Brands, S.; San Martin, D., Sr.; Gutiérrez, J. M., Sr.
2014-12-01
This work shows that local-scale climate projections obtained by means of statistical downscaling are sensitive to the choice of reanalysis used for calibration. To this aim, a Generalized Linear Model (GLM) approach is applied to downscale daily precipitation in the Philippines. First, the GLMs are trained and tested -under a cross-validation scheme- separately for two distinct reanalyses (ERA-Interim and JRA-25) for the period 1981-2000. When the observed and downscaled time-series are compared, the attained performance is found to be sensitive to the reanalysis considered if climate change signal bearing variables (temperature and/or specific humidity) are included in the predictor field. Moreover, performance differences are shown to be in correspondence with the disagreement found between the raw predictors from the two reanalyses. Second, the regression coefficients calibrated either with ERA-Interim or JRA-25 are subsequently applied to the output of a Global Climate Model (MPI-ECHAM5) in order to assess the sensitivity of local-scale climate change projections (up to 2100) to reanalysis choice. In this case, the differences detected in present climate conditions are considerably amplified, leading to "delta-change" estimates differing by up to a 35% (on average for the entire country) depending on the reanalysis used for calibration. Therefore, reanalysis choice is shown to importantly contribute to the uncertainty of local-scale climate change projections, and, consequently, should be treated with equal care as other, well-known, sources of uncertainty -e.g., the choice of the GCM and/or downscaling method.- Implications of the results for the entire tropics, as well as for the Model Output Statistics downscaling approach are also briefly discussed.
The importance of hydrological uncertainty assessment methods in climate change impact studies
NASA Astrophysics Data System (ADS)
Honti, M.; Scheidegger, A.; Stamm, C.
2014-08-01
Climate change impact assessments have become more and more popular in hydrology since the middle 1980s with a recent boost after the publication of the IPCC AR4 report. From hundreds of impact studies a quasi-standard methodology has emerged, to a large extent shaped by the growing public demand for predicting how water resources management or flood protection should change in the coming decades. The "standard" workflow relies on a model cascade from global circulation model (GCM) predictions for selected IPCC scenarios to future catchment hydrology. Uncertainty is present at each level and propagates through the model cascade. There is an emerging consensus between many studies on the relative importance of the different uncertainty sources. The prevailing perception is that GCM uncertainty dominates hydrological impact studies. Our hypothesis was that the relative importance of climatic and hydrologic uncertainty is (among other factors) heavily influenced by the uncertainty assessment method. To test this we carried out a climate change impact assessment and estimated the relative importance of the uncertainty sources. The study was performed on two small catchments in the Swiss Plateau with a lumped conceptual rainfall runoff model. In the climatic part we applied the standard ensemble approach to quantify uncertainty but in hydrology we used formal Bayesian uncertainty assessment with two different likelihood functions. One was a time series error model that was able to deal with the complicated statistical properties of hydrological model residuals. The second was an approximate likelihood function for the flow quantiles. The results showed that the expected climatic impact on flow quantiles was small compared to prediction uncertainty. The choice of uncertainty assessment method actually determined what sources of uncertainty could be identified at all. This demonstrated that one could arrive at rather different conclusions about the causes behind predictive uncertainty for the same hydrological model and calibration data when considering different objective functions for calibration.
Statistics of Dark Matter Halos from Gravitational Lensing.
Jain; Van Waerbeke L
2000-02-10
We present a new approach to measure the mass function of dark matter halos and to discriminate models with differing values of Omega through weak gravitational lensing. We measure the distribution of peaks from simulated lensing surveys and show that the lensing signal due to dark matter halos can be detected for a wide range of peak heights. Even when the signal-to-noise ratio is well below the limit for detection of individual halos, projected halo statistics can be constrained for halo masses spanning galactic to cluster halos. The use of peak statistics relies on an analytical model of the noise due to the intrinsic ellipticities of source galaxies. The noise model has been shown to accurately describe simulated data for a variety of input ellipticity distributions. We show that the measured peak distribution has distinct signatures of gravitational lensing, and its non-Gaussian shape can be used to distinguish models with different values of Omega. The use of peak statistics is complementary to the measurement of field statistics, such as the ellipticity correlation function, and is possibly not susceptible to the same systematic errors.
Organic-rich source beds and hydrocarbon production in Gulf Coast region
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, D.F.; Lerche, I.
1988-09-01
Two models (I and II) are presented that relate the production of hydrocarbons in the Gulf Coast region to organic-rich source beds of ancient intraslope basins. Model I is empirical, based on present-day depositional environments like the anoxic Orca basin of the northern Gulf of Mexico and the Bannock basin of the eastern Mediterranean Sea. Model I proposed that low oxygen levels in intraslope basins of the northwestern Gulf of Mexico (GOM) have been a common mechanism for the accumulation of sediments with significantly increased amounts of marine organic carbon. In Model I progradation of the shelf-slope and regional saltmore » tectonics control the occurrence and stratigraphic distribution of source beds throughout the Tertiary of the GOM. In turn, the maturation history of these organic-rich sediments is influenced by the high thermal conductivity of the underlying salt structures. Model II is statistical; it uses random number theory to suggest that the occurrence of organic-rich black muds in intraslope basins of the northwestern GOM had sufficient capacity to account for a dynamic range estimate of 30 to 500 billion bbl oil total and 30 to 300 bcf/million years per ephemeral basin of gas. These estimates, while approximate, clearly indicate the enormous hydrocarbon potential for generating oil and gas reserves in the Gulf Coast geosyncline. Such estimates underscore the need for a better understanding of intraslope basins of the northwestern GOM.« less
Distribution patterns of mercury in Lakes and Rivers of northeastern North America
Dennis, Ian F.; Clair, Thomas A.; Driscoll, Charles T.; Kamman, Neil; Chalmers, Ann T.; Shanley, Jamie; Norton, Stephen A.; Kahl, Steve
2005-01-01
We assembled 831 data points for total mercury (Hgt) and 277 overlapping points for methyl mercury (CH3Hg+) in surface waters from Massachussetts, USA to the Island of Newfoundland, Canada from State, Provincial, and Federal government databases. These geographically indexed values were used to determine: (a) if large-scale spatial distribution patterns existed and (b) whether there were significant relationships between the two main forms of aquatic Hg as well as with total organic carbon (TOC), a well know complexer of metals. We analyzed the catchments where samples were collected using a Geographical Information System (GIS) approach, calculating catchment sizes, mean slope, and mean wetness index. Our results show two main spatial distribution patterns. We detected loci of high Hgt values near urbanized regions of Boston MA and Portland ME. However, except for one unexplained exception, the highest Hgt and CH3Hg+ concentrations were located in regions far from obvious point sources. These correlated to topographically flat (and thus wet) areas that we relate to wetland abundances. We show that aquatic Hgt and CH3Hg+ concentrations are generally well correlated with TOC and with each other. Over the region, CH3Hg+ concentrations are typically approximately 15% of Hgt. There is an exception in the Boston region where CH3Hg+ is low compared to the high Hgt values. This is probably due to the proximity of point sources of inorganic Hg and a lack of wetlands. We also attempted to predict Hg concentrations in water with statistical models using catchment features as variables. We were only able to produce statistically significant predictive models in some parts of regions due to the lack of suitable digital information, and because data ranges in some regions were too narrow for meaningful regression analyses.
scoringRules - A software package for probabilistic model evaluation
NASA Astrophysics Data System (ADS)
Lerch, Sebastian; Jordan, Alexander; Krüger, Fabian
2016-04-01
Models in the geosciences are generally surrounded by uncertainty, and being able to quantify this uncertainty is key to good decision making. Accordingly, probabilistic forecasts in the form of predictive distributions have become popular over the last decades. With the proliferation of probabilistic models arises the need for decision theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way. Various scoring rules have been developed over the past decades to address this demand. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. As such, they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This poster presents the software package scoringRules for the statistical programming language R, which contains functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. Two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, Bayesian forecasts produced via Markov Chain Monte Carlo take this form. Thereby, the scoringRules package provides a framework for generalized model evaluation that both includes Bayesian as well as classical parametric models. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices.
NASA Technical Reports Server (NTRS)
Alexandrov, Mikhail Dmitrievic; Geogdzhayev, Igor V.; Tsigaridis, Konstantinos; Marshak, Alexander; Levy, Robert; Cairns, Brian
2016-01-01
A novel model for the variability in aerosol optical thickness (AOT) is presented. This model is based on the consideration of AOT fields as realizations of a stochastic process, that is the exponent of an underlying Gaussian process with a specific autocorrelation function. In this approach AOT fields have lognormal PDFs and structure functions having the correct asymptotic behavior at large scales. The latter is an advantage compared with fractal (scale-invariant) approaches. The simple analytical form of the structure function in the proposed model facilitates its use for the parameterization of AOT statistics derived from remote sensing data. The new approach is illustrated using a month-long global MODIS AOT dataset (over ocean) with 10 km resolution. It was used to compute AOT statistics for sample cells forming a grid with 5deg spacing. The observed shapes of the structure functions indicated that in a large number of cases the AOT variability is split into two regimes that exhibit different patterns of behavior: small-scale stationary processes and trends reflecting variations at larger scales. The small-scale patterns are suggested to be generated by local aerosols within the marine boundary layer, while the large-scale trends are indicative of elevated aerosols transported from remote continental sources. This assumption is evaluated by comparison of the geographical distributions of these patterns derived from MODIS data with those obtained from the GISS GCM. This study shows considerable potential to enhance comparisons between remote sensing datasets and climate models beyond regional mean AOTs.
Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan
2010-09-01
Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.
The First Planetary Microlensing Event with Two Microlensed Source Stars
NASA Astrophysics Data System (ADS)
Bennett, D. P.; Udalski, A.; Han, C.; Bond, I. A.; Beaulieu, J.-P.; Skowron, J.; Gaudi, B. S.; Koshimoto, N.; Abe, F.; Asakura, Y.; Barry, R. K.; Bhattacharya, A.; Donachie, M.; Evans, P.; Fukui, A.; Hirao, Y.; Itow, Y.; Li, M. C. A.; Ling, C. H.; Masuda, K.; Matsubara, Y.; Muraki, Y.; Nagakane, M.; Ohnishi, K.; Oyokawa, H.; Ranc, C.; Rattenbury, N. J.; Rosenthal, M. M.; Saito, To.; Sharan, A.; Sullivan, D. J.; Sumi, T.; Suzuki, D.; Tristram, P. J.; Yonehara, A.; The MOA Collaboration; Szymański, M. K.; Poleski, R.; Soszyński, I.; Ulaczyk, K.; Wyrzykowski, Ł.; The OGLE Collaboration; DePoy, D.; Gould, A.; Pogge, R. W.; Yee, J. C.; The μFUN Collaboration; Albrow, M. D.; Bachelet, E.; Batista, V.; Bowens-Rubin, R.; Brillant, S.; Caldwell, J. A. R.; Cole, A.; Coutures, C.; Dieters, S.; Dominis Prester, D.; Donatowicz, J.; Fouqué, P.; Horne, K.; Hundertmark, M.; Kains, N.; Kane, S. R.; Marquette, J.-B.; Menzies, J.; Pollard, K. R.; Ranc, C.; Sahu, K. C.; Wambsganss, J.; Williams, A.; Zub, M.; The PLANET Collaboration
2018-03-01
We present the analysis of the microlensing event MOA-2010-BLG-117, and show that the light curve can only be explained by the gravitational lensing of a binary source star system by a star with a Jupiter-mass ratio planet. It was necessary to modify standard microlensing modeling methods to find the correct light curve solution for this binary source, binary-lens event. We are able to measure a strong microlensing parallax signal, which yields the masses of the host star, M * = 0.58 ± 0.11 M ⊙, and planet, m p = 0.54 ± 0.10M Jup, at a projected star–planet separation of a ⊥ = 2.42 ± 0.26 au, corresponding to a semimajor axis of a=2.9≥nfrac{}{}{0em}{}{+1.6}{-0.6} au. Thus, the system resembles a half-scale model of the Sun–Jupiter system with a half-Jupiter0mass planet orbiting a half-solar-mass star at very roughly half of Jupiter’s orbital distance from the Sun. The source stars are slightly evolved, and by requiring them to lie on the same isochrone, we can constrain the source to lie in the near side of the bulge at a distance of D S = 6.9 ± 0.7 kpc, which implies a distance to the planetary lens system of D L = 3.5 ± 0.4 kpc. The ability to model unusual planetary microlensing events, like this one, will be necessary to extract precise statistical information from the planned large exoplanet microlensing surveys, such as the WFIRST microlensing survey.
NASA Astrophysics Data System (ADS)
Brunner, Dominik; Henne, Stephan; Keller, Christoph A.; Reimann, Stefan; Vollmer, Martin K.; O'Doherty, Simon
2010-05-01
Halogenated hydrocarbons in the atmosphere are mostly synthetic products of the chemical industry designed for a wide range of applications. The first generation of compounds, the bromine- and chlorine-containing halons and chlorofluorocarbons (CFCs), were shown to be harmful to the stratospheric ozone layer. This motivated the international community to initiate the Montreal Protocol in 1987 to phase out their production globally. In the industrialized countries CFCs were consequently replaced by the shorter-lived hydrochlorofluorocarbons (HCFCs) during the 1990s and thereafter by the completely chlorine-free HFCs. Although not harmful to the ozone layer anymore, some of the HFCs are potent greenhouse gases and are therefore regulated under the Kyoto Protocol. The high-alpine station Jungfraujoch and the coastal station Mace Head are two of only four sites of the European SOGE network (System for Observation of Halogenated Greenhouse Gases in Europe) with high-frequency measurements of halogenated compounds. Based on observations at these two sites, we here present a combined measurement - model analysis of the distribution of European emissions for a selection of compounds, and trace their evolution with time since measurements started in 2000. For the spatial allocation of sources, the measurements were combined with detailed transport simulations. For a qualitative allocation of sources in Europe we employed the trajectory statistics method of Seibert et al. (1994) and Stohl (1996). For Mace Head trajectories were computed with the FLEXPART model driven by ECMWF analyzed winds at 1°x1° resolution. For the station Jungfraujoch, however, we used the model COSMO-TRAJ driven by high-resolution wind fields (7 km x 7 km) of the weather forecast model COSMO of MeteoSwiss in order to better represent the transport in complex topography over the Alps. The method allows identifying the major source regions of the different compounds in Western and Central Europe. The pesticide methyl bromide (CH3Br), for example, was applied primarily in southern Europe to protect vegetable and strawberry plantations. Its production was banned by the Montreal Protocol which is reflected by a strong reduction in emissions between 2003 and 2008 as seen from Jungfraujoch. A contrasting example is the cooling agent HFC-125 belonging to the second generation of replacement compounds not regulated under the Montreal Protocol. During the same period, HFC-125 exhibited a marked increase with sources more homogeneously spread over Europe than those of CH3Br. For a more quantitative analysis for the years 2007-2009, we applied the Lagrangian Particle Dispersion Model FLEXPART using meteorological input data of the IFS model of ECMWF at 0.2° x 0.2° resolution, together with a new source inversion method based on sequential Kalman filtering. Different from other approaches the method is essentially independent of an a-priori and adjusts both the emission field and the trace gas background levels in an iterative fashion. In this study, we will contrast results of the trajectory statistics method with the more advanced source inversion, address uncertainties in the methods, and show the evolution of European emissions of a selection of compounds in comparison to official numbers reported by the individual countries to the Montreal and Kyoto protocols, respectively.
Online and offline tools for head movement compensation in MEG.
Stolk, Arjen; Todorovic, Ana; Schoffelen, Jan-Mathijs; Oostenveld, Robert
2013-03-01
Magnetoencephalography (MEG) is measured above the head, which makes it sensitive to variations of the head position with respect to the sensors. Head movements blur the topography of the neuronal sources of the MEG signal, increase localization errors, and reduce statistical sensitivity. Here we describe two novel and readily applicable methods that compensate for the detrimental effects of head motion on the statistical sensitivity of MEG experiments. First, we introduce an online procedure that continuously monitors head position. Second, we describe an offline analysis method that takes into account the head position time-series. We quantify the performance of these methods in the context of three different experimental settings, involving somatosensory, visual and auditory stimuli, assessing both individual and group-level statistics. The online head localization procedure allowed for optimal repositioning of the subjects over multiple sessions, resulting in a 28% reduction of the variance in dipole position and an improvement of up to 15% in statistical sensitivity. Offline incorporation of the head position time-series into the general linear model resulted in improvements of group-level statistical sensitivity between 15% and 29%. These tools can substantially reduce the influence of head movement within and between sessions, increasing the sensitivity of many cognitive neuroscience experiments. Copyright © 2012 Elsevier Inc. All rights reserved.
Galaxy mergers and gravitational lens statistics
NASA Technical Reports Server (NTRS)
Rix, Hans-Walter; Maoz, Dan; Turner, Edwin L.; Fukugita, Masataka
1994-01-01
We investigate the impact of hierarchical galaxy merging on the statistics of gravitational lensing of distant sources. Since no definite theoretical predictions for the merging history of luminous galaxies exist, we adopt a parameterized prescription, which allows us to adjust the expected number of pieces comprising a typical present galaxy at z approximately 0.65. The existence of global parameter relations for elliptical galaxies and constraints on the evolution of the phase space density in dissipationless mergers, allow us to limit the possible evolution of galaxy lens properties under merging. We draw two lessons from implementing this lens evolution into statistical lens calculations: (1) The total optical depth to multiple imaging (e.g., of quasars) is quite insensitive to merging. (2) Merging leads to a smaller mean separation of observed multiple images. Because merging does not reduce drastically the expected lensing frequency, it cannot make lambda-dominated cosmologies compatible with the existing lensing observations. A comparison with the data from the Hubble Space Telescope (HST) Snapshot Survey shows that models with little or no evolution of the lens population are statistically favored over strong merging scenarios. A specific merging scenario proposed to Toomre can be rejected (95% level) by such a comparison. Some versions of the scenario proposed by Broadhurst, Ellis, & Glazebrook are statistically acceptable.
Vortex dynamics and Lagrangian statistics in a model for active turbulence.
James, Martin; Wilczek, Michael
2018-02-14
Cellular suspensions such as dense bacterial flows exhibit a turbulence-like phase under certain conditions. We study this phenomenon of "active turbulence" statistically by using numerical tools. Following Wensink et al. (Proc. Natl. Acad. Sci. U.S.A. 109, 14308 (2012)), we model active turbulence by means of a generalized Navier-Stokes equation. Two-point velocity statistics of active turbulence, both in the Eulerian and the Lagrangian frame, is explored. We characterize the scale-dependent features of two-point statistics in this system. Furthermore, we extend this statistical study with measurements of vortex dynamics in this system. Our observations suggest that the large-scale statistics of active turbulence is close to Gaussian with sub-Gaussian tails.
One-dimensional statistical parametric mapping in Python.
Pataky, Todd C
2012-01-01
Statistical parametric mapping (SPM) is a topological methodology for detecting field changes in smooth n-dimensional continua. Many classes of biomechanical data are smooth and contained within discrete bounds and as such are well suited to SPM analyses. The current paper accompanies release of 'SPM1D', a free and open-source Python package for conducting SPM analyses on a set of registered 1D curves. Three example applications are presented: (i) kinematics, (ii) ground reaction forces and (iii) contact pressure distribution in probabilistic finite element modelling. In addition to offering a high-level interface to a variety of common statistical tests like t tests, regression and ANOVA, SPM1D also emphasises fundamental concepts of SPM theory through stand-alone example scripts. Source code and documentation are available at: www.tpataky.net/spm1d/.
NASA Astrophysics Data System (ADS)
Zaichik, Leonid I.; Alipchenkov, Vladimir M.
2009-10-01
The purpose of this paper is twofold: (i) to advance and extend the statistical two-point models of pair dispersion and particle clustering in isotropic turbulence that were previously proposed by Zaichik and Alipchenkov (2003 Phys. Fluids15 1776-87 2007 Phys. Fluids 19, 113308) and (ii) to present some applications of these models. The models developed are based on a kinetic equation for the two-point probability density function of the relative velocity distribution of two particles. These models predict the pair relative velocity statistics and the preferential accumulation of heavy particles in stationary and decaying homogeneous isotropic turbulent flows. Moreover, the models are applied to predict the effect of particle clustering on turbulent collisions, sedimentation and intensity of microwave radiation as well as to calculate the mean filtered subgrid stress of the particulate phase. Model predictions are compared with direct numerical simulations and experimental measurements.
Wu, Hao; Zhang, Yan; Yu, Qi; Ma, Weichun
2018-04-01
In this study, the authors endeavored to develop an effective framework for improving local urban air quality on meso-micro scales in cities in China that are experiencing rapid urbanization. Within this framework, the integrated Weather Research and Forecasting (WRF)/CALPUFF modeling system was applied to simulate the concentration distributions of typical pollutants (particulate matter with an aerodynamic diameter <10 μm [PM 10 ], sulfur dioxide [SO 2 ], and nitrogen oxides [NO x ]) in the urban area of Benxi. Statistical analyses were performed to verify the credibility of this simulation, including the meteorological fields and concentration fields. The sources were then categorized using two different classification methods (the district-based and type-based methods), and the contributions to the pollutant concentrations from each source category were computed to provide a basis for appropriate control measures. The statistical indexes showed that CALMET had sufficient ability to predict the meteorological conditions, such as the wind fields and temperatures, which provided meteorological data for the subsequent CALPUFF run. The simulated concentrations from CALPUFF showed considerable agreement with the observed values but were generally underestimated. The spatial-temporal concentration pattern revealed that the maximum concentrations tended to appear in the urban centers and during the winter. In terms of their contributions to pollutant concentrations, the districts of Xihu, Pingshan, and Mingshan all affected the urban air quality to different degrees. According to the type-based classification, which categorized the pollution sources as belonging to the Bengang Group, large point sources, small point sources, and area sources, the source apportionment showed that the Bengang Group, the large point sources, and the area sources had considerable impacts on urban air quality. Finally, combined with the industrial characteristics, detailed control measures were proposed with which local policy makers could improve the urban air quality in Benxi. In summary, the results of this study showed that this framework has credibility for effectively improving urban air quality, based on the source apportionment of atmospheric pollutants. The authors endeavored to build up an effective framework based on the integrated WRF/CALPUFF to improve the air quality in many cities on meso-micro scales in China. Via this framework, the integrated modeling tool is accurately used to study the characteristics of meteorological fields, concentration fields, and source apportionments of pollutants in target area. The impacts of classified sources on air quality together with the industrial characteristics can provide more effective control measures for improving air quality. Through the case study, the technical framework developed in this study, particularly the source apportionment, could provide important data and technical support for policy makers to assess air pollution on the scale of a city in China or even the world.
Detector-Response Correction of Two-Dimensional γ -Ray Spectra from Neutron Capture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rusev, G.; Jandel, M.; Arnold, C. W.
2015-05-28
The neutron-capture reaction produces a large variety of γ-ray cascades with different γ-ray multiplicities. A measured spectral distribution of these cascades for each γ-ray multiplicity is of importance to applications and studies of γ-ray statistical properties. The DANCE array, a 4π ball of 160 BaF 2 detectors, is an ideal tool for measurement of neutron-capture γ-rays. The high granularity of DANCE enables measurements of high-multiplicity γ-ray cascades. The measured two-dimensional spectra (γ-ray energy, γ-ray multiplicity) have to be corrected for the DANCE detector response in order to compare them with predictions of the statistical model or use them in applications.more » The detector-response correction problem becomes more difficult for a 4π detection system than for a single detector. A trial and error approach and an iterative decomposition of γ-ray multiplets, have been successfully applied to the detector-response correction. As a result, applications of the decomposition methods are discussed for two-dimensional γ-ray spectra measured at DANCE from γ-ray sources and from the 10B(n, γ) and 113Cd(n, γ) reactions.« less
The Transfer Function Model as a Tool to Study and Describe Space Weather Phenomena
NASA Technical Reports Server (NTRS)
Porter, Hayden S.; Mayr, Hans G.; Bhartia, P. K. (Technical Monitor)
2001-01-01
The Transfer Function Model (TFM) is a semi-analytical, linear model that is designed especially to describe thermospheric perturbations associated with magnetic storms and substorm. activity. It is a multi-constituent model (N2, O, He H, Ar) that accounts for wind induced diffusion, which significantly affects not only the composition and mass density but also the temperature and wind fields. Because the TFM adopts a semianalytic approach in which the geometry and temporal dependencies of the driving sources are removed through the use of height-integrated Green's functions, it provides physical insight into the essential properties of processes being considered, which are uncluttered by the accidental complexities that arise from particular source geometrie and time dependences. Extending from the ground to 700 km, the TFM eliminates spurious effects due to arbitrarily chosen boundary conditions. A database of transfer functions, computed only once, can be used to synthesize a wide range of spatial and temporal sources dependencies. The response synthesis can be performed quickly in real-time using only limited computing capabilities. These features make the TFM unique among global dynamical models. Given these desirable properties, a version of the TFM has been developed for personal computers (PC) using advanced platform-independent 3D visualization capabilities. We demonstrate the model capabilities with simulations for different auroral sources, including the response of ducted gravity waves modes that propagate around the globe. The thermospheric response is found to depend strongly on the spatial and temporal frequency spectra of the storm. Such varied behavior is difficult to describe in statistical empirical models. To improve the capability of space weather prediction, the TFM thus could be grafted naturally onto existing statistical models using data assimilation.
-> Air entrainment and bubble statistics in three-dimensional breaking waves
NASA Astrophysics Data System (ADS)
Deike, L.; Popinet, S.; Melville, W. K.
2016-02-01
Wave breaking in the ocean is of fundamental importance for quantifying wave dissipation and air-sea interaction, including gas and momentum exchange, and for improving air-sea flux parametrizations for weather and climate models. Here we investigate air entrainment and bubble statistics in three-dimensional breaking waves through direct numerical simulations of the two-phase air-water flow using the Open Source solver Gerris. As in previous 2D simulations, the dissipation due to breaking is found to be in good agreement with previous experimental observations and inertial-scaling arguments. For radii larger than the Hinze scale, the bubble size distribution is found to follow a power law of the radius, r-10/3 and to scale linearly with the time dependent turbulent dissipation rate during the active breaking stage. The time-averaged bubble size distribution is found to follow the same power law of the radius and to scale linearly with the wave dissipation rate per unit length of breaking crest. We propose a phenomenological turbulent bubble break-up model that describes the numerical results and existing experimental results.
Statistical representation of multiphase flow
NASA Astrophysics Data System (ADS)
Subramaniam
2000-11-01
The relationship between two common statistical representations of multiphase flow, namely, the single--point Eulerian statistical representation of two--phase flow (D. A. Drew, Ann. Rev. Fluid Mech. (15), 1983), and the Lagrangian statistical representation of a spray using the dropet distribution function (F. A. Williams, Phys. Fluids 1 (6), 1958) is established for spherical dispersed--phase elements. This relationship is based on recent work which relates the droplet distribution function to single--droplet pdfs starting from a Liouville description of a spray (Subramaniam, Phys. Fluids 10 (12), 2000). The Eulerian representation, which is based on a random--field model of the flow, is shown to contain different statistical information from the Lagrangian representation, which is based on a point--process model. The two descriptions are shown to be simply related for spherical, monodisperse elements in statistically homogeneous two--phase flow, whereas such a simple relationship is precluded by the inclusion of polydispersity and statistical inhomogeneity. The common origin of these two representations is traced to a more fundamental statistical representation of a multiphase flow, whose concepts derive from a theory for dense sprays recently proposed by Edwards (Atomization and Sprays 10 (3--5), 2000). The issue of what constitutes a minimally complete statistical representation of a multiphase flow is resolved.
Dorazio, Robert M; Hunter, Margaret E
2015-11-03
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
A new statistical model to find bedrock, a prequel to geochemical mass balance
NASA Astrophysics Data System (ADS)
Fisher, B.; Rendahl, A. K.; Aufdenkampe, A. K.; Yoo, K.
2016-12-01
We present a new statistical model to assess weathering trends in deep weathering profiles. The Weathering Trends (WT) model is presented as an extension of the geochemical mass balance model (Brimhall & Dietrich, 1987), and is available as an open-source R library on GitHub (https://github.com/AaronRendahl/WeatheringTrends). WT uses element concentration data to determine the depth to fresh bedrock by assessing the maximum extent of weathering for all elements and the model applies confidence intervals on the depth to bedrock. WT models near-surface features and the shape of the weathering profile using a log transformation of data to capture the magnitude of changes that are relevant to geochemical kinetics and thermodynamics. The WT model offers a new, enhanced opportunity to characterize and understand biogeochemical weathering in heterogeneous rock types. We apply the model to two 21-meter drill cores in the Laurels Schist bedrock in the Christina River Basin Critical Zone Observatory in the Pennsylvania Piedmont. The Laurels Schist had inconclusive weathering indicators prior to development and application of WT model. The model differentiated between rock variability and weathering to delineate the maximum extent of weathering at 12.3 (CI 95% [9.2, 21.3]) meters in Ridge Well 1 and 7.2 (CI 95% [4.3, 13.0]) meters in Interfluve Well 2. The modeled extent to weathering is decoupled from the water table at the ridge, but coincides with the water table at the interfluve. These depths were applied as the parent material for the geochemical mass balance for the Laurels Schist. We test statistical approaches to assess the variability and correlation of immobile elements to facilitate the selection of the best immobile element for use in both models. We apply the model to other published data where the geochemical mass balance was applied, to demonstrate how the WT model provides additional information about weathering depth and weathering trends.
Assessing groundwater vulnerability to agrichemical contamination in the Midwest US
Burkart, M.R.; Kolpin, D.W.; James, D.E.
1999-01-01
Agrichemicals (herbicides and nitrate) are significant sources of diffuse pollution to groundwater. Indirect methods are needed to assess the potential for groundwater contamination by diffuse sources because groundwater monitoring is too costly to adequately define the geographic extent of contamination at a regional or national scale. This paper presents examples of the application of statistical, overlay and index, and process-based modeling methods for groundwater vulnerability assessments to a variety of data from the Midwest U.S. The principles for vulnerability assessment include both intrinsic (pedologic, climatologic, and hydrogeologic factors) and specific (contaminant and other anthropogenic factors) vulnerability of a location. Statistical methods use the frequency of contaminant occurrence, contaminant concentration, or contamination probability as a response variable. Statistical assessments are useful for defining the relations among explanatory and response variables whether they define intrinsic or specific vulnerability. Multivariate statistical analyses are useful for ranking variables critical to estimating water quality responses of interest. Overlay and index methods involve intersecting maps of intrinsic and specific vulnerability properties and indexing the variables by applying appropriate weights. Deterministic models use process-based equations to simulate contaminant transport and are distinguished from the other methods in their potential to predict contaminant transport in both space and time. An example of a one-dimensional leaching model linked to a geographic information system (GIS) to define a regional metamodel for contamination in the Midwest is included.
Sauvé, Jean-François; Beaudry, Charles; Bégin, Denis; Dion, Chantal; Gérin, Michel; Lavoué, Jérôme
2013-05-01
Many construction activities can put workers at risk of breathing silica containing dusts, and there is an important body of literature documenting exposure levels using a task-based strategy. In this study, statistical modeling was used to analyze a data set containing 1466 task-based, personal respirable crystalline silica (RCS) measurements gathered from 46 sources to estimate exposure levels during construction tasks and the effects of determinants of exposure. Monte-Carlo simulation was used to recreate individual exposures from summary parameters, and the statistical modeling involved multimodel inference with Tobit models containing combinations of the following exposure variables: sampling year, sampling duration, construction sector, project type, workspace, ventilation, and controls. Exposure levels by task were predicted based on the median reported duration by activity, the year 1998, absence of source control methods, and an equal distribution of the other determinants of exposure. The model containing all the variables explained 60% of the variability and was identified as the best approximating model. Of the 27 tasks contained in the data set, abrasive blasting, masonry chipping, scabbling concrete, tuck pointing, and tunnel boring had estimated geometric means above 0.1mg m(-3) based on the exposure scenario developed. Water-fed tools and local exhaust ventilation were associated with a reduction of 71 and 69% in exposure levels compared with no controls, respectively. The predictive model developed can be used to estimate RCS concentrations for many construction activities in a wide range of circumstances.
Important Literature in Endocrinology: Citation Analysis and Historial Methodology.
ERIC Educational Resources Information Center
Hurt, C. D.
1982-01-01
Results of a study comparing two approaches to the identification of important literature in endocrinology reveals that association between ranking of cited items using the two methods is not statistically significant and use of citation or historical analysis alone will not result in same set of literature. Forty-two sources are appended. (EJS)
Quantification of uncertainties in the tsunami hazard for Cascadia using statistical emulation
NASA Astrophysics Data System (ADS)
Guillas, S.; Day, S. J.; Joakim, B.
2016-12-01
We present new high resolution tsunami wave propagation and coastal inundation for the Cascadia region in the Pacific Northwest. The coseismic representation in this analysis is novel, and more realistic than in previous studies, as we jointly parametrize multiple aspects of the seabed deformation. Due to the large computational cost of such simulators, statistical emulation is required in order to carry out uncertainty quantification tasks, as emulators efficiently approximate simulators. The emulator replaces the tsunami model VOLNA by a fast surrogate, so we are able to efficiently propagate uncertainties from the source characteristics to wave heights, in order to probabilistically assess tsunami hazard for Cascadia. We employ a new method for the design of the computer experiments in order to reduce the number of runs while maintaining good approximations properties of the emulator. Out of the initial nine parameters, mostly describing the geometry and time variation of the seabed deformation, we drop two parameters since these turn out to not have an influence on the resulting tsunami waves at the coast. We model the impact of another parameter linearly as its influence on the wave heights is identified as linear. We combine this screening approach with the sequential design algorithm MICE (Mutual Information for Computer Experiments), that adaptively selects the input values at which to run the computer simulator, in order to maximize the expected information gain (mutual information) over the input space. As a result, the emulation is made possible and accurate. Starting from distributions of the source parameters that encapsulate geophysical knowledge of the possible source characteristics, we derive distributions of the tsunami wave heights along the coastline.
NASA Astrophysics Data System (ADS)
Wilson, D.; Hopkins, C.
2015-04-01
For bending wave transmission across periodic box-like arrangements of plates, the effects of spatial filtering can be significant and this needs to be considered in the choice of prediction model. This paper investigates the errors that can occur with Statistical Energy Analysis (SEA) and the potential of using Advanced SEA (ASEA) to improve predictions. The focus is on the low- and mid-frequency range where plates only support local modes with low mode counts and the in situ modal overlap is relatively high. To increase the computational efficiency when using ASEA on large systems, a beam tracing method is introduced which groups together all rays with the same heading into a single beam. Based on a diffuse field on the source plate, numerical experiments are used to determine the angular distribution of incident power on receiver plate edges on linear and cuboid box-like structures. These show that on receiver plates which do not share a boundary with the source plate, the angular distribution on the receiver plate boundaries differs significantly from a diffuse field. SEA and ASEA predictions are assessed through comparison with finite element models. With rain-on-the-roof excitation on the source plate, the results show that compared to SEA, ASEA provides significantly better estimates of the receiver plate energy, but only where there are at least one or two bending modes in each one-third octave band. Whilst ASEA provides better accuracy than SEA, discrepancies still exist which become more apparent when the direct propagation path crosses more than three nominally identical structural junctions.
Payne, C L R; Scarborough, P; Rayner, M; Nonaka, K
2016-03-01
Insects have been the subject of recent attention as a potentially environmentally sustainable and nutritious alternative to traditional protein sources. The purpose of this paper is to test the hypothesis that insects are nutritionally preferable to meat, using two evaluative tools that are designed to combat over- and under-nutrition. We selected 183 datalines of publicly available data on the nutrient composition of raw cuts and offal of three commonly consumed meats (beef, pork and chicken), and six commercially available insect species, for energy and 12 relevant nutrients. We applied two nutrient profiling tools to this data: The Ofcom model, which is used in the United Kingdom, and the Nutrient Value Score (NVS), which has been used in East Africa. We compared the median nutrient profile scores of different insect species and meat types using non-parametric tests and applied Bonferroni adjustments to assess for statistical significance in differences. Insect nutritional composition showed high diversity between species. According to the Ofcom model, no insects were significantly 'healthier' than meat products. The NVS assigned crickets, palm weevil larvae and mealworm a significantly healthier score than beef (P<0.001) and chicken (P<0.001). No insects were statistically less healthy than meat. Insect nutritional composition is highly diverse in comparison with commonly consumed meats. The food category 'insects' contains some foods that could potentially exacerbate diet-related public health problems related to over-nutrition, but may be effective in combating under-nutrition.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramos-Mendez, J; Faddegon, B; Paganetti, H
2015-06-15
Purpose: We used TOPAS (TOPAS wraps and extends Geant4 for medical physicists) to compare Geant4 physics models with published data for neutron shielding calculations. Subsequently, we calculated the source terms and attenuation lengths (shielding data) of the total ambient dose equivalent (TADE) in concrete for neutrons produced by protons in brass. Methods: Stage1: The Bertini and Binary nuclear models available in Geant4 were compared with published attenuation at depth of the TADE in concrete and iron. Stage2: Shielding data of the TADE in concrete was calculated for 50– 200 MeV proton beams on brass. Stage3: Shielding data from Stage2 wasmore » extrapolated for 235 MeV proton beams. This data was used in a point-line-source analytical model to calculate the ambient dose per unit therapeutic dose at two locations inside one treatment room at the Francis H Burr Proton Therapy Center. Finally, we compared these results with experimental data and full TOPAS simulations. Results: At larger angles (∼130o) the TADE in concrete calculated with the Bertini model was about 9 times larger than that calculated with the Binary model. The attenuation length in concrete calculated with the Binary model agreed with published data within 7%±0.4% (statistical uncertainty) for the deepest regions and 5%±0.1% for shallower regions. For iron the agreement was within 3%±0.1%. The ambient dose per therapeutic dose calculated with the Binary model, relative to the experimental data, was a ratio of 0.93±0.16 and 1.23±0.24 for two locations. The analytical model overestimated the dose by four orders of magnitude. These differences are attributed to the complexity of the geometry. Conclusion: The Binary and Bertini models gave comparable results, with the Binary model giving the best agreement with published data at large angle. Shielding data we calculated using the Binary model is useful for fast shielding calculations with other analytical models. This work was supported by National Cancer Institute Grant R01CA140735.« less
NASA Astrophysics Data System (ADS)
Kis, A.; Lemperger, I.; Wesztergom, V.; Menvielle, M.; Szalai, S.; Novák, A.; Hada, T.; Matsukiyo, S.; Lethy, A. M.
2016-12-01
Magnetotelluric method is widely applied for investigation of subsurface structures by imaging the spatial distribution of electric conductivity. The method is based on the experimental determination of surface electromagnetic impedance tensor (Z) by surface geomagnetic and telluric registrations in two perpendicular orientation. In practical explorations the accurate estimation of Z necessitates the application of robust statistical methods for two reasons:1) the geomagnetic and telluric time series' are contaminated by man-made noise components and2) the non-homogeneous behavior of ionospheric current systems in the period range of interest (ELF-ULF and longer periods) results in systematic deviation of the impedance of individual time windows.Robust statistics manage both load of Z for the purpose of subsurface investigations. However, accurate analysis of the long term temporal variation of the first and second statistical moments of Z may provide valuable information about the characteristics of the ionospheric source current systems. Temporal variation of extent, spatial variability and orientation of the ionospheric source currents has specific effects on the surface impedance tensor. Twenty year long geomagnetic and telluric recordings of the Nagycenk Geophysical Observatory provides unique opportunity to reconstruct the so called magnetotelluric source effect and obtain information about the spatial and temporal behavior of ionospheric source currents at mid-latitudes. Detailed investigation of time series of surface electromagnetic impedance tensor has been carried out in different frequency classes of the ULF range. The presentation aims to provide a brief review of our results related to long term periodic modulations, up to solar cycle scale and about eventual deviations of the electromagnetic impedance and so the reconstructed equivalent ionospheric source effects.
Broad-band, radio spectro-polarimetric study of 100 radiative-mode and jet-mode AGN
NASA Astrophysics Data System (ADS)
O'Sullivan, S. P.; Purcell, C. R.; Anderson, C. S.; Farnes, J. S.; Sun, X. H.; Gaensler, B. M.
2017-08-01
We present the results from a broad-band (1 to 3 GHz), spectro-polarimetry study of the integrated emission from 100 extragalactic radio sources with the Australia Telescope Compact Array, selected to be highly linearly polarized at 1.4 GHz. We use a general-purpose, polarization model-fitting procedure that describes the Faraday rotation measure (RM) and intrinsic polarization structure of up to three distinct polarized emission regions or `RM components' of a source. Overall, 37 per cent/52 per cent/11 per cent of sources are best fitted by one/two/three RM components. However, these fractions are dependent on the signal-to-noise ratio (S/N) in polarization (more RM components more likely at higher S/N). In general, our analysis shows that sources with high integrated degrees of polarization at 1.4 GHz have low Faraday depolarization, are typically dominated by a single RM component, have a steep spectral index and have a high intrinsic degree of polarization. After classifying our sample into radiative-mode and jet-mode AGN, we find no significant difference between the Faraday rotation or Faraday depolarization properties of jet-mode and radiative-mode AGN. However, there is a statistically significant difference in the intrinsic degree of polarization between the two types, with the jet-mode sources having more intrinsically ordered magnetic field structures than the radiative-mode sources. We also find a preferred perpendicular orientation of the intrinsic magnetic field structure of jet-mode AGN with respect to the jet direction, while no clear preference is found for the radiative-mode sources.
NASA Astrophysics Data System (ADS)
Liu, L.; Du, L.; Liao, Y.
2017-12-01
Based on the ensemble hindcast dataset of CSM1.1m by NCC, CMA, Bayesian merging models and a two-step statistical model are developed and employed to predict monthly grid/station precipitation in the Huaihe River China during summer at the lead-time of 1 to 3 months. The hindcast datasets span a period of 1991 to 2014. The skill of the two models is evaluated using area under the ROC curve (AUC) in a leave-one-out cross-validation framework, and is compared to the skill of CSM1.1m. CSM1.1m has highest skill for summer precipitation from April while lowest from May, and has highest skill for precipitation in June but lowest for precipitation in July. Compared with raw outputs of climate models, some schemes of the two approaches have higher skill for the prediction from March and May, but almost schemes have lower skill for prediction from April. Compared to two-step approach, one sampling scheme of Bayesian merging approach has higher skill for the prediction from March, but has lower skill from May. The results suggest that there is potential to apply the two statistical models for monthly precipitation forecast in summer from March and from May over Huaihe River basin, but is potential to apply CSM1.1m forecast from April. Finally, the summer runoff during 1991 to 2014 is simulated based on one hydrological model using the climate hindcast of CSM1.1m and the two statistical models.
Armitage, James M; McLachlan, Michael S; Wiberg, Karin; Jonsson, Per
2009-06-01
The contamination of the Baltic Sea with polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs) has resulted in restrictions on the marketing and consumption of Baltic Sea fish, making this a priority environmental issue in the European Union. To date there is no consensus on the relative importance of different sources of PCDD/Fs to the Baltic Sea, and hence no consensus on how to address this issue. In this work we synthesized the available information to create a PCDD/F budget for the Baltic Sea, focusing on the two largest basins, the Bothnian Sea and the Baltic Proper. The non-steady state multimedia fate and transport model POPCYCLING-Baltic was employed, using recent data for PCDD/F concentrations in air and sediment as boundary conditions. The PCDD/F concentrations in water predicted by the model were in good agreement with recent measurements. The budget demonstrated that atmospheric deposition was the dominant source of PCDD/Fs to the basins as a whole. This conclusion was supported by a statistical comparison of the PCDD/F congener patterns in surface sediments from accumulation bottoms with the patterns in ambient air, bulk atmospheric deposition, and a range of potential industrial sources. Prospective model simulations indicated that the PCDD/F concentrations in the water column will continue to decrease in the coming years due to the slow response of the Baltic Sea system to falling PCDD/F inputs in the last decades, but that the decrease would be more pronounced if ambient air concentrations were to drop further in the future, for instance as a result of reduced emissions. The study illustrates the usefulness of using monitoring data and multimedia models in an integrated fashion to address complex organic contaminant issues.
Statistical field theory of futures commodity prices
NASA Astrophysics Data System (ADS)
Baaquie, Belal E.; Yu, Miao
2018-02-01
The statistical theory of commodity prices has been formulated by Baaquie (2013). Further empirical studies of single (Baaquie et al., 2015) and multiple commodity prices (Baaquie et al., 2016) have provided strong evidence in support the primary assumptions of the statistical formulation. In this paper, the model for spot prices (Baaquie, 2013) is extended to model futures commodity prices using a statistical field theory of futures commodity prices. The futures prices are modeled as a two dimensional statistical field and a nonlinear Lagrangian is postulated. Empirical studies provide clear evidence in support of the model, with many nontrivial features of the model finding unexpected support from market data.
Virtual Model Validation of Complex Multiscale Systems: Applications to Nonlinear Elastostatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oden, John Tinsley; Prudencio, Ernest E.; Bauman, Paul T.
We propose a virtual statistical validation process as an aid to the design of experiments for the validation of phenomenological models of the behavior of material bodies, with focus on those cases in which knowledge of the fabrication process used to manufacture the body can provide information on the micro-molecular-scale properties underlying macroscale behavior. One example is given by models of elastomeric solids fabricated using polymerization processes. We describe a framework for model validation that involves Bayesian updates of parameters in statistical calibration and validation phases. The process enables the quanti cation of uncertainty in quantities of interest (QoIs) andmore » the determination of model consistency using tools of statistical information theory. We assert that microscale information drawn from molecular models of the fabrication of the body provides a valuable source of prior information on parameters as well as a means for estimating model bias and designing virtual validation experiments to provide information gain over calibration posteriors.« less
Why environmental scientists are becoming Bayesians
James S. Clark
2005-01-01
Advances in computational statistics provide a general framework for the high dimensional models typically needed for ecological inference and prediction. Hierarchical Bayes (HB) represents a modelling structure with capacity to exploit diverse sources of information, to accommodate influences that are unknown (or unknowable), and to draw inference on large numbers of...
An efficient soil water balance model based on hybrid numerical and statistical methods
NASA Astrophysics Data System (ADS)
Mao, Wei; Yang, Jinzhong; Zhu, Yan; Ye, Ming; Liu, Zhao; Wu, Jingwei
2018-04-01
Most soil water balance models only consider downward soil water movement driven by gravitational potential, and thus cannot simulate upward soil water movement driven by evapotranspiration especially in agricultural areas. In addition, the models cannot be used for simulating soil water movement in heterogeneous soils, and usually require many empirical parameters. To resolve these problems, this study derives a new one-dimensional water balance model for simulating both downward and upward soil water movement in heterogeneous unsaturated zones. The new model is based on a hybrid of numerical and statistical methods, and only requires four physical parameters. The model uses three governing equations to consider three terms that impact soil water movement, including the advective term driven by gravitational potential, the source/sink term driven by external forces (e.g., evapotranspiration), and the diffusive term driven by matric potential. The three governing equations are solved separately by using the hybrid numerical and statistical methods (e.g., linear regression method) that consider soil heterogeneity. The four soil hydraulic parameters required by the new models are as follows: saturated hydraulic conductivity, saturated water content, field capacity, and residual water content. The strength and weakness of the new model are evaluated by using two published studies, three hypothetical examples and a real-world application. The evaluation is performed by comparing the simulation results of the new model with corresponding results presented in the published studies, obtained using HYDRUS-1D and observation data. The evaluation indicates that the new model is accurate and efficient for simulating upward soil water flow in heterogeneous soils with complex boundary conditions. The new model is used for evaluating different drainage functions, and the square drainage function and the power drainage function are recommended. Computational efficiency of the new model makes it particularly suitable for large-scale simulation of soil water movement, because the new model can be used with coarse discretization in space and time.
Acharyya, Muktish
2017-07-01
The spin wave interference is studied in two dimensional Ising ferromagnet driven by two coherent spherical magnetic field waves by Monte Carlo simulation. The spin waves are found to propagate and interfere according to the classic rule of interference pattern generated by two point sources. The interference pattern of spin wave is observed in one boundary of the lattice. The interference pattern is detected and studied by spin flip statistics at high and low temperatures. The destructive interference is manifested as the large number of spin flips and vice versa.
Brakebill, J.W.; Preston, S.D.
2003-01-01
The U.S. Geological Survey has developed a methodology for statistically relating nutrient sources and land-surface characteristics to nutrient loads of streams. The methodology is referred to as SPAtially Referenced Regressions On Watershed attributes (SPARROW), and relates measured stream nutrient loads to nutrient sources using nonlinear statistical regression models. A spatially detailed digital hydrologic network of stream reaches, stream-reach characteristics such as mean streamflow, water velocity, reach length, and travel time, and their associated watersheds supports the regression models. This network serves as the primary framework for spatially referencing potential nutrient source information such as atmospheric deposition, septic systems, point-sources, land use, land cover, and agricultural sources and land-surface characteristics such as land use, land cover, average-annual precipitation and temperature, slope, and soil permeability. In the Chesapeake Bay watershed that covers parts of Delaware, Maryland, Pennsylvania, New York, Virginia, West Virginia, and Washington D.C., SPARROW was used to generate models estimating loads of total nitrogen and total phosphorus representing 1987 and 1992 land-surface conditions. The 1987 models used a hydrologic network derived from an enhanced version of the U.S. Environmental Protection Agency's digital River Reach File, and course resolution Digital Elevation Models (DEMs). A new hydrologic network was created to support the 1992 models by generating stream reaches representing surface-water pathways defined by flow direction and flow accumulation algorithms from higher resolution DEMs. On a reach-by-reach basis, stream reach characteristics essential to the modeling were transferred to the newly generated pathways or reaches from the enhanced River Reach File used to support the 1987 models. To complete the new network, watersheds for each reach were generated using the direction of surface-water flow derived from the DEMs. This network improves upon existing digital stream data by increasing the level of spatial detail and providing consistency between the reach locations and topography. The hydrologic network also aids in illustrating the spatial patterns of predicted nutrient loads and sources contributed locally to each stream, and the percentages of nutrient load that reach Chesapeake Bay.
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Saumyadip; Abraham, John
2012-07-01
The unsteady flamelet progress variable (UFPV) model has been proposed by Pitsch and Ihme ["An unsteady/flamelet progress variable method for LES of nonpremixed turbulent combustion," AIAA Paper No. 2005-557, 2005] for modeling the averaged/filtered chemistry source terms in Reynolds averaged simulations and large eddy simulations of reacting non-premixed combustion. In the UFPV model, a look-up table of source terms is generated as a function of mixture fraction Z, scalar dissipation rate χ, and progress variable C by solving the unsteady flamelet equations. The assumption is that the unsteady flamelet represents the evolution of the reacting mixing layer in the non-premixed flame. We assess the accuracy of the model in predicting autoignition and flame development in compositionally stratified n-heptane/air mixtures using direct numerical simulations (DNS). The focus in this work is primarily on the assessment of accuracy of the probability density functions (PDFs) employed for obtaining averaged source terms. The performance of commonly employed presumed functions, such as the dirac-delta distribution function, the β distribution function, and statistically most likely distribution (SMLD) approach in approximating the shapes of the PDFs of the reactive and the conserved scalars is evaluated. For unimodal distributions, it is observed that functions that need two-moment information, e.g., the β distribution function and the SMLD approach with two-moment closure, are able to reasonably approximate the actual PDF. As the distribution becomes multimodal, higher moment information is required. Differences are observed between the ignition trends obtained from DNS and those predicted by the look-up table, especially for smaller gradients where the flamelet assumption becomes less applicable. The formulation assumes that the shape of the χ(Z) profile can be modeled by an error function which remains unchanged in the presence of heat release. We show that this assumption is not accurate.
A Two-Step Approach to Uncertainty Quantification of Core Simulators
Yankov, Artem; Collins, Benjamin; Klein, Markus; ...
2012-01-01
For the multiple sources of error introduced into the standard computational regime for simulating reactor cores, rigorous uncertainty analysis methods are available primarily to quantify the effects of cross section uncertainties. Two methods for propagating cross section uncertainties through core simulators are the XSUSA statistical approach and the “two-step” method. The XSUSA approach, which is based on the SUSA code package, is fundamentally a stochastic sampling method. Alternatively, the two-step method utilizes generalized perturbation theory in the first step and stochastic sampling in the second step. The consistency of these two methods in quantifying uncertainties in the multiplication factor andmore » in the core power distribution was examined in the framework of phase I-3 of the OECD Uncertainty Analysis in Modeling benchmark. With the Three Mile Island Unit 1 core as a base model for analysis, the XSUSA and two-step methods were applied with certain limitations, and the results were compared to those produced by other stochastic sampling-based codes. Based on the uncertainty analysis results, conclusions were drawn as to the method that is currently more viable for computing uncertainties in burnup and transient calculations.« less
2013-01-01
Background Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. Methods The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. Results The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance < mean) property. Our study also identify several significant predictors of the outcome variable namely mother’s education, father’s education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Conclusions Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh. PMID:23297699
Innovative Approach for Developing Spacecraft Interior Acoustic Requirement Allocation
NASA Technical Reports Server (NTRS)
Chu, S. Reynold; Dandaroy, Indranil; Allen, Christopher S.
2016-01-01
The Orion Multi-Purpose Crew Vehicle (MPCV) is an American spacecraft for carrying four astronauts during deep space missions. This paper describes an innovative application of Power Injection Method (PIM) for allocating Orion cabin continuous noise Sound Pressure Level (SPL) limits to the sound power level (PWL) limits of major noise sources in the Environmental Control and Life Support System (ECLSS) during all mission phases. PIM is simulated using both Statistical Energy Analysis (SEA) and Hybrid Statistical Energy Analysis-Finite Element (SEA-FE) models of the Orion MPCV to obtain the transfer matrix from the PWL of the noise sources to the acoustic energies of the receivers, i.e., the cavities associated with the cabin habitable volume. The goal of the allocation strategy is to control the total energy of cabin habitable volume for maintaining the required SPL limits. Simulations are used to demonstrate that applying the allocated PWLs to the noise sources in the models indeed reproduces the SPL limits in the habitable volume. The effects of Noise Control Treatment (NCT) on allocated noise source PWLs are investigated. The measurement of source PWLs of involved fan and pump development units are also discussed as it is related to some case-specific details of the allocation strategy discussed here.
Impact of numerical choices on water conservation in the E3SM Atmosphere Model Version 1 (EAM V1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Kai; Rasch, Philip J.; Taylor, Mark A.
The conservation of total water is an important numerical feature for global Earth system models. Even small conservation problems in the water budget can lead to systematic errors in century-long simulations for sea level rise projection. This study quantifies and reduces various sources of water conservation error in the atmosphere component of the Energy Exascale Earth System Model. Several sources of water conservation error have been identified during the development of the version 1 (V1) model. The largest errors result from the numerical coupling between the resolved dynamics and the parameterized sub-grid physics. A hybrid coupling using different methods formore » fluid dynamics and tracer transport provides a reduction of water conservation error by a factor of 50 at 1° horizontal resolution as well as consistent improvements at other resolutions. The second largest error source is the use of an overly simplified relationship between the surface moisture flux and latent heat flux at the interface between the host model and the turbulence parameterization. This error can be prevented by applying the same (correct) relationship throughout the entire model. Two additional types of conservation error that result from correcting the surface moisture flux and clipping negative water concentrations can be avoided by using mass-conserving fixers. With all four error sources addressed, the water conservation error in the V1 model is negligible and insensitive to the horizontal resolution. The associated changes in the long-term statistics of the main atmospheric features are small. A sensitivity analysis is carried out to show that the magnitudes of the conservation errors decrease strongly with temporal resolution but increase with horizontal resolution. The increased vertical resolution in the new model results in a very thin model layer at the Earth’s surface, which amplifies the conservation error associated with the surface moisture flux correction. We note that for some of the identified error sources, the proposed fixers are remedies rather than solutions to the problems at their roots. Future improvements in time integration would be beneficial for this model.« less
Statistical Signal Models and Algorithms for Image Analysis
1984-10-25
In this report, two-dimensional stochastic linear models are used in developing algorithms for image analysis such as classification, segmentation, and object detection in images characterized by textured backgrounds. These models generate two-dimensional random processes as outputs to which statistical inference procedures can naturally be applied. A common thread throughout our algorithms is the interpretation of the inference procedures in terms of linear prediction
NASA Astrophysics Data System (ADS)
Mergili, Martin; Fischer, Jan-Thomas; Krenn, Julia; Pudasaini, Shiva P.
2017-02-01
r.avaflow represents an innovative open-source computational tool for routing rapid mass flows, avalanches, or process chains from a defined release area down an arbitrary topography to a deposition area. In contrast to most existing computational tools, r.avaflow (i) employs a two-phase, interacting solid and fluid mixture model (Pudasaini, 2012); (ii) is suitable for modelling more or less complex process chains and interactions; (iii) explicitly considers both entrainment and stopping with deposition, i.e. the change of the basal topography; (iv) allows for the definition of multiple release masses, and/or hydrographs; and (v) serves with built-in functionalities for validation, parameter optimization, and sensitivity analysis. r.avaflow is freely available as a raster module of the GRASS GIS software, employing the programming languages Python and C along with the statistical software R. We exemplify the functionalities of r.avaflow by means of two sets of computational experiments: (1) generic process chains consisting in bulk mass and hydrograph release into a reservoir with entrainment of the dam and impact downstream; (2) the prehistoric Acheron rock avalanche, New Zealand. The simulation results are generally plausible for (1) and, after the optimization of two key parameters, reasonably in line with the corresponding observations for (2). However, we identify some potential to enhance the analytic and numerical concepts. Further, thorough parameter studies will be necessary in order to make r.avaflow fit for reliable forward simulations of possible future mass flow events.
Comparative study of x ray and microwave emissions during solar flares
NASA Technical Reports Server (NTRS)
Winglee, Robert M.
1993-01-01
The work supported by the grant consisted of two projects. The first project involved making detailed case studies of two flares using SMM data in conjunction with ground based observations. The first flare occurred at 1454 UT on June 20, 1989 and involved the eruption of a prominence near the limb. In the study we used data from many wavelength regimes including the radio, H-alpha, hard X-rays, and soft X-rays. We used a full gyrosynchrotron code to model the apparent presence of a 1.4 GHz source early in the flare that was in the form of a large coronal loop. The model results lead us to conclude that the initial acceleration occurs in small, dense loops which also produced the flare's hard X-ray emission. We also found evidence that a source at 1.4 GHz later in the event was due to second harmonic plasma emission. This source was adjacent to a leg of the prominence and comes from a dense column of material in the magnetic structure supporting the prominence. Finally, we investigated a source of microwaves and soft X-rays, occurring approximately 10 min after the hard X-ray peak, and calculate a lower limit for the density of the source. The second flare that was studied occurred at 2156 UT on June 20, 1989 and was observed with the VLA and the Owens Valley Radio Observatory (OVRO) Frequency Agile Array. We have developed a gyrosynchrotron model of the sources at flare peak using a new gyrosynchrotron approximation which is valid at very low harmonics of the gyrofrequency. We found that the accelerated particle densities of the sources decreased much more with radius from the source center than had been supposed in previous work, while the magnetic field varied less. We also used the available data to analyze a highly polarized source which appeared late in the flare. The second project involved compiling a statistical base for the relative timing of the hard X-ray peak, the turbulent and blue-shift velocities inferred from soft X-ray line emissions observed by SMM and the microwave peak as determined from ground-based observations. This timing was then used to aid the testing of newly developed global models for flares that incorporate the global magnetic topology as well as the electron dynamics that are responsible for the hard X-rays and microwaves.
PDF approach for turbulent scalar field: Some recent developments
NASA Technical Reports Server (NTRS)
Gao, Feng
1993-01-01
The probability density function (PDF) method has been proven a very useful approach in turbulence research. It has been particularly effective in simulating turbulent reacting flows and in studying some detailed statistical properties generated by a turbulent field There are, however, some important questions that have yet to be answered in PDF studies. Our efforts in the past year have been focused on two areas. First, a simple mixing model suitable for Monte Carlo simulations has been developed based on the mapping closure. Secondly, the mechanism of turbulent transport has been analyzed in order to understand the recently observed abnormal PDF's of turbulent temperature fields generated by linear heat sources.
Statistical Methods Applied to Gamma-ray Spectroscopy Algorithms in Nuclear Security Missions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fagan, Deborah K.; Robinson, Sean M.; Runkle, Robert C.
2012-10-01
In a wide range of nuclear security missions, gamma-ray spectroscopy is a critical research and development priority. One particularly relevant challenge is the interdiction of special nuclear material for which gamma-ray spectroscopy supports the goals of detecting and identifying gamma-ray sources. This manuscript examines the existing set of spectroscopy methods, attempts to categorize them by the statistical methods on which they rely, and identifies methods that have yet to be considered. Our examination shows that current methods effectively estimate the effect of counting uncertainty but in many cases do not address larger sources of decision uncertainty—ones that are significantly moremore » complex. We thus explore the premise that significantly improving algorithm performance requires greater coupling between the problem physics that drives data acquisition and statistical methods that analyze such data. Untapped statistical methods, such as Bayes Modeling Averaging and hierarchical and empirical Bayes methods have the potential to reduce decision uncertainty by more rigorously and comprehensively incorporating all sources of uncertainty. We expect that application of such methods will demonstrate progress in meeting the needs of nuclear security missions by improving on the existing numerical infrastructure for which these analyses have not been conducted.« less
Astrostatistical Analysis in Solar and Stellar Physics
NASA Astrophysics Data System (ADS)
Stenning, David Craig
This dissertation focuses on developing statistical models and methods to address data-analytic challenges in astrostatistics---a growing interdisciplinary field fostering collaborations between statisticians and astrophysicists. The astrostatistics projects we tackle can be divided into two main categories: modeling solar activity and Bayesian analysis of stellar evolution. These categories from Part I and Part II of this dissertation, respectively. The first line of research we pursue involves classification and modeling of evolving solar features. Advances in space-based observatories are increasing both the quality and quantity of solar data, primarily in the form of high-resolution images. To analyze massive streams of solar image data, we develop a science-driven dimension reduction methodology to extract scientifically meaningful features from images. This methodology utilizes mathematical morphology to produce a concise numerical summary of the magnetic flux distribution in solar "active regions'' that (i) is far easier to work with than the source images, (ii) encapsulates scientifically relevant information in a more informative manner than existing schemes (i.e., manual classification schemes), and (iii) is amenable to sophisticated statistical analyses. In a related line of research, we perform a Bayesian analysis of the solar cycle using multiple proxy variables, such as sunspot numbers. We take advantage of patterns and correlations among the proxy variables to model solar activity using data from proxies that have become available more recently, while also taking advantage of the long history of observations of sunspot numbers. This model is an extension of the Yu et al. (2012) Bayesian hierarchical model for the solar cycle that used the sunspot numbers alone. Since proxies have different temporal coverage, we devise a multiple imputation scheme to account for missing data. We find that incorporating multiple proxies reveals important features of the solar cycle that are missed when the model is fit using only the sunspot numbers. In Part II of this dissertation we focus on two related lines of research involving Bayesian analysis of stellar evolution. We first focus on modeling multiple stellar populations in star clusters. It has long been assumed that all star clusters are comprised of single stellar populations---stars that formed at roughly the same time from a common molecular cloud. However, recent studies have produced evidence that some clusters host multiple populations, which has far-reaching scientific implications. We develop a Bayesian hierarchical model for multiple-population star clusters, extending earlier statistical models of stellar evolution (e.g., van Dyk et al. 2009, Stein et al. 2013). We also devise an adaptive Markov chain Monte Carlo algorithm to explore the complex posterior distribution. We use numerical studies to demonstrate that our method can recover parameters of multiple-population clusters, and also show how model misspecification can be diagnosed. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We also explore statistical properties of the estimators and determine that the influence of the prior distribution does not diminish with larger sample sizes, leading to non-standard asymptotics. In a final line of research, we present the first-ever attempt to estimate the carbon fraction of white dwarfs. This quantity has important implications for both astrophysics and fundamental nuclear physics, but is currently unknown. We use a numerical study to demonstrate that assuming an incorrect value for the carbon fraction leads to incorrect white-dwarf ages of star clusters. Finally, we present our attempt to estimate the carbon fraction of the white dwarfs in the well-studied star cluster 47 Tucanae.
Dependence of Microlensing on Source Size and Lens Mass
NASA Astrophysics Data System (ADS)
Congdon, A. B.; Keeton, C. R.
2007-11-01
In gravitational lensed quasars, the magnification of an image depends on the configuration of stars in the lensing galaxy. We study the statistics of the magnification distribution for random star fields. The width of the distribution characterizes the amount by which the observed magnification is likely to differ from models in which the mass is smoothly distributed. We use numerical simulations to explore how the width of the magnification distribution depends on the mass function of stars, and on the size of the source quasar. We then propose a semi-analytic model to describe the distribution width for different source sizes and stellar mass functions.
Integrated spatial multiplexing of heralded single-photon sources
Collins, M.J.; Xiong, C.; Rey, I.H.; Vo, T.D.; He, J.; Shahnia, S.; Reardon, C.; Krauss, T.F.; Steel, M.J.; Clark, A.S.; Eggleton, B.J.
2013-01-01
The non-deterministic nature of photon sources is a key limitation for single-photon quantum processors. Spatial multiplexing overcomes this by enhancing the heralded single-photon yield without enhancing the output noise. Here the intrinsic statistical limit of an individual source is surpassed by spatially multiplexing two monolithic silicon-based correlated photon pair sources in the telecommunications band, demonstrating a 62.4% increase in the heralded single-photon output without an increase in unwanted multipair generation. We further demonstrate the scalability of this scheme by multiplexing photons generated in two waveguides pumped via an integrated coupler with a 63.1% increase in the heralded photon rate. This demonstration paves the way for a scalable architecture for multiplexing many photon sources in a compact integrated platform and achieving efficient two-photon interference, required at the core of optical quantum computing and quantum communication protocols. PMID:24107840
A consistent framework for Horton regression statistics that leads to a modified Hack's law
Furey, P.R.; Troutman, B.M.
2008-01-01
A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.
NASA Astrophysics Data System (ADS)
Alden, Caroline B.; Ghosh, Subhomoy; Coburn, Sean; Sweeney, Colm; Karion, Anna; Wright, Robert; Coddington, Ian; Rieker, Gregory B.; Prasad, Kuldeep
2018-03-01
Advances in natural gas extraction technology have led to increased activity in the production and transport sectors in the United States and, as a consequence, an increased need for reliable monitoring of methane leaks to the atmosphere. We present a statistical methodology in combination with an observing system for the detection and attribution of fugitive emissions of methane from distributed potential source location landscapes such as natural gas production sites. We measure long (> 500 m), integrated open-path concentrations of atmospheric methane using a dual frequency comb spectrometer and combine measurements with an atmospheric transport model to infer leak locations and strengths using a novel statistical method, the non-zero minimum bootstrap (NZMB). The new statistical method allows us to determine whether the empirical distribution of possible source strengths for a given location excludes zero. Using this information, we identify leaking source locations (i.e., natural gas wells) through rejection of the null hypothesis that the source is not leaking. The method is tested with a series of synthetic data inversions with varying measurement density and varying levels of model-data mismatch. It is also tested with field observations of (1) a non-leaking source location and (2) a source location where a controlled emission of 3.1 × 10-5 kg s-1 of methane gas is released over a period of several hours. This series of synthetic data tests and outdoor field observations using a controlled methane release demonstrates the viability of the approach for the detection and sizing of very small leaks of methane across large distances (4+ km2 in synthetic tests). The field tests demonstrate the ability to attribute small atmospheric enhancements of 17 ppb to the emitting source location against a background of combined atmospheric (e.g., background methane variability) and measurement uncertainty of 5 ppb (1σ), when measurements are averaged over 2 min. The results of the synthetic and field data testing show that the new observing system and statistical approach greatly decreases the incidence of false alarms (that is, wrongly identifying a well site to be leaking) compared with the same tests that do not use the NZMB approach and therefore offers increased leak detection and sizing capabilities.
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments
Dorazio, Robert; Hunter, Margaret
2015-01-01
Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Ackermann, M.; Ajello, M.; Allafort, A.; ...
2012-06-15
The Fermi Large Area Telescope (LAT) First Source Catalog (1FGL) provided spatial, spectral, and temporal properties for a large number of γ-ray sources using a uniform analysis method. After correlating with the most-complete catalogs of source types known to emit γ rays, 630 of these sources are "unassociated" (i.e., have no obvious counterparts at other wavelengths).We employ two statistical analyses of the primary γ-ray characteristics for these unassociated sources in an effort to correlate their γ-ray properties with the active galactic nucleus (AGN) and pulsar populations in 1FGL. Based on the correlation results, we classify 221 AGN-like and 134 pulsar-likemore » sources in the 1FGL unassociated sources. Furthermore, the results of these source "classifications" appear to match the expected source distributions, especially at high Galactic latitudes. While useful for planning future multiwavelength follow-up observations, these analyses use limited inputs, and their predictions should not be considered equivalent to "probable source classes" for these sources. We also discuss multiwavelength results and catalog cross-correlations to date, and provide new source associations for 229 Fermi-LAT sources that had no association listed in the 1FGL catalog. By validating the source classifications against these new associations, we find that the new association matches the predicted source class in ~80% of the sources.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ackermann, M.; Ajello, M.; Allafort, A.
The Fermi Large Area Telescope (LAT) First Source Catalog (1FGL) provided spatial, spectral, and temporal properties for a large number of {gamma}-ray sources using a uniform analysis method. After correlating with the most-complete catalogs of source types known to emit {gamma} rays, 630 of these sources are 'unassociated' (i.e., have no obvious counterparts at other wavelengths). Here, we employ two statistical analyses of the primary {gamma}-ray characteristics for these unassociated sources in an effort to correlate their {gamma}-ray properties with the active galactic nucleus (AGN) and pulsar populations in 1FGL. Based on the correlation results, we classify 221 AGN-like andmore » 134 pulsar-like sources in the 1FGL unassociated sources. The results of these source 'classifications' appear to match the expected source distributions, especially at high Galactic latitudes. While useful for planning future multiwavelength follow-up observations, these analyses use limited inputs, and their predictions should not be considered equivalent to 'probable source classes' for these sources. We discuss multiwavelength results and catalog cross-correlations to date, and provide new source associations for 229 Fermi-LAT sources that had no association listed in the 1FGL catalog. By validating the source classifications against these new associations, we find that the new association matches the predicted source class in {approx}80% of the sources.« less
NASA Technical Reports Server (NTRS)
Ackermann, M.; Ajello, M.; Allafort, A.; Antolini, E.; Baldini, L.; Ballet, J.; Barbiellini, G.; Bastieri, D.; Bellazzini, R.; Berenji, B.;
2012-01-01
The Fermi Large Area Telescope (LAT) First Source Catalog (1FGL) provided spatial, spectral, and temporal properties for a large number of gamma -ray sources using a uniform analysis method. After correlating with the mostcomplete catalogs of source types known to emit gamma rays, 630 of these sources are "unassociated" (i.e., have no obvious counterparts at other wavelengths). Here, we employ two statistical analyses of the primary gamma-ray characteristics for these unassociated sources in an effort to correlate their gamma-ray properties with the active galactic nucleus (AGN) and pulsar populations in 1FGL. Based on the correlation results, we classify 221 AGN-like and 134 pulsar-like sources in the 1FGL unassociated sources. The results of these source "classifications" appear to match the expected source distributions, especially at high Galactic latitudes. While useful for planning future multiwavelength follow-up observations, these analyses use limited inputs, and their predictions should not be considered equivalent to "probable source classes" for these sources. We discuss multiwavelength results and catalog cross-correlations to date, and provide new source associations for 229 Fermi-LAT sources that had no association listed in the 1FGL catalog. By validating the source classifications against these new associations, we find that the new association matches the predicted source class in approximately 80% of the sources.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ackermann, M.; Ajello, M.; Allafort, A.
The Fermi Large Area Telescope (LAT) First Source Catalog (1FGL) provided spatial, spectral, and temporal properties for a large number of γ-ray sources using a uniform analysis method. After correlating with the most-complete catalogs of source types known to emit γ rays, 630 of these sources are "unassociated" (i.e., have no obvious counterparts at other wavelengths).We employ two statistical analyses of the primary γ-ray characteristics for these unassociated sources in an effort to correlate their γ-ray properties with the active galactic nucleus (AGN) and pulsar populations in 1FGL. Based on the correlation results, we classify 221 AGN-like and 134 pulsar-likemore » sources in the 1FGL unassociated sources. Furthermore, the results of these source "classifications" appear to match the expected source distributions, especially at high Galactic latitudes. While useful for planning future multiwavelength follow-up observations, these analyses use limited inputs, and their predictions should not be considered equivalent to "probable source classes" for these sources. We also discuss multiwavelength results and catalog cross-correlations to date, and provide new source associations for 229 Fermi-LAT sources that had no association listed in the 1FGL catalog. By validating the source classifications against these new associations, we find that the new association matches the predicted source class in ~80% of the sources.« less
Frailty Models for Familial Risk with Application to Breast Cancer.
Gorfine, Malka; Hsu, Li; Parmigiani, Giovanni
2013-12-01
In evaluating familial risk for disease we have two main statistical tasks: assessing the probability of carrying an inherited genetic mutation conferring higher risk; and predicting the absolute risk of developing diseases over time, for those individuals whose mutation status is known. Despite substantial progress, much remains unknown about the role of genetic and environmental risk factors, about the sources of variation in risk among families that carry high-risk mutations, and about the sources of familial aggregation beyond major Mendelian effects. These sources of heterogeneity contribute substantial variation in risk across families. In this paper we present simple and efficient methods for accounting for this variation in familial risk assessment. Our methods are based on frailty models. We implemented them in the context of generalizing Mendelian models of cancer risk, and compared our approaches to others that do not consider heterogeneity across families. Our extensive simulation study demonstrates that when predicting the risk of developing a disease over time conditional on carrier status, accounting for heterogeneity results in a substantial improvement in the area under the curve of the receiver operating characteristic. On the other hand, the improvement for carriership probability estimation is more limited. We illustrate the utility of the proposed approach through the analysis of BRCA1 and BRCA2 mutation carriers in the Washington Ashkenazi Kin-Cohort Study of Breast Cancer.
Statistical physics of vaccination
NASA Astrophysics Data System (ADS)
Wang, Zhen; Bauch, Chris T.; Bhattacharyya, Samit; d'Onofrio, Alberto; Manfredi, Piero; Perc, Matjaž; Perra, Nicola; Salathé, Marcel; Zhao, Dawei
2016-12-01
Historically, infectious diseases caused considerable damage to human societies, and they continue to do so today. To help reduce their impact, mathematical models of disease transmission have been studied to help understand disease dynamics and inform prevention strategies. Vaccination-one of the most important preventive measures of modern times-is of great interest both theoretically and empirically. And in contrast to traditional approaches, recent research increasingly explores the pivotal implications of individual behavior and heterogeneous contact patterns in populations. Our report reviews the developmental arc of theoretical epidemiology with emphasis on vaccination, as it led from classical models assuming homogeneously mixing (mean-field) populations and ignoring human behavior, to recent models that account for behavioral feedback and/or population spatial/social structure. Many of the methods used originated in statistical physics, such as lattice and network models, and their associated analytical frameworks. Similarly, the feedback loop between vaccinating behavior and disease propagation forms a coupled nonlinear system with analogs in physics. We also review the new paradigm of digital epidemiology, wherein sources of digital data such as online social media are mined for high-resolution information on epidemiologically relevant individual behavior. Armed with the tools and concepts of statistical physics, and further assisted by new sources of digital data, models that capture nonlinear interactions between behavior and disease dynamics offer a novel way of modeling real-world phenomena, and can help improve health outcomes. We conclude the review by discussing open problems in the field and promising directions for future research.
Leidner, Andrew J.
2014-01-01
This paper provides a demonstration of propensity-score matching estimation methods to evaluate the effectiveness of health-risk communication efforts. This study develops a two-stage regression model to investigate household and respondent characteristics as they contribute to aversion behavior to reduce exposure to arsenic-contaminated groundwater. The aversion activity under study is a household-level point-of-use filtration device. Since the acquisition of arsenic contamination information and the engagement in an aversion activity may be codetermined, a two-stage propensity-score model is developed. In the first stage, the propensity for households to acquire arsenic contamination information is estimated. Then, the propensity scores are used to weight observations in a probit regression on the decision to avert the arsenic-related health risk. Of four potential sources of information, utility, media, friend, or others, information received from a friend appears to be the source of information most associated with aversion behavior. Other statistically significant covariates in the household's decision to avert contamination include reported household income, the presence of children in household, and region-level indicator variables. These findings are primarily illustrative and demonstrate the usefulness of propensity-score methods to estimate health-risk communication effectiveness. They may also be suggestive of areas for future research. PMID:25349622
Model Error Estimation for the CPTEC Eta Model
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; daSilva, Arlindo
1999-01-01
Statistical data assimilation systems require the specification of forecast and observation error statistics. Forecast error is due to model imperfections and differences between the initial condition and the actual state of the atmosphere. Practical four-dimensional variational (4D-Var) methods try to fit the forecast state to the observations and assume that the model error is negligible. Here with a number of simplifying assumption, a framework is developed for isolating the model error given the forecast error at two lead-times. Two definitions are proposed for the Talagrand ratio tau, the fraction of the forecast error due to model error rather than initial condition error. Data from the CPTEC Eta Model running operationally over South America are used to calculate forecast error statistics and lower bounds for tau.
Varmaghani, Mehdi; Rashidian, Arash; Kebriaeezadeh, Abbas; Moradi-Lakeh, Maziar; Moin, Mostafa; Ghasemian, Anoosheh; Rezaei-Darzi, Ehsan; Sepanlou, Sadaf Ghajarieh; Peykari, Niloofar; Rezaei, Nazila; Parsaeian, Mahboubeh; Farzadfar, Farshad
2014-12-01
Asthma is a chronic inflammatory airway disease caused or worsened by environmental factors in genetically vulnerable people. The study of national and sub-national burden of asthma aims to provide a quantitative method and valid estimates for the prevalence, incidence, and economic burden of asthma disease in Iran from 1990 to 2013 and this papers explains measures, data sources, methods, and challenges that we will use in the study. In order to conduct this study, we will use all available unpublished data sources, including claim databases and data collected by the food and drug organization (FDO). Moreover, we will devise and run a systematic review of all studies and literature published about asthma epidemiology in Iran, which includes all cross-sectional, cohort and case-control studies with asthma epidemiology focus that are population based. In this study, we will use two statistical models, including spatio-temporal and multilevel autoregressive models to estimate mean and uncertainty intervals for the parameters under study by gender, age, year, and province. All programs will be written in R statistical packages (version 3.0.1). This study helps to obtain information concerning the variation among regions and provinces, and in general among sub-national divisions. Our study can be contribute to better allocation of resources, since it helps policymakers to recognize inequalities between regions and provinces and consequently help them to allocate resources more efficiently.
VizieR Online Data Catalog: The CLASS BL Lac sample (Marcha+, 2013)
NASA Astrophysics Data System (ADS)
Marcha, M. J. M.; Caccianiga, A.
2014-04-01
This paper presents a new sample of BL Lac objects selected from a deep (30mJy) radio survey of flat spectrum radio sources (the CLASS blazar survey). The sample is one of the largest well-defined samples in the low-power regime with a total of 130 sources of which 55 satisfy the 'classical' optical BL Lac selection criteria, and the rest have indistinguishable radio properties. The primary goal of this study is to establish the radio luminosity function (RLF) on firm statistical ground at low radio luminosities where previous samples have not been able to investigate. The gain of taking a peek at lower powers is the possibility to search for the flattening of the luminosity function which is a feature predicted by the beaming model but which has remained elusive to observational confirmation. In this study, we extend for the first time the BL Lac RLF down to very low radio powers ~1022W/Hz, i.e. two orders of magnitude below the RLF currently available in the literature. In the process, we confirm the importance of adopting a broader, and more physically meaningful set of classification criteria to avoid the systematic missing of low-luminosity BL Lacs. Thanks to the good statistics we confirm the existence of weak but significant positive cosmological evolution for the BL Lac population, and we detect, for the first time the flattening of the RLF at L~1025W/Hz in agreement with the predictions of the beaming model. (1 data file).
The effect of the dynamic wet troposphere on VLBI measurements
NASA Technical Reports Server (NTRS)
Treuhaft, R. N.; Lanyi, G. E.
1986-01-01
Calculations using a statistical model of water vapor fluctuations yield the effect of the dynamic wet troposphere on Very Long Baseline Interferometry (VLBI) measurements. The statistical model arises from two primary assumptions: (1) the spatial structure of refractivity fluctuations can be closely approximated by elementary (Kolmogorov) turbulence theory, and (2) temporal fluctuations are caused by spatial patterns which are moved over a site by the wind. The consequences of these assumptions are outlined for the VLBI delay and delay rate observables. For example, wet troposphere induced rms delays for Deep Space Network (DSN) VLBI at 20-deg elevation are about 3 cm of delay per observation, which is smaller, on the average, than other known error sources in the current DSN VLBI data set. At 20-deg elevation for 200-s time intervals, water vapor induces approximately 1.5 x 10 to the minus 13th power s/s in the Allan standard deviation of interferometric delay, which is a measure of the delay rate observable error. In contrast to the delay error, the delay rate measurement error is dominated by water vapor fluctuations. Water vapor induced VLBI parameter errors and correlations are calculated. For the DSN, baseline length parameter errors due to water vapor fluctuations are in the range of 3 to 5 cm. The above physical assumptions also lead to a method for including the water vapor fluctuations in the parameter estimation procedure, which is used to extract baseline and source information from the VLBI observables.
NASA Astrophysics Data System (ADS)
Venable, N. B. H.; Fassnacht, S. R.; Adyabadam, G.
2014-12-01
Precipitation data in semi-arid and mountainous regions is often spatially and temporally sparse, yet it is a key variable needed to drive hydrological models. Gridded precipitation datasets provide a spatially and temporally coherent alternative to the use of point-based station data, but in the case of Mongolia, may not be constructed from all data available from government data sources, or may only be available at coarse resolutions. To examine the uncertainty associated with the use of gridded and/or point precipitation data, monthly water balance models of three river basins across forest steppe (the Khoid Tamir River at Ikhtamir), steppe (the Baidrag River at Bayanburd), and desert steppe (the Tuin River at Bogd) ecozones in the Khangai Mountain Region of Mongolia were compared. The models were forced over a 10-year period from 2001-2010, with gridded temperature and precipitation data at a 0.5 x 0.5 degree resolution. These results were compared to modeling using an interpolated hybrid of the gridded data and additional point data recently gathered from government sources; and with point data from the nearest meteorological station to the streamflow gage of choice. Goodness-of-fit measures including the Nash-Sutcliff Efficiency statistic, the percent bias, and the RMSE-observations standard deviation ratio were used to assess model performance. The results were mixed with smaller differences between the two gridded products as compared to the differences between gridded products and station data. The largest differences in precipitation inputs and modeled runoff amounts occurred between the two gridded datasets and station data in the desert steppe (Tuin), and the smallest differences occurred in the forest steppe (Khoid Tamir) and steppe (Baidrag). Mean differences between water balance model results are generally smaller than mean differences in the initial input data over the period of record. Seasonally, larger differences in gridded versus station-based precipitation products and modeled outputs occur in summer in the desert-steppe, and in spring in the forest steppe. Choice of precipitation data source in terms of gridded or point-based data directly affects model outcomes with greater uncertainty noted on a seasonal basis across ecozones of the Khangai.
Estimating uncertainties in complex joint inverse problems
NASA Astrophysics Data System (ADS)
Afonso, Juan Carlos
2016-04-01
Sources of uncertainty affecting geophysical inversions can be classified either as reflective (i.e. the practitioner is aware of her/his ignorance) or non-reflective (i.e. the practitioner does not know that she/he does not know!). Although we should be always conscious of the latter, the former are the ones that, in principle, can be estimated either empirically (by making measurements or collecting data) or subjectively (based on the experience of the researchers). For complex parameter estimation problems in geophysics, subjective estimation of uncertainty is the most common type. In this context, probabilistic (aka Bayesian) methods are commonly claimed to offer a natural and realistic platform from which to estimate model uncertainties. This is because in the Bayesian approach, errors (whatever their nature) can be naturally included as part of the global statistical model, the solution of which represents the actual solution to the inverse problem. However, although we agree that probabilistic inversion methods are the most powerful tool for uncertainty estimation, the common claim that they produce "realistic" or "representative" uncertainties is not always justified. Typically, ALL UNCERTAINTY ESTIMATES ARE MODEL DEPENDENT, and therefore, besides a thorough characterization of experimental uncertainties, particular care must be paid to the uncertainty arising from model errors and input uncertainties. We recall here two quotes by G. Box and M. Gunzburger, respectively, of special significance for inversion practitioners and for this session: "…all models are wrong, but some are useful" and "computational results are believed by no one, except the person who wrote the code". In this presentation I will discuss and present examples of some problems associated with the estimation and quantification of uncertainties in complex multi-observable probabilistic inversions, and how to address them. Although the emphasis will be on sources of uncertainty related to the forward and statistical models, I will also address other uncertainties associated with data and uncertainty propagation.
A global goodness-of-fit statistic for Cox regression models.
Parzen, M; Lipsitz, S R
1999-06-01
In this paper, a global goodness-of-fit test statistic for a Cox regression model, which has an approximate chi-squared distribution when the model has been correctly specified, is proposed. Our goodness-of-fit statistic is global and has power to detect if interactions or higher order powers of covariates in the model are needed. The proposed statistic is similar to the Hosmer and Lemeshow (1980, Communications in Statistics A10, 1043-1069) goodness-of-fit statistic for binary data as well as Schoenfeld's (1980, Biometrika 67, 145-153) statistic for the Cox model. The methods are illustrated using data from a Mayo Clinic trial in primary billiary cirrhosis of the liver (Fleming and Harrington, 1991, Counting Processes and Survival Analysis), in which the outcome is the time until liver transplantation or death. The are 17 possible covariates. Two Cox proportional hazards models are fit to the data, and the proposed goodness-of-fit statistic is applied to the fitted models.
Importance of vesicle release stochasticity in neuro-spike communication.
Ramezani, Hamideh; Akan, Ozgur B
2017-07-01
Aim of this paper is proposing a stochastic model for vesicle release process, a part of neuro-spike communication. Hence, we study biological events occurring in this process and use microphysiological simulations to observe functionality of these events. Since the most important source of variability in vesicle release probability is opening of voltage dependent calcium channels (VDCCs) followed by influx of calcium ions through these channels, we propose a stochastic model for this event, while using a deterministic model for other variability sources. To capture the stochasticity of calcium influx to pre-synaptic neuron in our model, we study its statistics and find that it can be modeled by a distribution defined based on Normal and Logistic distributions.
Abruptness of Cascade Failures in Power Grids
NASA Astrophysics Data System (ADS)
Pahwa, Sakshi; Scoglio, Caterina; Scala, Antonio
2014-01-01
Electric power-systems are one of the most important critical infrastructures. In recent years, they have been exposed to extreme stress due to the increasing demand, the introduction of distributed renewable energy sources, and the development of extensive interconnections. We investigate the phenomenon of abrupt breakdown of an electric power-system under two scenarios: load growth (mimicking the ever-increasing customer demand) and power fluctuations (mimicking the effects of renewable sources). Our results on real, realistic and synthetic networks indicate that increasing the system size causes breakdowns to become more abrupt; in fact, mapping the system to a solvable statistical-physics model indicates the occurrence of a first order transition in the large size limit. Such an enhancement for the systemic risk failures (black-outs) with increasing network size is an effect that should be considered in the current projects aiming to integrate national power-grids into ``super-grids''.
Abruptness of cascade failures in power grids.
Pahwa, Sakshi; Scoglio, Caterina; Scala, Antonio
2014-01-15
Electric power-systems are one of the most important critical infrastructures. In recent years, they have been exposed to extreme stress due to the increasing demand, the introduction of distributed renewable energy sources, and the development of extensive interconnections. We investigate the phenomenon of abrupt breakdown of an electric power-system under two scenarios: load growth (mimicking the ever-increasing customer demand) and power fluctuations (mimicking the effects of renewable sources). Our results on real, realistic and synthetic networks indicate that increasing the system size causes breakdowns to become more abrupt; in fact, mapping the system to a solvable statistical-physics model indicates the occurrence of a first order transition in the large size limit. Such an enhancement for the systemic risk failures (black-outs) with increasing network size is an effect that should be considered in the current projects aiming to integrate national power-grids into "super-grids".
Abruptness of Cascade Failures in Power Grids
Pahwa, Sakshi; Scoglio, Caterina; Scala, Antonio
2014-01-01
Electric power-systems are one of the most important critical infrastructures. In recent years, they have been exposed to extreme stress due to the increasing demand, the introduction of distributed renewable energy sources, and the development of extensive interconnections. We investigate the phenomenon of abrupt breakdown of an electric power-system under two scenarios: load growth (mimicking the ever-increasing customer demand) and power fluctuations (mimicking the effects of renewable sources). Our results on real, realistic and synthetic networks indicate that increasing the system size causes breakdowns to become more abrupt; in fact, mapping the system to a solvable statistical-physics model indicates the occurrence of a first order transition in the large size limit. Such an enhancement for the systemic risk failures (black-outs) with increasing network size is an effect that should be considered in the current projects aiming to integrate national power-grids into “super-grids”. PMID:24424239
Distribution of tsunami interevent times
NASA Astrophysics Data System (ADS)
Geist, Eric L.; Parsons, Tom
2008-01-01
The distribution of tsunami interevent times is analyzed using global and site-specific (Hilo, Hawaii) tsunami catalogs. An empirical probability density distribution is determined by binning the observed interevent times during a period in which the observation rate is approximately constant. The empirical distributions for both catalogs exhibit non-Poissonian behavior in which there is an abundance of short interevent times compared to an exponential distribution. Two types of statistical distributions are used to model this clustering behavior: (1) long-term clustering described by a universal scaling law, and (2) Omori law decay of aftershocks and triggered sources. The empirical and theoretical distributions all imply an increased hazard rate after a tsunami, followed by a gradual decrease with time approaching a constant hazard rate. Examination of tsunami sources suggests that many of the short interevent times are caused by triggered earthquakes, though the triggered events are not necessarily on the same fault.
Blind separation of incoherent and spatially disjoint sound sources
NASA Astrophysics Data System (ADS)
Dong, Bin; Antoni, Jérôme; Pereira, Antonio; Kellermann, Walter
2016-11-01
Blind separation of sound sources aims at reconstructing the individual sources which contribute to the overall radiation of an acoustical field. The challenge is to reach this goal using distant measurements when all sources are operating concurrently. The working assumption is usually that the sources of interest are incoherent - i.e. statistically orthogonal - so that their separation can be approached by decorrelating a set of simultaneous measurements, which amounts to diagonalizing the cross-spectral matrix. Principal Component Analysis (PCA) is traditionally used to this end. This paper reports two new findings in this context. First, a sufficient condition is established under which "virtual" sources returned by PCA coincide with true sources; it stipulates that the sources of interest should be not only incoherent but also spatially orthogonal. A particular case of this instance is met by spatially disjoint sources - i.e. with non-overlapping support sets. Second, based on this finding, a criterion that enforces both statistical and spatial orthogonality is proposed to blindly separate incoherent sound sources which radiate from disjoint domains. This criterion can be easily incorporated into acoustic imaging algorithms such as beamforming or acoustical holography to identify sound sources of different origins. The proposed methodology is validated on laboratory experiments. In particular, the separation of aeroacoustic sources is demonstrated in a wind tunnel.
Jet Noise Diagnostics Supporting Statistical Noise Prediction Methods
NASA Technical Reports Server (NTRS)
Bridges, James E.
2006-01-01
The primary focus of my presentation is the development of the jet noise prediction code JeNo with most examples coming from the experimental work that drove the theoretical development and validation. JeNo is a statistical jet noise prediction code, based upon the Lilley acoustic analogy. Our approach uses time-average 2-D or 3-D mean and turbulent statistics of the flow as input. The output is source distributions and spectral directivity. NASA has been investing in development of statistical jet noise prediction tools because these seem to fit the middle ground that allows enough flexibility and fidelity for jet noise source diagnostics while having reasonable computational requirements. These tools rely on Reynolds-averaged Navier-Stokes (RANS) computational fluid dynamics (CFD) solutions as input for computing far-field spectral directivity using an acoustic analogy. There are many ways acoustic analogies can be created, each with a series of assumptions and models, many often taken unknowingly. And the resulting prediction can be easily reverse-engineered by altering the models contained within. However, only an approach which is mathematically sound, with assumptions validated and modeled quantities checked against direct measurement will give consistently correct answers. Many quantities are modeled in acoustic analogies precisely because they have been impossible to measure or calculate, making this requirement a difficult task. The NASA team has spent considerable effort identifying all the assumptions and models used to take the Navier-Stokes equations to the point of a statistical calculation via an acoustic analogy very similar to that proposed by Lilley. Assumptions have been identified and experiments have been developed to test these assumptions. In some cases this has resulted in assumptions being changed. Beginning with the CFD used as input to the acoustic analogy, models for turbulence closure used in RANS CFD codes have been explored and compared against measurements of mean and rms velocity statistics over a range of jet speeds and temperatures. Models for flow parameters used in the acoustic analogy, most notably the space-time correlations of velocity, have been compared against direct measurements, and modified to better fit the observed data. These measurements have been extremely challenging for hot, high speed jets, and represent a sizeable investment in instrumentation development. As an intermediate check that the analysis is predicting the physics intended, phased arrays have been employed to measure source distributions for a wide range of jet cases. And finally, careful far-field spectral directivity measurements have been taken for final validation of the prediction code. Examples of each of these experimental efforts will be presented. The main result of these efforts is a noise prediction code, named JeNo, which is in middevelopment. JeNo is able to consistently predict spectral directivity, including aft angle directivity, for subsonic cold jets of most geometries. Current development on JeNo is focused on extending its capability to hot jets, requiring inclusion of a previously neglected second source associated with thermal fluctuations. A secondary result of the intensive experimentation is the archiving of various flow statistics applicable to other acoustic analogies and to development of time-resolved prediction methods. These will be of lasting value as we look ahead at future challenges to the aeroacoustic experimentalist.
Abbott, M.L.; Susong, D.D.; Krabbenhoft, D.P.; Rood, A.S.
2002-01-01
Mercury (total and methyl) was evaluated in snow samples collected near a major mercury emission source on the Idaho National Engineering and Environmental Laboratory (INEEL) in southeastern Idaho and 160 km downwind in Teton Range in western Wyoming. The sampling was done to assess near-field (<12 km) deposition rates around the source, compare them to those measured in a relatively remote, pristine downwind location, and to use the measurements to develop improved, site-specific model input parameters for precipitation scavenging coefficient and the fraction of Hg emissions deposited locally. Measured snow water concentrations (ng L-1) were converted to deposition (ug m-2) using the sample location snow water equivalent. The deposition was then compared to that predicted using the ISC3 air dispersion/deposition model which was run with a range of particle and vapor scavenging coefficient input values. Accepted model statistical performance measures (fractional bias and normalized mean square error) were calculated for the different modeling runs, and the best model performance was selected. Measured concentrations close to the source (average = 5.3 ng L-1) were about twice those measured in the Teton Range (average = 2.7 ng L-1) which were within the expected range of values for remote background areas. For most of the sampling locations, the ISC3 model predicted within a factor of two of the observed deposition. The best modeling performance was obtained using a scavenging coefficient value for 0.25 ??m diameter particulate and the assumption that all of the mercury is reactive Hg(II) and subject to local deposition. A 0.1 ??m particle assumption provided conservative overprediction of the data, while a vapor assumption resulted in highly variable predictions. Partitioning a fraction of the Hg emissions to elemental Hg(0) (a U.S. EPA default assumption for combustion facility risk assessments) would have underpredicted the observed fallout.
Influences of system uncertainties on the numerical transfer path analysis of engine systems
NASA Astrophysics Data System (ADS)
Acri, A.; Nijman, E.; Acri, A.; Offner, G.
2017-10-01
Practical mechanical systems operate with some degree of uncertainty. In numerical models uncertainties can result from poorly known or variable parameters, from geometrical approximation, from discretization or numerical errors, from uncertain inputs or from rapidly changing forcing that can be best described in a stochastic framework. Recently, random matrix theory was introduced to take parameter uncertainties into account in numerical modeling problems. In particular in this paper, Wishart random matrix theory is applied on a multi-body dynamic system to generate random variations of the properties of system components. Multi-body dynamics is a powerful numerical tool largely implemented during the design of new engines. In this paper the influence of model parameter variability on the results obtained from the multi-body simulation of engine dynamics is investigated. The aim is to define a methodology to properly assess and rank system sources when dealing with uncertainties. Particular attention is paid to the influence of these uncertainties on the analysis and the assessment of the different engine vibration sources. Examples of the effects of different levels of uncertainties are illustrated by means of examples using a representative numerical powertrain model. A numerical transfer path analysis, based on system dynamic substructuring, is used to derive and assess the internal engine vibration sources. The results obtained from this analysis are used to derive correlations between parameter uncertainties and statistical distribution of results. The derived statistical information can be used to advance the knowledge of the multi-body analysis and the assessment of system sources when uncertainties in model parameters are considered.
A global reconstruction of climate-driven subdecadal water storage variability
NASA Astrophysics Data System (ADS)
Humphrey, V.; Gudmundsson, L.; Seneviratne, S. I.
2017-03-01
Since 2002, the Gravity Recovery and Climate Experiment (GRACE) mission has provided unprecedented observations of global mass redistribution caused by hydrological processes. However, there are still few sources on pre-2002 global terrestrial water storage (TWS). Classical approaches to retrieve past TWS rely on either land surface models (LSMs) or basin-scale water balance calculations. Here we propose a new approach which statistically relates anomalies in atmospheric drivers to monthly GRACE anomalies. Gridded subdecadal TWS changes and time-dependent uncertainty intervals are reconstructed for the period 1985-2015. Comparisons with model results demonstrate the performance and robustness of the derived data set, which represents a new and valuable source for studying subdecadal TWS variability, closing the ocean/land water budgets and assessing GRACE uncertainties. At midpoint between GRACE observations and LSM simulations, the statistical approach provides TWS estimates (doi:
On Theoretical Broadband Shock-Associated Noise Near-Field Cross-Spectra
NASA Technical Reports Server (NTRS)
Miller, Steven A. E.
2015-01-01
The cross-spectral acoustic analogy is used to predict auto-spectra and cross-spectra of broadband shock-associated noise in the near-field and far-field from a range of heated and unheated supersonic off-design jets. A single equivalent source model is proposed for the near-field, mid-field, and far-field terms, that contains flow-field statistics of the shock wave shear layer interactions. Flow-field statistics are modeled based upon experimental observation and computational fluid dynamics solutions. An axisymmetric assumption is used to reduce the model to a closed-form equation involving a double summation over the equivalent source at each shock wave shear layer interaction. Predictions are compared with a wide variety of measurements at numerous jet Mach numbers and temperature ratios from multiple facilities. Auto-spectral predictions of broadband shock-associated noise in the near-field and far-field capture trends observed in measurement and other prediction theories. Predictions of spatial coherence of broadband shock-associated noise accurately capture the peak coherent intensity, frequency, and spectral width.
The use of open source bioinformatics tools to dissect transcriptomic data.
Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera
2012-01-01
Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.
Argyropoulos, G; Samara, C; Diapouli, E; Eleftheriadis, K; Papaoikonomou, K; Kungolos, A
2017-12-01
A hybrid source-receptor modeling process was assembled, to apportion and infer source locations of PM 10 and PM 2.5 in three heavily-impacted urban areas of Greece, during the warm period of 2011, and the cold period of 2012. The assembled process involved application of an advanced computational procedure, the so-called Robotic Chemical Mass Balance (RCMB) model. Source locations were inferred using two well-established probability functions: (a) the Conditional Probability Function (CPF), to correlate the output of RCMB with local wind directional data, and (b) the Potential Source Contribution Function (PSCF), to correlate the output of RCMB with 72h air-mass back-trajectories, arriving at the receptor sites, during sampling. Regarding CPF, a higher-level conditional probability function was defined as well, from the common locus of CPF sectors derived for neighboring receptor sites. With respect to PSCF, a non-parametric bootstrapping method was applied to discriminate the statistically significant values. RCMB modeling showed that resuspended dust is actually one of the main barriers for attaining the European Union (EU) limit values in Mediterranean urban agglomerations, where the drier climate favors build-up. The shift in the energy mix of Greece (caused by the economic recession) was also evidenced, since biomass burning was found to contribute more significantly to the sampling sites belonging to the coldest climatic zone, particularly during the cold period. The CPF analysis showed that short-range transport of anthropogenic emissions from urban traffic to urban background sites was very likely to have occurred, within all the examined urban agglomerations. The PSCF analysis confirmed that long-range transport of primary and/or secondary aerosols may indeed be possible, even from distances over 1000km away from study areas. Copyright © 2017 Elsevier B.V. All rights reserved.
Inverse and Forward Modeling of The 2014 Iquique Earthquake with Run-up Data
NASA Astrophysics Data System (ADS)
Fuentes, M.
2015-12-01
The April 1, 2014 Mw 8.2 Iquique earthquake excited a moderate tsunami which turned on the national alert of tsunami threat. This earthquake was located in the well-known seismic gap in northern Chile which had a high seismic potential (~ Mw 9.0) after the two main large historic events of 1868 and 1877. Nonetheless, studies of the seismic source performed with seismic data inversions suggest that the event exhibited a main patch located around 19.8° S at 40 km of depth with a seismic moment equivalent to Mw = 8.2. Thus, a large seismic deficit remains in the gap being capable to release an event of Mw = 8.8-8.9. To understand the importance of the tsunami threat in this zone, a seismic source modeling of the Iquique Earthquake is performed. A new approach based on stochastic k2 seismic sources is presented. A set of those sources is generated and for each one, a full numerical tsunami model is performed in order to obtain the run-up heights along the coastline. The results are compared with the available field run-up measurements and with the tide gauges that registered the signal. The comparison is not uniform; it penalizes more when the discrepancies are larger close to the peak run-up location. This criterion allows to identify the best seismic source from the set of scenarios that explains better the observations from a statistical point of view. By the other hand, a L2 norm minimization is used to invert the seismic source by comparing the peak nearshore tsunami amplitude (PNTA) with the run-up observations. This method searches in a space of solutions the best seismic configuration by retrieving the Green's function coefficients in order to explain the field measurements. The results obtained confirm that a concentrated down-dip patch slip adequately models the run-up data.
Source-Modeling Auditory Processes of EEG Data Using EEGLAB and Brainstorm.
Stropahl, Maren; Bauer, Anna-Katharina R; Debener, Stefan; Bleichner, Martin G
2018-01-01
Electroencephalography (EEG) source localization approaches are often used to disentangle the spatial patterns mixed up in scalp EEG recordings. However, approaches differ substantially between experiments, may be strongly parameter-dependent, and results are not necessarily meaningful. In this paper we provide a pipeline for EEG source estimation, from raw EEG data pre-processing using EEGLAB functions up to source-level analysis as implemented in Brainstorm. The pipeline is tested using a data set of 10 individuals performing an auditory attention task. The analysis approach estimates sources of 64-channel EEG data without the prerequisite of individual anatomies or individually digitized sensor positions. First, we show advanced EEG pre-processing using EEGLAB, which includes artifact attenuation using independent component analysis (ICA). ICA is a linear decomposition technique that aims to reveal the underlying statistical sources of mixed signals and is further a powerful tool to attenuate stereotypical artifacts (e.g., eye movements or heartbeat). Data submitted to ICA are pre-processed to facilitate good-quality decompositions. Aiming toward an objective approach on component identification, the semi-automatic CORRMAP algorithm is applied for the identification of components representing prominent and stereotypic artifacts. Second, we present a step-wise approach to estimate active sources of auditory cortex event-related processing, on a single subject level. The presented approach assumes that no individual anatomy is available and therefore the default anatomy ICBM152, as implemented in Brainstorm, is used for all individuals. Individual noise modeling in this dataset is based on the pre-stimulus baseline period. For EEG source modeling we use the OpenMEEG algorithm as the underlying forward model based on the symmetric Boundary Element Method (BEM). We then apply the method of dynamical statistical parametric mapping (dSPM) to obtain physiologically plausible EEG source estimates. Finally, we show how to perform group level analysis in the time domain on anatomically defined regions of interest (auditory scout). The proposed pipeline needs to be tailored to the specific datasets and paradigms. However, the straightforward combination of EEGLAB and Brainstorm analysis tools may be of interest to others performing EEG source localization.
Cortical Hierarchies Perform Bayesian Causal Inference in Multisensory Perception
Rohe, Tim; Noppeney, Uta
2015-01-01
To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the “causal inference problem.” Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world. PMID:25710328
Evaluation of Oceanic Transport Statistics By Use of Transient Tracers and Bayesian Methods
NASA Astrophysics Data System (ADS)
Trossman, D. S.; Thompson, L.; Mecking, S.; Bryan, F.; Peacock, S.
2013-12-01
Key variables that quantify the time scales over which atmospheric signals penetrate into the oceanic interior and their uncertainties are computed using Bayesian methods and transient tracers from both models and observations. First, the mean residence times, subduction rates, and formation rates of Subtropical Mode Water (STMW) and Subpolar Mode Water (SPMW) in the North Atlantic and Subantarctic Mode Water (SAMW) in the Southern Ocean are estimated by combining a model and observations of chlorofluorocarbon-11 (CFC-11) via Bayesian Model Averaging (BMA), statistical technique that weights model estimates according to how close they agree with observations. Second, a Bayesian method is presented to find two oceanic transport parameters associated with the age distribution of ocean waters, the transit-time distribution (TTD), by combining an eddying global ocean model's estimate of the TTD with hydrographic observations of CFC-11, temperature, and salinity. Uncertainties associated with objectively mapping irregularly spaced bottle data are quantified by making use of a thin-plate spline and then propagated via the two Bayesian techniques. It is found that the subduction of STMW, SPMW, and SAMW is mostly an advective process, but up to about one-third of STMW subduction likely owes to non-advective processes. Also, while the formation of STMW is mostly due to subduction, the formation of SPMW is mostly due to other processes. About half of the formation of SAMW is due to subduction and half is due to other processes. A combination of air-sea flux, acting on relatively short time scales, and turbulent mixing, acting on a wide range of time scales, is likely the dominant SPMW erosion mechanism. Air-sea flux is likely responsible for most STMW erosion, and turbulent mixing is likely responsible for most SAMW erosion. Two oceanic transport parameters, the mean age of a water parcel and the half-variance associated with the TTD, estimated using the model's tracers as data (BayesPOP) and those estimated using tracer observations as data (BayesObs) provide information about the sources of model biases, and give a more nuanced picture than can be found by comparing the simulated CFC-11 concentrations with observed CFC-11 concentrations. Using the differences between the two oceanic transport parameters from BayesObs and those from BayesPOP with and without a constant Peclet number assumption along each of the hydrographic cross-sections considered here, it is found that the model's diffusivity tensor biases lead to larger model errors than the model's mean advection time biases. However, it is also found that mean advection time biases in the model are statistically significant at the 95% level where mode water is found.
NASA Astrophysics Data System (ADS)
Lutz, Stefanie; Van Breukelen, Boris
2014-05-01
Natural attenuation can represent a complementary or alternative approach to engineered remediation of polluted sites. In this context, compound specific stable isotope analysis (CSIA) has proven a useful tool, as it can provide evidence of natural attenuation and assess the extent of in-situ degradation based on changes in isotope ratios of pollutants. Moreover, CSIA can allow for source identification and apportionment, which might help to identify major emission sources in complex contamination scenarios. However, degradation and mixing processes in aquifers can lead to changes in isotopic compositions, such that their simultaneous occurrence might complicate combined source apportionment (SA) and assessment of the extent of degradation (ED). We developed a mathematical model (stable isotope sources and sinks model; SISS model) based on the linear stable isotope mixing model and the Rayleigh equation that allows for simultaneous SA and quantification of the ED in a scenario of two emission sources and degradation via one reaction pathway. It was shown that the SISS model with CSIA of at least two elements contained in the pollutant (e.g., C and H in benzene) allows for unequivocal SA even in the presence of degradation-induced isotope fractionation. In addition, the model enables precise quantification of the ED provided degradation follows instantaneous mixing of two sources. If mixing occurs after two sources have degraded separately, the model can still yield a conservative estimate of the overall extent of degradation. The SISS model was validated against virtual data from a two-dimensional reactive transport model. The model results for SA and ED were in good agreement with the simulation results. The application of the SISS model to field data of benzene contamination was, however, challenged by large uncertainties in measured isotope data. Nonetheless, the use of the SISS model provided a better insight into the interplay of mixing and degradation processes at the field site, as it revealed the prevailing contribution of one emission source and a low overall ED. The model can be extended to a larger number of sources and sinks. It may aid in forensics and natural attenuation assessment of soil, groundwater, surface water, or atmospheric pollution.
Marković, Snežana; Kerč, Janez; Horvat, Matej
2017-03-01
We are presenting a new approach of identifying sources of variability within a manufacturing process by NIR measurements of samples of intermediate material after each consecutive unit operation (interprocess NIR sampling technique). In addition, we summarize the development of a multivariate statistical process control (MSPC) model for the production of enteric-coated pellet product of the proton-pump inhibitor class. By developing provisional NIR calibration models, the identification of critical process points yields comparable results to the established MSPC modeling procedure. Both approaches are shown to lead to the same conclusion, identifying parameters of extrusion/spheronization and characteristics of lactose that have the greatest influence on the end-product's enteric coating performance. The proposed approach enables quicker and easier identification of variability sources during manufacturing process, especially in cases when historical process data is not straightforwardly available. In the presented case the changes of lactose characteristics are influencing the performance of the extrusion/spheronization process step. The pellet cores produced by using one (considered as less suitable) lactose source were on average larger and more fragile, leading to consequent breakage of the cores during subsequent fluid bed operations. These results were confirmed by additional experimental analyses illuminating the underlying mechanism of fracture of oblong pellets during the pellet coating process leading to compromised film coating.
Retrograde spins of near-Earth asteroids from the Yarkovsky effect.
La Spina, A; Paolicchi, P; Kryszczyńska, A; Pravec, P
2004-03-25
Dynamical resonances in the asteroid belt are the gateway for the production of near-Earth asteroids (NEAs). To generate the observed number of NEAs, however, requires the injection of many asteroids into those resonant regions. Collisional processes have long been claimed as a possible source, but difficulties with that idea have led to the suggestion that orbital drift arising from the Yarkovsky effect dominates the injection process. (The Yarkovsky effect is a force arising from differential heating-the 'afternoon' side of an asteroid is warmer than the 'morning' side.) The two models predict different rotational properties of NEAs: the usual collisional theories are consistent with a nearly isotropic distribution of rotation vectors, whereas the 'Yarkovsky model' predicts an excess of retrograde rotations. Here we report that the spin vectors of NEAs show a strong and statistically significant excess of retrograde rotations, quantitatively consistent with the theoretical expectations of the Yarkovsky model.
NASA Astrophysics Data System (ADS)
Geil, Paul M.; Mutch, Simon J.; Poole, Gregory B.; Angel, Paul W.; Duffy, Alan R.; Mesinger, Andrei; Wyithe, J. Stuart B.
2016-10-01
We use the Dark-ages, Reionization And Galaxy formation Observables from Numerical Simulations (DRAGONS) framework to investigate the effect of galaxy formation physics on the morphology and statistics of ionized hydrogen (H II) regions during the Epoch of Reioinization (EoR). DRAGONS self-consistently couples a semi-analytic galaxy formation model with the inhomogeneous ionizing UV background, and can therefore be used to study the dependence of morphology and statistics of reionization on feedback phenomena of the ionizing source galaxy population. Changes in galaxy formation physics modify the sizes of H II regions and the amplitude and shape of 21-cm power spectra. Of the galaxy physics investigated, we find that supernova feedback plays the most important role in reionization, with H II regions up to ≈20 per cent smaller and a fractional difference in the amplitude of power spectra of up to ≈17 per cent at fixed ionized fraction in the absence of this feedback. We compare our galaxy formation-based reionization models with past calculations that assume constant stellar-to-halo mass ratios and find that with the correct choice of minimum halo mass, such models can mimic the predicted reionization morphology. Reionization morphology at fixed neutral fraction is therefore not uniquely determined by the details of galaxy formation, but is sensitive to the mass of the haloes hosting the bulk of the ionizing sources. Simple EoR parametrizations are therefore accurate predictors of reionization statistics. However, a complete understanding of reionization using future 21-cm observations will require interpretation with realistic galaxy formation models, in combination with other observations.
Some Statistics for Assessing Person-Fit Based on Continuous-Response Models
ERIC Educational Resources Information Center
Ferrando, Pere Joan
2010-01-01
This article proposes several statistics for assessing individual fit based on two unidimensional models for continuous responses: linear factor analysis and Samejima's continuous response model. Both models are approached using a common framework based on underlying response variables and are formulated at the individual level as fixed regression…
Tests and consequences of disk plus halo models of gamma-ray burst sources
NASA Technical Reports Server (NTRS)
Smith, I. A.
1995-01-01
The gamma-ray burst observations made by the Burst and Transient Source Experiment (BATSE) and by previous experiments are still consistent with a combined Galactic disk (or Galactic spiral arm) plus extended Galactic halo model. Testable predictions and consequences of the disk plus halo model are discussed here; tests performed on the expanded BATSE database in the future will constrain the allowed model parameters and may eventually rule out the disk plus halo model. Using examples, it is shown that if the halo has an appropriate edge, BATSE will never detect an anisotropic signal from the halo of the Andromeda galaxy. A prediction of the disk plus halo model is that the fraction of the bursts observed to be in the 'disk' population rises as the detector sensitivity improves. A careful reexamination of the numbers of bursts in the two populations for the pre-BATSE databases could rule out this class of models. Similarly, it is predicted that different satellites will observe different relative numbers of bursts in the two classes for any model in which there are two different spatial distribiutions of the sources, or for models in which there is one spatial distribution of the sources that is sampled to different depths for the two classes. An important consequence of the disk plus halo model is that for the birthrate of the halo sources to be small compared to the birthrate of the disk sources, it is necessary for the halo sources to release many orders of magnitude more energy over their bursting lifetime than the disk sources. The halo bursts must also be much more luminous than the disk bursts; if this disk-halo model is correct, it is necessary to explain why the disk sources do not produce halo-type bursts.
NASA Astrophysics Data System (ADS)
Wellen, Christopher; Arhonditsis, George B.; Long, Tanya; Boyd, Duncan
2014-11-01
Spatially distributed nonpoint source watershed models are essential tools to estimate the magnitude and sources of diffuse pollution. However, little work has been undertaken to understand the sources and ramifications of the uncertainty involved in their use. In this study we conduct the first Bayesian uncertainty analysis of the water quality components of the SWAT model, one of the most commonly used distributed nonpoint source models. Working in Southern Ontario, we apply three Bayesian configurations for calibrating SWAT to Redhill Creek, an urban catchment, and Grindstone Creek, an agricultural one. We answer four interrelated questions: can SWAT determine suspended sediment sources with confidence when end of basin data is used for calibration? How does uncertainty propagate from the discharge submodel to the suspended sediment submodels? Do the estimated sediment sources vary when different calibration approaches are used? Can we combine the knowledge gained from different calibration approaches? We show that: (i) despite reasonable fit at the basin outlet, the simulated sediment sources are subject to uncertainty sufficient to undermine the typical approach of reliance on a single, best fit simulation; (ii) more than a third of the uncertainty of sediment load predictions may stem from the discharge submodel; (iii) estimated sediment sources do vary significantly across the three statistical configurations of model calibration despite end-of-basin predictions being virtually identical; and (iv) Bayesian model averaging is an approach that can synthesize predictions when a number of adequate distributed models make divergent source apportionments. We conclude with recommendations for future research to reduce the uncertainty encountered when using distributed nonpoint source models for source apportionment.
Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong
2013-01-01
For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods. PMID:23620809
Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong
2013-01-01
For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.
Teacher's Corner: Structural Equation Modeling with the Sem Package in R
ERIC Educational Resources Information Center
Fox, John
2006-01-01
R is free, open-source, cooperatively developed software that implements the S statistical programming language and computing environment. The current capabilities of R are extensive, and it is in wide use, especially among statisticians. The sem package provides basic structural equation modeling facilities in R, including the ability to fit…
Shear band formation in plastic bonded explosive (PBX)
NASA Astrophysics Data System (ADS)
Dey, T. N.; Johnson, J. N.
1998-07-01
Adiabatic shear bands can be a source of ignition and lead to detonation. At low to moderate deformation rates, 10-1000 s-1, two other mechanisms can also give rise to shear bands. These mechanisms are: 1) softening caused by micro-cracking and 2) a constitutive response with a non-associated flow rule as is observed in granular material such as soil. Brittle behavior at small strains and the granular nature of HMX suggest that PBX-9501 constitutive behavior may be similar to sand. A constitutive model for the first of these mechanisms is studied in a series of calculations. This viscoelastic constitutive model for PBX-9501 softens via a statistical crack model. A sand model is used to provide a non-associated flow rule and detailed results will be reported elsewhere. Both models generate shear band formation at 1-2% strain at nominal strain rates at and below 1000 s-1. Shear band formation is suppressed at higher strain rates. Both mechanisms may accelerate the formation of adiabatic shear bands.
NASA Astrophysics Data System (ADS)
Mabit, Lionel; Gibbs, Max; Meusburger, Katrin; Toloza, Arsenio; Resch, Christian; Klik, Andreas; Swales, Andrew; Alewell, Christine
2016-04-01
- Several recently published information from scientific research have highlighted that compound-specific stable isotope (CSSI) signatures of fatty acids (FAs) based on the measurement of carbon-13 natural abundance signatures showed great promises to identify sediment origin. The authors have used this innovative isotopic approach to investigate the sources of sediment in a three hectares Austrian sub-watershed (i.e. Mistelbach). Through a previous study using the Cs-137 technique, Mabit et al. (Geoderma, 2009) reported a local maximum sedimentation rate reaching 20 to 50 t/ha/yr in the lowest part of this watershed. However, this study did not identify the sources. Subsequently, the deposited sediment at its outlet (i.e. the sediment mixture) and representative soil samples from the four main agricultural fields - expected to be the source soils - of the site were investigated. The bulk delta carbon-13 of the samples and two long-chain FAs (i.e. C22:0 and C24:0) allowed the best statistical discrimination. Using two different mixing models (i.e. IsoSource and CSSIAR v1.00) and the organic carbon content of the soil sources and sediment mixture, the contribution of each source has been established. Results suggested that the grassed waterway contributed to at least 50% of the sediment deposited at the watershed outlet. This study, that will require further validation, highlights that CSSI and Cs-137 techniques are complementary as fingerprints and tracers for establishing land sediment redistribution and could provide meaningful information for optimized decision-making by land managers.
BOOK REVIEW: Statistical Mechanics of Turbulent Flows
NASA Astrophysics Data System (ADS)
Cambon, C.
2004-10-01
This is a handbook for a computational approach to reacting flows, including background material on statistical mechanics. In this sense, the title is somewhat misleading with respect to other books dedicated to the statistical theory of turbulence (e.g. Monin and Yaglom). In the present book, emphasis is placed on modelling (engineering closures) for computational fluid dynamics. The probabilistic (pdf) approach is applied to the local scalar field, motivated first by the nonlinearity of chemical source terms which appear in the transport equations of reacting species. The probabilistic and stochastic approaches are also used for the velocity field and particle position; nevertheless they are essentially limited to Lagrangian models for a local vector, with only single-point statistics, as for the scalar. Accordingly, conventional techniques, such as single-point closures for RANS (Reynolds-averaged Navier-Stokes) and subgrid-scale models for LES (large-eddy simulations), are described and in some cases reformulated using underlying Langevin models and filtered pdfs. Even if the theoretical approach to turbulence is not discussed in general, the essentials of probabilistic and stochastic-processes methods are described, with a useful reminder concerning statistics at the molecular level. The book comprises 7 chapters. Chapter 1 briefly states the goals and contents, with a very clear synoptic scheme on page 2. Chapter 2 presents definitions and examples of pdfs and related statistical moments. Chapter 3 deals with stochastic processes, pdf transport equations, from Kramer-Moyal to Fokker-Planck (for Markov processes), and moments equations. Stochastic differential equations are introduced and their relationship to pdfs described. This chapter ends with a discussion of stochastic modelling. The equations of fluid mechanics and thermodynamics are addressed in chapter 4. Classical conservation equations (mass, velocity, internal energy) are derived from their counterparts at the molecular level. In addition, equations are given for multicomponent reacting systems. The chapter ends with miscellaneous topics, including DNS, (idea of) the energy cascade, and RANS. Chapter 5 is devoted to stochastic models for the large scales of turbulence. Langevin-type models for velocity (and particle position) are presented, and their various consequences for second-order single-point corelations (Reynolds stress components, Kolmogorov constant) are discussed. These models are then presented for the scalar. The chapter ends with compressible high-speed flows and various models, ranging from k-epsilon to hybrid RANS-pdf. Stochastic models for small-scale turbulence are addressed in chapter 6. These models are based on the concept of a filter density function (FDF) for the scalar, and a more conventional SGS (sub-grid-scale model) for the velocity in LES. The final chapter, chapter 7, is entitled `The unification of turbulence models' and aims at reconciling large-scale and small-scale modelling. This book offers a timely survey of techniques in modern computational fluid mechanics for turbulent flows with reacting scalars. It should be of interest to engineers, while the discussion of the underlying tools, namely pdfs, stochastic and statistical equations should also be attractive to applied mathematicians and physicists. The book's emphasis on local pdfs and stochastic Langevin models gives a consistent structure to the book and allows the author to cover almost the whole spectrum of practical modelling in turbulent CFD. On the other hand, one might regret that non-local issues are not mentioned explicitly, or even briefly. These problems range from the presence of pressure-strain correlations in the Reynolds stress transport equations to the presence of two-point pdfs in the single-point pdf equation derived from the Navier--Stokes equations. (One may recall that, even without scalar transport, a general closure problem for turbulence statistics results from both non-linearity and non-locality of Navier-Stokes equations, the latter coming from, e.g., the nonlocal relationship of velocity and pressure in the quasi-incompressible case. These two aspects are often intricately linked. It is well known that non-linearity alone is not responsible for the `problem', as evidenced by 1D turbulence without pressure (`Burgulence' from the Burgers equation) and probably 3D (cosmological gas). A local description in terms of pdf for the velocity can resolve the `non-linear' problem, which instead yields an infinite hierarchy of equations in terms of moments. On the other hand, non-locality yields a hierarchy of unclosed equations, with the single-point pdf equation for velocity derived from NS incompressible equations involving a two-point pdf, and so on. The general relationship was given by Lundgren (1967, Phys. Fluids 10 (5), 969-975), with the equation for pdf at n points involving the pdf at n+1 points. The nonlocal problem appears in various statistical models which are not discussed in the book. The simplest example is full RST or ASM models, in which the closure of pressure-strain correlations is pivotal (their counterpart ought to be identified and discussed in equations (5-21) and the following ones). The book does not address more sophisticated non-local approaches, such as two-point (or spectral) non-linear closure theories and models, `rapid distortion theory' for linear regimes, not to mention scaling and intermittency based on two-point structure functions, etc. The book sometimes mixes theoretical modelling and pure empirical relationships, the empirical character coming from the lack of a nonlocal (two-point) approach.) In short, the book is orientated more towards applications than towards turbulence theory; it is written clearly and concisely and should be useful to a large community, interested either in the underlying stochastic formalism or in CFD applications.
Estimating Animal Abundance in Ground Beef Batches Assayed with Molecular Markers
Hu, Xin-Sheng; Simila, Janika; Platz, Sindey Schueler; Moore, Stephen S.; Plastow, Graham; Meghen, Ciaran N.
2012-01-01
Estimating animal abundance in industrial scale batches of ground meat is important for mapping meat products through the manufacturing process and for effectively tracing the finished product during a food safety recall. The processing of ground beef involves a potentially large number of animals from diverse sources in a single product batch, which produces a high heterogeneity in capture probability. In order to estimate animal abundance through DNA profiling of ground beef constituents, two parameter-based statistical models were developed for incidence data. Simulations were applied to evaluate the maximum likelihood estimate (MLE) of a joint likelihood function from multiple surveys, showing superiority in the presence of high capture heterogeneity with small sample sizes, or comparable estimation in the presence of low capture heterogeneity with a large sample size when compared to other existing models. Our model employs the full information on the pattern of the capture-recapture frequencies from multiple samples. We applied the proposed models to estimate animal abundance in six manufacturing beef batches, genotyped using 30 single nucleotide polymorphism (SNP) markers, from a large scale beef grinding facility. Results show that between 411∼1367 animals were present in six manufacturing beef batches. These estimates are informative as a reference for improving recall processes and tracing finished meat products back to source. PMID:22479559
Goodisman, MAD.; Asmussen, M. A.
1997-01-01
We develop models that describe the cytonuclear structure for either a cytoplasmic and nuclear marker in a haplodiploid species or a cytoplasmic and X-linked marker in a diploid species. Sex-specific disequilibrium statistics that summarize nonrandom cytonuclear associations in such systems are defined, and their basic Hardy-Weinberg dynamics and admixture formulae are delimited. We focus on the context of hybrid zones and develop continent-island models whereby individuals from two genetically differentiated source populations migrate into and mate within a single zone of admixture. We examine the effects of differential migration of the sexes, assortative mating by pure type females, and census time (relative to mating and migration), as well as special cases of random mating and migration subsumed under the general models. We show that pure type individuals and nonzero cytonuclear disequilibria can be maintained within a hybrid zone if there is continued migration from both source populations, and that females generally have a greater influence over these cytonuclear variables than males. The resulting theoretical framework can be used to estimate the rates of assortative mating and sex-specific gene flow in hybrid zones and other zones of admixture involving haplodiploid or sex-linked cytonuclear data. PMID:9286692
The impact on midlevel vision of statistically optimal divisive normalization in V1.
Coen-Cagli, Ruben; Schwartz, Odelia
2013-07-15
The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality.
Clark, Matthew T.; Calland, James Forrest; Enfield, Kyle B.; Voss, John D.; Lake, Douglas E.; Moorman, J. Randall
2017-01-01
Background Charted vital signs and laboratory results represent intermittent samples of a patient’s dynamic physiologic state and have been used to calculate early warning scores to identify patients at risk of clinical deterioration. We hypothesized that the addition of cardiorespiratory dynamics measured from continuous electrocardiography (ECG) monitoring to intermittently sampled data improves the predictive validity of models trained to detect clinical deterioration prior to intensive care unit (ICU) transfer or unanticipated death. Methods and findings We analyzed 63 patient-years of ECG data from 8,105 acute care patient admissions at a tertiary care academic medical center. We developed models to predict deterioration resulting in ICU transfer or unanticipated death within the next 24 hours using either vital signs, laboratory results, or cardiorespiratory dynamics from continuous ECG monitoring and also evaluated models using all available data sources. We calculated the predictive validity (C-statistic), the net reclassification improvement, and the probability of achieving the difference in likelihood ratio χ2 for the additional degrees of freedom. The primary outcome occurred 755 times in 586 admissions (7%). We analyzed 395 clinical deteriorations with continuous ECG data in the 24 hours prior to an event. Using only continuous ECG measures resulted in a C-statistic of 0.65, similar to models using only laboratory results and vital signs (0.63 and 0.69 respectively). Addition of continuous ECG measures to models using conventional measurements improved the C-statistic by 0.01 and 0.07; a model integrating all data sources had a C-statistic of 0.73 with categorical net reclassification improvement of 0.09 for a change of 1 decile in risk. The difference in likelihood ratio χ2 between integrated models with and without cardiorespiratory dynamics was 2158 (p value: <0.001). Conclusions Cardiorespiratory dynamics from continuous ECG monitoring detect clinical deterioration in acute care patients and improve performance of conventional models that use only laboratory results and vital signs. PMID:28771487
Moss, Travis J; Clark, Matthew T; Calland, James Forrest; Enfield, Kyle B; Voss, John D; Lake, Douglas E; Moorman, J Randall
2017-01-01
Charted vital signs and laboratory results represent intermittent samples of a patient's dynamic physiologic state and have been used to calculate early warning scores to identify patients at risk of clinical deterioration. We hypothesized that the addition of cardiorespiratory dynamics measured from continuous electrocardiography (ECG) monitoring to intermittently sampled data improves the predictive validity of models trained to detect clinical deterioration prior to intensive care unit (ICU) transfer or unanticipated death. We analyzed 63 patient-years of ECG data from 8,105 acute care patient admissions at a tertiary care academic medical center. We developed models to predict deterioration resulting in ICU transfer or unanticipated death within the next 24 hours using either vital signs, laboratory results, or cardiorespiratory dynamics from continuous ECG monitoring and also evaluated models using all available data sources. We calculated the predictive validity (C-statistic), the net reclassification improvement, and the probability of achieving the difference in likelihood ratio χ2 for the additional degrees of freedom. The primary outcome occurred 755 times in 586 admissions (7%). We analyzed 395 clinical deteriorations with continuous ECG data in the 24 hours prior to an event. Using only continuous ECG measures resulted in a C-statistic of 0.65, similar to models using only laboratory results and vital signs (0.63 and 0.69 respectively). Addition of continuous ECG measures to models using conventional measurements improved the C-statistic by 0.01 and 0.07; a model integrating all data sources had a C-statistic of 0.73 with categorical net reclassification improvement of 0.09 for a change of 1 decile in risk. The difference in likelihood ratio χ2 between integrated models with and without cardiorespiratory dynamics was 2158 (p value: <0.001). Cardiorespiratory dynamics from continuous ECG monitoring detect clinical deterioration in acute care patients and improve performance of conventional models that use only laboratory results and vital signs.
NASA Astrophysics Data System (ADS)
Lilly, P.; Yanai, R. D.; Buckley, H. L.; Case, B. S.; Woollons, R. C.; Holdaway, R. J.; Johnson, J.
2016-12-01
Calculations of forest biomass and elemental content require many measurements and models, each contributing uncertainty to the final estimates. While sampling error is commonly reported, based on replicate plots, error due to uncertainty in the regression used to estimate biomass from tree diameter is usually not quantified. Some published estimates of uncertainty due to the regression models have used the uncertainty in the prediction of individuals, ignoring uncertainty in the mean, while others have propagated uncertainty in the mean while ignoring individual variation. Using the simple case of the calcium concentration of sugar maple leaves, we compare the variation among individuals (the standard deviation) to the uncertainty in the mean (the standard error) and illustrate the declining importance in the prediction of individual concentrations as the number of individuals increases. For allometric models, the analogous statistics are the prediction interval (or the residual variation in the model fit) and the confidence interval (describing the uncertainty in the best fit model). The effect of propagating these two sources of error is illustrated using the mass of sugar maple foliage. The uncertainty in individual tree predictions was large for plots with few trees; for plots with 30 trees or more, the uncertainty in individuals was less important than the uncertainty in the mean. Authors of previously published analyses have reanalyzed their data to show the magnitude of these two sources of uncertainty in scales ranging from experimental plots to entire countries. The most correct analysis will take both sources of uncertainty into account, but for practical purposes, country-level reports of uncertainty in carbon stocks, as required by the IPCC, can ignore the uncertainty in individuals. Ignoring the uncertainty in the mean will lead to exaggerated estimates of confidence in estimates of forest biomass and carbon and nutrient contents.
Imfit: A Fast, Flexible Program for Astronomical Image Fitting
NASA Astrophysics Data System (ADS)
Erwin, Peter
2014-08-01
Imift is an open-source astronomical image-fitting program specialized for galaxies but potentially useful for other sources, which is fast, flexible, and highly extensible. Its object-oriented design allows new types of image components (2D surface-brightness functions) to be easily written and added to the program. Image functions provided with Imfit include Sersic, exponential, and Gaussian galaxy decompositions along with Core-Sersic and broken-exponential profiles, elliptical rings, and three components that perform line-of-sight integration through 3D luminosity-density models of disks and rings seen at arbitrary inclinations. Available minimization algorithms include Levenberg-Marquardt, Nelder-Mead simplex, and Differential Evolution, allowing trade-offs between speed and decreased sensitivity to local minima in the fit landscape. Minimization can be done using the standard chi^2 statistic (using either data or model values to estimate per-pixel Gaussian errors, or else user-supplied error images) or the Cash statistic; the latter is particularly appropriate for cases of Poisson data in the low-count regime. The C++ source code for Imfit is available under the GNU Public License.
NASA Astrophysics Data System (ADS)
Määttä, A.; Laine, M.; Tamminen, J.; Veefkind, J. P.
2014-05-01
Satellite instruments are nowadays successfully utilised for measuring atmospheric aerosol in many applications as well as in research. Therefore, there is a growing need for rigorous error characterisation of the measurements. Here, we introduce a methodology for quantifying the uncertainty in the retrieval of aerosol optical thickness (AOT). In particular, we concentrate on two aspects: uncertainty due to aerosol microphysical model selection and uncertainty due to imperfect forward modelling. We apply the introduced methodology for aerosol optical thickness retrieval of the Ozone Monitoring Instrument (OMI) on board NASA's Earth Observing System (EOS) Aura satellite, launched in 2004. We apply statistical methodologies that improve the uncertainty estimates of the aerosol optical thickness retrieval by propagating aerosol microphysical model selection and forward model error more realistically. For the microphysical model selection problem, we utilise Bayesian model selection and model averaging methods. Gaussian processes are utilised to characterise the smooth systematic discrepancies between the measured and modelled reflectances (i.e. residuals). The spectral correlation is composed empirically by exploring a set of residuals. The operational OMI multi-wavelength aerosol retrieval algorithm OMAERO is used for cloud-free, over-land pixels of the OMI instrument with the additional Bayesian model selection and model discrepancy techniques introduced here. The method and improved uncertainty characterisation is demonstrated by several examples with different aerosol properties: weakly absorbing aerosols, forest fires over Greece and Russia, and Sahara desert dust. The statistical methodology presented is general; it is not restricted to this particular satellite retrieval application.
Evaluating Item Fit for Multidimensional Item Response Models
ERIC Educational Resources Information Center
Zhang, Bo; Stone, Clement A.
2008-01-01
This research examines the utility of the s-x[superscript 2] statistic proposed by Orlando and Thissen (2000) in evaluating item fit for multidimensional item response models. Monte Carlo simulation was conducted to investigate both the Type I error and statistical power of this fit statistic in analyzing two kinds of multidimensional test…
NASA Astrophysics Data System (ADS)
Mukhopadhyay, S.; Tsang, Y. W.
2001-12-01
Heating unsaturated fractured tuff sets off a series of complicated thermal-hydrological (TH) processes, which result in large-scale redistribution of moisture in the host rock. Moisture redistribution arises from boiling of water near heat sources, transport of vapor away from those heat sources, condensation of that vapor in cooler rock, and subsequent gravity drainage of condensate through fractures. Vapor transport through high-permeability paths, which include both the fractures in the rock and other conduits, contributes to the evolution of these TH processes in two ways. First, the highly permeable natural fractures provide easy passage for vapor away from the heat sources. Second, these fractures and other highly permeable conduits allow vapor (and the associated energy) to escape the rock through open boundaries of the test domain. The overall impact of vapor transport on the evolution of the TH processes can be more easily understood in the context of the Drift Scale Test (DST), the largest ever in situ heater test in unsaturated fractured tuff. The DST, in which a large volume of rock has been heated for four years now, is located in the middle nonlithophysal (Tptpmn) stratigraphic unit of Yucca Mountain, Nevada. The fractured tuff in Tptpmn contains many well-connected fractures. In the DST, heating is provided by nine cannister heaters placed in a five-meter-diameter Heated Drift (HD) and fifty wing heaters installed orthogonal to the axis of the HD. The test has many instrumentation boreholes, some of which are not sealed by packers or grout and may provide passage for vapor and energy. Of these conduits, the boreholes housing the wing heaters are most important for vapor transport because of their proximity to heat sources. While part of the vapor generated by heating moves away from the heat sources through the fractures and condenses elsewhere in the rock, the rest of the vapor, under gas-pressure difference, enters the HD by way of the high-permeability wing heater boreholes and escapes the test block through an open bulkhead that connects the HD to the outside world. We show that this vapor transport makes a significant difference in the validation of numerical models against TH processes in the DST. A huge volume of data, including changes in temperature and saturation of the rock, has been collected from the DST. Sophisticated conceptual and numerical models, based on the TOUGH2 simulator, have been developed to analyze these data and to help develop a better understanding of various aspects of coupled TH processes in unsaturated fractured tuff. In general, these models have predicted a close match between measured and simulated results, indicating a good representation of the underlying physical processes. However, there are subtle differences in the predictions from these models. Of particular interest here are two models: One in which vapor transport was considered through the natural fractures only, and the other in which vapor transport through the boreholes housing the wing heaters was included in addition to that through natural fractures. Direct statistical comparison of simulated and measured temperatures from more than 1,700 sensors yielded a mean error of 3-4oC for the first model, indicating that less heat was retained in the test block than that predicted by the model. On the other hand, a similar statistical comparison yielded a mean error of 1-2oC for the second model, suggesting that inclusion of vapor loss through the boreholes produces results closer to the measured data.
A comparison of data-driven groundwater vulnerability assessment methods
Sorichetta, Alessandro; Ballabio, Cristiano; Masetti, Marco; Robinson, Gilpin R.; Sterlacchini, Simone
2013-01-01
Increasing availability of geo-environmental data has promoted the use of statistical methods to assess groundwater vulnerability. Nitrate is a widespread anthropogenic contaminant in groundwater and its occurrence can be used to identify aquifer settings vulnerable to contamination. In this study, multivariate Weights of Evidence (WofE) and Logistic Regression (LR) methods, where the response variable is binary, were used to evaluate the role and importance of a number of explanatory variables associated with nitrate sources and occurrence in groundwater in the Milan District (central part of the Po Plain, Italy). The results of these models have been used to map the spatial variation of groundwater vulnerability to nitrate in the region, and we compare the similarities and differences of their spatial patterns and associated explanatory variables. We modify the standard WofE method used in previous groundwater vulnerability studies to a form analogous to that used in LR; this provides a framework to compare the results of both models and reduces the effect of sampling bias on the results of the standard WofE model. In addition, a nonlinear Generalized Additive Model has been used to extend the LR analysis. Both approaches improved discrimination of the standard WofE and LR models, as measured by the c-statistic. Groundwater vulnerability probability outputs, based on rank-order classification of the respective model results, were similar in spatial patterns and identified similar strong explanatory variables associated with nitrate source (population density as a proxy for sewage systems and septic sources) and nitrate occurrence (groundwater depth).
Identifying the Source of Misfit in Item Response Theory Models.
Liu, Yang; Maydeu-Olivares, Alberto
2014-01-01
When an item response theory model fails to fit adequately, the items for which the model provides a good fit and those for which it does not must be determined. To this end, we compare the performance of several fit statistics for item pairs with known asymptotic distributions under maximum likelihood estimation of the item parameters: (a) a mean and variance adjustment to bivariate Pearson's X(2), (b) a bivariate subtable analog to Reiser's (1996) overall goodness-of-fit test, (c) a z statistic for the bivariate residual cross product, and (d) Maydeu-Olivares and Joe's (2006) M2 statistic applied to bivariate subtables. The unadjusted Pearson's X(2) with heuristically determined degrees of freedom is also included in the comparison. For binary and ordinal data, our simulation results suggest that the z statistic has the best Type I error and power behavior among all the statistics under investigation when the observed information matrix is used in its computation. However, if one has to use the cross-product information, the mean and variance adjusted X(2) is recommended. We illustrate the use of pairwise fit statistics in 2 real-data examples and discuss possible extensions of the current research in various directions.
Impact of numerical choices on water conservation in the E3SM Atmosphere Model version 1 (EAMv1)
NASA Astrophysics Data System (ADS)
Zhang, Kai; Rasch, Philip J.; Taylor, Mark A.; Wan, Hui; Leung, Ruby; Ma, Po-Lun; Golaz, Jean-Christophe; Wolfe, Jon; Lin, Wuyin; Singh, Balwinder; Burrows, Susannah; Yoon, Jin-Ho; Wang, Hailong; Qian, Yun; Tang, Qi; Caldwell, Peter; Xie, Shaocheng
2018-06-01
The conservation of total water is an important numerical feature for global Earth system models. Even small conservation problems in the water budget can lead to systematic errors in century-long simulations. This study quantifies and reduces various sources of water conservation error in the atmosphere component of the Energy Exascale Earth System Model. Several sources of water conservation error have been identified during the development of the version 1 (V1) model. The largest errors result from the numerical coupling between the resolved dynamics and the parameterized sub-grid physics. A hybrid coupling using different methods for fluid dynamics and tracer transport provides a reduction of water conservation error by a factor of 50 at 1° horizontal resolution as well as consistent improvements at other resolutions. The second largest error source is the use of an overly simplified relationship between the surface moisture flux and latent heat flux at the interface between the host model and the turbulence parameterization. This error can be prevented by applying the same (correct) relationship throughout the entire model. Two additional types of conservation error that result from correcting the surface moisture flux and clipping negative water concentrations can be avoided by using mass-conserving fixers. With all four error sources addressed, the water conservation error in the V1 model becomes negligible and insensitive to the horizontal resolution. The associated changes in the long-term statistics of the main atmospheric features are small. A sensitivity analysis is carried out to show that the magnitudes of the conservation errors in early V1 versions decrease strongly with temporal resolution but increase with horizontal resolution. The increased vertical resolution in V1 results in a very thin model layer at the Earth's surface, which amplifies the conservation error associated with the surface moisture flux correction. We note that for some of the identified error sources, the proposed fixers are remedies rather than solutions to the problems at their roots. Future improvements in time integration would be beneficial for V1.
NASA Astrophysics Data System (ADS)
Díaz-Mojica, J. J.; Cruz-Atienza, V. M.; Madariaga, R.; Singh, S. K.; Iglesias, A.
2013-05-01
We introduce a novel approach for imaging the earthquakes dynamics from ground motion records based on a parallel genetic algorithm (GA). The method follows the elliptical dynamic-rupture-patch approach introduced by Di Carli et al. (2010) and has been carefully verified through different numerical tests (Díaz-Mojica et al., 2012). Apart from the five model parameters defining the patch geometry, our dynamic source description has four more parameters: the stress drop inside the nucleation and the elliptical patches; and two friction parameters, the slip weakening distance and the change of the friction coefficient. These parameters are constant within the rupture surface. The forward dynamic source problem, involved in the GA inverse method, uses a highly accurate computational solver for the problem, namely the staggered-grid split-node. The synthetic inversion presented here shows that the source model parameterization is suitable for the GA, and that short-scale source dynamic features are well resolved in spite of low-pass filtering of the data for periods comparable to the source duration. Since there is always uncertainty in the propagation medium as well as in the source location and the focal mechanisms, we have introduced a statistical approach to generate a set of solution models so that the envelope of the corresponding synthetic waveforms explains as much as possible the observed data. We applied the method to the 2012 Mw6.5 intraslab Zumpango, Mexico earthquake and determined several fundamental source parameters that are in accordance with different and completely independent estimates for Mexican and worldwide earthquakes. Our weighted-average final model satisfactorily explains eastward rupture directivity observed in the recorded data. Some parameters found for the Zumpango earthquake are: Δτ = 30.2+/-6.2 MPa, Er = 0.68+/-0.36x10^15 J, G = 1.74+/-0.44x10^15 J, η = 0.27+/-0.11, Vr/Vs = 0.52+/-0.09 and Mw = 6.64+/-0.07; for the stress drop, radiated energy, fracture energy, radiation efficiency, rupture velocity and moment magnitude, respectively. Mw6.5 intraslab Zumpango earthquake location, stations location and tectonic setting in central Mexico
Probability theory for 3-layer remote sensing radiative transfer model: univariate case.
Ben-David, Avishai; Davidson, Charles E
2012-04-23
A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of line of sight) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive infrared remote sensing. The probability model is described by the Johnson family of distributions that are used as a fit for theoretically computed moments of the radiative transfer model. From the Johnson family we use the SU distribution that can address a wide range of skewness and kurtosis values (in addition to addressing the first two moments, mean and variance). In the limit, SU can also describe lognormal and normal distributions. With the probability model one can evaluate the potential for detecting a target (vapor cloud layer), the probability of observing thermal contrast, and evaluate performance (receiver operating characteristics curves) in clutter-noise limited scenarios. This is (to our knowledge) the first probability model for the 3-layer remote sensing geometry that treats all parameters as random variables and includes higher-order statistics. © 2012 Optical Society of America
NASA Astrophysics Data System (ADS)
Tenkès, Lucille-Marie; Hollerbach, Rainer; Kim, Eun-jin
2017-12-01
A probabilistic description is essential for understanding growth processes in non-stationary states. In this paper, we compute time-dependent probability density functions (PDFs) in order to investigate stochastic logistic and Gompertz models, which are two of the most popular growth models. We consider different types of short-correlated multiplicative and additive noise sources and compare the time-dependent PDFs in the two models, elucidating the effects of the additive and multiplicative noises on the form of PDFs. We demonstrate an interesting transition from a unimodal to a bimodal PDF as the multiplicative noise increases for a fixed value of the additive noise. A much weaker (leaky) attractor in the Gompertz model leads to a significant (singular) growth of the population of a very small size. We point out the limitation of using stationary PDFs, mean value and variance in understanding statistical properties of the growth in non-stationary states, highlighting the importance of time-dependent PDFs. We further compare these two models from the perspective of information change that occurs during the growth process. Specifically, we define an infinitesimal distance at any time by comparing two PDFs at times infinitesimally apart and sum these distances in time. The total distance along the trajectory quantifies the total number of different states that the system undergoes in time, and is called the information length. We show that the time-evolution of the two models become more similar when measured in units of the information length and point out the merit of using the information length in unifying and understanding the dynamic evolution of different growth processes.
Open-source Software for Exoplanet Atmospheric Modeling
NASA Astrophysics Data System (ADS)
Cubillos, Patricio; Blecic, Jasmina; Harrington, Joseph
2018-01-01
I will present a suite of self-standing open-source tools to model and retrieve exoplanet spectra implemented for Python. These include: (1) a Bayesian-statistical package to run Levenberg-Marquardt optimization and Markov-chain Monte Carlo posterior sampling, (2) a package to compress line-transition data from HITRAN or Exomol without loss of information, (3) a package to compute partition functions for HITRAN molecules, (4) a package to compute collision-induced absorption, and (5) a package to produce radiative-transfer spectra of transit and eclipse exoplanet observations and atmospheric retrievals.
Monroe, Scott; Cai, Li
2015-01-01
This research is concerned with two topics in assessing model fit for categorical data analysis. The first topic involves the application of a limited-information overall test, introduced in the item response theory literature, to structural equation modeling (SEM) of categorical outcome variables. Most popular SEM test statistics assess how well the model reproduces estimated polychoric correlations. In contrast, limited-information test statistics assess how well the underlying categorical data are reproduced. Here, the recently introduced C2 statistic of Cai and Monroe (2014) is applied. The second topic concerns how the root mean square error of approximation (RMSEA) fit index can be affected by the number of categories in the outcome variable. This relationship creates challenges for interpreting RMSEA. While the two topics initially appear unrelated, they may conveniently be studied in tandem since RMSEA is based on an overall test statistic, such as C2. The results are illustrated with an empirical application to data from a large-scale educational survey.
NASA Astrophysics Data System (ADS)
Guadagnini, A.; Riva, M.; Dell'Oca, A.
2017-12-01
We propose to ground sensitivity of uncertain parameters of environmental models on a set of indices based on the main (statistical) moments, i.e., mean, variance, skewness and kurtosis, of the probability density function (pdf) of a target model output. This enables us to perform Global Sensitivity Analysis (GSA) of a model in terms of multiple statistical moments and yields a quantification of the impact of model parameters on features driving the shape of the pdf of model output. Our GSA approach includes the possibility of being coupled with the construction of a reduced complexity model that allows approximating the full model response at a reduced computational cost. We demonstrate our approach through a variety of test cases. These include a commonly used analytical benchmark, a simplified model representing pumping in a coastal aquifer, a laboratory-scale tracer experiment, and the migration of fracturing fluid through a naturally fractured reservoir (source) to reach an overlying formation (target). Our strategy allows discriminating the relative importance of model parameters to the four statistical moments considered. We also provide an appraisal of the error associated with the evaluation of our sensitivity metrics by replacing the original system model through the selected surrogate model. Our results suggest that one might need to construct a surrogate model with increasing level of accuracy depending on the statistical moment considered in the GSA. The methodological framework we propose can assist the development of analysis techniques targeted to model calibration, design of experiment, uncertainty quantification and risk assessment.
NASA Astrophysics Data System (ADS)
Zielke, Olaf; McDougall, Damon; Mai, Martin; Babuska, Ivo
2014-05-01
Seismic, often augmented with geodetic data, are frequently used to invert for the spatio-temporal evolution of slip along a rupture plane. The resulting images of the slip evolution for a single event, inferred by different research teams, often vary distinctly, depending on the adopted inversion approach and rupture model parameterization. This observation raises the question, which of the provided kinematic source inversion solutions is most reliable and most robust, and — more generally — how accurate are fault parameterization and solution predictions? These issues are not included in "standard" source inversion approaches. Here, we present a statistical inversion approach to constrain kinematic rupture parameters from teleseismic body waves. The approach is based a) on a forward-modeling scheme that computes synthetic (body-)waves for a given kinematic rupture model, and b) on the QUESO (Quantification of Uncertainty for Estimation, Simulation, and Optimization) library that uses MCMC algorithms and Bayes theorem for sample selection. We present Bayesian inversions for rupture parameters in synthetic earthquakes (i.e. for which the exact rupture history is known) in an attempt to identify the cross-over at which further model discretization (spatial and temporal resolution of the parameter space) is no longer attributed to a decreasing misfit. Identification of this cross-over is of importance as it reveals the resolution power of the studied data set (i.e. teleseismic body waves), enabling one to constrain kinematic earthquake rupture histories of real earthquakes at a resolution that is supported by data. In addition, the Bayesian approach allows for mapping complete posterior probability density functions of the desired kinematic source parameters, thus enabling us to rigorously assess the uncertainties in earthquake source inversions.
Entanglement dynamics in a non-Markovian environment: An exactly solvable model
NASA Astrophysics Data System (ADS)
Wilson, Justin H.; Fregoso, Benjamin M.; Galitski, Victor M.
2012-05-01
We study the non-Markovian effects on the dynamics of entanglement in an exactly solvable model that involves two independent oscillators, each coupled to its own stochastic noise source. First, we develop Lie algebraic and functional integral methods to find an exact solution to the single-oscillator problem which includes an analytic expression for the density matrix and the complete statistics, i.e., the probability distribution functions for observables. For long bath time correlations, we see nonmonotonic evolution of the uncertainties in observables. Further, we extend this exact solution to the two-particle problem and find the dynamics of entanglement in a subspace. We find the phenomena of “sudden death” and “rebirth” of entanglement. Interestingly, all memory effects enter via the functional form of the energy and hence the time of death and rebirth is controlled by the amount of noisy energy added into each oscillator. If this energy increases above (decreases below) a threshold, we obtain sudden death (rebirth) of entanglement.
Turbulent Statistics From Time-Resolved PIV Measurements of a Jet Using Empirical Mode Decomposition
NASA Technical Reports Server (NTRS)
Dahl, Milo D.
2013-01-01
Empirical mode decomposition is an adaptive signal processing method that when applied to a broadband signal, such as that generated by turbulence, acts as a set of band-pass filters. This process was applied to data from time-resolved, particle image velocimetry measurements of subsonic jets prior to computing the second-order, two-point, space-time correlations from which turbulent phase velocities and length and time scales could be determined. The application of this method to large sets of simultaneous time histories is new. In this initial study, the results are relevant to acoustic analogy source models for jet noise prediction. The high frequency portion of the results could provide the turbulent values for subgrid scale models for noise that is missed in large-eddy simulations. The results are also used to infer that the cross-correlations between different components of the decomposed signals at two points in space, neglected in this initial study, are important.
Turbulent Statistics from Time-Resolved PIV Measurements of a Jet Using Empirical Mode Decomposition
NASA Technical Reports Server (NTRS)
Dahl, Milo D.
2012-01-01
Empirical mode decomposition is an adaptive signal processing method that when applied to a broadband signal, such as that generated by turbulence, acts as a set of band-pass filters. This process was applied to data from time-resolved, particle image velocimetry measurements of subsonic jets prior to computing the second-order, two-point, space-time correlations from which turbulent phase velocities and length and time scales could be determined. The application of this method to large sets of simultaneous time histories is new. In this initial study, the results are relevant to acoustic analogy source models for jet noise prediction. The high frequency portion of the results could provide the turbulent values for subgrid scale models for noise that is missed in large-eddy simulations. The results are also used to infer that the cross-correlations between different components of the decomposed signals at two points in space, neglected in this initial study, are important.
Satellite freeze forecast system: Executive summary
NASA Technical Reports Server (NTRS)
Martsolf, J. D. (Principal Investigator)
1983-01-01
A satellite-based temperature monitoring and prediction system consisting of a computer controlled acquisition, processing, and display system and the ten automated weather stations called by that computer was developed and transferred to the national weather service. This satellite freeze forecasting system (SFFS) acquires satellite data from either one of two sources, surface data from 10 sites, displays the observed data in the form of color-coded thermal maps and in tables of automated weather station temperatures, computes predicted thermal maps when requested and displays such maps either automatically or manually, archives the data acquired, and makes comparisons with historical data. Except for the last function, SFFS handles these tasks in a highly automated fashion if the user so directs. The predicted thermal maps are the result of two models, one a physical energy budget of the soil and atmosphere interface and the other a statistical relationship between the sites at which the physical model predicts temperatures and each of the pixels of the satellite thermal map.
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-11-01
The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire variables were found to have a strong control on the occurrence of very rapid shallow landslides.
Skillful prediction of hot temperature extremes over the source region of ancient Silk Road.
Zhang, Jingyong; Yang, Zhanmei; Wu, Lingyun
2018-04-27
The source region of ancient Silk Road (SRASR) in China, a region of around 150 million people, faces a rapidly increased risk of extreme heat in summer. In this study, we develop statistical models to predict summer hot temperature extremes over the SRASR based on a timescale decomposition approach. Results show that after removing the linear trends, the inter-annual components of summer hot days and heatwaves over the SRASR are significantly related with those of spring soil temperature over Central Asia and sea surface temperature over Northwest Atlantic while their inter-decadal components are closely linked to those of spring East Pacific/North Pacific pattern and Atlantic Multidecadal Oscillation for 1979-2016. The physical processes involved are also discussed. Leave-one-out cross-validation for detrended 1979-2016 time series indicates that the statistical models based on identified spring predictors can predict 47% and 57% of the total variances of summer hot days and heatwaves averaged over the SRASR, respectively. When the linear trends are put back, the prediction skills increase substantially to 64% and 70%. Hindcast experiments for 2012-2016 show high skills in predicting spatial patterns of hot temperature extremes over the SRASR. The statistical models proposed herein can be easily applied to operational seasonal forecasting.
94 Mo(γ,n) and 90Zr(γ,n) cross-section measurements towards understanding the origin of p-nuclei
NASA Astrophysics Data System (ADS)
Meekins, E.; Banu, A.; Karwowski, H.; Silano, J.; Zimmerman, W.; Muller, J.; Rich, G.; Bhike, M.; Tornow, W.; McClesky, M.; Travaglio, C.
2014-09-01
The nucleosynthesis beyond iron of the rarest stable isotopes in the cosmos, the so-called p-nuclei, is one of the forefront topics in nuclear astrophysics. Recently, a stellar source was found that, for the first time, was able to produce both light and heavy p-nuclei almost at the same level as 56Fe, including the most debated 92,94Mo and 96,98Ru; it was also found that there is an important contribution from the p-process nucleosynthesis to the neutron magic nucleus 90Zr. We focus here on constraining the origin of p-nuclei through nuclear physics by studying two key astrophysical photoneutron reaction cross sections for 94Mo(γ,n) and 90Zr(γ,n). Their energy dependencies were measured using quasi-monochromatic photon beams from Duke University's High Intensity Gamma-ray Source facility at the respective neutron threshold energies up to 18 MeV. Preliminary results of these experimental cross sections will be presented along with their comparison to predictions by a statistical model based on the Hauser-Feshbach formalism implemented in codes like TALYS and SMARAGD. The nucleosynthesis beyond iron of the rarest stable isotopes in the cosmos, the so-called p-nuclei, is one of the forefront topics in nuclear astrophysics. Recently, a stellar source was found that, for the first time, was able to produce both light and heavy p-nuclei almost at the same level as 56Fe, including the most debated 92,94Mo and 96,98Ru; it was also found that there is an important contribution from the p-process nucleosynthesis to the neutron magic nucleus 90Zr. We focus here on constraining the origin of p-nuclei through nuclear physics by studying two key astrophysical photoneutron reaction cross sections for 94Mo(γ,n) and 90Zr(γ,n). Their energy dependencies were measured using quasi-monochromatic photon beams from Duke University's High Intensity Gamma-ray Source facility at the respective neutron threshold energies up to 18 MeV. Preliminary results of these experimental cross sections will be presented along with their comparison to predictions by a statistical model based on the Hauser-Feshbach formalism implemented in codes like TALYS and SMARAGD. This research was supported by the Research Corporation for Science Advancement.
A Simple Double-Source Model for Interference of Capillaries
ERIC Educational Resources Information Center
Hou, Zhibo; Zhao, Xiaohong; Xiao, Jinghua
2012-01-01
A simple but physically intuitive double-source model is proposed to explain the interferogram of a laser-capillary system, where two effective virtual sources are used to describe the rays reflected by and transmitted through the capillary. The locations of the two virtual sources are functions of the observing positions on the target screen. An…