Statistical Physics of High Dimensional Inference
NASA Astrophysics Data System (ADS)
Advani, Madhu; Ganguli, Surya
To model modern large-scale datasets, we need efficient algorithms to infer a set of P unknown model parameters from N noisy measurements. What are fundamental limits on the accuracy of parameter inference, given limited measurements, signal-to-noise ratios, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as the measurement density α =N/P --> ∞ . However, modern high-dimensional inference problems, in fields ranging from bio-informatics to economics, occur at finite α. We formulate and analyze high-dimensional inference analytically by applying the replica and cavity methods of statistical physics where data serves as quenched disorder and inferred parameters play the role of thermal degrees of freedom. Our analysis reveals that widely cherished Bayesian inference algorithms such as maximum likelihood and maximum a posteriori are suboptimal in the modern setting, and yields new tractable, optimal algorithms to replace them as well as novel bounds on the achievable accuracy of a large class of high-dimensional inference algorithms. Thanks to Stanford Graduate Fellowship and Mind Brain Computation IGERT grant for support.
High-dimensional statistical inference: From vector to matrix
NASA Astrophysics Data System (ADS)
Zhang, Anru
Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion). The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, delta kA < 1/3, deltak A+ thetak,kA < 1, or deltatkA < √( t - 1)/t for any given constant t ≥ 4/3 guarantee the exact recovery of all k sparse signals in the noiseless case through the constrained ℓ1 minimization, and similarly in affine rank minimization delta rM < 1/3, deltar M + thetar, rM < 1, or deltatrM< √( t - 1)/t ensure the exact reconstruction of all matrices with rank at most r in the noiseless case via the constrained nuclear norm minimization. Moreover, for any epsilon > 0, delta kA < 1/3 + epsilon, deltak A + thetak,kA < 1 + epsilon, or deltatkA< √(t - 1) / t + epsilon are not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar result also holds for matrix recovery. In addition, the conditions delta kA<1/3, deltak A+ thetak,kA<1, delta tkA < √(t - 1)/t and deltarM<1/3, delta rM+ thetar,rM<1, delta trM< √(t - 1)/ t are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case
Statistical inference and string theory
NASA Astrophysics Data System (ADS)
Heckman, Jonathan J.
2015-09-01
In this paper, we expose some surprising connections between string theory and statistical inference. We consider a large collective of agents sweeping out a family of nearby statistical models for an M-dimensional manifold of statistical fitting parameters. When the agents making nearby inferences align along a d-dimensional grid, we find that the pooled probability that the collective reaches a correct inference is the partition function of a nonlinear sigma model in d dimensions. Stability under perturbations to the original inference scheme requires the agents of the collective to distribute along two dimensions. Conformal invariance of the sigma model corresponds to the condition of a stable inference scheme, directly leading to the Einstein field equations for classical gravity. By summing over all possible arrangements of the agents in the collective, we reach a string theory. We also use this perspective to quantify how much an observer can hope to learn about the internal geometry of a superstring compactification. Finally, we present some brief speculative remarks on applications to the AdS/CFT correspondence and Lorentzian signature space-times.
NASA Astrophysics Data System (ADS)
Shehadeh, Mahmoud M.
The Greek word "nano," meaning dwarf, refers to a reduction of size or time by 10-9, which is one thousand times smaller than a micron. (The width across the head of a pin, for instance, is 1,000,000 nanometers). First introduced in the late 1970s, the concept of nanotechnology entails the manufacture and manipulation of objects, atoms, and molecules on a nanometer scale. Currently, nanoscience and its research constitute a complete spectrum of activities towards the promised next industrial revolution, and they span the whole spectrum of physical, chemical, biological, and mathematical sciences needed to develop new tools, models and techniques that help in expanding this technology. In this thesis, we first discuss the robust parameter design for nanostructures and data collection. We then model the nano data using multinomial logit and probit models and implement statistical inference using both the frequentist and Bayesian approaches. Moreover, the mean of probabilities of obtaining different types of nanostructures are obtained using Monte Carlo simulations, and these probabilities are maximized in order to find the conditions in which the desired nanostructures would be produced in large quantity.
NASA Astrophysics Data System (ADS)
Sjöstrand, Karl; Cardenas, Valerie A.; Larsen, Rasmus; Studholme, Colin
2008-03-01
Whole-brain morphometry denotes a group of methods with the aim of relating clinical and cognitive measurements to regions of the brain. Typically, such methods require the statistical analysis of a data set with many variables (voxels and exogenous variables) paired with few observations (subjects). A common approach to this ill-posed problem is to analyze each spatial variable separately, dividing the analysis into manageable subproblems. A disadvantage of this method is that the correlation structure of the spatial variables is not taken into account. This paper investigates the use of ridge regression to address this issue, allowing for a gradual introduction of correlation information into the model. We make the connections between ridge regression and voxel-wise procedures explicit and discuss relations to other statistical methods. Results are given on an in-vivo data set of deformation based morphometry from a study of cognitive decline in an elderly population.
Thermodynamics of cellular statistical inference
NASA Astrophysics Data System (ADS)
Lang, Alex; Fisher, Charles; Mehta, Pankaj
2014-03-01
Successful organisms must be capable of accurately sensing the surrounding environment in order to locate nutrients and evade toxins or predators. However, single cell organisms face a multitude of limitations on their accuracy of sensing. Berg and Purcell first examined the canonical example of statistical limitations to cellular learning of a diffusing chemical and established a fundamental limit to statistical accuracy. Recent work has shown that the Berg and Purcell learning limit can be exceeded using Maximum Likelihood Estimation. Here, we recast the cellular sensing problem as a statistical inference problem and discuss the relationship between the efficiency of an estimator and its thermodynamic properties. We explicitly model a single non-equilibrium receptor and examine the constraints on statistical inference imposed by noisy biochemical networks. Our work shows that cells must balance sample number, specificity, and energy consumption when performing statistical inference. These tradeoffs place significant constraints on the practical implementation of statistical estimators in a cell.
Statistical learning and selective inference
Taylor, Jonathan; Tibshirani, Robert J.
2015-01-01
We describe the problem of “selective inference.” This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have “cherry-picked”—searched for the strongest associations—means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis. PMID:26100887
Statistical Inference at Work: Statistical Process Control as an Example
ERIC Educational Resources Information Center
Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia
2008-01-01
To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…
The Reasoning behind Informal Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie; Bakker, Arthur; Ben-Zvi, Dani
2011-01-01
Informal statistical inference (ISI) has been a frequent focus of recent research in statistics education. Considering the role that context plays in developing ISI calls into question the need to be more explicit about the reasoning that underpins ISI. This paper uses educational literature on informal statistical inference and philosophical…
Predict! Teaching Statistics Using Informational Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie
2013-01-01
Statistics is one of the most widely used topics for everyday life in the school mathematics curriculum. Unfortunately, the statistics taught in schools focuses on calculations and procedures before students have a chance to see it as a useful and powerful tool. Researchers have found that a dominant view of statistics is as an assortment of tools…
Local and Global Thinking in Statistical Inference
ERIC Educational Resources Information Center
Pratt, Dave; Johnston-Wilder, Peter; Ainley, Janet; Mason, John
2008-01-01
In this reflective paper, we explore students' local and global thinking about informal statistical inference through our observations of 10- to 11-year-olds, challenged to infer the unknown configuration of a virtual die, but able to use the die to generate as much data as they felt necessary. We report how they tended to focus on local changes…
Ranald Macdonald and statistical inference.
Smith, Philip T
2009-05-01
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing. PMID:19351454
Making statistical inferences about software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1988-01-01
Failure times of software undergoing random debugging can be modelled as order statistics of independent but nonidentically distributed exponential random variables. Using this model inferences can be made about current reliability and, if debugging continues, future reliability. This model also shows the difficulty inherent in statistical verification of very highly reliable software such as that used by digital avionics in commercial aircraft.
Making statistical inferences about software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1986-01-01
Failure times of software undergoing random debugging can be modeled as order statistics of independent but nonidentically distributed exponential random variables. Using this model inferences can be made about current reliability and, if debugging continues, future reliability. This model also shows the difficulty inherent in statistical verification of very highly reliable software such as that used by digital avionics in commercial aircraft.
Investigating Mathematics Teachers' Thoughts of Statistical Inference
ERIC Educational Resources Information Center
Yang, Kai-Lin
2012-01-01
Research on statistical cognition and application suggests that statistical inference concepts are commonly misunderstood by students and even misinterpreted by researchers. Although some research has been done on students' misunderstanding or misconceptions of confidence intervals (CIs), few studies explore either students' or mathematics…
Inference and the introductory statistics course
NASA Astrophysics Data System (ADS)
Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross
2011-10-01
This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its hypothetical probabilistic reasoning process is examined in some depth. We argue that the revolution in the teaching of inference must begin. We also discuss some perplexing issues, problematic areas and some new insights into language conundrums associated with introducing the logic of inference through randomization methods.
Statistical Inference in Retrieval Effectiveness Evaluation.
ERIC Educational Resources Information Center
Savoy, Jacques
1997-01-01
Discussion of evaluation methodology in information retrieval focuses on the average precision over a set of fixed recall values in an effort to evaluate the retrieval effectiveness of a search algorithm. Highlights include a review of traditional evaluation methodology with examples; and a statistical inference methodology called bootstrap.…
Inference and the Introductory Statistics Course
ERIC Educational Resources Information Center
Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross
2011-01-01
This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its…
Pointwise probability reinforcements for robust statistical inference.
Frénay, Benoît; Verleysen, Michel
2014-02-01
Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. PMID:24300550
Asymptotic theory of quantum statistical inference
NASA Astrophysics Data System (ADS)
Hayashi, Masahito
Part I: Hypothesis Testing: Introduction to Part I -- Strong Converse and Stein's lemma in quantum hypothesis testing/Tomohiro Ogawa and Hiroshi Nagaoka -- The proper formula for relative entropy and its asymptotics in quantum probability/Fumio Hiai and Dénes Petz -- Strong Converse theorems in Quantum Information Theory/Hiroshi Nagaoka -- Asymptotics of quantum relative entropy from a representation theoretical viewpoint/Masahito Hayashi -- Quantum birthday problems: geometrical aspects of Quantum Random Coding/Akio Fujiwara -- Part II: Quantum Cramèr-Rao Bound in Mixed States Model: Introduction to Part II -- A new approach to Cramèr-Rao Bounds for quantum state estimation/Hiroshi Nagaoka -- On Fisher information of Quantum Statistical Models/Hiroshi Nagaoka -- On the parameter estimation problem for Quantum Statistical Models/Hiroshi Nagaoka -- A generalization of the simultaneous diagonalization of Hermitian matrices and its relation to Quantum Estimation Theory/Hiroshi Nagaoka -- A linear programming approach to Attainable Cramèr-Rao Type Bounds/Masahito Hayashi -- Statistical model with measurement degree of freedom and quantum physics/Masahito Hayashi and Keiji Matsumoto -- Asymptotic Quantum Theory for the Thermal States Family/Masahito Hayashi -- State estimation for large ensembles/Richard D. Gill and Serge Massar -- Part III: Quantum Cramèr-Rao Bound in Pure States Model: Introduction to Part III-- Quantum Fisher Metric and estimation for Pure State Models/Akio Fujiwara and Hiroshi Nagaoka -- Geometry of Quantum Estimation Theory/Akio Fujiwara -- An estimation theoretical characterization of coherent states/Akio Fujiwara and Hiroshi Nagaoka -- A geometrical approach to Quantum Estimation Theory/Keiji Matsumoto -- Part IV: Group symmetric approach to Pure States Model: Introduction to Part IV -- Optimal extraction of information from finite quantum ensembles/Serge Massar and Sandu Popescu -- Asymptotic Estimation Theory for a Finite-Dimensional Pure
Likelihood-Free Inference in High-Dimensional Models.
Kousathanas, Athanasios; Leuenberger, Christoph; Helfer, Jonas; Quinodoz, Mathieu; Foll, Matthieu; Wegmann, Daniel
2016-06-01
Methods that bypass analytical evaluations of the likelihood function have become an indispensable tool for statistical inference in many fields of science. These so-called likelihood-free methods rely on accepting and rejecting simulations based on summary statistics, which limits them to low-dimensional models for which the value of the likelihood is large enough to result in manageable acceptance rates. To get around these issues, we introduce a novel, likelihood-free Markov chain Monte Carlo (MCMC) method combining two key innovations: updating only one parameter per iteration and accepting or rejecting this update based on subsets of statistics approximately sufficient for this parameter. This increases acceptance rates dramatically, rendering this approach suitable even for models of very high dimensionality. We further derive that for linear models, a one-dimensional combination of statistics per parameter is sufficient and can be found empirically with simulations. Finally, we demonstrate that our method readily scales to models of very high dimensionality, using toy models as well as by jointly inferring the effective population size, the distribution of fitness effects (DFE) of segregating mutations, and selection coefficients for each locus from data of a recent experiment on the evolution of drug resistance in influenza. PMID:27052569
The renormalization group via statistical inference
NASA Astrophysics Data System (ADS)
Bény, Cédric; Osborne, Tobias J.
2015-08-01
In physics, one attempts to infer the rules governing a system given only the results of imperfect measurements. Hence, microscopic theories may be effectively indistinguishable experimentally. We develop an operationally motivated procedure to identify the corresponding equivalence classes of states, and argue that the renormalization group (RG) arises from the inherent ambiguities associated with the classes: one encounters flow parameters as, e.g., a regulator, a scale, or a measure of precision, which specify representatives in a given equivalence class. This provides a unifying framework and reveals the role played by information in renormalization. We validate this idea by showing that it justifies the use of low-momenta n-point functions as statistically relevant observables around a Gaussian hypothesis. These results enable the calculation of distinguishability in quantum field theory. Our methods also provide a way to extend renormalization techniques to effective models which are not based on the usual quantum-field formalism, and elucidates the relationships between various type of RG.
Verbal framing of statistical evidence drives children's preference inferences.
Garvin, Laura E; Woodward, Amanda L
2015-05-01
Although research has shown that statistical information can support children's inferences about specific psychological causes of others' behavior, previous work leaves open the question of how children interpret statistical information in more ambiguous situations. The current studies investigated the effect of specific verbal framing information on children's ability to infer mental states from statistical regularities in behavior. We found that preschool children inferred others' preferences from their statistically non-random choices only when they were provided with verbal information placing the person's behavior in a specifically preference-related context, not when the behavior was presented in a non-mentalistic action context or an intentional choice context. Furthermore, verbal framing information showed some evidence of supporting children's mental state inferences even from more ambiguous statistical data. These results highlight the role that specific, relevant framing information can play in supporting children's ability to derive novel insights from statistical information. PMID:25704581
An argument for mechanism-based statistical inference in cancer
Ochs, Michael; Price, Nathan D.; Tomasetti, Cristian; Younes, Laurent
2015-01-01
Cancer is perhaps the prototypical systems disease, and as such has been the focus of extensive study in quantitative systems biology. However, translating these programs into personalized clinical care remains elusive and incomplete. In this perspective, we argue that realizing this agenda—in particular, predicting disease phenotypes, progression and treatment response for individuals—requires going well beyond standard computational and bioinformatics tools and algorithms. It entails designing global mathematical models over network-scale configurations of genomic states and molecular concentrations, and learning the model parameters from limited available samples of high-dimensional and integrative omics data. As such, any plausible design should accommodate: biological mechanism, necessary for both feasible learning and interpretable decision making; stochasticity, to deal with uncertainty and observed variation at many scales; and a capacity for statistical inference at the patient level. This program, which requires a close, sustained collaboration between mathematicians and biologists, is illustrated in several contexts, including learning bio-markers, metabolism, cell signaling, network inference and tumorigenesis. PMID:25381197
Nuclear Forensic Inferences Using Iterative Multidimensional Statistics
Robel, M; Kristo, M J; Heller, M A
2009-06-09
Nuclear forensics involves the analysis of interdicted nuclear material for specific material characteristics (referred to as 'signatures') that imply specific geographical locations, production processes, culprit intentions, etc. Predictive signatures rely on expert knowledge of physics, chemistry, and engineering to develop inferences from these material characteristics. Comparative signatures, on the other hand, rely on comparison of the material characteristics of the interdicted sample (the 'questioned sample' in FBI parlance) with those of a set of known samples. In the ideal case, the set of known samples would be a comprehensive nuclear forensics database, a database which does not currently exist. In fact, our ability to analyze interdicted samples and produce an extensive list of precise materials characteristics far exceeds our ability to interpret the results. Therefore, as we seek to develop the extensive databases necessary for nuclear forensics, we must also develop the methods necessary to produce the necessary inferences from comparison of our analytical results with these large, multidimensional sets of data. In the work reported here, we used a large, multidimensional dataset of results from quality control analyses of uranium ore concentrate (UOC, sometimes called 'yellowcake'). We have found that traditional multidimensional techniques, such as principal components analysis (PCA), are especially useful for understanding such datasets and drawing relevant conclusions. In particular, we have developed an iterative partial least squares-discriminant analysis (PLS-DA) procedure that has proven especially adept at identifying the production location of unknown UOC samples. By removing classes which fell far outside the initial decision boundary, and then rebuilding the PLS-DA model, we have consistently produced better and more definitive attributions than with a single pass classification approach. Performance of the iterative PLS-DA method
Network topology inference from infection statistics
NASA Astrophysics Data System (ADS)
Tomovski, Igor; Kocarev, Ljupčo
2015-10-01
We introduce a mathematical framework for identification of network topology, based on data collected from infectious SIS process occurring on a network. An exact expression for the weight of each network link (existing or not) as a function of infectious statistics, is obtained. An algorithm for proper implementation of the analyzed concept is suggested and the validity of the obtained result is confirmed by numerical simulations performed on a number of synthetic (computer generated) networks.
Simultaneous Statistical Inference for Epigenetic Data
Schildknecht, Konstantin; Olek, Sven; Dickhaus, Thorsten
2015-01-01
Epigenetic research leads to complex data structures. Since parametric model assumptions for the distribution of epigenetic data are hard to verify we introduce in the present work a nonparametric statistical framework for two-group comparisons. Furthermore, epigenetic analyses are often performed at various genetic loci simultaneously. Hence, in order to be able to draw valid conclusions for specific loci, an appropriate multiple testing correction is necessary. Finally, with technologies available for the simultaneous assessment of many interrelated biological parameters (such as gene arrays), statistical approaches also need to deal with a possibly unknown dependency structure in the data. Our statistical approach to the nonparametric comparison of two samples with independent multivariate observables is based on recently developed multivariate multiple permutation tests. We adapt their theory in order to cope with families of hypotheses regarding relative effects. Our results indicate that the multivariate multiple permutation test keeps the pre-assigned type I error level for the global null hypothesis. In combination with the closure principle, the family-wise error rate for the simultaneous test of the corresponding locus/parameter-specific null hypotheses can be controlled. In applications we demonstrate that group differences in epigenetic data can be detected reliably with our methodology. PMID:25965389
Unequal Division of Type I Risk in Statistical Inferences
ERIC Educational Resources Information Center
Meek, Gary E.; Ozgur, Ceyhun O.
2004-01-01
Introductory statistics texts give extensive coverage to two-sided inferences in hypothesis testing, interval estimation, and one-sided hypothesis tests. Very few discuss the possibility of one-sided interval estimation at all. Even fewer do so in any detail. Two of the business statistics texts we reviewed mentioned the possibility of dividing…
Introducing Statistical Inference to Biology Students through Bootstrapping and Randomization
ERIC Educational Resources Information Center
Lock, Robin H.; Lock, Patti Frazer
2008-01-01
Bootstrap methods and randomization tests are increasingly being used as alternatives to standard statistical procedures in biology. They also serve as an effective introduction to the key ideas of statistical inference in introductory courses for biology students. We discuss the use of such simulation based procedures in an integrated curriculum…
LOWER LEVEL INFERENCE CONTROL IN STATISTICAL DATABASE SYSTEMS
Lipton, D.L.; Wong, H.K.T.
1984-02-01
An inference is the process of transforming unclassified data values into confidential data values. Most previous research in inference control has studied the use of statistical aggregates to deduce individual records. However, several other types of inference are also possible. Unknown functional dependencies may be apparent to users who have 'expert' knowledge about the characteristics of a population. Some correlations between attributes may be concluded from 'commonly-known' facts about the world. To counter these threats, security managers should use random sampling of databases of similar populations, as well as expert systems. 'Expert' users of the DATABASE SYSTEM may form inferences from the variable performance of the user interface. Users may observe on-line turn-around time, accounting statistics. the error message received, and the point at which an interactive protocol sequence fails. One may obtain information about the frequency distributions of attribute values, and the validity of data object names from this information. At the back-end of a database system, improved software engineering practices will reduce opportunities to bypass functional units of the database system. The term 'DATA OBJECT' should be expanded to incorporate these data object types which generate new classes of threats. The security of DATABASES and DATABASE SySTEMS must be recognized as separate but related problems. Thus, by increased awareness of lower level inferences, system security managers may effectively nullify the threat posed by lower level inferences.
A Framework for Thinking about Informal Statistical Inference
ERIC Educational Resources Information Center
Makar, Katie; Rubin, Andee
2009-01-01
Informal inferential reasoning has shown some promise in developing students' deeper understanding of statistical processes. This paper presents a framework to think about three key principles of informal inference--generalizations "beyond the data," probabilistic language, and data as evidence. The authors use primary school classroom episodes…
Targeted estimation of nuisance parameters to obtain valid statistical inference.
van der Laan, Mark J
2014-01-01
In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries. PMID:25822617
Statistical Detection of EEG Synchrony Using Empirical Bayesian Inference
Singh, Archana K.; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries. PMID:25822617
Inference in high-dimensional parameter space.
O'Hare, Anthony
2015-11-01
Model parameter inference has become increasingly popular in recent years in the field of computational epidemiology, especially for models with a large number of parameters. Techniques such as Approximate Bayesian Computation (ABC) or maximum/partial likelihoods are commonly used to infer parameters in phenomenological models that best describe some set of data. These techniques rely on efficient exploration of the underlying parameter space, which is difficult in high dimensions, especially if there are correlations between the parameters in the model that may not be known a priori. The aim of this article is to demonstrate the use of the recently invented Adaptive Metropolis algorithm for exploring parameter space in a practical way through the use of a simple epidemiological model. PMID:26176624
Breakdown of statistical inference from some random experiments
NASA Astrophysics Data System (ADS)
Kupczynski, Marian; De Raedt, Hans
2016-03-01
Many experiments can be interpreted in terms of random processes operating according to some internal protocols. When experiments are costly or cannot be repeated only one or a few finite samples are available. In this paper we study data generated by pseudo-random computer experiments operating according to particular internal protocols. We show that the standard statistical analysis performed on a sample, containing 105 data points or more, may sometimes be highly misleading and statistical errors largely underestimated. Our results confirm in a dramatic way the dangers of standard asymptotic statistical inference if a sample is not homogeneous. We demonstrate that analyzing various subdivisions of samples by multiple chi-square tests and chi-square frequency graphs is very effective in detecting sample inhomogeneity. Therefore to assure correctness of the statistical inference the above mentioned chi-square tests and other non-parametric sample homogeneity tests should be incorporated in any statistical analysis of experimental data. If such tests are not performed the reported conclusions and estimates of the errors cannot be trusted.
Indirect Fourier transform in the context of statistical inference.
Muthig, Michael; Prévost, Sylvain; Orglmeister, Reinhold; Gradzielski, Michael
2016-09-01
Inferring structural information from the intensity of a small-angle scattering (SAS) experiment is an ill-posed inverse problem. Thus, the determination of a solution is in general non-trivial. In this work, the indirect Fourier transform (IFT), which determines the pair distance distribution function from the intensity and hence yields structural information, is discussed within two different statistical inference approaches, namely a frequentist one and a Bayesian one, in order to determine a solution objectively From the frequentist approach the cross-validation method is obtained as a good practical objective function for selecting an IFT solution. Moreover, modern machine learning methods are employed to suppress oscillatory behaviour of the solution, hence extracting only meaningful features of the solution. By comparing the results yielded by the different methods presented here, the reliability of the outcome can be improved and thus the approach should enable more reliable information to be deduced from SAS experiments. PMID:27580204
Statistical Inference for Big Data Problems in Molecular Biophysics
Ramanathan, Arvind; Savol, Andrej; Burger, Virginia; Quinn, Shannon; Agarwal, Pratul K; Chennubhotla, Chakra
2012-01-01
We highlight the role of statistical inference techniques in providing biological insights from analyzing long time-scale molecular simulation data. Technologi- cal and algorithmic improvements in computation have brought molecular simu- lations to the forefront of techniques applied to investigating the basis of living systems. While these longer simulations, increasingly complex reaching petabyte scales presently, promise a detailed view into microscopic behavior, teasing out the important information has now become a true challenge on its own. Mining this data for important patterns is critical to automating therapeutic intervention discovery, improving protein design, and fundamentally understanding the mech- anistic basis of cellular homeostasis.
Two dimensional unstable scar statistics.
Warne, Larry Kevin; Jorgenson, Roy Eberhardt; Kotulski, Joseph Daniel; Lee, Kelvin S. H. (ITT Industries/AES Los Angeles, CA)
2006-12-01
This report examines the localization of time harmonic high frequency modal fields in two dimensional cavities along periodic paths between opposing sides of the cavity. The cases where these orbits lead to unstable localized modes are known as scars. This paper examines the enhancements for these unstable orbits when the opposing mirrors are both convex and concave. In the latter case the construction includes the treatment of interior foci.
Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference
Beggs, W.J.
1981-02-01
This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; the analysis of variance; quality control procedures; and linear regression analysis.
NASA Astrophysics Data System (ADS)
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods
NASA Astrophysics Data System (ADS)
Vali Ahmadi, Mohammad; Doostparast, Mahdi; Ahmadi, Jafar
2015-04-01
In manufacturing industries, the lifetime of an item is usually characterised by a random variable X and considered to be satisfactory if X exceeds a given lower lifetime limit L. The probability of a satisfactory item is then ηL := P(X ≥ L), called conforming rate. In industrial companies, however, the lifetime performance index, proposed by Montgomery and denoted by CL, is widely used as a process capability index instead of the conforming rate. Assuming a parametric model for the random variable X, we show that there is a connection between the conforming rate and the lifetime performance index. Consequently, the statistical inferences about ηL and CL are equivalent. Hence, we restrict ourselves to statistical inference for CL based on generalised order statistics, which contains several ordered data models such as usual order statistics, progressively Type-II censored data and records. Various point and interval estimators for the parameter CL are obtained and optimal critical regions for the hypothesis testing problems concerning CL are proposed. Finally, two real data-sets on the lifetimes of insulating fluid and ball bearings, due to Nelson (1982) and Caroni (2002), respectively, and a simulated sample are analysed.
Statistical Inference for Point Process Models of Rainfall
NASA Astrophysics Data System (ADS)
Smith, James A.; Karr, Alan F.
1985-01-01
In this paper we develop maximum likelihood procedures for parameter estimation and model selection that apply to a large class of point process models that have been used to model rainfall occurrences, including Cox processes, Neyman-Scott processes, and renewal processes. The statistical inference procedures are based on the stochastic intensity λ(t) = lims→0,s>0 (1/s)E[N(t + s) - N(t)|N(u), u < t]. The likelihood function of a point process is shown to have a simple expression in terms of the stochastic intensity. The main result of this paper is a recursive procedure for computing stochastic intensities; the procedure is applicable to a broad class of point process models, including renewal Cox process with Markovian intensity processes and an important class of Neyman-Scott processes. The model selection procedure we propose, which is based on likelihood ratios, allows direct comparison of two classes of point processes to determine which provides a better model for a given data set. The estimation and model selection procedures are applied to two data sets of simulated Cox process arrivals and a data set of daily rainfall occurrences in the Potomac River basin.
Simple statistical inference algorithms for task-dependent wellness assessment.
Kailas, A; Chong, C-C; Watanabe, F
2012-07-01
Stress is a key indicator of wellness in human beings and a prime contributor to performance degradation and errors during various human tasks. The overriding purpose of this paper is to propose two algorithms (probabilistic and non-probabilistic) that iteratively track stress states to compute a wellness index in terms of the stress levels. This paper adopts the physiological view-point that high stress is accompanied with large deviations in biometrics such as body temperature, heart rate, etc., and the proposed algorithms iteratively track these fluctuations to compute a personalized wellness index that is correlated to the engagement levels of the tasks performed by the user. In essence, this paper presents a quantitative relationship between temperature, occupational stress, and wellness during different tasks. The simplicity of the statistical inference algorithms make them favorable candidates for implementation on mobile platforms such as smart phones in the future, thereby providing users an inexpensive application for self-wellness monitoring for a healthier lifestyle. PMID:22676998
Multivariate Statistical Inference of Lightning Occurrence, and Using Lightning Observations
NASA Technical Reports Server (NTRS)
Boccippio, Dennis
2004-01-01
Two classes of multivariate statistical inference using TRMM Lightning Imaging Sensor, Precipitation Radar, and Microwave Imager observation are studied, using nonlinear classification neural networks as inferential tools. The very large and globally representative data sample provided by TRMM allows both training and validation (without overfitting) of neural networks with many degrees of freedom. In the first study, the flashing / or flashing condition of storm complexes is diagnosed using radar, passive microwave and/or environmental observations as neural network inputs. The diagnostic skill of these simple lightning/no-lightning classifiers can be quite high, over land (above 80% Probability of Detection; below 20% False Alarm Rate). In the second, passive microwave and lightning observations are used to diagnose radar reflectivity vertical structure. A priori diagnosis of hydrometeor vertical structure is highly important for improved rainfall retrieval from either orbital radars (e.g., the future Global Precipitation Mission "mothership") or radiometers (e.g., operational SSM/I and future Global Precipitation Mission passive microwave constellation platforms), we explore the incremental benefit to such diagnosis provided by lightning observations.
Physics of epigenetic landscapes and statistical inference by cells
NASA Astrophysics Data System (ADS)
Lang, Alex H.
Biology is currently in the midst of a revolution. Great technological advances have led to unprecedented quantitative data at the whole genome level. However, new techniques are needed to deal with this deluge of high-dimensional data. Therefore, statistical physics has the potential to help develop systems biology level models that can incorporate complex data. Additionally, physicists have made great strides in understanding non-equilibrium thermodynamics. However, the consequences of these advances have yet to be fully incorporated into biology. There are three specific problems that I address in my dissertation. First, a common metaphor for describing development is a rugged "epigenetic landscape'' where cell fates are represented as attracting valleys resulting from a complex regulatory network. I introduce a framework for explicitly constructing epigenetic landscapes that combines genomic data with techniques from spin-glass physics. The model reproduces known reprogramming protocols and identifies candidate transcription factors for reprogramming to novel cell fates, suggesting epigenetic landscapes are a powerful paradigm for understanding cellular identity. Second, I examine the dynamics of cellular reprogramming. By reanalyzing all available time-series data, I show that gene expression dynamics during reprogramming follow a simple one-dimensional reaction coordinate that is independent of both the time and details of experimental protocol used. I show that such a reaction coordinate emerges naturally from epigenetic landscape models of cell identity where cellular reprogramming is viewed as a "barrier-crossing'' between the starting and ending cell fates. Overall, the analysis and model suggest that gene expression dynamics during reprogramming follow a canonical trajectory consistent with the idea of an ``optimal path'' in gene expression space for reprogramming. Third, an important task of cells is to perform complex computations in response to
Statistical challenges of high-dimensional data
Johnstone, Iain M.; Titterington, D. Michael
2009-01-01
Modern applications of statistical theory and methods can involve extremely large datasets, often with huge numbers of measurements on each of a comparatively small number of experimental units. New methodology and accompanying theory have emerged in response: the goal of this Theme Issue is to illustrate a number of these recent developments. This overview article introduces the difficulties that arise with high-dimensional data in the context of the very familiar linear statistical model: we give a taste of what can nevertheless be achieved when the parameter vector of interest is sparse, that is, contains many zero elements. We describe other ways of identifying low-dimensional subspaces of the data space that contain all useful information. The topic of classification is then reviewed along with the problem of identifying, from within a very large set, the variables that help to classify observations. Brief mention is made of the visualization of high-dimensional data and ways to handle computational problems in Bayesian analysis are described. At appropriate points, reference is made to the other papers in the issue. PMID:19805443
Statistical Inference in the Learning of Novel Phonetic Categories
ERIC Educational Resources Information Center
Zhao, Yuan
2010-01-01
Learning a phonetic category (or any linguistic category) requires integrating different sources of information. A crucial unsolved problem for phonetic learning is how this integration occurs: how can we update our previous knowledge about a phonetic category as we hear new exemplars of the category? One model of learning is Bayesian Inference,…
Building Intuitions about Statistical Inference Based on Resampling
ERIC Educational Resources Information Center
Watson, Jane; Chance, Beth
2012-01-01
Formal inference, which makes theoretical assumptions about distributions and applies hypothesis testing procedures with null and alternative hypotheses, is notoriously difficult for tertiary students to master. The debate about whether this content should appear in Years 11 and 12 of the "Australian Curriculum: Mathematics" has gone on for…
ERIC Educational Resources Information Center
Larwin, Karen H.; Larwin, David A.
2011-01-01
Bootstrapping methods and random distribution methods are increasingly recommended as better approaches for teaching students about statistical inference in introductory-level statistics courses. The authors examined the effect of teaching undergraduate business statistics students using random distribution and bootstrapping simulations. It is the…
Statistical Inferences from Formaldehyde Dna-Protein Cross-Link Data
Physiologically-based pharmacokinetic (PBPK) modeling has reached considerable sophistication in its application in the pharmacological and environmental health areas. Yet, mature methodologies for making statistical inferences have not been routinely incorporated in these applic...
Statistical mechanics of complex neural systems and high dimensional data
NASA Astrophysics Data System (ADS)
Advani, Madhu; Lahiri, Subhaneil; Ganguli, Surya
2013-03-01
Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? Second, how can we extract meaningful models of neuronal systems from high dimensional datasets? To aid in these challenges, we give a pedagogical review of a collection of ideas and theoretical methods arising at the intersection of statistical physics, computer science and neurobiology. We introduce the interrelated replica and cavity methods, which originated in statistical physics as powerful ways to quantitatively analyze large highly heterogeneous systems of many interacting degrees of freedom. We also introduce the closely related notion of message passing in graphical models, which originated in computer science as a distributed algorithm capable of solving large inference and optimization problems involving many coupled variables. We then show how both the statistical physics and computer science perspectives can be applied in a wide diversity of contexts to problems arising in theoretical neuroscience and data analysis. Along the way we discuss spin glasses, learning theory, illusions of structure in noise, random matrices, dimensionality reduction and compressed sensing, all within the unified formalism of the replica method. Moreover, we review recent conceptual connections between message passing in graphical models, and neural computation and learning. Overall, these ideas illustrate how statistical physics and computer science might provide a lens through which we can uncover emergent computational functions buried deep within the dynamical complexities of neuronal networks.
Statistical inference for exploratory data analysis and model diagnostics.
Buja, Andreas; Cook, Dianne; Hofmann, Heike; Lawrence, Michael; Lee, Eun-Kyung; Swayne, Deborah F; Wickham, Hadley
2009-11-13
We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries' is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the 'lineup' popular from criminal legal procedures. Another protocol modelled after the 'Rorschach' inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students' statistical thinking. PMID:19805449
Technology Focus: Using Technology to Explore Statistical Inference
ERIC Educational Resources Information Center
Garofalo, Joe; Juersivich, Nicole
2007-01-01
There is much research that documents what many teachers know, that students struggle with many concepts in probability and statistics. This article presents two sample activities the authors use to help preservice teachers develop ideas about how they can use technology to promote their students' ability to understand mathematics and connect…
Trans-dimensional Bayesian inference for large sequential data sets
NASA Astrophysics Data System (ADS)
Mandolesi, E.; Dettmer, J.; Dosso, S. E.; Holland, C. W.
2015-12-01
This work develops a sequential Monte Carlo method to infer seismic parameters of layered seabeds from large sequential reflection-coefficient data sets. The approach provides parameter estimates and uncertainties along survey tracks with the goal to aid in the detection of unexploded ordnance in shallow water. The sequential data are acquired by a moving platform with source and receiver array towed close to the seabed. This geometry requires consideration of spherical reflection coefficients, computed efficiently by massively parallel implementation of the Sommerfeld integral via Levin integration on a graphics processing unit. The seabed is parametrized with a trans-dimensional model to account for changes in the environment (i.e. changes in layering) along the track. The method combines advanced Markov chain Monte Carlo methods (annealing) with particle filtering (resampling). Since data from closely-spaced source transmissions (pings) often sample similar environments, the solution from one ping can be utilized to efficiently estimate the posterior for data from subsequent pings. Since reflection-coefficient data are highly informative, the likelihood function can be extremely peaked, resulting in little overlap between posteriors of adjacent pings. This is addressed by adding bridging distributions (via annealed importance sampling) between pings for more efficient transitions. The approach assumes the environment to be changing slowly enough to justify the local 1D parametrization. However, bridging allows rapid changes between pings to be addressed and we demonstrate the method to be stable in such situations. Results are in terms of trans-D parameter estimates and uncertainties along the track. The algorithm is examined for realistic simulated data along a track and applied to a dataset collected by an autonomous underwater vehicle on the Malta Plateau, Mediterranean Sea. [Work supported by the SERDP, DoD.
Statistical Inference and Sensitivity to Sampling in 11-Month-Old Infants
ERIC Educational Resources Information Center
Xu, Fei; Denison, Stephanie
2009-01-01
Research on initial conceptual knowledge and research on early statistical learning mechanisms have been, for the most part, two separate enterprises. We report a study with 11-month-old infants investigating whether they are sensitive to sampling conditions and whether they can integrate intentional information in a statistical inference task.…
PyClone: Statistical inference of clonal population structure in cancer
Roth, Andrew; Khattra, Jaswinder; Yap, Damian; Wan, Adrian; Laks, Emma; Biele, Justina; Ha, Gavin; Aparicio, Samuel; Bouchard-Côté, Alexandre; Shah, Sohrab P.
2016-01-01
We introduce a novel statistical method, PyClone, for inference of clonal population structures in cancers. PyClone is a Bayesian clustering method for grouping sets of deeply sequenced somatic mutations into putative clonal clusters while estimating their cellular prevalences and accounting for allelic imbalances introduced by segmental copy number changes and normal cell contamination. Single cell sequencing validation demonstrates that PyClone infers accurate clustering of mutations that co-occur in individual cells. PMID:24633410
Circumpulsar Asteroids: Inferences from Nulling Statistics and High Energy Correlations
NASA Astrophysics Data System (ADS)
Shannon, Ryan; Cordes, J. M.
2006-12-01
We have proposed that some classes of radio pulsar variability are associated with the entry of neutral asteroidal material into the pulsar magnetosphere. The region surrounding neutron stars is polluted with supernova fall-back material, which collapses and condenses into an asteroid-bearing disk that is stable for millions of years. Over time, collisional and radiative processes cause the asteroids to migrate inward until they are heated to the point of ionization. For older and cooler pulsars, asteroids ionize within the large magnetospheres and inject a sufficient amount of charged particles to alter the electrodynamics of the gap regions and modulate emission processes. This extrinsic model unifies many observed phenomena of variability that occur on time scales that are disparate with the much shorter time scales associated with pulsars and their magnetospheres. One such type of variability is nulling, in which certain pulsars exhibit episodes of quiescence that for some objects may be as short as a few pulse periods, but, for others, is longer than days. Here, in the context of this model, we examine the nulling phenomenon. We analyze the relationship between in-falling material and the statistics of nulling. In addition, as motivation for further high energy observations, we consider the relationship between the nulling and other magnetospheric processes.
Statistical inference from capture data on closed animal populations
Otis, David L.; Burnham, Kenneth P.; White, Gary C.; Anderson, David R.
1978-01-01
The estimation of animal abundance is an important problem in both the theoretical and applied biological sciences. Serious work to develop estimation methods began during the 1950s, with a few attempts before that time. The literature on estimation methods has increased tremendously during the past 25 years (Cormack 1968, Seber 1973). However, in large part, the problem remains unsolved. Past efforts toward comprehensive and systematic estimation of density (D) or population size (N) have been inadequate, in general. While more than 200 papers have been published on the subject, one is generally left without a unified approach to the estimation of abundance of an animal population This situation is unfortunate because a number of pressing research problems require such information. In addition, a wide array of environmental assessment studies and biological inventory programs require the estimation of animal abundance. These needs have been further emphasized by the requirement for the preparation of Environmental Impact Statements imposed by the National Environmental Protection Act in 1970. This publication treats inference procedures for certain types of capture data on closed animal populations. This includes multiple capture-recapture studies (variously called capture-mark-recapture, mark-recapture, or tag-recapture studies) involving livetrapping techniques and removal studies involving kill traps or at least temporary removal of captured individuals during the study. Animals do not necessarily need to be physically trapped; visual sightings of marked animals and electrofishing studies also produce data suitable for the methods described in this monograph. To provide a frame of reference for what follows, we give an exampled of a capture-recapture experiment to estimate population size of small animals using live traps. The general field experiment is similar for all capture-recapture studies (a removal study is, of course, slightly different). A typical
Social Inferences from Faces: Ambient Images Generate a Three-Dimensional Model
ERIC Educational Resources Information Center
Sutherland, Clare A. M.; Oldmeadow, Julian A.; Santos, Isabel M.; Towler, John; Burt, D. Michael; Young, Andrew W.
2013-01-01
Three experiments are presented that investigate the two-dimensional valence/trustworthiness by dominance model of social inferences from faces (Oosterhof & Todorov, 2008). Experiment 1 used image averaging and morphing techniques to demonstrate that consistent facial cues subserve a range of social inferences, even in a highly variable sample of…
NASA Astrophysics Data System (ADS)
Lawrence, C.; Lin, L.; Lisiecki, L. E.; Khider, D.
2014-12-01
The broad goal of this presentation is to demonstrate the utility of probabilistic generative models to capture investigators' knowledge of geological processes and proxy data to draw statistical inferences about unobserved paleoclimatological events. We illustrate how this approach forces investigators to be explicit about their assumptions, and about how probability theory yields results that are a mathematical consequence of these assumptions and the data. We illustrate these ideas with the HMM-Match model that infers common times of sediment deposition in two records and the uncertainty in these inferences in the form of confidence bands. HMM-Match models the sedimentation processes that led to proxy data measured in marine sediment cores. This Bayesian model has three components: 1) a generative probabilistic model that proceeds from the underlying geophysical and geochemical events, specifically the sedimentation events to the generation the proxy data Sedimentation ---> Proxy Data ; 2) a recursive algorithm that reverses the logic of the model to yield inference about the unobserved sedimentation events and the associated alignment of the records based on proxy data Proxy Data ---> Sedimentation (Alignment) ; 3) an expectation maximization algorithm for estimating two unknown parameters. We applied HMM-Match to align 35 Late Pleistocene records to a global benthic d18Ostack and found that the mean width of 95% confidence intervals varies between 3-23 kyr depending on the resolution and noisiness of the core's d18O signal. Confidence bands within individual cores also vary greatly, ranging from ~0 to >40 kyr. Results from this algorithm will allow researchers to examine the robustness of their conclusions with respect to alignment uncertainty. Figure 1 shows the confidence bands for one low resolution record.
Bayesian Inference of High-Dimensional Dynamical Ocean Models
NASA Astrophysics Data System (ADS)
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Young children's use of statistical sampling evidence to infer the subjectivity of preferences.
Ma, Lili; Xu, Fei
2011-09-01
A crucial task in social interaction involves understanding subjective mental states. Here we report two experiments with toddlers exploring whether they can use statistical evidence to infer the subjective nature of preferences. We found that 2-year-olds were likely to interpret another person's nonrandom sampling behavior as a cue for a preference different from their own. When there was no alternative in the population or if the sampling was random, 2-year-olds did not ascribe a preference and persisted in their initial beliefs that the person would share their own preference. We found similar but weaker patterns of responses in 16-month-olds. These results suggest that the ability to infer the subjectivity of preferences based on sampling information begins to emerge between 16 months and 2 years. Our findings provide some of the first evidence that from early in development, young children can use statistical evidence to make rational inferences about the social world. PMID:21353215
Young Children's Use of Statistical Sampling Evidence to Infer the Subjectivity of Preferences
ERIC Educational Resources Information Center
Ma, Lili; Xu, Fei
2011-01-01
A crucial task in social interaction involves understanding subjective mental states. Here we report two experiments with toddlers exploring whether they can use statistical evidence to infer the subjective nature of preferences. We found that 2-year-olds were likely to interpret another person's nonrandom sampling behavior as a cue for a…
Inferring the connectivity of coupled oscillators from time-series statistical similarity analysis
Tirabassi, Giulio; Sevilla-Escoboza, Ricardo; Buldú, Javier M.; Masoller, Cristina
2015-01-01
A system composed by interacting dynamical elements can be represented by a network, where the nodes represent the elements that constitute the system, and the links account for their interactions, which arise due to a variety of mechanisms, and which are often unknown. A popular method for inferring the system connectivity (i.e., the set of links among pairs of nodes) is by performing a statistical similarity analysis of the time-series collected from the dynamics of the nodes. Here, by considering two systems of coupled oscillators (Kuramoto phase oscillators and Rössler chaotic electronic oscillators) with known and controllable coupling conditions, we aim at testing the performance of this inference method, by using linear and non linear statistical similarity measures. We find that, under adequate conditions, the network links can be perfectly inferred, i.e., no mistakes are made regarding the presence or absence of links. These conditions for perfect inference require: i) an appropriated choice of the observed variable to be analysed, ii) an appropriated interaction strength, and iii) an adequate thresholding of the similarity matrix. For the dynamical units considered here we find that the linear statistical similarity measure performs, in general, better than the non-linear ones. PMID:26042395
ERIC Educational Resources Information Center
Sotos, Ana Elisa Castro; Vanhoof, Stijn; Van den Noortgate, Wim; Onghena, Patrick
2007-01-01
A solid understanding of "inferential statistics" is of major importance for designing and interpreting empirical results in any scientific discipline. However, students are prone to many misconceptions regarding this topic. This article structurally summarizes and describes these misconceptions by presenting a systematic review of publications…
Young children use statistical sampling to infer the preferences of other people.
Kushnir, Tamar; Xu, Fei; Wellman, Henry M
2010-08-01
Psychological scientists use statistical information to determine the workings of human behavior. We argue that young children do so as well. Over the course of a few years, children progress from viewing human actions as intentional and goal directed to reasoning about the psychological causes underlying such actions. Here, we show that preschoolers and 20-month-old infants can use statistical information-namely, a violation of random sampling-to infer that an agent is expressing a preference for one type of toy instead of another type of toy. Children saw a person remove five toys of one type from a container of toys. Preschoolers and infants inferred that the person had a preference for that type of toy when there was a mismatch between the sampled toys and the population of toys in the box. Mere outcome consistency, time spent with the toys, and positive attention toward the toys did not lead children to infer a preference. These findings provide an important demonstration of how statistical learning could underpin the rapid acquisition of early psychological knowledge. PMID:20622142
Hupé, Jean-Michel
2015-01-01
Published studies using functional and structural MRI include many errors in the way data are analyzed and conclusions reported. This was observed when working on a comprehensive review of the neural bases of synesthesia, but these errors are probably endemic to neuroimaging studies. All studies reviewed had based their conclusions using Null Hypothesis Significance Tests (NHST). NHST have yet been criticized since their inception because they are more appropriate for taking decisions related to a Null hypothesis (like in manufacturing) than for making inferences about behavioral and neuronal processes. Here I focus on a few key problems of NHST related to brain imaging techniques, and explain why or when we should not rely on "significance" tests. I also observed that, often, the ill-posed logic of NHST was even not correctly applied, and describe what I identified as common mistakes or at least problematic practices in published papers, in light of what could be considered as the very basics of statistical inference. MRI statistics also involve much more complex issues than standard statistical inference. Analysis pipelines vary a lot between studies, even for those using the same software, and there is no consensus which pipeline is the best. I propose a synthetic view of the logic behind the possible methodological choices, and warn against the usage and interpretation of two statistical methods popular in brain imaging studies, the false discovery rate (FDR) procedure and permutation tests. I suggest that current models for the analysis of brain imaging data suffer from serious limitations and call for a revision taking into account the "new statistics" (confidence intervals) logic. PMID:25745383
High-Dimensional Statistical Learning: Roots, Justifications, and Potential Machineries
Zollanvari, Amin
2015-01-01
High-dimensional data generally refer to data in which the number of variables is larger than the sample size. Analyzing such datasets poses great challenges for classical statistical learning because the finite-sample performance of methods developed within classical statistical learning does not live up to classical asymptotic premises in which the sample size unboundedly grows for a fixed dimensionality of observations. Much work has been done in developing mathematical–statistical techniques for analyzing high-dimensional data. Despite remarkable progress in this field, many practitioners still utilize classical methods for analyzing such datasets. This state of affairs can be attributed, in part, to a lack of knowledge and, in part, to the ready-to-use computational and statistical software packages that are well developed for classical techniques. Moreover, many scientists working in a specific field of high-dimensional statistical learning are either not aware of other existing machineries in the field or are not willing to try them out. The primary goal in this work is to bring together various machineries of high-dimensional analysis, give an overview of the important results, and present the operating conditions upon which they are grounded. When appropriate, readers are referred to relevant review articles for more information on a specific subject. PMID:27081307
High-Dimensional Statistical Learning: Roots, Justifications, and Potential Machineries.
Zollanvari, Amin
2015-01-01
High-dimensional data generally refer to data in which the number of variables is larger than the sample size. Analyzing such datasets poses great challenges for classical statistical learning because the finite-sample performance of methods developed within classical statistical learning does not live up to classical asymptotic premises in which the sample size unboundedly grows for a fixed dimensionality of observations. Much work has been done in developing mathematical-statistical techniques for analyzing high-dimensional data. Despite remarkable progress in this field, many practitioners still utilize classical methods for analyzing such datasets. This state of affairs can be attributed, in part, to a lack of knowledge and, in part, to the ready-to-use computational and statistical software packages that are well developed for classical techniques. Moreover, many scientists working in a specific field of high-dimensional statistical learning are either not aware of other existing machineries in the field or are not willing to try them out. The primary goal in this work is to bring together various machineries of high-dimensional analysis, give an overview of the important results, and present the operating conditions upon which they are grounded. When appropriate, readers are referred to relevant review articles for more information on a specific subject. PMID:27081307
Inference in infinite-dimensional inverse problems - Discretization and duality
NASA Technical Reports Server (NTRS)
Stark, Philip B.
1992-01-01
Many techniques for solving inverse problems involve approximating the unknown model, a function, by a finite-dimensional 'discretization' or parametric representation. The uncertainty in the computed solution is sometimes taken to be the uncertainty within the parametrization; this can result in unwarranted confidence. The theory of conjugate duality can overcome the limitations of discretization within the 'strict bounds' formalism, a technique for constructing confidence intervals for functionals of the unknown model incorporating certain types of prior information. The usual computational approach to strict bounds approximates the 'primal' problem in a way that the resulting confidence intervals are at most long enough to have the nominal coverage probability. There is another approach based on 'dual' optimization problems that gives confidence intervals with at least the nominal coverage probability. The pair of intervals derived by the two approaches bracket a correct confidence interval. The theory is illustrated with gravimetric, seismic, geomagnetic, and helioseismic problems and a numerical example in seismology.
Hupé, Jean-Michel
2015-01-01
Published studies using functional and structural MRI include many errors in the way data are analyzed and conclusions reported. This was observed when working on a comprehensive review of the neural bases of synesthesia, but these errors are probably endemic to neuroimaging studies. All studies reviewed had based their conclusions using Null Hypothesis Significance Tests (NHST). NHST have yet been criticized since their inception because they are more appropriate for taking decisions related to a Null hypothesis (like in manufacturing) than for making inferences about behavioral and neuronal processes. Here I focus on a few key problems of NHST related to brain imaging techniques, and explain why or when we should not rely on “significance” tests. I also observed that, often, the ill-posed logic of NHST was even not correctly applied, and describe what I identified as common mistakes or at least problematic practices in published papers, in light of what could be considered as the very basics of statistical inference. MRI statistics also involve much more complex issues than standard statistical inference. Analysis pipelines vary a lot between studies, even for those using the same software, and there is no consensus which pipeline is the best. I propose a synthetic view of the logic behind the possible methodological choices, and warn against the usage and interpretation of two statistical methods popular in brain imaging studies, the false discovery rate (FDR) procedure and permutation tests. I suggest that current models for the analysis of brain imaging data suffer from serious limitations and call for a revision taking into account the “new statistics” (confidence intervals) logic. PMID:25745383
Local dependence in random graph models: characterization, properties and statistical inference
Schweinberger, Michael; Handcock, Mark S.
2015-01-01
Summary Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with ‘ground truth’. PMID:26560142
Inferring biological tasks using Pareto analysis of high-dimensional data.
Hart, Yuval; Sheftel, Hila; Hausser, Jean; Szekely, Pablo; Ben-Moshe, Noa Bossel; Korem, Yael; Tendler, Avichai; Mayo, Avraham E; Alon, Uri
2015-03-01
We present the Pareto task inference method (ParTI; http://www.weizmann.ac.il/mcb/UriAlon/download/ParTI) for inferring biological tasks from high-dimensional biological data. Data are described as a polytope, and features maximally enriched closest to the vertices (or archetypes) allow identification of the tasks the vertices represent. We demonstrate that human breast tumors and mouse tissues are well described by tetrahedrons in gene expression space, with specific tumor types and biological functions enriched at each of the vertices, suggesting four key tasks. PMID:25622107
A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.
Zheng, Chaojie; Wang, Xiuying; Feng, Dagan
2015-01-01
PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies. PMID:26736741
Statistical entropy of charged two-dimensional black holes
NASA Astrophysics Data System (ADS)
Teo, Edward
1998-06-01
The statistical entropy of a five-dimensional black hole in Type II string theory was recently derived by showing that it is U-dual to the three-dimensional Bañados-Teitelboim-Zanelli black hole, and using Carlip's method to count the microstates of the latter. This is valid even for the non-extremal case, unlike the derivation which relies on D-brane techniques. In this letter, I shall exploit the U-duality that exists between the five-dimensional black hole and the two-dimensional charged black hole of McGuigan, Nappi and Yost, to microscopically compute the entropy of the latter. It is shown that this result agrees with previous calculations using thermodynamic arguments.
NASA Technical Reports Server (NTRS)
Lerner, Jeffrey A.; Jedlovec, Gary J.; Atkinson, Robert J.
1998-01-01
Ever since the first satellite image loops from the 6.3 micron water vapor channel on the METEOSAT-1 in 1978, there have been numerous efforts (many to a great degree of success) to relate the water vapor radiance patterns to familiar atmospheric dynamic quantities. The realization of these efforts is becoming evident with the merging of satellite derived winds into predictive models (Velden et al., 1997; Swadley and Goerss, 1989). Another parameter that has been quantified from satellite water vapor channel measurements is upper tropospheric relative humidity (UTH) (e.g., Soden and Bretherton, 1996; Schmetz and Turpeinen, 1988). These humidity measurements, in turn, can be used to quantify upper tropospheric water vapor and its transport to more accurately diagnose climate changes (Lerner et al., 1998; Schmetz et al. 1995a) and quantify radiative processes in the upper troposphere. Also apparent in water vapor imagery animations are regions of subsiding and ascending air flow. Indeed, a component of the translated motions we observe are due to vertical velocities. The few attempts at exploiting this information have been met with a fair degree of success. Picon and Desbois (1990) statistically related Meteosat monthly mean water vapor radiances to six standard pressure levels of the European Centre for Medium Range Weather Forecast (ECMWF) model vertical velocities and found correlation coefficients of about 0.50 or less. This paper presents some preliminary results of viewing climatological satellite water vapor data in a different fashion. Specifically, we attempt to infer the three dimensional flow characteristics of the mid- to upper troposphere as portrayed by GOES VAS during the warm ENSO event (1987) and a subsequent cold period in 1998.
Inferences on weather extremes and weather-related disasters: a review of statistical methods
NASA Astrophysics Data System (ADS)
Visser, H.; Petersen, A. C.
2012-02-01
The study of weather extremes and their impacts, such as weather-related disasters, plays an important role in research of climate change. Due to the great societal consequences of extremes - historically, now and in the future - the peer-reviewed literature on this theme has been growing enormously since the 1980s. Data sources have a wide origin, from century-long climate reconstructions from tree rings to relatively short (30 to 60 yr) databases with disaster statistics and human impacts. When scanning peer-reviewed literature on weather extremes and its impacts, it is noticeable that many different methods are used to make inferences. However, discussions on these methods are rare. Such discussions are important since a particular methodological choice might substantially influence the inferences made. A calculation of a return period of once in 500 yr, based on a normal distribution will deviate from that based on a Gumbel distribution. And the particular choice between a linear or a flexible trend model might influence inferences as well. In this article, a concise overview of statistical methods applied in the field of weather extremes and weather-related disasters is given. Methods have been evaluated as to stationarity assumptions, the choice for specific probability density functions (PDFs) and the availability of uncertainty information. As for stationarity assumptions, the outcome was that good testing is essential. Inferences on extremes may be wrong if data are assumed stationary while they are not. The same holds for the block-stationarity assumption. As for PDF choices it was found that often more than one PDF shape fits to the same data. From a simulation study the conclusion can be drawn that both the generalized extreme value (GEV) distribution and the log-normal PDF fit very well to a variety of indicators. The application of the normal and Gumbel distributions is more limited. As for uncertainty, it is advisable to test conclusions on extremes
On statistical inference in time series analysis of the evolution of road safety.
Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora
2013-11-01
Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. PMID:23260716
Statistical mechanics of two-dimensional and geophysical flows
NASA Astrophysics Data System (ADS)
Bouchet, Freddy; Venaille, Antoine
2012-06-01
The theoretical study of the self-organization of two-dimensional and geophysical turbulent flows is addressed based on statistical mechanics methods. This review is a self-contained presentation of classical and recent works on this subject; from the statistical mechanics basis of the theory up to applications to Jupiter’s troposphere and ocean vortices and jets. Emphasize has been placed on examples with available analytical treatment in order to favor better understanding of the physics and dynamics. After a brief presentation of the 2D Euler and quasi-geostrophic equations, the specificity of two-dimensional and geophysical turbulence is emphasized. The equilibrium microcanonical measure is built from the Liouville theorem. Important statistical mechanics concepts (large deviations and mean field approach) and thermodynamic concepts (ensemble inequivalence and negative heat capacity) are briefly explained and described. On this theoretical basis, we predict the output of the long time evolution of complex turbulent flows as statistical equilibria. This is applied to make quantitative models of two-dimensional turbulence, the Great Red Spot and other Jovian vortices, ocean jets like the Gulf-Stream, and ocean vortices. A detailed comparison between these statistical equilibria and real flow observations is provided. We also present recent results for non-equilibrium situations, for the studies of either the relaxation towards equilibrium or non-equilibrium steady states. In this last case, forces and dissipation are in a statistical balance; fluxes of conserved quantity characterize the system and microcanonical or other equilibrium measures no longer describe the system.
Social inferences from faces: ambient images generate a three-dimensional model.
Sutherland, Clare A M; Oldmeadow, Julian A; Santos, Isabel M; Towler, John; Michael Burt, D; Young, Andrew W
2013-04-01
Three experiments are presented that investigate the two-dimensional valence/trustworthiness by dominance model of social inferences from faces (Oosterhof & Todorov, 2008). Experiment 1 used image averaging and morphing techniques to demonstrate that consistent facial cues subserve a range of social inferences, even in a highly variable sample of 1000 ambient images (images that are intended to be representative of those encountered in everyday life, see Jenkins, White, Van Montfort, & Burton, 2011). Experiment 2 then tested Oosterhof and Todorov's two-dimensional model on this extensive sample of face images. The original two dimensions were replicated and a novel 'youthful-attractiveness' factor also emerged. Experiment 3 successfully cross-validated the three-dimensional model using face averages directly constructed from the factor scores. These findings highlight the utility of the original trustworthiness and dominance dimensions, but also underscore the need to utilise varied face stimuli: with a more realistically diverse set of face images, social inferences from faces show a more elaborate underlying structure than hitherto suggested. PMID:23376296
Statistical Properties of Decaying Two-Dimensional Turbulence
NASA Astrophysics Data System (ADS)
Nakamura, Kenshi; Takahashi, Takehiro; Nakano, Tohru
1993-04-01
We investigate the temporal development of the statistical properties of two-dimensional incompressible turbulence simulated for a long time. First, we obtain information on the evolving microscopic vortical structure by inspecting the time variation of qth order fractal dimensions of the enstrophy dissipation rate. The conclusion drawn from such an inspection is consistent with a picture given by Kida (J. Phys. Soc. Jpn. 54 (1985) 2840); in the first stage the \
n-dimensional Statistical Inverse Graphical Hydraulic Test Simulator
2012-09-12
nSIGHTS (n-dimensional Statistical Inverse Graphical Hydraulic Test Simulator) is a comprehensive well test analysis software package. It provides a user-interface, a well test analysis model and many tools to analyze both field and simulated data. The well test analysis model simulates a single-phase, one-dimensional, radial/non-radial flow regime, with a borehole at the center of the modeled flow system. nSIGHTS solves the radially symmetric n-dimensional forward flow problem using a solver based on a graph-theoretic approach. The results of the forward simulation are pressure, and flow rate, given all the input parameters. The parameter estimation portion of nSIGHTS uses a perturbation-based approach to interpret the best-fit well and reservoir parameters, given an observed dataset of pressure and flow rate.
n-dimensional Statistical Inverse Graphical Hydraulic Test Simulator
Energy Science and Technology Software Center (ESTSC)
2012-09-12
nSIGHTS (n-dimensional Statistical Inverse Graphical Hydraulic Test Simulator) is a comprehensive well test analysis software package. It provides a user-interface, a well test analysis model and many tools to analyze both field and simulated data. The well test analysis model simulates a single-phase, one-dimensional, radial/non-radial flow regime, with a borehole at the center of the modeled flow system. nSIGHTS solves the radially symmetric n-dimensional forward flow problem using a solver based on a graph-theoretic approach.more » The results of the forward simulation are pressure, and flow rate, given all the input parameters. The parameter estimation portion of nSIGHTS uses a perturbation-based approach to interpret the best-fit well and reservoir parameters, given an observed dataset of pressure and flow rate.« less
Staude, Benjamin; Grün, Sonja; Rotter, Stefan
2009-01-01
The extent to which groups of neurons exhibit higher-order correlations in their spiking activity is a controversial issue in current brain research. A major difficulty is that currently available tools for the analysis of massively parallel spike trains (N >10) for higher-order correlations typically require vast sample sizes. While multiple single-cell recordings become increasingly available, experimental approaches to investigate the role of higher-order correlations suffer from the limitations of available analysis techniques. We have recently presented a novel method for cumulant-based inference of higher-order correlations (CuBIC) that detects correlations of higher order even from relatively short data stretches of length T = 10–100 s. CuBIC employs the compound Poisson process (CPP) as a statistical model for the population spike counts, and assumes spike trains to be stationary in the analyzed data stretch. In the present study, we describe a non-stationary version of the CPP by decoupling the correlation structure from the spiking intensity of the population. This allows us to adapt CuBIC to time-varying firing rates. Numerical simulations reveal that the adaptation corrects for false positive inference of correlations in data with pure rate co-variation, while allowing for temporal variations of the firing rates has a surprisingly small effect on CuBICs sensitivity for correlations. PMID:20725510
Feldman, Naomi H.; Griffiths, Thomas L.; Morgan, James L.
2009-01-01
A variety of studies have demonstrated that organizing stimuli into categories can affect the way the stimuli are perceived. We explore the influence of categories on perception through one such phenomenon, the perceptual magnet effect, in which discriminability between vowels is reduced near prototypical vowel sounds. We present a Bayesian model to explain why this reduced discriminability might occur: it arises as a consequence of optimally solving the statistical problem of perception in noise. In the optimal solution to this problem, listeners’ perception is biased toward phonetic category means because they use knowledge of these categories to guide their inferences about speakers’ target productions. Simulations show that model predictions closely correspond to previously published human data, and novel experimental results provide evidence for the predicted link between perceptual warping and noise. The model unifies several previous accounts of the perceptual magnet effect and provides a framework for exploring categorical effects in other domains. PMID:19839683
McDonald, L.L.; Erickson, W.P.; Strickland, M.D.
1995-12-31
The objective of the Coastal Habitat Injury Assessment study was to document and quantify injury to biota of the shallow subtidal, intertidal, and supratidal zones throughout the shoreline affected by oil or cleanup activity associated with the Exxon Valdez oil spill. The results of these studies were to be used to support the Trustee`s Type B Natural Resource Damage Assessment under the Comprehensive Environmental Response, Compensation, and Liability Act of 1980 (CERCLA). A probability based stratified random sample of shoreline segments was selected with probability proportional to size from each of 15 strata (5 habitat types crossed with 3 levels of potential oil impact) based on those data available in July, 1989. Three study regions were used: Prince William Sound, Cook Inlet/Kenai Peninsula, and Kodiak/Alaska Peninsula. A Geographic Information System was utilized to combine oiling and habitat data and to select the probability sample of study sites. Quasi-experiments were conducted where randomly selected oiled sites were compared to matched reference sites. Two levels of statistical inferences, philosophical bases, and limitations are discussed and illustrated with example data from the resulting studies. 25 refs., 4 figs., 1 tab.
Gaggiotti, Oscar E
2010-11-01
Ever since the introduction of allozymes in the 1960s, evolutionary biologists and ecologists have continued to search for more powerful molecular markers to estimate important parameters such as effective population size and migration rates and to make inferences about the demographic history of populations, the relationships between individuals and the genetic architecture of phenotypic variation (Bensch & Akesson 2005; Bonin et al. 2007). Choosing a marker requires a thorough consideration of the trade-offs associated with the different techniques and the type of data obtained from them. Some markers can be very informative but require substantial amounts of start-up time (e.g. microsatellites), while others require very little time but are much less polymorphic. Amplified fragment length polymorphism (AFLP) is a firmly established molecular marker technique that falls in this latter category. AFLPs are widely distributed throughout the genome and can be used on organisms for which there is no a priori sequence information (Meudt & Clarke 2007). These properties together with their moderate cost and short start-up time have made them the method of choice for many molecular ecology studies of wild species (Bensch & Akesson 2005). However, they have a major disadvantage, they are dominant. This represents a very important limitation because many statistical genetics methods appropriate for molecular ecology studies require the use of codominant markers. In this issue, Foll et al. (2010) present an innovative hierarchical Bayesian method that overcomes this limitation. The proposed approach represents a comprehensive statistical treatment of the fluorescence of AFLP bands and leads to accurate inferences about the genetic structure of natural populations. Besides allowing a quasi-codominant treatment of AFLPs, this new method also solves the difficult problems posed by subjectivity in the scoring of AFLP bands. PMID:20958811
Quantum Statistical Entropy of Five-Dimensional Black Hole
NASA Astrophysics Data System (ADS)
Zhao, Ren; Wu, Yue-Qin; Zhang, Sheng-Li
2006-05-01
The generalized uncertainty relation is introduced to calculate quantum statistic entropy of a black hole. By using the new equation of state density motivated by the generalized uncertainty relation, we discuss entropies of Bose field and Fermi field on the background of the five-dimensional spacetime. In our calculation, we need not introduce cutoff. There is not the divergent logarithmic term as in the original brick-wall method. And it is obtained that the quantum statistic entropy corresponding to black hole horizon is proportional to the area of the horizon. Further it is shown that the entropy of black hole is the entropy of quantum state on the surface of horizon. The black hole's entropy is the intrinsic property of the black hole. The entropy is a quantum effect. It makes people further understand the quantum statistic entropy.
Statistical mechanics of shell models for two-dimensional turbulence
NASA Astrophysics Data System (ADS)
Aurell, E.; Boffetta, G.; Crisanti, A.; Frick, P.; Paladin, G.; Vulpiani, A.
1994-12-01
We study shell models that conserve the analogs of energy and enstrophy and hence are designed to mimic fluid turbulence in two-dimensions (2D). The main result is that the observed state is well described as a formal statistical equilibrium, closely analogous to the approach to two-dimensional ideal hydrodynamics of Onsager [Nuovo Cimento Suppl. 6, 279 (1949)], Hopf [J. Rat. Mech. Anal. 1, 87 (1952)], and Lee [Q. Appl. Math. 10, 69 (1952)]. In the presence of forcing and dissipation we observe a forward flux of enstrophy and a backward flux of energy. These fluxes can be understood as mean diffusive drifts from a source to two sinks in a system which is close to local equilibrium with Lagrange multipliers (``shell temperatures'') changing slowly with scale. This is clear evidence that the simplest shell models are not adequate to reproduce the main features of two-dimensional turbulence. The dimensional predictions on the power spectra from a supposed forward cascade of enstrophy and from one branch of the formal statistical equilibrium coincide in these shell models in contrast to the corresponding predictions for the Navier-Stokes and Euler equations in 2D. This coincidence has previously led to the mistaken conclusion that shell models exhibit a forward cascade of enstrophy. We also study the dynamical properties of the models and the growth of perturbations.
Sex, lies, and statistics: inferences from the child sexual abuse accommodation syndrome.
Weiss, Kenneth J; Curcio Alexander, Julia
2013-01-01
Victims of child sexual abuse often recant their complaints or do not report incidents, making prosecution of offenders difficult. The child with sexual abuse accommodation syndrome (CSAAS) has been used to explain this phenomenon by identifying common behavioral responses. Unlike PTSD but like rape trauma syndrome, CSAAS is not an official diagnostic term and should not be used as evidence of a defendant's guilt or to imply probative value in prosecutions. Courts have grappled with the ideal use of CSAAS in the evaluation of child witness testimony. Expert testimony should be helpful to the jurors without prejudicing them. The New Jersey Supreme Court ruled recently that statistical evidence about CSAAS implying the probability that a child is truthful runs the risk of confusing jury members and biasing them against the defendant. We review the parameters of expert testimony and its admissibility in this area, concluding that statistics about CSAAS should not be used to draw inferences about the victim's credibility or the defendant's guilt. PMID:24051595
Univariate description and bivariate statistical inference: the first step delving into data
2016-01-01
In observational studies, the first step is usually to explore data distribution and the baseline differences between groups. Data description includes their central tendency (e.g., mean, median, and mode) and dispersion (e.g., standard deviation, range, interquartile range). There are varieties of bivariate statistical inference methods such as Student’s t-test, Mann-Whitney U test and Chi-square test, for normal, skews and categorical data, respectively. The article shows how to perform these analyses with R codes. Furthermore, I believe that the automation of the whole workflow is of paramount importance in that (I) it allows for others to repeat your results; (II) you can easily find out how you performed analysis during revision; (III) it spares data input by hand and is less error-prone; and (IV) when you correct your original dataset, the final result can be automatically corrected by executing the codes. Therefore, the process of making a publication quality table incorporating all abovementioned statistics and P values is provided, allowing readers to customize these codes to their own needs. PMID:27047950
Validi, AbdoulAhad
2014-03-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.
NASA Astrophysics Data System (ADS)
Morozov, Alexandre
2009-03-01
Formation of nucleosome core particles is a first step towards packaging genomic DNA into chromosomes in living cells. Nucleosomes are formed by wrapping 147 base pairs of DNA around a spool of eight histone proteins. It is reasonable to assume that formation of single nucleosomes in vitro is determined by DNA sequence alone: it costs less elastic energy to wrap a flexible DNA polymer around the histone octamer, and more if the polymer is rigid. However, it is unclear to which extent this effect is important in living cells. Cells have evolved chromatin remodeling enzymes that expend ATP to actively reposition nucleosomes. In addition, nucleosome positioning on long DNA sequences is affected by steric exclusion - many nucleosomes have to form simultaneously without overlap. Currently available bioinformatics methods for predicting nucleosome positions are trained on in vivo data sets and are thus unable to distinguish between extrinsic and intrinsic nucleosome positioning signals. In order to see the relative importance of such signals for nucleosome positioning in vivo, we have developed a model based on a large collection of DNA sequences from nucleosomes reconstituted in vitro by salt dialysis. We have used these data to infer the free energy of nucleosome formation at each position along the genome. The method uses an exact result from the statistical mechanics of classical 1D fluids to infer the free energy landscape from nucleosome occupancy. We will discuss the degree to which in vitro nucleosome occupancy profiles are predictive of in vivo nucleosome positions, and will estimate how many nucleosomes are sequence-specific and how many are positioned purely by steric exclusion. Our approach to nucleosome energetics should be applicable across multiple organisms and genomic regions.
Statistical Downscaling in Multi-dimensional Wave Climate Forecast
NASA Astrophysics Data System (ADS)
Camus, P.; Méndez, F. J.; Medina, R.; Losada, I. J.; Cofiño, A. S.; Gutiérrez, J. M.
2009-04-01
Wave climate at a particular site is defined by the statistical distribution of sea state parameters, such as significant wave height, mean wave period, mean wave direction, wind velocity, wind direction and storm surge. Nowadays, long-term time series of these parameters are available from reanalysis databases obtained by numerical models. The Self-Organizing Map (SOM) technique is applied to characterize multi-dimensional wave climate, obtaining the relevant "wave types" spanning the historical variability. This technique summarizes multi-dimension of wave climate in terms of a set of clusters projected in low-dimensional lattice with a spatial organization, providing Probability Density Functions (PDFs) on the lattice. On the other hand, wind and storm surge depend on instantaneous local large-scale sea level pressure (SLP) fields while waves depend on the recent history of these fields (say, 1 to 5 days). Thus, these variables are associated with large-scale atmospheric circulation patterns. In this work, a nearest-neighbors analog method is used to predict monthly multi-dimensional wave climate. This method establishes relationships between the large-scale atmospheric circulation patterns from numerical models (SLP fields as predictors) with local wave databases of observations (monthly wave climate SOM PDFs as predictand) to set up statistical models. A wave reanalysis database, developed by Puertos del Estado (Ministerio de Fomento), is considered as historical time series of local variables. The simultaneous SLP fields calculated by NCEP atmospheric reanalysis are used as predictors. Several applications with different size of sea level pressure grid and with different temporal domain resolution are compared to obtain the optimal statistical model that better represents the monthly wave climate at a particular site. In this work we examine the potential skill of this downscaling approach considering perfect-model conditions, but we will also analyze the
Emmert-Streib, Frank; Glazko, Galina V.; Altay, Gökmen; de Matos Simoes, Ricardo
2012-01-01
In this paper, we present a systematic and conceptual overview of methods for inferring gene regulatory networks from observational gene expression data. Further, we discuss two classic approaches to infer causal structures and compare them with contemporary methods by providing a conceptual categorization thereof. We complement the above by surveying global and local evaluation measures for assessing the performance of inference algorithms. PMID:22408642
Lagrangian statistics in forced two-dimensional turbulence
NASA Astrophysics Data System (ADS)
Kamps, Oliver; Friedrich, Rudolf
2007-11-01
In recent years the Lagrangian description of turbulent flows has attracted much interest from the experimental point of view and as well is in the focus of numerical and analytical investigations. We present detailed numerical investigations of Lagrangian tracer particles in the inverse energy cascade of two-dimensional turbulence. In the first part we focus on the shape and scaling properties of the probability distribution functions for the velocity increments and compare them to the Eulerian case and the increment statistics in three dimensions. Motivated by our observations we address the important question of translating increment statistics from one frame of reference to the other [1]. To reveal the underlying physical mechanism we determine numerically the involved transition probabilities. In this way we shed light on the source of Lagrangian intermittency.[1ex] [1] R. Friedrich, R. Grauer, H. Hohmann, O. Kamps, A Corrsin type approximation for Lagrangian fluid Turbulence , arXiv:0705.3132
Inference and Decoding of Motor Cortex Low-Dimensional Dynamics via Latent State-Space Models.
Aghagolzadeh, Mehdi; Truccolo, Wilson
2016-02-01
Motor cortex neuronal ensemble spiking activity exhibits strong low-dimensional collective dynamics (i.e., coordinated modes of activity) during behavior. Here, we demonstrate that these low-dimensional dynamics, revealed by unsupervised latent state-space models, can provide as accurate or better reconstruction of movement kinematics as direct decoding from the entire recorded ensemble. Ensembles of single neurons were recorded with triple microelectrode arrays (MEAs) implanted in ventral and dorsal premotor (PMv, PMd) and primary motor (M1) cortices while nonhuman primates performed 3-D reach-to-grasp actions. Low-dimensional dynamics were estimated via various types of latent state-space models including, for example, Poisson linear dynamic system (PLDS) models. Decoding from low-dimensional dynamics was implemented via point process and Kalman filters coupled in series. We also examined decoding based on a predictive subsampling of the recorded population. In this case, a supervised greedy procedure selected neuronal subsets that optimized decoding performance. When comparing decoding based on predictive subsampling and latent state-space models, the size of the neuronal subset was set to the same number of latent state dimensions. Overall, our findings suggest that information about naturalistic reach kinematics present in the recorded population is preserved in the inferred low-dimensional motor cortex dynamics. Furthermore, decoding based on unsupervised PLDS models may also outperform previous approaches based on direct decoding from the recorded population or on predictive subsampling. PMID:26336135
Brannigan, V.M.; Bier, V.M.; Berg, C.
1992-09-01
Toxic torts are product liability cases dealing with alleged injuries due to chemical or biological hazards such as radiation, thalidomide, or Agent Orange. Toxic tort cases typically rely more heavily that other product liability cases on indirect or statistical proof of injury in toxic cases. However, there have been only a handful of actual legal decisions regarding the use of such statistical evidence, and most of those decisions have been inconclusive. Recently, a major case from the Fifth Circuit, involving allegations that Benedectin (a morning sickness drug) caused birth defects, was decided entirely on the basis of statistical inference. This paper examines both the conceptual basis of that decision, and also the relationships among statistical inference, scientific evidence, and the rules of product liability in general. 23 refs.
Conn, Paul B.; Johnson, Devin S.; Ver Hoef, Jay M.; Hooten, Mevin B.; London, Joshua M.; Boveng, Peter L.
2015-01-01
Ecologists often fit models to survey data to estimate and explain variation in animal abundance. Such models typically require that animal density remains constant across the landscape where sampling is being conducted, a potentially problematic assumption for animals inhabiting dynamic landscapes or otherwise exhibiting considerable spatiotemporal variation in density. We review several concepts from the burgeoning literature on spatiotemporal statistical models, including the nature of the temporal structure (i.e., descriptive or dynamical) and strategies for dimension reduction to promote computational tractability. We also review several features as they specifically relate to abundance estimation, including boundary conditions, population closure, choice of link function, and extrapolation of predicted relationships to unsampled areas. We then compare a suite of novel and existing spatiotemporal hierarchical models for animal count data that permit animal density to vary over space and time, including formulations motivated by resource selection and allowing for closed populations. We gauge the relative performance (bias, precision, computational demands) of alternative spatiotemporal models when confronted with simulated and real data sets from dynamic animal populations. For the latter, we analyze spotted seal (Phoca largha) counts from an aerial survey of the Bering Sea where the quantity and quality of suitable habitat (sea ice) changed dramatically while surveys were being conducted. Simulation analyses suggested that multiple types of spatiotemporal models provide reasonable inference (low positive bias, high precision) about animal abundance, but have potential for overestimating precision. Analysis of spotted seal data indicated that several model formulations, including those based on a log-Gaussian Cox process, had a tendency to overestimate abundance. By contrast, a model that included a population closure assumption and a scale prior on total
Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels
Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J.
2014-01-01
This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively “hiding” its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research. PMID:25505378
Statistical inference of the time-varying structure of gene-regulation networks
2010-01-01
Background Biological networks are highly dynamic in response to environmental and physiological cues. This variability is in contrast to conventional analyses of biological networks, which have overwhelmingly employed static graph models which stay constant over time to describe biological systems and their underlying molecular interactions. Methods To overcome these limitations, we propose here a new statistical modelling framework, the ARTIVA formalism (Auto Regressive TIme VArying models), and an associated inferential procedure that allows us to learn temporally varying gene-regulation networks from biological time-course expression data. ARTIVA simultaneously infers the topology of a regulatory network and how it changes over time. It allows us to recover the chronology of regulatory associations for individual genes involved in a specific biological process (development, stress response, etc.). Results We demonstrate that the ARTIVA approach generates detailed insights into the function and dynamics of complex biological systems and exploits efficiently time-course data in systems biology. In particular, two biological scenarios are analyzed: the developmental stages of Drosophila melanogaster and the response of Saccharomyces cerevisiae to benomyl poisoning. Conclusions ARTIVA does recover essential temporal dependencies in biological systems from transcriptional data, and provide a natural starting point to learn and investigate their dynamics in greater detail. PMID:20860793
Hung, H M James; Wang, Sue-Jane; O'Neill, Robert
2005-02-01
Without a placebo arm, any non-inferiority inference involving assessment of the placebo effect under the active control trial setting is difficult. The statistical risk for falsely concluding non-inferiority cannot be evaluated unless the constancy assumption approximately holds that the effect of the active control under the historical trial setting where the control effect can be assessed carries to the noninferiority trial setting. The constancy assumption cannot be checked because of missing the placebo arm in the non-inferiority trial. Depending on how serious the violation of the assumption is thought to be, one may need to seek an alternative design strategy that includes a cushion for a very conservative non-inferiority analysis or shows superiority of the experimental treatment over the control. Determination of the non-inferiority margin depends on what objective the non-inferiority analysis is intended to achieve. The margin can be a fixed margin or a margin functionally defined. Between-trial differences always exist and need to be properly considered. PMID:16395994
You, Jinhong; Zhou, Haibo
2009-01-01
We consider statistical inference on a regression model in which some covariables are measured with errors together with an auxiliary variable. The proposed estimation for the regression coefficients is based on some estimating equations. This new method alleates some drawbacks of previously proposed estimations. This includes the requirment of undersmoothing the regressor functions over the auxiliary variable, the restriction on other covariables which can be observed exactly, among others. The large sample properties of the proposed estimator are established. We further propose a jackknife estimation, which consists of deleting one estimating equation (instead of one obervation) at a time. We show that the jackknife estimator of the regression coefficients and the estimating equations based estimator are asymptotically equivalent. Simulations show that the jackknife estimator has smaller biases when sample size is small or moderate. In addition, the jackknife estimation can also provide a consistent estimator of the asymptotic covariance matrix, which is robust to the heteroscedasticity. We illustrate these methods by applying them to a real data set from marketing science. PMID:22199460
Lagrangian statistics in weakly forced two-dimensional turbulence.
Rivera, Michael K; Ecke, Robert E
2016-01-01
Measurements of Lagrangian single-point and multiple-point statistics in a quasi-two-dimensional stratified layer system are reported. The system consists of a layer of salt water over an immiscible layer of Fluorinert and is forced electromagnetically so that mean-squared vorticity is injected at a well-defined spatial scale ri. Simultaneous cascades develop in which enstrophy flows predominately to small scales whereas energy cascades, on average, to larger scales. Lagrangian correlations and one- and two-point displacements are measured for random initial conditions and for initial positions within topological centers and saddles. Some of the behavior of these quantities can be understood in terms of the trapping characteristics of long-lived centers, the slow motion near strong saddles, and the rapid fluctuations outside of either centers or saddles. We also present statistics of Lagrangian velocity fluctuations using energy spectra in frequency space and structure functions in real space. We compare with complementary Eulerian velocity statistics. We find that simultaneous inverse energy and enstrophy ranges present in spectra are not directly echoed in real-space moments of velocity difference. Nevertheless, the spectral ranges line up well with features of moment ratios, indicating that although the moments are not exhibiting unambiguous scaling, the behavior of the probability distribution functions is changing over short ranges of length scales. Implications for understanding weakly forced 2D turbulence with simultaneous inverse and direct cascades are discussed. PMID:26826855
NASA Astrophysics Data System (ADS)
Riezler, Stefan
2000-08-01
In this thesis, we present two approaches to a rigorous mathematical and algorithmic foundation of quantitative and statistical inference in constraint-based natural language processing. The first approach, called quantitative constraint logic programming, is conceptualized in a clear logical framework, and presents a sound and complete system of quantitative inference for definite clauses annotated with subjective weights. This approach combines a rigorous formal semantics for quantitative inference based on subjective weights with efficient weight-based pruning for constraint-based systems. The second approach, called probabilistic constraint logic programming, introduces a log-linear probability distribution on the proof trees of a constraint logic program and an algorithm for statistical inference of the parameters and properties of such probability models from incomplete, i.e., unparsed data. The possibility of defining arbitrary properties of proof trees as properties of the log-linear probability model and efficiently estimating appropriate parameter values for them permits the probabilistic modeling of arbitrary context-dependencies in constraint logic programs. The usefulness of these ideas is evaluated empirically in a small-scale experiment on finding the correct parses of a constraint-based grammar. In addition, we address the problem of computational intractability of the calculation of expectations in the inference task and present various techniques to approximately solve this task. Moreover, we present an approximate heuristic technique for searching for the most probable analysis in probabilistic constraint logic programs.
Convertino, Matteo; Mangoubi, Rami S.; Linkov, Igor; Lowry, Nathan C.; Desai, Mukund
2012-01-01
Shannon entropy of pixel intensity.To test our approach, we specifically use the green band of Landsat images for a water conservation area in the Florida Everglades. We validate our predictions against data of species occurrences for a twenty-eight years long period for both wet and dry seasons. Our method correctly predicts 73% of species richness. For species turnover, the newly proposed KL divergence prediction performance is near 100% accurate. This represents a significant improvement over the more conventional Shannon entropy difference, which provides 85% accuracy. Furthermore, we find that changes in soil and water patterns, as measured by fluctuations of the Shannon entropy for the red and blue bands respectively, are positively correlated with changes in vegetation. The fluctuations are smaller in the wet season when compared to the dry season. Conclusions/Significance Texture-based statistical multiresolution image analysis is a promising method for quantifying interseasonal differences and, consequently, the degree to which vegetation, soil, and water patterns vary. The proposed automated method for quantifying species richness and turnover can also provide analysis at higher spatial and temporal resolution than is currently obtainable from expensive monitoring campaigns, thus enabling more prompt, more cost effective inference and decision making support regarding anomalous variations in biodiversity. Additionally, a matrix-based visualization of the statistical multiresolution analysis is presented to facilitate both insight and quick recognition of anomalous data. PMID:23115629
Wallace, D L; Perlman, M D
1980-06-01
This report describes the research activities of the Department of Statistics, University of Chicago, during the period June 15, 1975 to July 30, 1979. Nine research projects are briefly described on the following subjects: statistical computing and approximation techniques in statistics; numerical computation of first passage distributions; probabilities of large deviations; combining independent tests of significance; small-sample efficiencies of tests and estimates; improved procedures for simultaneous estimation and testing of many correlations; statistical computing and improved regression methods; comparison of several populations; and unbiasedness in multivariate statistics. A description of the statistical consultation activities of the Department that are of interest to DOE, in particular, the scientific interactions between the Department and the scientists at Argonne National Laboratories, is given. A list of publications issued during the term of the contract is included.
Happ, Martin; Harrar, Solomon W; Bathke, Arne C
2016-07-01
We propose tests for main and simple treatment effects, time effects, as well as treatment by time interactions in possibly high-dimensional multigroup repeated measures designs. The proposed inference procedures extend the work by Brunner et al. (2012) from two to several treatment groups and remain valid for unbalanced data and under unequal covariance matrices. In addition to showing consistency when sample size and dimension tend to infinity at the same rate, we provide finite sample approximations and evaluate their performance in a simulation study, demonstrating better maintenance of the nominal α-level than the popular Box-Greenhouse-Geisser and Huynh-Feldt methods, and a gain in power for informatively increasing dimension. Application is illustrated using electroencephalography (EEG) data from a neurological study involving patients with Alzheimer's disease and other cognitive impairments. PMID:26700536
Simpson, Helen Blair; Petkova, Eva; Cheng, Jianfeng; Huppert, Jonathan; Foa, Edna; Liebowitz, Michael R
2008-07-01
Longitudinal clinical trials in psychiatry have used various statistical methods to examine treatment effects. The validity of the inferences depends upon the different method's assumptions and whether a given study violates those assumptions. The objective of this paper was to elucidate these complex issues by comparing various methods for handling missing data (e.g., last observation carried forward [LOCF], completer analysis, propensity-adjusted multiple imputation) and for analyzing outcome (e.g., end-point analysis, repeated-measures analysis of variance [RM-ANOVA], mixed-effects models [MEMs]) using data from a multi-site randomized controlled trial in obsessive-compulsive disorder (OCD). The trial compared the effects of 12 weeks of exposure and ritual prevention (EX/RP), clomipramine (CMI), their combination (EX/RP&CMI) or pill placebo in 122 adults with OCD. The primary outcome measure was the Yale-Brown Obsessive Compulsive Scale. For most comparisons, inferences about the relative efficacy of the different treatments were impervious to different methods for handling missing data and analyzing outcome. However, when EX/RP was compared to CMI and when CMI was compared to placebo, traditional methods (e.g., LOCF, RM-ANOVA) led to different inferences than currently recommended alternatives (e.g., multiple imputation based on estimation-maximization algorithm, MEMs). Thus, inferences about treatment efficacy can be affected by statistical choices. This is most likely when there are small but potentially clinically meaningful treatment differences and when sample sizes are modest. The use of appropriate statistical methods in psychiatric trials can advance public health by ensuring that valid inferences are made about treatment efficacy. PMID:17892885
Vorticity statistics of bounded two-dimensional turbulence
NASA Astrophysics Data System (ADS)
Clercx, H. J. H.; Molenaar, D.; van Heijst, Gertjan
2004-11-01
Vorticity statistics play an important role in the determination of small-scale dynamics in forced two-dimensional turbulence. On the basis of the Hölder-continuity of the vorticity field ω(t,x), the scaling behavior of vorticity structure functions S_p(ω(ℓ)), of order p, provides clues on small-scale intermittency. Confirming earlier ideas of Sulem and Frisch (JFM 72, 1975), Eyink (Phys. D 91, 1996) proved the following scaling of the second-order structure function S_2(ω(ℓ))≡<|ω(t,x+r)-ω(t,x)|^2> ˜ℓ^ζ_2, with ζ_2≤ 2/3 and ℓ≤ℓ_f. Here, ℓ=|r|, ℓf is the typical energy-injection scale, associated to an external forcing and the brackets <\\cdot> denote combined space- and time-averaging. The only assumption used to derive this scaling was a constant enstrophy flux to small scales, in the so-called enstrophy cascade range. On the contrary, using the classical Batchelor argument for the advection of a passive scalar, Falkovich and Lebedev (PRE 50, 1994) argued that one must have ζ_p=0 for all p. With new direct numerical simulations we address these issues for a bounded square domain, using the no-slip boundary condition for the velocity. Our results are compared with the earlier experimental results of Paret and Tabeling (PRL 83, 1999).
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
Using Action Research to Develop a Course in Statistical Inference for Workplace-Based Adults
ERIC Educational Resources Information Center
Forbes, Sharleen
2014-01-01
Many adults who need an understanding of statistical concepts have limited mathematical skills. They need a teaching approach that includes as little mathematical context as possible. Iterative participatory qualitative research (action research) was used to develop a statistical literacy course for adult learners informed by teaching in…
ERIC Educational Resources Information Center
Thompson, Bruce
Web-based statistical instruction, like all statistical instruction, ought to focus on teaching the essence of the research endeavor: the exercise of reflective judgment. Using the framework of the recent report of the American Psychological Association (APA) Task Force on Statistical Inference (Wilkinson and the APA Task Force on Statistical…
Inferring the hosts of coronavirus using dual statistical models based on nucleotide composition
Tang, Qin; Shi, Mijuan; Cheng, Yingyin; Zhang, Wanting; Xia, Xiao-Qin
2015-01-01
Many coronaviruses are capable of interspecies transmission. Some of them have caused worldwide panic as emerging human pathogens in recent years, e.g., severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). In order to assess their threat to humans, we explored to infer the potential hosts of coronaviruses using a dual-model approach based on nineteen parameters computed from spike genes of coronaviruses. Both the support vector machine (SVM) model and the Mahalanobis distance (MD) discriminant model achieved high accuracies in leave-one-out cross-validation of training data consisting of 730 representative coronaviruses (99.86% and 98.08% respectively). Predictions on 47 additional coronaviruses precisely conformed to conclusions or speculations by other researchers. Our approach is implemented as a web server that can be accessed at http://bioinfo.ihb.ac.cn/seq2hosts. PMID:26607834
Inferring the hosts of coronavirus using dual statistical models based on nucleotide composition.
Tang, Qin; Song, Yulong; Shi, Mijuan; Cheng, Yingyin; Zhang, Wanting; Xia, Xiao-Qin
2015-01-01
Many coronaviruses are capable of interspecies transmission. Some of them have caused worldwide panic as emerging human pathogens in recent years, e.g., severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). In order to assess their threat to humans, we explored to infer the potential hosts of coronaviruses using a dual-model approach based on nineteen parameters computed from spike genes of coronaviruses. Both the support vector machine (SVM) model and the Mahalanobis distance (MD) discriminant model achieved high accuracies in leave-one-out cross-validation of training data consisting of 730 representative coronaviruses (99.86% and 98.08% respectively). Predictions on 47 additional coronaviruses precisely conformed to conclusions or speculations by other researchers. Our approach is implemented as a web server that can be accessed at http://bioinfo.ihb.ac.cn/seq2hosts. PMID:26607834
NASA Astrophysics Data System (ADS)
Fu, Ji-Meng; Winchester, John W.
1994-03-01
Nitrogen in fresh waters of three rivers in northern Florida - the Apalachicola-Chattahoochee-Flint (ACF) River system, Ochlockonee (Och), and Sopchoppy (Sop) - is inferred to be derived mostly from atmospheric deposition. Because the N:P mole ratios in the rivers are nearly three times higher than the Redfield ratio for aquatic photosynthesis, N is saturated in the ecosystems, not a limiting nutrient, although it may be chemically transformed. Absolute principal component analysis (APCA), a receptor model, was applied to many years of monitoring data for Apalachicola River water and rainfall over its basin in order to better understand aquatic chemistry of nitrogen in the watershed. The APCA model describes the river water as mainly a mixture of components with compositions resembling fresh rain, aged rain, and groundwater. In the fresh rain component, the ratio of atmospheric nitrate to sulfate is close to that in rainwater, as if some samples had been collected following very recent rainfall. The aged rain component of the river water is distinguished by a low NO 3-/SO 42- ratio, signifying an atmospheric source but with most of its nitrate having been lost or transformed. The groundwater component, inferred from its concentration to contribute on average about one fourth of the river water, contains abundant Ca 2+ but no detectable nitrogen. Results similar to ACF were obtained for Sop and Och, though Och exhibits some association of NO 3- with the Ca 2+-rich component. Similar APCA of wet precipitation resolves mainly components that represent acid rain, with NO 3-, SO 42- and NH 4+ and sea salt, with Na +, Cl - and Mg 2+. Inland, the acid rain component is relatively more prominent and Cl - is depleted, while at atmospheric monitoring sites nearer the coastal region sea salt tends to be more prominent.
Statistical Inference for Valued-Edge Networks: The Generalized Exponential Random Graph Model
Desmarais, Bruce A.; Cranmer, Skyler J.
2012-01-01
Across the sciences, the statistical analysis of networks is central to the production of knowledge on relational phenomena. Because of their ability to model the structural generation of networks based on both endogenous and exogenous factors, exponential random graph models are a ubiquitous means of analysis. However, they are limited by an inability to model networks with valued edges. We address this problem by introducing a class of generalized exponential random graph models capable of modeling networks whose edges have continuous values (bounded or unbounded), thus greatly expanding the scope of networks applied researchers can subject to statistical analysis. PMID:22276151
Independence and statistical inference in clinical trial designs: a tutorial review.
Bolton, S
1998-05-01
The requirements for statistical approaches to the design, analysis, and interpretation of experimental data are now accepted by the scientific community. This is of particular importance in medical studies where public health consequences are of concern. Investigators in the clinical sciences should be cognizant of statistical principles in general, but should always be wary of the pursuing their own analyses and engage statisticians for data analysis whenever possible. Examples of circumstances that require statistical evaluation not found in textbooks and not always obvious to the lay person are pervasive. Incorrect statistical evaluation and analyses in such situations will result in erroneous and potentially serious misleading interpretation of clinical data. Although a statistician may not be responsible for any misinterpretations in such unfortunate circumstances, the quote often cited about statisticians and "damned liars" may appear to be more truth than fable. This article is a tutorial review and describes a common misuse of clinical data resulting in an apparently large sample size derived from a small number of patients. This mistake is a consequence of ignoring the dependency of results, treating multiple observations from a single patient as independent observations. PMID:9602951
Statistical inference of selection and divergence of rice blast resistance gene Pi-ta
Technology Transfer Automated Retrieval System (TEKTRAN)
The resistance gene Pi-ta has been effectively used to control rice blast disease worldwide. A few recent studies have described the possible evolution of Pi-ta in cultivated and weedy rice. However, evolutionary statistics used for the studies are too limited to precisely understand selection and d...
Using Stimulus Equivalence Technology to Teach Statistical Inference in a Group Setting
ERIC Educational Resources Information Center
Critchfield, Thomas S.; Fienup, Daniel M.
2010-01-01
Computerized lessons employing stimulus equivalence technology, used previously under laboratory conditions to teach inferential statistics concepts to college students, were employed in a group setting for the first time. Students showed the same directly taught and emergent learning gains as in laboratory studies. A brief paper-and-pencil…
ERIC Educational Resources Information Center
Finch, Sue; Cumming, Geoff; Thomason, Neil
2001-01-01
Analyzed 150 articles from the "Journal of Applied Psychology" (JAP) from 1940 to 1999 to determine statistical reporting practices related to null hypothesis significance testing, American Psychological Association guidelines, and reform recommendations. Findings show little evidence that decades of cogent criticisms by reformers have resulted in…
NASA Technical Reports Server (NTRS)
Abbey, Craig K.; Eckstein, Miguel P.
2002-01-01
We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.
Statistical inference for classification of RRIM clone series using near IR reflectance properties
NASA Astrophysics Data System (ADS)
Ismail, Faridatul Aima; Madzhi, Nina Korlina; Hashim, Hadzli; Abdullah, Noor Ezan; Khairuzzaman, Noor Aishah; Azmi, Azrie Faris Mohd; Sampian, Ahmad Faiz Mohd; Harun, Muhammad Hafiz
2015-08-01
RRIM clone is a rubber breeding series produced by RRIM (Rubber Research Institute of Malaysia) through "rubber breeding program" to improve latex yield and producing clones attractive to farmers. The objective of this work is to analyse measurement of optical sensing device on latex of selected clone series. The device using transmitting NIR properties and its reflectance is converted in terms of voltage. The obtained reflectance index value via voltage was analyzed using statistical technique in order to find out the discrimination among the clones. From the statistical results using error plots and one-way ANOVA test, there is an overwhelming evidence showing discrimination of RRIM 2002, RRIM 2007 and RRIM 3001 clone series with p value = 0.000. RRIM 2008 cannot be discriminated with RRIM 2014; however both of these groups are distinct from the other clones.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
NASA Technical Reports Server (NTRS)
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Lekone, Phenyo E; Finkenstädt, Bärbel F
2006-12-01
A stochastic discrete-time susceptible-exposed-infectious-recovered (SEIR) model for infectious diseases is developed with the aim of estimating parameters from daily incidence and mortality time series for an outbreak of Ebola in the Democratic Republic of Congo in 1995. The incidence time series exhibit many low integers as well as zero counts requiring an intrinsically stochastic modeling approach. In order to capture the stochastic nature of the transitions between the compartmental populations in such a model we specify appropriate conditional binomial distributions. In addition, a relatively simple temporally varying transmission rate function is introduced that allows for the effect of control interventions. We develop Markov chain Monte Carlo methods for inference that are used to explore the posterior distribution of the parameters. The algorithm is further extended to integrate numerically over state variables of the model, which are unobserved. This provides a realistic stochastic model that can be used by epidemiologists to study the dynamics of the disease and the effect of control interventions. PMID:17156292
Statistical inference for the additive hazards model under outcome-dependent sampling
Yu, Jichang; Liu, Yanyan; Sandler, Dale P.; Zhou, Haibo
2015-01-01
Cost-effective study design and proper inference procedures for data from such designs are always of particular interests to study investigators. In this article, we propose a biased sampling scheme, an outcome-dependent sampling (ODS) design for survival data with right censoring under the additive hazards model. We develop a weighted pseudo-score estimator for the regression parameters for the proposed design and derive the asymptotic properties of the proposed estimator. We also provide some suggestions for using the proposed method by evaluating the relative efficiency of the proposed method against simple random sampling design and derive the optimal allocation of the subsamples for the proposed design. Simulation studies show that the proposed ODS design is more powerful than other existing designs and the proposed estimator is more efficient than other estimators. We apply our method to analyze a cancer study conducted at NIEHS, the Cancer Incidence and Mortality of Uranium Miners Study, to study the risk of radon exposure to cancer. PMID:26379363
Fu, Ji-Meng; Winchester, J.W. )
1994-03-01
Nitrogen in fresh waters of three rivers in northern Florida-the Apalachicola-Chattahoochee-Flint (ACF) River system, Ochlockonee (Och), and Sopchoppy (Sop)- is inferred to be derived mostly from atmospheric deposition. Because the N:P mole ratios in the rivers are nearly three times higher than the Redfield ratio for aquatic photosynthesis, N is saturate in the ecosystems, not a limiting nutrient, although it may be chemically transformed. Absolute principal component analysis (APCA), a receptor model, was applied to many years of monitoring data for Apalachicola River water and rainfall over its basin in order to better understand aquatic chemistry of nitrogen in the watershed. The APCA model aged rain and groundwater. In the fresh rain component, the ratio of atmospheric nitrate to sulfate is close to that in rainwater, as if some samples had been collected following very recent rainfall. The aged rain component of the river water is distinguished by a low NO[sup [minus][sub 3
NASA Astrophysics Data System (ADS)
Jha, Sanjeev Kumar; Comunian, Alessandro; Mariethoz, Gregoire; Kelly, Bryce F. J.
2014-10-01
We develop a stochastic approach to construct channelized 3-D geological models constrained to borehole measurements as well as geological interpretation. The methodology is based on simple 2-D geologist-provided sketches of fluvial depositional elements, which are extruded in the 3rd dimension. Multiple-point geostatistics (MPS) is used to impair horizontal variability to the structures by introducing geometrical transformation parameters. The sketches provided by the geologist are used as elementary training images, whose statistical information is expanded through randomized transformations. We demonstrate the applicability of the approach by applying it to modeling a fluvial valley filling sequence in the Maules Creek catchment, Australia. The facies models are constrained to borehole logs, spatial information borrowed from an analogue and local orientations derived from the present-day stream networks. The connectivity in the 3-D facies models is evaluated using statistical measures and transport simulations. Comparison with a statistically equivalent variogram-based model shows that our approach is more suited for building 3-D facies models that contain structures specific to the channelized environment and which have a significant influence on the transport processes.
Nonequilibrium statistical mechanics in one-dimensional bose gases
NASA Astrophysics Data System (ADS)
Baldovin, F.; Cappellaro, A.; Orlandini, E.; Salasnich, L.
2016-06-01
We study cold dilute gases made of bosonic atoms, showing that in the mean-field one-dimensional regime they support stable out-of-equilibrium states. Starting from the 3D Boltzmann–Vlasov equation with contact interaction, we derive an effective 1D Landau–Vlasov equation under the condition of a strong transverse harmonic confinement. We investigate the existence of out-of-equilibrium states, obtaining stability criteria similar to those of classical plasmas.
ERIC Educational Resources Information Center
Cui, Ying; Roberts, Mary Roduta
2013-01-01
The goal of this study was to investigate the usefulness of person-fit analysis in validating student score inferences in a cognitive diagnostic assessment. In this study, a two-stage procedure was used to evaluate person fit for a diagnostic test in the domain of statistical hypothesis testing. In the first stage, the person-fit statistic, the…
Anderson, Eric C
2012-01-01
Advances in genotyping that allow tens of thousands of individuals to be genotyped at a moderate number of single nucleotide polymorphisms (SNPs) permit parentage inference to be pursued on a very large scale. The intergenerational tagging this capacity allows is revolutionizing the management of cultured organisms (cows, salmon, etc.) and is poised to do the same for scientific studies of natural populations. Currently, however, there are no likelihood-based methods of parentage inference which are implemented in a manner that allows them to quickly handle a very large number of potential parents or parent pairs. Here we introduce an efficient likelihood-based method applicable to the specialized case of cultured organisms in which both parents can be reliably sampled. We develop a Markov chain representation for the cumulative number of Mendelian incompatibilities between an offspring and its putative parents and we exploit it to develop a fast algorithm for simulation-based estimates of statistical confidence in SNP-based assignments of offspring to pairs of parents. The method is implemented in the freely available software SNPPIT. We describe the method in detail, then assess its performance in a large simulation study using known allele frequencies at 96 SNPs from ten hatchery salmon populations. The simulations verify that the method is fast and accurate and that 96 well-chosen SNPs can provide sufficient power to identify the correct pair of parents from amongst millions of candidate pairs. PMID:23152426
A statistical formulation of one-dimensional electron fluid turbulence
NASA Technical Reports Server (NTRS)
Fyfe, D.; Montgomery, D.
1977-01-01
A one-dimensional electron fluid model is investigated using the mathematical methods of modern fluid turbulence theory. Non-dissipative equilibrium canonical distributions are determined in a phase space whose co-ordinates are the real and imaginary parts of the Fourier coefficients for the field variables. Spectral densities are calculated, yielding a wavenumber electric field energy spectrum proportional to k to the negative second power for large wavenumbers. The equations of motion are numerically integrated and the resulting spectra are found to compare well with the theoretical predictions.
Statistical Inference in Hidden Markov Models Using k-Segment Constraints
Titsias, Michalis K.; Holmes, Christopher C.; Yau, Christopher
2016-01-01
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward–backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online. PMID:27226674
NASA Astrophysics Data System (ADS)
Laloy, Eric; Linde, Niklas; Jacques, Diederik; Vrugt, Jasper A.
2015-06-01
We present a Bayesian inversion method for the joint inference of high-dimensional multi-Gaussian hydraulic conductivity fields and associated geostatistical parameters from indirect hydrological data. We combine Gaussian process generation via circulant embedding to decouple the variogram from grid cell specific values, with dimensionality reduction by interpolation to enable Markov chain Monte Carlo (MCMC) simulation. Using the Matérn variogram model, this formulation allows inferring the conductivity values simultaneously with the field smoothness (also called Matérn shape parameter) and other geostatistical parameters such as the mean, sill, integral scales and anisotropy direction(s) and ratio(s). The proposed dimensionality reduction method systematically honors the underlying variogram and is demonstrated to achieve better performance than the Karhunen-Loève expansion. We illustrate our inversion approach using synthetic (error corrupted) data from a tracer experiment in a fairly heterogeneous 10,000-dimensional 2-D conductivity field. A 40-times reduction of the size of the parameter space did not prevent the posterior simulations to appropriately fit the measurement data and the posterior parameter distributions to include the true geostatistical parameter values. Overall, the posterior field realizations covered a wide range of geostatistical models, questioning the common practice of assuming a fixed variogram prior to inference of the hydraulic conductivity values. Our method is shown to be more efficient than sequential Gibbs sampling (SGS) for the considered case study, particularly when implemented on a distributed computing cluster. It is also found to outperform the method of anchored distributions (MAD) for the same computational budget.
Statistical inference of co-movements of stocks during a financial crisis
NASA Astrophysics Data System (ADS)
Ibuki, Takero; Higano, Shunsuke; Suzuki, Sei; Inoue, Jun-ichi; Chakraborti, Anirban
2013-12-01
In order to figure out and to forecast the emergence phenomena of social systems, we propose several probabilistic models for the analysis of financial markets, especially around a crisis. We first attempt to visualize the collective behaviour of markets during a financial crisis through cross-correlations between typical Japanese daily stocks by making use of multidimensional scaling. We find that all the two-dimensional points (stocks) shrink into a single small region when a economic crisis takes place. By using the properties of cross-correlations in financial markets especially during a crisis, we next propose a theoretical framework to predict several time-series simultaneously. Our model system is basically described by a variant of the multi-layered Ising model with random fields as non-stationary time series. Hyper-parameters appearing in the probabilistic model are estimated by means of minimizing the 'cumulative error' in the past market history. The justification and validity of our approaches are numerically examined for several empirical data sets.
Statistical inference methods for recurrent event processes with shape and size parameters
WANG, MEI-CHENG; HUANG, CHIUNG-YU
2015-01-01
Summary This paper proposes a unified framework to characterize the rate function of a recurrent event process through shape and size parameters. In contrast to the intensity function, which is the event occurrence rate conditional on the event history, the rate function is the occurrence rate unconditional on the event history, and thus it can be interpreted as a population-averaged count of events in unit time. In this paper, shape and size parameters are introduced and used to characterize the association between the rate function λ(·) and a random variable X. Measures of association between X and λ(·) are defined via shape- and size-based coefficients. Rate-independence of X and λ(·) is studied through tests of shape-independence and size-independence, where the shape-and size-based test statistics can be used separately or in combination. These tests can be applied when X is a covariable possibly correlated with the recurrent event process through λ(·) or, in the one-sample setting, when X is the censoring time at which the observation of N(·) is terminated. The proposed tests are shape- and size-based, so when a null hypothesis is rejected, the test results can serve to distinguish the source of violation. PMID:26412863
Statistical validation of high-dimensional models of growing networks
NASA Astrophysics Data System (ADS)
Medo, Matúš
2014-03-01
The abundance of models of complex networks and the current insufficient validation standards make it difficult to judge which models are strongly supported by data and which are not. We focus here on likelihood maximization methods for models of growing networks with many parameters and compare their performance on artificial and real datasets. While high dimensionality of the parameter space harms the performance of direct likelihood maximization on artificial data, this can be improved by introducing a suitable penalization term. Likelihood maximization on real data shows that the presented approach is able to discriminate among available network models. To make large-scale datasets accessible to this kind of analysis, we propose a subset sampling technique and show that it yields substantial model evidence in a fraction of time necessary for the analysis of the complete data.
Kimura, S; Araki, D; Matsumura, K; Okada-Hatakeyama, M
2012-02-01
Voit and Almeida have proposed the decoupling approach as a method for inferring the S-system models of genetic networks. The decoupling approach defines the inference of a genetic network as a problem requiring the solutions of sets of algebraic equations. The computation can be accomplished in a very short time, as the approach estimates S-system parameters without solving any of the differential equations. Yet the defined algebraic equations are non-linear, which sometimes prevents us from finding reasonable S-system parameters. In this study, we propose a new technique to overcome this drawback of the decoupling approach. This technique transforms the problem of solving each set of algebraic equations into a one-dimensional function optimization problem. The computation can still be accomplished in a relatively short time, as the problem is transformed by solving a linear programming problem. We confirm the effectiveness of the proposed approach through numerical experiments. PMID:22155075
NASA Astrophysics Data System (ADS)
Hashim, Mohammad Firdaus Abu; Ramakrishnan, Sivakumar; Mohamad, Ahmad Azmin
2014-06-01
Due to low environmental impact and rechargeable capability, the Nickel Metal Hydride battery has been considered to be one of the most promising candidate battery for electrical vehicle nowadays. The energy delivered by the Nickel Metal Hydride battery depends heavily on its discharge profile and generally it is intangible to tract the trend of the energydissipation that is stored in the battery for informative analysis. The thermal models were developed in 1-dimensional and 2-dimensional using Matlab and these models are capable of predicting the temperature distributions inside a cell. The simulated results were validated and verified with referred exact sources of experimental data using Minitab software. The result for 1-Dimensional showed that the correlations between experimental and predicted results for the time intervals 60 minutes, 90 minutes, and 114 minutes frompositive to negative electrode thermal dissipationdirection are34%, 83%, and 94% accordingly while for the 2-Dimensional the correlational results for the same above time intervals are44%, 93% and 95%. These correlationalresults between experimental and predicted clearly indicating the thermal behavior under natural convention can be well fitted after around 90 minutes durational time and 2-Dimensional model can predict the results more accurately compared to 1-Dimensional model. Based on the results obtained from simulations, it can be concluded that both 1-Dimensional and 2-Dimensional models can predict nearly similar thermal behavior under natural convention while 2-Dimensional model was used to predict thermal behavior under forced convention for better accuracy.
Constrained statistical inference: sample-size tables for ANOVA and regression
Vanbrabant, Leonard; Van De Schoot, Rens; Rosseel, Yves
2015-01-01
Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient β1 is larger than β2 and β3. The corresponding hypothesis is H: β1 > {β2, β3} and this is known as an (order) constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a pre-specified power (say, 0.80) for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30–50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., β1 > β2) results in a higher power than assigning a positive or a negative sign to the parameters (e.g., β1 > 0). PMID:25628587
Can we infer the effect of river works on streamflow statistics?
NASA Astrophysics Data System (ADS)
Ganora, Daniele
2016-04-01
Most of our river network system is affected by anthropic pressure of different types. While climate and land use change are widely recognized as important factors, the effects of "in-line" water infrastructures on the global behavior of the river system is often overlooked. This is due to the difficulty in including local "physical" knowledge (e.g., the hydraulic behavior of a river reach with levees during a flood) into large-scale models that provide a statistical description of the streamflow, and which are the basis for the implementation of resources/risk management plans (e.g., regional models for prediction of the flood frequency curve). This work presents some preliminary applications regarding two widely used hydrological signatures, the flow duration curve and the flood frequency curve. We adopt a pragmatic (i.e., reliable and implementable at large scales) and parsimonious (i.e., that requires a few data) framework of analysis, considering that we operate in a complex system (many river work are already existing, and many others could be built in the future). In the first case, a method is proposed to correct observations of streamflow affected by the presence of upstream run-of-the-river power plants in order to provide the "natural" flow duration curve, using only simple information about the plant (i.e., the maximum intake flow). The second case regards the effects of flood-protection works on the downstream sections, to support the application of along-stream cost-benefit analysis in the flood risk management context. Current applications and possible future developments are discussed.
NASA Astrophysics Data System (ADS)
Calderon, Christopher P.; Weiss, Lucien E.; Moerner, W. E.
2014-05-01
Experimental advances have improved the two- (2D) and three-dimensional (3D) spatial resolution that can be extracted from in vivo single-molecule measurements. This enables researchers to quantitatively infer the magnitude and directionality of forces experienced by biomolecules in their native environment. Situations where such force information is relevant range from mitosis to directed transport of protein cargo along cytoskeletal structures. Models commonly applied to quantify single-molecule dynamics assume that effective forces and velocity in the x ,y (or x ,y,z) directions are statistically independent, but this assumption is physically unrealistic in many situations. We present a hypothesis testing approach capable of determining if there is evidence of statistical dependence between positional coordinates in experimentally measured trajectories; if the hypothesis of independence between spatial coordinates is rejected, then a new model accounting for 2D (3D) interactions can and should be considered. Our hypothesis testing technique is robust, meaning it can detect interactions, even if the noise statistics are not well captured by the model. The approach is demonstrated on control simulations and on experimental data (directed transport of intraflagellar transport protein 88 homolog in the primary cilium).
Soap film flows: Statistics of two-dimensional turbulence
Vorobieff, P.; Rivera, M.; Ecke, R.E.
1999-08-01
Soap film flows provide a very convenient laboratory model for studies of two-dimensional (2-D) hydrodynamics including turbulence. For a gravity-driven soap film channel with a grid of equally spaced cylinders inserted in the flow, we have measured the simultaneous velocity and thickness fields in the irregular flow downstream from the cylinders. The velocity field is determined by a modified digital particle image velocimetry method and the thickness from the light scattered by the particles in the film. From these measurements, we compute the decay of mean energy, enstrophy, and thickness fluctuations with downstream distance, and the structure functions of velocity, vorticity, thickness fluctuation, and vorticity flux. From these quantities we determine the microscale Reynolds number of the flow R{sub {lambda}}{approx}100 and the integral and dissipation scales of 2D turbulence. We also obtain quantitative measures of the degree to which our flow can be considered incompressible and isotropic as a function of downstream distance. We find coarsening of characteristic spatial scales, qualitative correspondence of the decay of energy and enstrophy with the Batchelor model, scaling of energy in {ital k} space consistent with the k{sup {minus}3} spectrum of the Kraichnan{endash}Batchelor enstrophy-scaling picture, and power-law scalings of the structure functions of velocity, vorticity, vorticity flux, and thickness. These results are compared with models of 2-D turbulence and with numerical simulations. {copyright} {ital 1999 American Institute of Physics.}
NASA Astrophysics Data System (ADS)
Ortega-Minakata, R. A.; Torres-Papaqui, J. P.; Andernach, H.; Islas-Islas, J. M.
2014-05-01
We quantify the statistical evidence of the relation between the inferred morphology and the emission-line activity type of galaxies for a large sample of galaxies. We compare the distribution of the inferred morphologies of galaxies of different dominant activity types, showing that the difference in the median morphological type between the samples of different activity types is significant. We also test the significance of the difference in the mean morphological type between all the activity-type samples using an ANOVA model with a modified Tukey test that takes into account heteroscedasticity and the unequal sample sizes. We show this test in the form of simultaneous confidence intervals for all pairwise comparisons of the mean morphological types of the samples. Using this test, scarcely applied in astronomy, we conclude that there are statistically significant differences in the inferred morphologies of galaxies of different dominant activity types.
NASA Astrophysics Data System (ADS)
von Nessi, G. T.; Hole, M. J.; The MAST Team
2014-11-01
We present recent results and technical breakthroughs for the Bayesian inference of tokamak equilibria using force-balance as a prior constraint. Issues surrounding model parameter representation and posterior analysis are discussed and addressed. These points motivate the recent advancements embodied in the Bayesian Equilibrium Analysis and Simulation Tool (BEAST) software being presently utilized to study equilibria on the Mega-Ampere Spherical Tokamak (MAST) experiment in the UK (von Nessi et al 2012 J. Phys. A 46 185501). State-of-the-art results of using BEAST to study MAST equilibria are reviewed, with recent code advancements being systematically presented though out the manuscript.
NASA Astrophysics Data System (ADS)
Speegle, Darrin; Steward, Robert
2015-08-01
We propose a semiparametric approach to infer the existence of and estimate the location of a statistical change-point to a nonlinear high dimensional time series contaminated with an additive noise component. In particular, we consider a p―dimensional stochastic process of independent multivariate normal observations where the mean function varies smoothly except at a single change-point. Our approach first involves a dimension reduction of the original time series through a random matrix multiplication. Next, we conduct a Bayesian analysis on the empirical detail coefficients of this dimensionally reduced time series after a wavelet transform. We also present a means to associate confidence bounds to the conclusions of our results. Aside from being computationally efficient and straight forward to implement, the primary advantage of our methods is seen in how these methods apply to a much larger class of time series whose mean functions are subject to only general smoothness conditions.
NASA Astrophysics Data System (ADS)
Pandarinath, Kailasa
2014-12-01
Several new multi-dimensional tectonomagmatic discrimination diagrams employing log-ratio variables of chemical elements and probability based procedure have been developed during the last 10 years for basic-ultrabasic, intermediate and acid igneous rocks. There are numerous studies on extensive evaluations of these newly developed diagrams which have indicated their successful application to know the original tectonic setting of younger and older as well as sea-water and hydrothermally altered volcanic rocks. In the present study, these diagrams were applied to Precambrian rocks of Mexico (southern and north-eastern) and Argentina. The study indicated the original tectonic setting of Precambrian rocks from the Oaxaca Complex of southern Mexico as follows: (1) dominant rift (within-plate) setting for rocks of 1117-988 Ma age; (2) dominant rift and less-dominant arc setting for rocks of 1157-1130 Ma age; and (3) a combined tectonic setting of collision and rift for Etla Granitoid Pluton (917 Ma age). The diagrams have indicated the original tectonic setting of the Precambrian rocks from the north-eastern Mexico as: (1) a dominant arc tectonic setting for the rocks of 988 Ma age; and (2) an arc and collision setting for the rocks of 1200-1157 Ma age. Similarly, the diagrams have indicated the dominant original tectonic setting for the Precambrian rocks from Argentina as: (1) with-in plate (continental rift-ocean island) and continental rift (CR) setting for the rocks of 800 Ma and 845 Ma age, respectively; and (2) an arc setting for the rocks of 1174-1169 Ma and of 1212-1188 Ma age. The inferred tectonic setting for these Precambrian rocks are, in general, in accordance to the tectonic setting reported in the literature, though there are some inconsistence inference of tectonic settings by some of the diagrams. The present study confirms the importance of these newly developed discriminant-function based diagrams in inferring the original tectonic setting of
A three-dimensional statistical mechanical model of folding double-stranded chain molecules
NASA Astrophysics Data System (ADS)
Zhang, Wenbing; Chen, Shi-Jie
2001-05-01
Based on a graphical representation of intrachain contacts, we have developed a new three-dimensional model for the statistical mechanics of double-stranded chain molecules. The theory has been tested and validated for the cubic lattice chain conformations. The statistical mechanical model can be applied to the equilibrium folding thermodynamics of a large class of chain molecules, including protein β-hairpin conformations and RNA secondary structures. The application of a previously developed two-dimensional model to RNA secondary structure folding thermodynamics generally overestimates the breadth of the melting curves [S-J. Chen and K. A. Dill, Proc. Natl. Acad. Sci. U.S.A. 97, 646 (2000)], suggesting an underestimation for the sharpness of the conformational transitions. In this work, we show that the new three-dimensional model gives much sharper melting curves than the two-dimensional model. We believe that the new three-dimensional model may give much improved predictions for the thermodynamic properties of RNA conformational changes than the previous two-dimensional model.
Measurement of two-dimensional optical system MTF by computation of second order speckle statistics
NASA Astrophysics Data System (ADS)
Lund, G.; Azouit, M.
1980-04-01
An interferometric approach to the calculation of the two-dimensional MTF of an optical system is proposed. The technique, in some ways analogous to that of speckle interferometry used in astronomical situations, is based on the computation of the second-order spatio-temporal statistics of a fluctuating speckle pattern. The theorum of Van Cittert-Zernike is invoked to relate the speckle, due to the illumination of a perfect diffuser by the point spread function of an optical system, to the two-dimensional MTF of the system. The computed MTF is displayed in the form of a contour map and can also be represented in the conventional form of a one-dimensional vertical cut. Preliminary measurements have yielded qualitatively useful results and clearly illustrate the suitability of two-dimensional maps for the detection of transfer function anisotropies.
Stadler, R.; Hellmann, J.; Schirle, M.; Beckmann, J.
1993-12-31
Based on on previous work where it was shown that 4-urazoyl benzoic acid groups (U4A), which were statistically attached to polybutadiene, form ordered supramolecular arrays in the polymer matrix. The present work describes the synthesis of a new molecular building block capable for self assembling in the unpolar matrix. 5-urazoylisophthalic acid groups (U35A) attached to 1,4-polybutadiene chains show an endothermic transition, characteristic for supramolecular self assembling. The melting temperature increases for low levels of modification from 130{degrees}C up to 190{degrees}C. The IR-data indicate than the 5-urazoylisophthalic acid groups are 4-functional with respect to supramolecular self-addressing. Based on the detailed knowledge of the structure of the self-assembled domains in 4-urazoyl benzoic acid groups, a model is developed which describes qualitatively the observed material properties.
Schwermann, Achim H; Dos Santos Rolo, Tomy; Caterino, Michael S; Bechly, Günter; Schmied, Heiko; Baumbach, Tilo; van de Kamp, Thomas
2016-01-01
External and internal morphological characters of extant and fossil organisms are crucial to establishing their systematic position, ecological role and evolutionary trends. The lack of internal characters and soft-tissue preservation in many arthropod fossils, however, impedes comprehensive phylogenetic analyses and species descriptions according to taxonomic standards for Recent organisms. We found well-preserved three-dimensional anatomy in mineralized arthropods from Paleogene fissure fillings and demonstrate the value of these fossils by utilizing digitally reconstructed anatomical structure of a hister beetle. The new anatomical data facilitate a refinement of the species diagnosis and allowed us to reject a previous hypothesis of close phylogenetic relationship to an extant congeneric species. Our findings suggest that mineralized fossils, even those of macroscopically poor preservation, constitute a rich but yet largely unexploited source of anatomical data for fossil arthropods. PMID:26854367
Schwermann, Achim H; dos Santos Rolo, Tomy; Caterino, Michael S; Bechly, Günter; Schmied, Heiko; Baumbach, Tilo; van de Kamp, Thomas
2016-01-01
External and internal morphological characters of extant and fossil organisms are crucial to establishing their systematic position, ecological role and evolutionary trends. The lack of internal characters and soft-tissue preservation in many arthropod fossils, however, impedes comprehensive phylogenetic analyses and species descriptions according to taxonomic standards for Recent organisms. We found well-preserved three-dimensional anatomy in mineralized arthropods from Paleogene fissure fillings and demonstrate the value of these fossils by utilizing digitally reconstructed anatomical structure of a hister beetle. The new anatomical data facilitate a refinement of the species diagnosis and allowed us to reject a previous hypothesis of close phylogenetic relationship to an extant congeneric species. Our findings suggest that mineralized fossils, even those of macroscopically poor preservation, constitute a rich but yet largely unexploited source of anatomical data for fossil arthropods. DOI: http://dx.doi.org/10.7554/eLife.12129.001 PMID:26854367
Maneuvering target tracking algorithm based on current statistical model in three dimensional space
NASA Astrophysics Data System (ADS)
Huang, Ligang; Yan, Kang; Wang, Xiangdong
2015-07-01
This paper is mainly to solve the problems associated with maneuvering target tracking based current statistical model in three dimensional space. Firstly, a three-dimensional model of the nine state variables is presented. Then adaptive Kalman filtering algorithm is designed with the motor acceleration data mean and variance. Finally, A simulation about the adaptive Kalman filtering put forward by this thesis and the direct calculation method is given, which aim at the maneuvering target in three-dimension. The results show the good performances such as better target position, velocity and acceleration estimates brought by the proposed approach by presenting and discussing the simulation results.
Full counting statistics of laser excited Rydberg aggregates in a one-dimensional geometry.
Schempp, H; Günter, G; Robert-de-Saint-Vincent, M; Hofmann, C S; Breyel, D; Komnik, A; Schönleber, D W; Gärttner, M; Evers, J; Whitlock, S; Weidemüller, M
2014-01-10
We experimentally study the full counting statistics of few-body Rydberg aggregates excited from a quasi-one-dimensional atomic gas. We measure asymmetric excitation spectra and increased second and third order statistical moments of the Rydberg number distribution, from which we determine the average aggregate size. Estimating rates for different excitation processes we conclude that the aggregates grow sequentially around an initial grain. Direct comparison with numerical simulations confirms this conclusion and reveals the presence of liquidlike spatial correlations. Our findings demonstrate the importance of dephasing in strongly correlated Rydberg gases and introduce a way to study spatial correlations in interacting many-body quantum systems without imaging. PMID:24483893
NASA Astrophysics Data System (ADS)
Yoshimitsu, Nana; Furumura, Takashi; Maeda, Takuto
2016-09-01
The coda part of a waveform transmitted through a laboratory sample should be examined for the high-resolution monitoring of the sample characteristics in detail. However, the origin and propagation process of the later phases in a finite-sized small sample are very complicated with the overlap of multiple unknown reflections and conversions. In this study, we investigated the three-dimensional (3D) geometric effect of a finite-sized cylindrical sample to understand the development of these later phases. This study used 3D finite difference method simulation employing a free-surface boundary condition over a curved model surface and a realistic circular shape of the source model. The simulated waveforms and the visualized 3D wavefield in a stainless steel sample clearly demonstrated the process of multiple reflections and the conversions of the P and S waves at the side surface as well as at the top and bottom of the sample. Rayleigh wave propagation along the curved side boundary was also confirmed, and these waves dominate in the later portion of the simulated waveform with much larger amplitudes than the P and S wave reflections. The feature of the simulated waveforms showed good agreement with laboratory observed waveforms. For the simulation, an introduction of an absorbing boundary condition at the top and bottom of the sample made it possible to efficiently separate the contribution of the vertical and horizontal boundary effects in the simulated wavefield. This procedure helped to confirm the additional finding of vertically propagating multiple surface waves and their conversion at the corner of the sample. This new laboratory-scale 3D simulation enabled the appearance of a variety of geometric effects that constitute the later phases of the transmitted waves.
Halpin, Peter F; Stam, Henderikus J
2006-01-01
The application of statistical testing in psychological research over the period of 1940-1960 is examined in order to address psychologists' reconciliation of the extant controversy between the Fisher and Neyman-Pearson approaches. Textbooks of psychological statistics and the psychological journal literature are reviewed to examine the presence of what Gigerenzer (1993) called a hybrid model of statistical testing. Such a model is present in the textbooks, although the mathematically incomplete character of this model precludes the appearance of a similarly hybridized approach to statistical testing in the research literature. The implications of this hybrid model for psychological research and the statistical testing controversy are discussed. PMID:17286092
Carlsen, Michelle; Fu, Guifang; Bushman, Shaun; Corcoran, Christopher
2016-02-01
Genome-wide data with millions of single-nucleotide polymorphisms (SNPs) can be highly correlated due to linkage disequilibrium (LD). The ultrahigh dimensionality of big data brings unprecedented challenges to statistical modeling such as noise accumulation, the curse of dimensionality, computational burden, spurious correlations, and a processing and storing bottleneck. The traditional statistical approaches lose their power due to [Formula: see text] (n is the number of observations and p is the number of SNPs) and the complex correlation structure among SNPs. In this article, we propose an integrated distance correlation ridge regression (DCRR) approach to accommodate the ultrahigh dimensionality, joint polygenic effects of multiple loci, and the complex LD structures. Initially, a distance correlation (DC) screening approach is used to extensively remove noise, after which LD structure is addressed using a ridge penalized multiple logistic regression (LRR) model. The false discovery rate, true positive discovery rate, and computational cost were simultaneously assessed through a large number of simulations. A binary trait of Arabidopsis thaliana, the hypersensitive response to the bacterial elicitor AvrRpm1, was analyzed in 84 inbred lines (28 susceptibilities and 56 resistances) with 216,130 SNPs. Compared to previous SNP discovery methods implemented on the same data set, the DCRR approach successfully detected the causative SNP while dramatically reducing spurious associations and computational time. PMID:26661113
NASA Astrophysics Data System (ADS)
Rotter, Stefan; Aigner, Florian; Burgdörfer, Joachim
2007-03-01
We investigate the statistical distribution of transmission eigenvalues in phase-coherent transport through quantum dots. In two-dimensional ab initio simulations for both clean and disordered two-dimensional cavities, we find markedly different quantum-to-classical crossover scenarios for these two cases. In particular, we observe the emergence of “noiseless scattering states” in clean cavities, irrespective of sharp-edged entrance and exit lead mouths. We find the onset of these “classical” states to be largely independent of the cavity’s classical chaoticity, but very sensitive with respect to bulk disorder. Our results suggest that for weakly disordered cavities, the transmission eigenvalue distribution is determined both by scattering at the disorder potential and the cavity walls. To properly account for this intermediate parameter regime, we introduce a hybrid crossover scheme, which combines previous models that are valid in the ballistic and the stochastic limit, respectively.
Application of Edwards' statistical mechanics to high-dimensional jammed sphere packings.
Jin, Yuliang; Charbonneau, Patrick; Meyer, Sam; Song, Chaoming; Zamponi, Francesco
2010-11-01
The isostatic jamming limit of frictionless spherical particles from Edwards' statistical mechanics [Song et al., Nature (London) 453, 629 (2008)] is generalized to arbitrary dimension d using a liquid-state description. The asymptotic high-dimensional behavior of the self-consistent relation is obtained by saddle-point evaluation and checked numerically. The resulting random close packing density scaling ϕ∼d2(-d) is consistent with that of other approaches, such as replica theory and density-functional theory. The validity of various structural approximations is assessed by comparing with three- to six-dimensional isostatic packings obtained from simulations. These numerical results support a growing accuracy of the theoretical approach with dimension. The approach could thus serve as a starting point to obtain a geometrical understanding of the higher-order correlations present in jammed packings. PMID:21230456
NASA Astrophysics Data System (ADS)
Yeom, Seokwon; Lee, Dongsu; Son, Jung-Young; Kim, Shin-Hwan
2009-09-01
In this paper, we discuss computational reconstruction and statistical pattern classification using integral imaging. Three-dimensional object information is numerically reconstructed at arbitrary depth-levels by averaging the corresponding pixels. The longitudinal distance and object boundary are estimated where the standard deviation of the intensity is minimized. Fisher linear discriminant analysis combined with principal component analysis is adopted for the classification of out-of-plane rotated objects. The Fisher linear discriminant analysis maximizes the class-discrimination while the principal component analysis minimizes the error between the original and the restored images. The presented method provides promising results for the distortion-tolerant pattern classification.
Three-Dimensional Statistical Gas Distribution Mapping in an Uncontrolled Indoor Environment
Reggente, Matteo; Lilienthal, Achim J.
2009-05-23
In this paper we present a statistical method to build three-dimensional gas distribution maps (3D-DM). The proposed mapping technique uses kernel extrapolation with a tri-variate Gaussian kernel that models the likelihood that a reading represents the concentration distribution at a distant location in the three dimensions. The method is evaluated using a mobile robot equipped with three 'e-noses' mounted at different heights. Initial experiments in an uncontrolled indoor environment are presented and evaluated with respect to the ability of the 3D map, computed from the lower and upper nose, to predict the map from the middle nose.
Nonextensive statistics, entropic gravity and gravitational force in a non-integer dimensional space
NASA Astrophysics Data System (ADS)
Abreu, Everton M. C.; Neto, Jorge Ananias; Godinho, Cresus F. L.
2014-10-01
Based on the connection between Tsallis nonextensive statistics and fractional dimensional space, in this work we have introduced, with the aid of Verlinde's formalism, the Newton constant in a fractal space as a function of the nonextensive constant. With this result we have constructed a curve that shows the direct relation between Tsallis nonextensive parameter and the dimension of this fractal space. We have demonstrated precisely that there are ambiguities between the results due to Verlinde's approach and the ones due to fractional calculus formalism. We have shown precisely that these ambiguities appear only for spaces with dimensions different from three. A possible solution for this ambiguity was proposed here.
Aggelopoulos, Nikolaos C
2015-08-01
Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. PMID:25976632
Blanc, Guillermo A.; Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A.
2015-01-10
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.
Loop braiding statistics in exactly soluble three-dimensional lattice models
NASA Astrophysics Data System (ADS)
Lin, Chien-Hung; Levin, Michael
2015-07-01
We construct two exactly soluble lattice spin models that demonstrate the importance of three-loop braiding statistics for the classification of three-dimensional gapped quantum phases. The two models are superficially similar: both are gapped and both support particlelike and looplike excitations similar to those of charges and vortex lines in a Z2×Z2 gauge theory. Furthermore, in both models the particle excitations are bosons, and in both models the particle and loop excitations have the same mutual braiding statistics. The difference between the two models is only apparent when one considers the recently proposed three-loop braiding process in which one loop is braided around another while both are linked to a third loop. We find that the statistical phase associated with this process is different in the two models, thus proving that they belong to two distinct phases. An important feature of this work is that we derive our results using a concrete approach: we construct string and membrane operators that create and move the particle and loop excitations and then we extract the braiding statistics from the commutation algebra of these operators.
Statistical Projections for Multi-resolution, Multi-dimensional Visual Data Exploration and Analysis
Hoa T. Nguyen; Stone, Daithi; E. Wes Bethel
2016-01-01
An ongoing challenge in visual exploration and analysis of large, multi-dimensional datasets is how to present useful, concise information to a user for some specific visualization tasks. Typical approaches to this problem have proposed either reduced-resolution versions of data, or projections of data, or both. These approaches still have some limitations such as consuming high computation or suffering from errors. In this work, we explore the use of a statistical metric as the basis for both projections and reduced-resolution versions of data, with a particular focus on preserving one key trait in data, namely variation. We use two different case studies to explore this idea, one that uses a synthetic dataset, and another that uses a large ensemble collection produced by an atmospheric modeling code to study long-term changes in global precipitation. The primary findings of our work are that in terms of preserving the variation signal inherent in data, that using a statistical measure more faithfully preserves this key characteristic across both multi-dimensional projections and multi-resolution representations than a methodology based upon averaging.
A Study of the Statistical Inference Criteria: Can We Agree on When to Use Z versus "t"?
ERIC Educational Resources Information Center
Ozgur, Ceyhun; Strasser, Sandra E.
2004-01-01
Authors who write introductory business statistics texts do not agree on when to use a t distribution and when to use a Z distribution in both the construction of confidence intervals and the use of hypothesis testing. In a survey of textbooks written in the last 15 years, we found the decision rules to be contradictory and, at times, the…
ERIC Educational Resources Information Center
Schochet, Peter Z.
2015-01-01
This report presents the statistical theory underlying the "RCT-YES" software that estimates and reports impacts for RCTs for a wide range of designs used in social policy research. The report discusses a unified, non-parametric design-based approach for impact estimation using the building blocks of the Neyman-Rubin-Holland causal…
ERIC Educational Resources Information Center
Davis, Philip M.; Solla, Leah R.
2003-01-01
Reports an analysis of American Chemical Society electronic journal downloads at Cornell University (Ithaca, New York) by individual IP (Internet Protocol) addresses. Highlights include usage statistics to evaluate library journal subscriptions; understanding scientists' reading behavior; individual use of articles and of journals; and the…
Kravtsov, V.E.; Yudson, V.I.
2011-07-15
Highlights: > Statistics of normalized eigenfunctions in one-dimensional Anderson localization at E = 0 is studied. > Moments of inverse participation ratio are calculated. > Equation for generating function is derived at E = 0. > An exact solution for generating function at E = 0 is obtained. > Relation of the generating function to the phase distribution function is established. - Abstract: The one-dimensional (1d) Anderson model (AM), i.e. a tight-binding chain with random uncorrelated on-site energies, has statistical anomalies at any rational point f=(2a)/({lambda}{sub E}) , where a is the lattice constant and {lambda}{sub E} is the de Broglie wavelength. We develop a regular approach to anomalous statistics of normalized eigenfunctions {psi}(r) at such commensurability points. The approach is based on an exact integral transfer-matrix equation for a generating function {Phi}{sub r}(u, {phi}) (u and {phi} have a meaning of the squared amplitude and phase of eigenfunctions, r is the position of the observation point). This generating function can be used to compute local statistics of eigenfunctions of 1d AM at any disorder and to address the problem of higher-order anomalies at f=p/q with q > 2. The descender of the generating function P{sub r}({phi}){identical_to}{Phi}{sub r}(u=0,{phi}) is shown to be the distribution function of phase which determines the Lyapunov exponent and the local density of states. In the leading order in the small disorder we derived a second-order partial differential equation for the r-independent ('zero-mode') component {Phi}(u, {phi}) at the E = 0 (f=1/2 ) anomaly. This equation is nonseparable in variables u and {phi}. Yet, we show that due to a hidden symmetry, it is integrable and we construct an exact solution for {Phi}(u, {phi}) explicitly in quadratures. Using this solution we computed moments I{sub m} = N< vertical bar {psi} vertical bar {sup 2m}> (m {>=} 1) for a chain of the length N {yields} {infinity} and found an
Statistical mechanics of two-dimensional foams: Physical foundations of the model.
Durand, Marc
2015-12-01
In a recent series of papers, a statistical model that accounts for correlations between topological and geometrical properties of a two-dimensional shuffled foam has been proposed and compared with experimental and numerical data. Here, the various assumptions on which the model is based are exposed and justified: the equiprobability hypothesis of the foam configurations is argued. The range of correlations between bubbles is discussed, and the mean-field approximation that is used in the model is detailed. The two self-consistency equations associated with this mean-field description can be interpreted as the conservation laws of number of sides and bubble curvature, respectively. Finally, the use of a "Grand-Canonical" description, in which the foam constitutes a reservoir of sides and curvature, is justified. PMID:26701712
Statistics of particle transport in a two-dimensional dusty plasma cluster
Ratynskaia, S.; Knapek, C.; Rypdal, K.; Khrapak, S.; Morfill, G.
2005-02-01
Statistical analysis is performed on long time series of dust particle trajectories in a two-dimensional dusty plasma cluster. Particle transport is found to be superdiffusive on all time scales until the range of particle displacements approaches the size of the cluster. Analysis of probability distribution functions and rescaled range analysis of the position increments show that the signal is non-Gaussian self-similar with Hurst exponent H=0.6, indicating that the superdiffusion is caused by long-range dependencies in the system. Investigation of temporal and spatial characteristics of persistent particle slips demonstrates that they are associated with collective events present on all time scales and responsible for the non-Gaussianity and long-memory effects.
NASA Astrophysics Data System (ADS)
Villeta, M.; Sanz-Lobera, A.; González, C.; Sebastián, M. A.
2009-11-01
The implantation of Statistical Process Control, SPC designated in short, requires the use of measurement systems. The inherent variability of these systems influences on the reliability of measurement results obtained, and as a consequence of it, influences on the SPC results. This paper investigates about the influence of the uncertainty of measurement on the analysis of process capability. It looks for reducing the effect of measurement uncertainty, to approach the capability that the productive process really has. In this work processes centered at a nominal value as well as off-center processes are raised, and a criterion is proposed that allows validate the adequacy of the dimensional measurement systems used in a SPC implantation.
Collisional statistics and dynamics of two-dimensional hard-disk systems: From fluid to solid.
Taloni, Alessandro; Meroz, Yasmine; Huerta, Adrián
2015-08-01
We perform extensive MD simulations of two-dimensional systems of hard disks, focusing on the collisional statistical properties. We analyze the distribution functions of velocity, free flight time, and free path length for packing fractions ranging from the fluid to the solid phase. The behaviors of the mean free flight time and path length between subsequent collisions are found to drastically change in the coexistence phase. We show that single-particle dynamical properties behave analogously in collisional and continuous-time representations, exhibiting apparent crossovers between the fluid and the solid phases. We find that, both in collisional and continuous-time representation, the mean-squared displacement, velocity autocorrelation functions, intermediate scattering functions, and self-part of the van Hove function (propagator) closely reproduce the same behavior exhibited by the corresponding quantities in granular media, colloids, and supercooled liquids close to the glass or jamming transition. PMID:26382368
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
NASA Astrophysics Data System (ADS)
Das Sarma, S.; Nag, Amit; Sau, Jay D.
2016-07-01
We consider a simple conceptual question with respect to Majorana zero modes in semiconductor nanowires: can the measured nonideal values of the zero-bias-conductance-peak in the tunneling experiments be used as a characteristic to predict the underlying topological nature of the proximity induced nanowire superconductivity? In particular, we define and calculate the topological visibility, which is a variation of the topological invariant associated with the scattering matrix of the system as well as the zero-bias-conductance-peak heights in the tunneling measurements, in the presence of dissipative broadening, using precisely the same realistic nanowire parameters to connect the topological invariants with the zero-bias tunneling conductance values. This dissipative broadening is present in both (the existing) tunneling measurements and also (any future) braiding experiments as an inevitable consequence of a finite braiding time. The connection between the topological visibility and the conductance allows us to obtain the visibility of realistic braiding experiments in nanowires, and to conclude that the current experimentally accessible systems with nonideal zero-bias conductance peaks may indeed manifest (with rather low visibility) non-Abelian statistics for the Majorana zero modes. In general, we find that a large (small) superconducting gap (Majorana peak splitting) is essential for the manifestation of the non-Abelian braiding statistics, and in particular, a zero-bias conductance value of around half the ideal quantized Majorana value should be sufficient for the manifestation of non-Abelian statistics in experimental nanowires. Our work also establishes that as a matter of principle the topological transition associated with the emergence of Majorana zero modes in finite nanowires is always a crossover (akin to a quantum phase transition at finite temperature) requiring the presence of dissipative broadening (which must be larger than the Majorana energy
Statistical conservation law in two- and three-dimensional turbulent flows
NASA Astrophysics Data System (ADS)
Frishman, Anna; Boffetta, Guido; De Lillo, Filippo; Liberzon, Alex
2015-03-01
Particles in turbulence live complicated lives. It is nonetheless sometimes possible to find order in this complexity. It was proposed in Falkovich et al. [Phys. Rev. Lett. 110, 214502 (2013), 10.1103/PhysRevLett.110.214502] that pairs of Lagrangian tracers at small scales, in an incompressible isotropic turbulent flow, have a statistical conservation law. More specifically, in a d -dimensional flow the distance R (t ) between two neutrally buoyant particles, raised to the power -d and averaged over velocity realizations, remains at all times equal to the initial, fixed, separation raised to the same power. In this work we present evidence from direct numerical simulations of two- and three-dimensional turbulence for this conservation. In both cases the conservation is lost when particles exit the linear flow regime. In two dimensions we show that, as an extension of the conservation law, an Evans-Cohen-Morriss or Gallavotti-Cohen type fluctuation relation exists. We also analyze data from a 3D laboratory experiment [Liberzon et al., Physica D 241, 208 (2012), 10.1016/j.physd.2011.07.008], finding that although it probes small scales they are not in the smooth regime. Thus instead of
Current Sheet Statistics in Three-Dimensional Simulations of Coronal Heating
NASA Astrophysics Data System (ADS)
Lin, L.; Ng, C. S.; Bhattacharjee, A.
2013-04-01
In a recent numerical study [Ng et al., Astrophys. J. 747, 109, 2012], with a three-dimensional model of coronal heating using reduced magnetohydrodynamics (RMHD), we have obtained scaling results of heating rate versus Lundquist number based on a series of runs in which random photospheric motions are imposed for hundreds to thousands of Alfvén time in order to obtain converged statistical values. The heating rate found in these simulations saturate to a level that is independent of the Lundquist number. This scaling result was also supported by an analysis with the assumption of the Sweet-Parker scaling of the current sheets, as well as how the width, length and number of current sheets scale with Lundquist number. In order to test these assumptions, we have implemented an automated routine to analyze thousands of current sheets in these simulations and return statistical scalings for these quantities. It is found that the Sweet-Parker scaling is justified. However, some discrepancies are also found and require further study.
NASA Astrophysics Data System (ADS)
Shen, Samuel S. P.; Wied, Olaf; Weithmann, Alexander; Regele, Tobias; Bailey, Barbara A.; Lawrimore, Jay H.
2015-05-01
This paper describes six different temporal climate regimes of the contiguous United States (CONUS) according to interdecadal variations of surface air temperature (SAT) and precipitation using the United States Historical Climatology Network (USHCN) monthly data (Tmax, Tmin, Tmean, and precipitation) from 1895 to 2010. Our analysis is based on the probability distribution, mean, standard deviation, skewness, kurtosis, Kolmogorov-Smirnov (KS) test, and Welch's t test. The relevant statistical parameters are computed from gridded monthly SAT and precipitation data. SAT variations lead to classification of four regimes: 1895-1930 (cool), 1931-1960 (warm), 1961-1985 (cool), and 1986-2010 (warm), while precipitation variations lead to a classification of two regimes: 1895-1975 (dry) and 1976-2010 (wet). The KS test shows that any two regimes of the above six are statistically significantly different from each other due to clear shifts of the probability density functions. Extremes of SAT and precipitation identify the ten hottest, coldest, driest, and wettest years. Welch's t test is used to discern significant differences among these extremes. The spatial patterns of the six climate regimes and some years of extreme climate are analyzed. Although the recent two decades are the warmest among the other decades since 1895 and many hottest years measured by CONUS Tmin and Tmean are in these two decades, the hottest year according to the CONUS Tmax anomalies is 1934 (1.37 °C), which is very close to the second Tmax hottest year 2006 (1.35 °C).
NASA Astrophysics Data System (ADS)
Shen, Samuel S. P.; Wied, Olaf; Weithmann, Alexander; Regele, Tobias; Bailey, Barbara A.; Lawrimore, Jay H.
2016-07-01
This paper describes six different temporal climate regimes of the contiguous United States (CONUS) according to interdecadal variations of surface air temperature (SAT) and precipitation using the United States Historical Climatology Network (USHCN) monthly data (Tmax, Tmin, Tmean, and precipitation) from 1895 to 2010. Our analysis is based on the probability distribution, mean, standard deviation, skewness, kurtosis, Kolmogorov-Smirnov (KS) test, and Welch's t test. The relevant statistical parameters are computed from gridded monthly SAT and precipitation data. SAT variations lead to classification of four regimes: 1895-1930 (cool), 1931-1960 (warm), 1961-1985 (cool), and 1986-2010 (warm), while precipitation variations lead to a classification of two regimes: 1895-1975 (dry) and 1976-2010 (wet). The KS test shows that any two regimes of the above six are statistically significantly different from each other due to clear shifts of the probability density functions. Extremes of SAT and precipitation identify the ten hottest, coldest, driest, and wettest years. Welch's t test is used to discern significant differences among these extremes. The spatial patterns of the six climate regimes and some years of extreme climate are analyzed. Although the recent two decades are the warmest among the other decades since 1895 and many hottest years measured by CONUS Tmin and Tmean are in these two decades, the hottest year according to the CONUS Tmax anomalies is 1934 (1.37 °C), which is very close to the second Tmax hottest year 2006 (1.35 °C).
NASA Astrophysics Data System (ADS)
Chavanis, Pierre-Henri
2014-04-01
We complement the literature on the statistical mechanics of point vortices in two-dimensional hydrodynamics. Using a maximum entropy principle, we determine the multi-species Boltzmann-Poisson equation and establish a form of Virial theorem. Using a maximum entropy production principle (MEPP), we derive a set of relaxation equations towards statistical equilibrium. These relaxation equations can be used as a numerical algorithm to compute the maximum entropy state. We mention the analogies with the Fokker-Planck equations derived by Debye and Hückel for electrolytes. We then consider the limit of strong mixing (or low energy). To leading order, the relationship between the vorticity and the stream function at equilibrium is linear and the maximization of the entropy becomes equivalent to the minimization of the enstrophy. This expansion is similar to the Debye-Hückel approximation for electrolytes, except that the temperature is negative instead of positive so that the effective interaction between like-sign vortices is attractive instead of repulsive. This leads to an organization at large scales presenting geometry-induced phase transitions, instead of Debye shielding. We compare the results obtained with point vortices to those obtained in the context of the statistical mechanics of continuous vorticity fields described by the Miller-Robert-Sommeria (MRS) theory. At linear order, we get the same results but differences appear at the next order. In particular, the MRS theory predicts a transition between sinh and tanh-like ω - ψ relationships depending on the sign of Ku - 3 (where Ku is the Kurtosis) while there is no such transition for point vortices which always show a sinh-like ω - ψ relationship. We derive the form of the relaxation equations in the strong mixing limit and show that the enstrophy plays the role of a Lyapunov functional.
NASA Astrophysics Data System (ADS)
Yoshimatsu, Katsunori; Schneider, Kai; Okamoto, Naoya; Kawahara, Yasuhiro; Farge, Marie
2011-09-01
Scale-dependent and geometrical statistics of three-dimensional incompressible homogeneous magnetohydrodynamic turbulence without mean magnetic field are examined by means of the orthogonal wavelet decomposition. The flow is computed by direct numerical simulation with a Fourier spectral method at resolution 5123 and a unit magnetic Prandtl number. Scale-dependent second and higher order statistics of the velocity and magnetic fields allow to quantify their intermittency in terms of spatial fluctuations of the energy spectra, the flatness, and the probability distribution functions at different scales. Different scale-dependent relative helicities, e.g., kinetic, cross, and magnetic relative helicities, yield geometrical information on alignment between the different scale-dependent fields. At each scale, the alignment between the velocity and magnetic field is found to be more pronounced than the other alignments considered here, i.e., the scale-dependent alignment between the velocity and vorticity, the scale-dependent alignment between the magnetic field and its vector potential, and the scale-dependent alignment between the magnetic field and the current density. Finally, statistical scale-dependent analyses of both Eulerian and Lagrangian accelerations and the corresponding time-derivatives of the magnetic field are performed. It is found that the Lagrangian acceleration does not exhibit substantially stronger intermittency compared to the Eulerian acceleration, in contrast to hydrodynamic turbulence where the Lagrangian acceleration shows much stronger intermittency than the Eulerian acceleration. The Eulerian time-derivative of the magnetic field is more intermittent than the Lagrangian time-derivative of the magnetic field.
Lagrangian statistics and flow topology in forced two-dimensional turbulence.
Kadoch, B; Del-Castillo-Negrete, D; Bos, W J T; Schneider, K
2011-03-01
A study of the relationship between Lagrangian statistics and flow topology in fluid turbulence is presented. The topology is characterized using the Weiss criterion, which provides a conceptually simple tool to partition the flow into topologically different regions: elliptic (vortex dominated), hyperbolic (deformation dominated), and intermediate (turbulent background). The flow corresponds to forced two-dimensional Navier-Stokes turbulence in doubly periodic and circular bounded domains, the latter with no-slip boundary conditions. In the double periodic domain, the probability density function (pdf) of the Weiss field exhibits a negative skewness consistent with the fact that in periodic domains the flow is dominated by coherent vortex structures. On the other hand, in the circular domain, the elliptic and hyperbolic regions seem to be statistically similar. We follow a Lagrangian approach and obtain the statistics by tracking large ensembles of passively advected tracers. The pdfs of residence time in the topologically different regions are computed introducing the Lagrangian Weiss field, i.e., the Weiss field computed along the particles' trajectories. In elliptic and hyperbolic regions, the pdfs of the residence time have self-similar algebraic decaying tails. In contrast, in the intermediate regions the pdf has exponential decaying tails. The conditional pdfs (with respect to the flow topology) of the Lagrangian velocity exhibit Gaussian-like behavior in the periodic and in the bounded domains. In contrast to the freely decaying turbulence case, the conditional pdfs of the Lagrangian acceleration in forced turbulence show a comparable level of intermittency in both the periodic and the bounded domains. The conditional pdfs of the Lagrangian curvature are characterized, in all cases, by self-similar power-law behavior with a decay exponent of order -2. PMID:21517594
NASA Astrophysics Data System (ADS)
Abramson, Louis Evan; Imacs Cluster Building Survey
2015-01-01
The growth of galaxies is a central theme of the cosmological narrative, but we do not yet understand how these objects build their stellar populations over time. Largely, this is because star formation histories must be inferred from statistical metrics (at z > 0), e.g., the cosmic star formation rate density, the stellar mass function, and the SFR/stellar mass relation. The relationship between these observations and the behavior of individual systems is unclear, but it deeply affects views on galaxy evolution. Here, I discuss key issues complicating this relationship, and explore attempts to deal with them from both 'population-down' and 'galaxy-up' perspectives. I suggest that these interpretations ultimately differ in their emphasis on astrophysical processes that 'quench' versus those that diversify galaxies, and the extent to which individual star formation histories encode these processes. I close by highlighting observations which might soon reveal the accuracy of either vision.
Statistics of extreme waves in the framework of one-dimensional Nonlinear Schrodinger Equation
NASA Astrophysics Data System (ADS)
Agafontsev, Dmitry; Zakharov, Vladimir
2013-04-01
We examine the statistics of extreme waves for one-dimensional classical focusing Nonlinear Schrodinger (NLS) equation, iÎ¨t + Î¨xx + |Î¨ |2Î¨ = 0, (1) as well as the influence of the first nonlinear term beyond Eq. (1) - the six-wave interactions - on the statistics of waves in the framework of generalized NLS equation accounting for six-wave interactions, dumping (linear dissipation, two- and three-photon absorption) and pumping terms, We solve these equations numerically in the box with periodically boundary conditions starting from the initial data Î¨t=0 = F(x) + ?(x), where F(x) is an exact modulationally unstable solution of Eq. (1) seeded by stochastic noise ?(x) with fixed statistical properties. We examine two types of initial conditions F(x): (a) condensate state F(x) = 1 for Eq. (1)-(2) and (b) cnoidal wave for Eq. (1). The development of modulation instability in Eq. (1)-(2) leads to formation of one-dimensional wave turbulence. In the integrable case the turbulence is called integrable and relaxes to one of infinite possible stationary states. Addition of six-wave interactions term leads to appearance of collapses that eventually are regularized by the dumping terms. The energy lost during regularization of collapses in (2) is restored by the pumping term. In the latter case the system does not demonstrate relaxation-like behavior. We measure evolution of spectra Ik =< |Î¨k|2 >, spatial correlation functions and the PDFs for waves amplitudes |Î¨|, concentrating special attention on formation of "fat tails" on the PDFs. For the classical integrable NLS equation (1) with condensate initial condition we observe Rayleigh tails for extremely large waves and a "breathing region" for middle waves with oscillations of the frequency of waves appearance with time, while nonintegrable NLS equation with dumping and pumping terms (2) with the absence of six-wave interactions α = 0 demonstrates perfectly Rayleigh PDFs without any oscillations with
NASA Astrophysics Data System (ADS)
Germa, Aurelie; Connor, Laura; Connor, Chuck; Malservisi, Rocco
2015-04-01
One challenge of volcanic hazard assessment in distributed volcanic fields (large number of small-volume basaltic volcanoes along with one or more silicic central volcanoes) is to constrain the location of future activity. Although the extent of the source of melts at depth can be known using geophysical methods or the location of past eruptive vents, the location of preferential pathways and zones of higher magma flux are still unobserved. How does the spatial distribution of eruptive vents at the surface reveal the location of magma sources or focusing? When this distribution is investigated, the location of central polygenetic edifices as well as clusters of monogenetic volcanoes denote zones of high magma flux and recurrence rate, whereas areas of dispersed monogenetic vents represent zones of lower flux. Additionally, central polygenetic edifices, acting as magma filters, prevent dense mafic magmas from reaching the surface close to their central silicic system. Subsequently, the spatial distribution of mafic monogenetic vents may provide clues to the subsurface structure of a volcanic field, such as the location of magma sources, preferential magma pathways, and flux distribution across the field. Gathering such data is of highly importance in improving the assessment of volcanic hazards. We are developing a modeling framework that compares output of statistical models of vent distribution with outputs form numerical models of subsurface magma transport. Geologic data observed at the Earth's surface are used to develop statistical models of spatial intensity (vents per unit area), volume intensity (erupted volume per unit area) and volume-flux intensity (erupted volume per unit time and area). Outputs are in the form of probability density functions assumed to represent volcanic flow output at the surface. These are then compared to outputs from conceptual models of the subsurface processes of magma storage and transport. These models are using Darcy's law
NASA Astrophysics Data System (ADS)
Onishi, Ryo; Vassilicos, J. C.
2014-11-01
This study investigates the collision statistics of inertial particles in inverse-cascading 2D homogeneous isotropic turbulence by means of a direct numerical simulation (DNS). A collision kernel model for particles with small Stokes number (St) in 2D flows is proposed based on the model of Saffman & Turner (1956) (ST56 model). The DNS results agree with this 2D version of the ST56 model for St < 0.1. It is then confirmed that our DNS results satisfy the 2D version of the spherical formulation of the collision kernel. The fact that the flatness factor stays around 3 in our 2D flow confirms that the present 2D turbulent flow is nearly intermittency-free. Collision statistics for St = 0.1, 0.4 and 0.6, i.e. for St <1, are obtained from the present 2D DNS and compared with those obtained from the three-dimensional (3D) DNS of Onishi et al. (2013). We have observed that the 3D radial distribution function at contact (g(R), the so-called clustering effect) decreases for St = 0.4 and 0.6 with increasing Reynolds number, while the 2D g(R) does not show a significant dependence on Reynolds number. This observation supports the view that the Reynolds-number dependence of g(R) observed in three dimensions is due to internal intermittency of the 3D turbulence. We have further investigated the local St, which is a function of the local flow strain rates, and proposed a plausible mechanism that can explain the Reynolds-number dependence of g(R).
A one-dimensional statistical mechanics model for nucleosome positioning on genomic DNA.
Tesoro, S; Ali, I; Morozov, A N; Sulaiman, N; Marenduzzo, D
2016-02-01
The first level of folding of DNA in eukaryotes is provided by the so-called '10 nm chromatin fibre', where DNA wraps around histone proteins (∼10 nm in size) to form nucleosomes, which go on to create a zig-zagging bead-on-a-string structure. In this work we present a one-dimensional statistical mechanics model to study nucleosome positioning within one such 10 nm fibre. We focus on the case of genomic sheep DNA, and we start from effective potentials valid at infinite dilution and determined from high-resolution in vitro salt dialysis experiments. We study positioning within a polynucleosome chain, and compare the results for genomic DNA to that obtained in the simplest case of homogeneous DNA, where the problem can be mapped to a Tonks gas. First, we consider the simple, analytically solvable, case where nucleosomes are assumed to be point-like. Then, we perform numerical simulations to gauge the effect of their finite size on the nucleosomal distribution probabilities. Finally we compare nucleosome distributions and simulated nuclease digestion patterns for the two cases (homogeneous and sheep DNA), thereby providing testable predictions of the effect of sequence on experimentally observable quantities in experiments on polynucleosome chromatin fibres reconstituted in vitro. PMID:26871546
A statistical mechanical theory for a two-dimensional model of water
NASA Astrophysics Data System (ADS)
Urbic, Tomaz; Dill, Ken A.
2010-06-01
We develop a statistical mechanical model for the thermal and volumetric properties of waterlike fluids. Each water molecule is a two-dimensional disk with three hydrogen-bonding arms. Each water interacts with neighboring waters through a van der Waals interaction and an orientation-dependent hydrogen-bonding interaction. This model, which is largely analytical, is a variant of the Truskett and Dill (TD) treatment of the "Mercedes-Benz" (MB) model. The present model gives better predictions than TD for hydrogen-bond populations in liquid water by distinguishing strong cooperative hydrogen bonds from weaker ones. We explore properties versus temperature T and pressure p. We find that the volumetric and thermal properties follow the same trends with T as real water and are in good general agreement with Monte Carlo simulations of MB water, including the density anomaly, the minimum in the isothermal compressibility, and the decreased number of hydrogen bonds for increasing temperature. The model reproduces that pressure squeezes out water's heat capacity and leads to a negative thermal expansion coefficient at low temperatures. In terms of water structuring, the variance in hydrogen-bonding angles increases with both T and p, while the variance in water density increases with T but decreases with p. Hydrogen bonding is an energy storage mechanism that leads to water's large heat capacity (for its size) and to the fragility in its cagelike structures, which are easily melted by temperature and pressure to a more van der Waals-like liquid state.
A statistical mechanical theory for a two-dimensional model of water.
Urbic, Tomaz; Dill, Ken A
2010-06-14
We develop a statistical mechanical model for the thermal and volumetric properties of waterlike fluids. Each water molecule is a two-dimensional disk with three hydrogen-bonding arms. Each water interacts with neighboring waters through a van der Waals interaction and an orientation-dependent hydrogen-bonding interaction. This model, which is largely analytical, is a variant of the Truskett and Dill (TD) treatment of the "Mercedes-Benz" (MB) model. The present model gives better predictions than TD for hydrogen-bond populations in liquid water by distinguishing strong cooperative hydrogen bonds from weaker ones. We explore properties versus temperature T and pressure p. We find that the volumetric and thermal properties follow the same trends with T as real water and are in good general agreement with Monte Carlo simulations of MB water, including the density anomaly, the minimum in the isothermal compressibility, and the decreased number of hydrogen bonds for increasing temperature. The model reproduces that pressure squeezes out water's heat capacity and leads to a negative thermal expansion coefficient at low temperatures. In terms of water structuring, the variance in hydrogen-bonding angles increases with both T and p, while the variance in water density increases with T but decreases with p. Hydrogen bonding is an energy storage mechanism that leads to water's large heat capacity (for its size) and to the fragility in its cagelike structures, which are easily melted by temperature and pressure to a more van der Waals-like liquid state. PMID:20550408
A three-dimensional statistical approach to improved image quality for multislice helical CT.
Thibault, Jean-Baptiste; Sauer, Ken D; Bouman, Charles A; Hsieh, Jiang
2007-11-01
Multislice helical computed tomography scanning offers the advantages of faster acquisition and wide organ coverage for routine clinical diagnostic purposes. However, image reconstruction is faced with the challenges of three-dimensional cone-beam geometry, data completeness issues, and low dosage. Of all available reconstruction methods, statistical iterative reconstruction (IR) techniques appear particularly promising since they provide the flexibility of accurate physical noise modeling and geometric system description. In this paper, we present the application of Bayesian iterative algorithms to real 3D multislice helical data to demonstrate significant image quality improvement over conventional techniques. We also introduce a novel prior distribution designed to provide flexibility in its parameters to fine-tune image quality. Specifically, enhanced image resolution and lower noise have been achieved, concurrently with the reduction of helical cone-beam artifacts, as demonstrated by phantom studies. Clinical results also illustrate the capabilities of the algorithm on real patient data. Although computational load remains a significant challenge for practical development, superior image quality combined with advancements in computing technology make IR techniques a legitimate candidate for future clinical applications. PMID:18072519
Large Deviations of Radial Statistics in the Two-Dimensional One-Component Plasma
NASA Astrophysics Data System (ADS)
Cunden, Fabio Deelan; Mezzadri, Francesco; Vivo, Pierpaolo
2016-07-01
The two-dimensional one-component plasma is an ubiquitous model for several vortex systems. For special values of the coupling constant β q^2 (where q is the particles charge and β the inverse temperature), the model also corresponds to the eigenvalues distribution of normal matrix models. Several features of the system are discussed in the limit of large number N of particles for generic values of the coupling constant. We show that the statistics of a class of radial observables produces a rich phase diagram, and their asymptotic behaviour in terms of large deviation functions is calculated explicitly, including next-to-leading terms up to order 1 / N. We demonstrate a split-off phenomenon associated to atypical fluctuations of the edge density profile. We also show explicitly that a failure of the fluid phase assumption of the plasma can break a genuine 1 / N-expansion of the free energy. Our findings are corroborated by numerical comparisons with exact finite-N formulae valid for β q^2=2.
Local Packing Fraction Statistics in a Two-Dimensional Granular Media
NASA Astrophysics Data System (ADS)
Puckett, James; Lechenault, Frederic; Daniels, Karen
2010-03-01
We experimentally investigate local packing fraction statistics of a two-dimensional bidisperse granular material supported by a horizontal air table and rearranged under impulses from the boundary. Our apparatus permits investigation of dense liquids close to the jamming transition under either constant pressure (CP) or constant volume (CV) boundary conditions and three different coefficients of friction. We calculate the probability distribution of the local packing fraction φ using both radical Voronoi tessellations (φV) and the Central Limit Theorem (φCLT). The two distributions have the same mean: <φV>=<φCLT>. For both methods, we observe that the variance strictly decreases as the mean increases; the functional dependence reveals information about the system. The variance of φV is larger under CP than CV, as expected since the cell volumes adjust to fluctuations in global volume. Interestingly, this feature is missing from φCLT. Instead, the variance of φCLT is sensitive to the internal friction of the system.
NASA Astrophysics Data System (ADS)
Lupu-Sax, Adam; Smolyarenko, Igor; Kaplan, Lev; Heller, Eric
1998-03-01
Recent theoretical work on statistics of local eigenstate intensities in weakly disordered two-dimensional metals(Falko & Efetov, PRB, 1995, 52(24), 17413-29; Smolyarenko & Altshuler, PRB, 1997, 55(16), 10451-66) predicts a logarithmically normal form of the distribution of local eigenstate intensities |ψ|^2 in the asymptotic region L^2|ψ|^2>>l/ln L, where the mean free path l and the system size L are measured in the units of the wavelength λ. We use a new scattering theory method(Lupu-Sax & Heller, talk in session 38c, paper in preparation) to find and compute eigenstates numerically at high speed which allows us to investigate previously inaccessible tails of the distribution function. We observe the log-normal form of the far asymptotic region of the distribution function of |ψ|^2 in the model of a single particle moving in the potential formed by randomly placed pointlike scatterers in a 2D integrable or chaotic billiard. We study the parameters of the log-normal distribution as functions of l and L and analyze the spatial structure of ``anomalous'' wavefunctions (those with a value of |ψ|^2 satisfying the above inequality somewhere in the sample), as well as the scatterer arrangements which produce them. The results are compared to theoretical predictions^1,(Mirlin, J. Math. Phys., 1997, 38(4), 1888-917).
A one-dimensional statistical mechanics model for nucleosome positioning on genomic DNA
NASA Astrophysics Data System (ADS)
Tesoro, S.; Ali, I.; Morozov, A. N.; Sulaiman, N.; Marenduzzo, D.
2016-02-01
The first level of folding of DNA in eukaryotes is provided by the so-called ‘10 nm chromatin fibre’, where DNA wraps around histone proteins (∼10 nm in size) to form nucleosomes, which go on to create a zig-zagging bead-on-a-string structure. In this work we present a one-dimensional statistical mechanics model to study nucleosome positioning within one such 10 nm fibre. We focus on the case of genomic sheep DNA, and we start from effective potentials valid at infinite dilution and determined from high-resolution in vitro salt dialysis experiments. We study positioning within a polynucleosome chain, and compare the results for genomic DNA to that obtained in the simplest case of homogeneous DNA, where the problem can be mapped to a Tonks gas [1]. First, we consider the simple, analytically solvable, case where nucleosomes are assumed to be point-like. Then, we perform numerical simulations to gauge the effect of their finite size on the nucleosomal distribution probabilities. Finally we compare nucleosome distributions and simulated nuclease digestion patterns for the two cases (homogeneous and sheep DNA), thereby providing testable predictions of the effect of sequence on experimentally observable quantities in experiments on polynucleosome chromatin fibres reconstituted in vitro.
Large Deviations of Radial Statistics in the Two-Dimensional One-Component Plasma
NASA Astrophysics Data System (ADS)
Cunden, Fabio Deelan; Mezzadri, Francesco; Vivo, Pierpaolo
2016-09-01
The two-dimensional one-component plasma is an ubiquitous model for several vortex systems. For special values of the coupling constant β q^2 (where q is the particles charge and β the inverse temperature), the model also corresponds to the eigenvalues distribution of normal matrix models. Several features of the system are discussed in the limit of large number N of particles for generic values of the coupling constant. We show that the statistics of a class of radial observables produces a rich phase diagram, and their asymptotic behaviour in terms of large deviation functions is calculated explicitly, including next-to-leading terms up to order 1 / N. We demonstrate a split-off phenomenon associated to atypical fluctuations of the edge density profile. We also show explicitly that a failure of the fluid phase assumption of the plasma can break a genuine 1 / N-expansion of the free energy. Our findings are corroborated by numerical comparisons with exact finite- N formulae valid for β q^2=2.
NASA Astrophysics Data System (ADS)
Iizumi, T.; Nishimori, M.; Yokozawa, M.; Kotera, A.; Khang, N. D.
2008-12-01
Long-term daily global solar radiation (GSR) data of the same quality in the 20th century has been needed as a baseline to assess the climate change impact on paddy rice production in Vietnamese Mekong Delta area (MKD: 104.5-107.5oE/8.2-11.2oN). However, though sunshine duration data is available, the accessibility of GSR data is quite poor in MKD. This study estimated the daily GSR in MKD for 30-yr (1978- 2007) by applying the statistical downscaling method (SDM). The estimates of GSR was obtained from four different sources: (1) the combined equations with the corrected reanalysis data of daily maximum/minimum temperatures, relative humidity, sea level pressure, and precipitable water; (2) the correction equation with the reanalysis data of downward shortwave radiation; (3) the empirical equation with the observed sunshine duration; and (4) the observation at one site for short term. Three reanalysis data, i.e., NCEP-R1, ERA-40, and JRA-25, were used. Also the observed meteorological data, which includes many missing data, were obtained from 11 stations of the Vietnamese Meteorological Agency for 28-yr and five stations of the Global Summary of the Day for 30-yr. The observed GSR data for 1-yr was obtained from our station. Considering the use of data with many missing data for analysis, the Bayesian inference was used for this study, which has the powerful capability to optimize multiple parameters in a non-linear and hierarchical model. The Bayesian inference provided the posterior distributions of 306 parameter values relating to the combined equations, the empirical equation, and the correction equation. The preliminary result shows that the amplitude of daily fluctuation of modeled GSR was underestimated by the empirical equation and the correction equation. The combination of SDM and Bayesian inference has a potential to estimate the long- term daily GSR of the same quality even though in the area where the observed data is quite limited.
NASA Astrophysics Data System (ADS)
Kumar, Ranjeet; Chandra, Navin; Tomar, Surekha
2016-02-01
This paper deals with the role of triple encounters with low initial velocities and equal masses in the framework of statistical escape theory in two-dimensional space. This system is described by allowing for both energy and angular momentum conservation in the phase space. The complete statistical solutions (i.e. the semi-major axis `a', the distributions of eccentricity `e', and energy Eb of the final binary, escape energy Es of escaper and its escape velocity vs) of the system are calculated. These are in good agreement with the numerical results of Chandra and Bhatnagar (1999) in the range of perturbing velocities vi (10^{-1} ≤ vi ≤ 10^{-10}) in two-dimensional space. The double limit process has been applied to the system. It is observed that when vi to 0^{ +}, a vs2 to 2 / 3 for all directions in two-dimensional space.
Statistics of active and passive scalars in one-dimensional compressible turbulence.
Ni, Qionglin; Chen, Shiyi
2012-12-01
Statistics of the active temperature and passive concentration advected by the one-dimensional stationary compressible turbulence at Re_{λ}=2.56×10^{6} and M_{t}=1.0 is investigated by using direct numerical simulation with all-scale forcing. It is observed that the signal of velocity, as well as the two scalars, is full of small-scale sawtooth structures. The temperature spectrum corresponds to G(k)∝k^{-5/3}, whereas the concentration spectrum acts as a double power law of H(k)∝k^{-5/3} and H(k)∝k^{-7/3}. The probability distribution functions (PDFs) for the two scalar increments show that both δT and δC are strongly intermittent at small separation distance r and gradually approach the Gaussian distribution as r increases. Simultaneously, the exponent values of the PDF tails for the large negative scalar gradients are q_{θ}=-4.0 and q_{ζ}=-3.0, respectively. A single power-law region of finite width is identified in the structure function (SF) of δT; however, in the SF of δC, there are two regions with the exponents taken as a local minimum and a local maximum. As for the scalings of the two SFs, they are close to the Burgers and Obukhov-Corrsin scalings, respectively. Moreover, the negative filtered flux at large scales and the time-increasing total variance give evidences to the existence of an inverse cascade of the passive concentration, which is induced by the implosive collapse in the Lagrangian trajectories. PMID:23368038
A statistical mechanical theory for a two-dimensional model of water
Urbic, Tomaz; Dill, Ken A.
2010-01-01
We develop a statistical mechanical model for the thermal and volumetric properties of waterlike fluids. Each water molecule is a two-dimensional disk with three hydrogen-bonding arms. Each water interacts with neighboring waters through a van der Waals interaction and an orientation-dependent hydrogen-bonding interaction. This model, which is largely analytical, is a variant of the Truskett and Dill (TD) treatment of the “Mercedes-Benz” (MB) model. The present model gives better predictions than TD for hydrogen-bond populations in liquid water by distinguishing strong cooperative hydrogen bonds from weaker ones. We explore properties versus temperature T and pressure p. We find that the volumetric and thermal properties follow the same trends with T as real water and are in good general agreement with Monte Carlo simulations of MB water, including the density anomaly, the minimum in the isothermal compressibility, and the decreased number of hydrogen bonds for increasing temperature. The model reproduces that pressure squeezes out water’s heat capacity and leads to a negative thermal expansion coefficient at low temperatures. In terms of water structuring, the variance in hydrogen-bonding angles increases with both T and p, while the variance in water density increases with T but decreases with p. Hydrogen bonding is an energy storage mechanism that leads to water’s large heat capacity (for its size) and to the fragility in its cagelike structures, which are easily melted by temperature and pressure to a more van der Waals-like liquid state. PMID:20550408
Computationally efficient Bayesian inference for inverse problems.
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
NASA Astrophysics Data System (ADS)
Verma, Sanjeet K.; Oliveira, Elson P.
2013-08-01
In present work, we applied two sets of new multi-dimensional geochemical diagrams (Verma et al., 2013) obtained from linear discriminant analysis (LDA) of natural logarithm-transformed ratios of major elements and immobile major and trace elements in acid magmas to decipher plate tectonic settings and corresponding probability estimates for Paleoproterozoic rocks from Amazonian craton, São Francisco craton, São Luís craton, and Borborema province of Brazil. The robustness of LDA minimizes the effects of petrogenetic processes and maximizes the separation among the different tectonic groups. The probability based boundaries further provide a better objective statistical method in comparison to the commonly used subjective method of determining the boundaries by eye judgment. The use of readjusted major element data to 100% on an anhydrous basis from SINCLAS computer program, also helps to minimize the effects of post-emplacement compositional changes and analytical errors on these tectonic discrimination diagrams. Fifteen case studies of acid suites highlighted the application of these diagrams and probability calculations. The first case study on Jamon and Musa granites, Carajás area (Central Amazonian Province, Amazonian craton) shows a collision setting (previously thought anorogenic). A collision setting was clearly inferred for Bom Jardim granite, Xingú area (Central Amazonian Province, Amazonian craton) The third case study on Older São Jorge, Younger São Jorge and Maloquinha granites Tapajós area (Ventuari-Tapajós Province, Amazonian craton) indicated a within-plate setting (previously transitional between volcanic arc and within-plate). We also recognized a within-plate setting for the next three case studies on Aripuanã and Teles Pires granites (SW Amazonian craton), and Pitinga area granites (Mapuera Suite, NW Amazonian craton), which were all previously suggested to have been emplaced in post-collision to within-plate settings. The seventh case
Saberi, A A; Rouhani, S
2009-03-01
We investigate the statistics of isoheight lines of (2+1) -dimensional Kardar-Parisi-Zhang model at different level sets around the mean height in the saturation regime. We find that the exponent describing the distribution of the height-cluster size behaves differently for level cuts above and below the mean height, while the fractal dimensions of the height-clusters and their perimeters remain unchanged. The statistics of the winding angle confirms the previous observation that these contour lines are in the same universality class as self-avoiding random walks. PMID:19392013
Statistical Entropy of Four-Dimensional Rotating Black Holes from Near-Horizon Geometry
Cvetic, M.; Larsen, F.; Cvetic, M.
1999-01-01
We show that a class of four-dimensional rotating black holes allow five-dimensional embeddings as black rotating strings. Their near-horizon geometry factorizes locally as a product of the three-dimensional anti{endash}de Sitter space-time and a two-dimensional sphere (AdS{sub 3}{times}S{sup 2} ), with angular momentum encoded in the global space-time structure. Following the observation that the isometries on the AdS{sub 3} space induce a two-dimensional (super)conformal field theory on the boundary, we reproduce the microscopic entropy with the correct dependence on the black hole angular momentum. {copyright} {ital 1999} {ital The American Physical Society }
On the criticality of inferred models
NASA Astrophysics Data System (ADS)
Mastromatteo, Iacopo; Marsili, Matteo
2011-10-01
Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality.
NASA Technical Reports Server (NTRS)
Bonavito, N. L.; Gordon, C. L.; Inguva, R.; Serafino, G. N.; Barnes, R. A.
1994-01-01
NASA's Mission to Planet Earth (MTPE) will address important interdisciplinary and environmental issues such as global warming, ozone depletion, deforestation, acid rain, and the like with its long term satellite observations of the Earth and with its comprehensive Data and Information System. Extensive sets of satellite observations supporting MTPE will be provided by the Earth Observing System (EOS), while more specific process related observations will be provided by smaller Earth Probes. MTPE will use data from ground and airborne scientific investigations to supplement and validate the global observations obtained from satellite imagery, while the EOS satellites will support interdisciplinary research and model development. This is important for understanding the processes that control the global environment and for improving the prediction of events. In this paper we illustrate the potential for powerful artificial intelligence (AI) techniques when used in the analysis of the formidable problems that exist in the NASA Earth Science programs and of those to be encountered in the future MTPE and EOS programs. These techniques, based on the logical and probabilistic reasoning aspects of plausible inference, strongly emphasize the synergetic relation between data and information. As such, they are ideally suited for the analysis of the massive data streams to be provided by both MTPE and EOS. To demonstrate this, we address both the satellite imagery and model enhancement issues for the problem of ozone profile retrieval through a method based on plausible scientific inferencing. Since in the retrieval problem, the atmospheric ozone profile that is consistent with a given set of measured radiances may not be unique, an optimum statistical method is used to estimate a 'best' profile solution from the radiances and from additional a priori information.
NASA Astrophysics Data System (ADS)
Hunziker, Jürg; Laloy, Eric; Linde, Niklas
2016-04-01
Deterministic inversion procedures can often explain field data, but they only deliver one final subsurface model that depends on the initial model and regularization constraints. This leads to poor insights about the uncertainties associated with the inferred model properties. In contrast, probabilistic inversions can provide an ensemble of model realizations that accurately span the range of possible models that honor the available calibration data and prior information allowing a quantitative description of model uncertainties. We reconsider the problem of inferring the dielectric permittivity (directly related to radar velocity) structure of the subsurface by inversion of first-arrival travel times from crosshole ground penetrating radar (GPR) measurements. We rely on the DREAM_(ZS) algorithm that is a state-of-the-art Markov chain Monte Carlo (MCMC) algorithm. Such algorithms need several orders of magnitude more forward simulations than deterministic algorithms and often become infeasible in high parameter dimensions. To enable high-resolution imaging with MCMC, we use a recently proposed dimensionality reduction approach that allows reproducing 2D multi-Gaussian fields with far fewer parameters than a classical grid discretization. We consider herein a dimensionality reduction from 5000 to 257 unknowns. The first 250 parameters correspond to a spectral representation of random and uncorrelated spatial fluctuations while the remaining seven geostatistical parameters are (1) the standard deviation of the data error, (2) the mean and (3) the variance of the relative electric permittivity, (4) the integral scale along the major axis of anisotropy, (5) the anisotropy angle, (6) the ratio of the integral scale along the minor axis of anisotropy to the integral scale along the major axis of anisotropy and (7) the shape parameter of the Matérn function. The latter essentially defines the type of covariance function (e.g., exponential, Whittle, Gaussian). We present
Zhao, Xi; Dellandréa, Emmanuel; Chen, Liming; Kakadiaris, Ioannis A
2011-10-01
Three-dimensional face landmarking aims at automatically localizing facial landmarks and has a wide range of applications (e.g., face recognition, face tracking, and facial expression analysis). Existing methods assume neutral facial expressions and unoccluded faces. In this paper, we propose a general learning-based framework for reliable landmark localization on 3-D facial data under challenging conditions (i.e., facial expressions and occlusions). Our approach relies on a statistical model, called 3-D statistical facial feature model, which learns both the global variations in configurational relationships between landmarks and the local variations of texture and geometry around each landmark. Based on this model, we further propose an occlusion classifier and a fitting algorithm. Results from experiments on three publicly available 3-D face databases (FRGC, BU-3-DFE, and Bosphorus) demonstrate the effectiveness of our approach, in terms of landmarking accuracy and robustness, in the presence of expressions and occlusions. PMID:21622076
Lange, Kenneth; Papp, Jeanette C.; Sinsheimer, Janet S.; Sobel, Eric M.
2014-01-01
Statistical genetics is undergoing the same transition to big data that all branches of applied statistics are experiencing. With the advent of inexpensive DNA sequencing, the transition is only accelerating. This brief review highlights some modern techniques with recent successes in statistical genetics. These include: (a) lasso penalized regression and association mapping, (b) ethnic admixture estimation, (c) matrix completion for genotype and sequence data, (d) the fused lasso and copy number variation, (e) haplotyping, (f) estimation of relatedness, (g) variance components models, and (h) rare variant testing. For more than a century, genetics has been both a driver and beneficiary of statistical theory and practice. This symbiotic relationship will persist for the foreseeable future. PMID:24955378
Hu, Jun; Li, Zhi-Wei; Ding, Xiao-Li; Zhu, Jian-Jun
2008-01-01
The Mw=7.6 Chi-Chi earthquake in Taiwan occurred in 1999 over the Chelungpu fault and caused a great surface rupture and severe damage. Differential Synthetic Aperture Radar Interferometry (DInSAR) has been applied previously to study the co-seismic ground displacements. There have however been significant limitations in the studies. First, only one-dimensional displacements along the Line-of-Sight (LOS) direction have been measured. The large horizontal displacements along the Chelungpu fault are largely missing from the measurements as the fault is nearly perpendicular to the LOS direction. Second, due to severe signal decorrelation on the hangling wall of the fault, the displacements in that area are un-measurable by differential InSAR method. We estimate the co-seismic displacements in both the azimuth and range directions with the method of SAR amplitude image matching. GPS observations at the 10 GPS stations are used to correct for the orbital ramp in the amplitude matching and to create the two-dimensional (2D) co-seismic surface displacements field using the descending ERS-2 SAR image pair. The results show that the co-seismic displacements range from about -2.0 m to 0.7 m in the azimuth direction (with the positive direction pointing to the flight direction), with the footwall side of the fault moving mainly southwards and the hanging wall side northwards. The displacements in the LOS direction range from about -0.5 m to 1.0 m, with the largest displacement occuring in the northeastern part of the hanging wall (the positive direction points to the satellite from ground). Comparing the results from amplitude matching with those from DInSAR, we can see that while only a very small fraction of the LOS displacement has been recovered by the DInSAR mehtod, the azimuth displacements cannot be well detected with the DInSAR measurements as they are almost perpendicular to the LOS. Therefore, the amplitude matching method is obviously more advantageous than the DIn
NASA Astrophysics Data System (ADS)
King, Gary; Rosen, Ori; Tanner, Martin A.
2004-09-01
This collection of essays brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half-decade has witnessed an explosion of research in ecological inference--the process of trying to infer individual behavior from aggregate data. Although uncertainties and information lost in aggregation make ecological inference one of the most problematic types of research to rely on, these inferences are required in many academic fields, as well as by legislatures and the Courts in redistricting, by business in marketing research, and by governments in policy analysis.
Melville, C A; Johnson, P C D; Smiley, E; Simpson, N; McConnachie, A; Purves, D; Osugo, M; Cooper, S-A
2016-01-01
Diagnosing mental ill-health using categorical classification systems has limited validity for clinical practice and research. Dimensions of psychopathology have greater validity than categorical diagnoses in the general population, but dimensional models have not had a significant impact on our understanding of mental ill-health and problem behaviours experienced by adults with intellectual disabilities. This paper systematically reviews the methods and findings from intellectual disabilities studies that use statistical methods to identify dimensions of psychopathology from data collected using structured assessments of psychopathology. The PRISMA framework for systematic review was used to identify studies for inclusion. Study methods were compared to best-practice guidelines on the use of exploratory factor analysis. Data from the 20 studies included suggest that it is possible to use statistical methods to model dimensions of psychopathology experienced by adults with intellectual disabilities. However, none of the studies used methods recommended for the analysis of non-continuous psychopathology data and all 20 studies used statistical methods that produce unstable results that lack reliability. Statistical modelling is a promising methodology to improve our understanding of mental ill-health experienced by adults with intellectual disabilities but future studies should use robust statistical methods to build on the existing evidence base. PMID:26852278
Statistics of Critical Points of Gaussian Fields on Large-Dimensional Spaces
Bray, Alan J.; Dean, David S.
2007-04-13
We calculate the average number of critical points of a Gaussian field on a high-dimensional space as a function of their energy and their index. Our results give a complete picture of the organization of critical points and are of relevance to glassy and disordered systems and landscape scenarios coming from the anthropic approach to string theory.
Spatial mapping and statistical reproducibility of an array of 256 one-dimensional quantum wires
Al-Taie, H. Kelly, M. J.; Smith, L. W.; Lesage, A. A. J.; Griffiths, J. P.; Beere, H. E.; Jones, G. A. C.; Ritchie, D. A.; Smith, C. G.; See, P.
2015-08-21
We utilize a multiplexing architecture to measure the conductance properties of an array of 256 split gates. We investigate the reproducibility of the pinch off and one-dimensional definition voltage as a function of spatial location on two different cooldowns, and after illuminating the device. The reproducibility of both these properties on the two cooldowns is high, the result of the density of the two-dimensional electron gas returning to a similar state after thermal cycling. The spatial variation of the pinch-off voltage reduces after illumination; however, the variation of the one-dimensional definition voltage increases due to an anomalous feature in the center of the array. A technique which quantifies the homogeneity of split-gate properties across the array is developed which captures the experimentally observed trends. In addition, the one-dimensional definition voltage is used to probe the density of the wafer at each split gate in the array on a micron scale using a capacitive model.
Applying Clustering to Statistical Analysis of Student Reasoning about Two-Dimensional Kinematics
ERIC Educational Resources Information Center
Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R.
2007-01-01
We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…
Statistical Analysis of Current Sheets in Three-dimensional Magnetohydrodynamic Turbulence
NASA Astrophysics Data System (ADS)
Zhdankin, Vladimir; Uzdensky, Dmitri A.; Perez, Jean C.; Boldyrev, Stanislav
2013-07-01
We develop a framework for studying the statistical properties of current sheets in numerical simulations of magnetohydrodynamic (MHD) turbulence with a strong guide field, as modeled by reduced MHD. We describe an algorithm that identifies current sheets in a simulation snapshot and then determines their geometrical properties (including length, width, and thickness) and intensities (peak current density and total energy dissipation rate). We then apply this procedure to simulations of reduced MHD and perform a statistical analysis on the obtained population of current sheets. We evaluate the role of reconnection by separately studying the populations of current sheets which contain magnetic X-points and those which do not. We find that the statistical properties of the two populations are different in general. We compare the scaling of these properties to phenomenological predictions obtained for the inertial range of MHD turbulence. Finally, we test whether the reconnecting current sheets are consistent with the Sweet-Parker model.
NASA Astrophysics Data System (ADS)
He, J.; Xiao, J.; Pan, Z.
2014-12-01
Associated with northward convergence of the India continent, the surface motion of the Tibetan plateau, documented mainly by dense geodetic GPS measurements, changes greatly both on magnitude and on direction in different tectonic units. The most remarkable discordance of surface motion is around the eastern Himalayan syntaxis, where GPS velocity field is rotated gradually to oppositional direction near the southeastern Tibetan plateau with respect to the northward convergence of the India continent. Such a velocity field could be result from lateral boundary conditions, since the strength of lithosphere is probably weaker in the Tibetan plateau than in the surrounding regions. However, whether the surface motion of the Tibetan plateau is affected by basal shear at base of the elastic crust, that could exist if the coupling condition between the elastic and the viscous crust were changed, is unclear. Here, we developed a large-scale three-dimensional finite element model to explore the possible existence of basal shear below the Tibetan plateau and the surrounding regions. In the model, the lateral boundaries are specified with far-field boundary condition; the blocks surrounding the Tibetan plateaulike the Tarim, the Ordos, and the South China are treat as rigid blocks; and the mean thickness of elastic crust is assumed about 25km. Then, the magnitude and distribution of basal shear stress is automatically searchedin numerical calculation to fit surface (GPS) motion of the Tibetan plateau. We find that to better fit surface motion of the Tibetan plateau, negligible basal shear stress on the base of elastic crust is needed below majority of the western and the central Tibetan plateau; Whereas, around the eastern and the southeastern Tibetan plateau, especially between the Xianshuhestrike-slip fault and the eastern Himalayan syntaxis, at least ~1.5-3.0 Mpaof basal shear stress is needed to cause rotational surface motion as GPS measurements documented. This
NASA Technical Reports Server (NTRS)
Balkanski, Yves J.; Jacob, Daniel J.; Gardner, Geraldine M.; Graustein, William C.; Turekian, Karl K.
1993-01-01
A global three-dimensional model is used to investigate the transport and tropospheric residence time of Pb-210, an aerosol tracer produced in the atmosphere by radioactive decay of Rn-222 emitted from soils. The model uses meteorological input with 4 deg x 5 deg horizontal resolution and 4-hour temporal resolution from the Goddard Institute for Space Studies general circulation model (GCM). It computes aerosol scavenging by convective precipitation as part of the wet convective mass transport operator in order to capture the coupling between vertical transport and rainout. Scavenging in convective precipitation accounts for 74% of the global Pb-210 sink in the model; scavenging in large-scale precipitation accounts for 12%, and scavenging in dry deposition accounts for 14%. The model captures 63% of the variance of yearly mean Pb-210 concentrations measured at 85 sites around the world with negligible mean bias, lending support to the computation of aerosol scavenging. There are, however, a number of regional and seasonal discrepancies that reflect in part anomalies in GCM precipitation. Computed residence times with respect to deposition for Pb-210 aerosol in the tropospheric column are about 5 days at southern midlatitudes and 10-15 days in the tropics; values at northern midlatitudes vary from about 5 days in winter to 10 days in summer. The residence time of Pb-210 produced in the lowest 0.5 km of atmosphere is on average four times shorter than that of Pb-210 produced in the upper atmosphere. Both model and observations indicate a weaker decrease of Pb-210 concentrations between the continental mixed layer and the free troposphere than is observed for total aerosol concentrations; an explanation is that Rn-222 is transported to high altitudes in wet convective updrafts, while aerosols and soluble precursors of aerosols are scavenged by precipitation in the updrafts. Thus Pb-210 is not simply a tracer of aerosols produced in the continental boundary layer, but
NASA Astrophysics Data System (ADS)
Jameson, A. R.; Larsen, M. L.
2016-06-01
Microphysical understanding of the variability in rain requires a statistical characterization of different drop sizes both in time and in all dimensions of space. Temporally, there have been several statistical characterizations of raindrop counts. However, temporal and spatial structures are neither equivalent nor readily translatable. While there are recent reports of the one-dimensional spatial correlation functions in rain, they can only be assumed to represent the two-dimensional (2D) correlation function under the assumption of spatial isotropy. To date, however, there are no actual observations of the (2D) spatial correlation function in rain over areas. Two reasons for this deficiency are the fiscal and the physical impossibilities of assembling a dense network of instruments over even hundreds of meters much less over kilometers. Consequently, all measurements over areas will necessarily be sparsely sampled. A dense network of data must then be estimated using interpolations from the available observations. In this work, a network of 19 optical disdrometers over a 100 m by 71 m area yield observations of drop spectra every minute. These are then interpolated to a 1 m resolution grid. Fourier techniques then yield estimates of the 2D spatial correlation functions. Preliminary examples using this technique found that steadier, light rain decorrelates spatially faster than does the convective rain, but in both cases the 2D spatial correlation functions are anisotropic, reflecting an asymmetry in the physical processes influencing the rain reaching the ground not accounted for in numerical microphysical models.
High-dimensional statistical measure for region-of-interest tracking.
Boltz, Sylvain; Debreuve, Eric; Barlaud, Michel
2009-06-01
This paper deals with region-of-interest (ROI) tracking in video sequences. The goal is to determine in successive frames the region which best matches, in terms of a similarity measure, a ROI defined in a reference frame. Some tracking methods define similarity measures which efficiently combine several visual features into a probability density function (PDF) representation, thus building a discriminative model of the ROI. This approach implies dealing with PDFs with domains of definition of high dimension. To overcome this obstacle, a standard solution is to assume independence between the different features in order to bring out low-dimension marginal laws and/or to make some parametric assumptions on the PDFs at the cost of generality. We discard these assumptions by proposing to compute the Kullback-Leibler divergence between high-dimensional PDFs using the k th nearest neighbor framework. In consequence, the divergence is expressed directly from the samples, i.e., without explicit estimation of the underlying PDFs. As an application, we defined 5, 7, and 13-dimensional feature vectors containing color information (including pixel-based, gradient-based and patch-based) and spatial layout. The proposed procedure performs tracking allowing for translation and scaling of the ROI. Experiments show its efficiency on a movie excerpt and standard test sequences selected for the specific conditions they exhibit: partial occlusions, variations of luminance, noise, and complex motion. PMID:19369157
Allen, J; Velsko, S
2009-11-16
This report explores the question of whether meaningful conclusions can be drawn regarding the transmission relationship between two microbial samples on the basis of differences observed between the two sample's respective genomes. Unlike similar forensic applications using human DNA, the rapid rate of microbial genome evolution combined with the dynamics of infectious disease require a shift in thinking on what it means for two samples to 'match' in support of a forensic hypothesis. Previous outbreaks for SARS-CoV, FMDV and HIV were examined to investigate the question of how microbial sequence data can be used to draw inferences that link two infected individuals by direct transmission. The results are counter intuitive with respect to human DNA forensic applications in that some genetic change rather than exact matching improve confidence in inferring direct transmission links, however, too much genetic change poses challenges, which can weaken confidence in inferred links. High rates of infection coupled with relatively weak selective pressure observed in the SARS-CoV and FMDV data lead to fairly low confidence for direct transmission links. Confidence values for forensic hypotheses increased when testing for the possibility that samples are separated by at most a few intermediate hosts. Moreover, the observed outbreak conditions support the potential to provide high confidence values for hypothesis that exclude direct transmission links. Transmission inferences are based on the total number of observed or inferred genetic changes separating two sequences rather than uniquely weighing the importance of any one genetic mismatch. Thus, inferences are surprisingly robust in the presence of sequencing errors provided the error rates are randomly distributed across all samples in the reference outbreak database and the novel sequence samples in question. When the number of observed nucleotide mutations are limited due to characteristics of the outbreak or the
Three-dimensional segmentation of the heart muscle using image statistics
NASA Astrophysics Data System (ADS)
Nillesen, Maartje M.; Lopata, Richard G. P.; Gerrits, Inge H.; Kapusta, Livia; Huisman, Henkjan H.; Thijssen, Johan M.; de Korte, Chris L.
2006-03-01
Segmentation of the heart muscle in 3D echocardiographic images provides a tool for visualization of cardiac anatomy and assessment of heart function, and serves as an important pre-processing step for cardiac strain imaging. By incorporating spatial and temporal information of 3D ultrasound image sequences (4D), a fully automated method using image statistics was developed to perform 3D segmentation of the heart muscle. 3D rf-data were acquired with a Philips SONOS 7500 live 3D ultrasound system, and an X4 matrix array transducer (2-4 MHz). Left ventricular images of five healthy children were taken in transthoracial short/long axis view. As a first step, image statistics of blood and heart muscle were investigated. Next, based on these statistics, an adaptive mean squares filter was selected and applied to the images. Window size was related to speckle size (5x2 speckles). The degree of adaptive filtering was automatically steered by the local homogeneity of tissue. As a result, discrimination of heart muscle and blood was optimized, while sharpness of edges was preserved. After this pre-processing stage, homomorphic filtering and automatic thresholding were performed to obtain the inner borders of the heart muscle. Finally, a deformable contour algorithm was used to yield a closed contour of the left ventricular cavity in each elevational plane. Each contour was optimized using contours of the surrounding planes (spatial and temporal) as limiting condition to ensure spatial and temporal continuity. Better segmentation of the ventricle was obtained using 4D information than using information of each plane separately.
Emergent exclusion statistics of quasiparticles in two-dimensional topological phases
NASA Astrophysics Data System (ADS)
Hu, Yuting; Stirling, Spencer D.; Wu, Yong-Shi
2014-03-01
We demonstrate how the generalized Pauli exclusion principle emerges for quasiparticle excitations in 2D topological phases. As an example, we examine the Levin-Wen model with the Fibonacci data (specified in the text), and construct the number operator for fluxons living on plaquettes. By numerically counting the many-body states with fluxon number fixed, the matrix of exclusion statistics parameters is identified and is shown to depend on the spatial topology (sphere or torus) of the system. Our work reveals the structure of the (many-body) Hilbert space and some general features of thermodynamics for quasiparticle excitations in topological matter.
Statistical properties of three-dimensional two-fluid plasma model
Qaisrani, M. Hasnain; Xia, ZhenWei; Zou, Dandan
2015-09-15
The nonlinear dynamics of incompressible non-dissipative two-fluid plasma model is investigated through classical Gibbs ensemble methods. Liouville's theorem of phase space for each wave number is proved, and the absolute equilibrium spectra for Galerkin truncated two-fluid model are calculated. In two-fluid theory, the equilibrium is built on the conservation of three quadratic invariants: the total energy and the self-helicities for ions and electrons fluid, respectively. The implications of statistic equilibrium spectra with arbitrary ratios of conserved invariants are discussed.
NASA Astrophysics Data System (ADS)
Nguyen van yen, Romain; Farge, Marie; Schneider, Kai
2012-02-01
Classical statistical theories of turbulence have shown their limitations, in that they cannot predict much more than the energy spectrum in an idealized setting of statistical homogeneity and stationarity. We explore the applicability of a conditional statistical modeling approach: can we sort out what part of the information should be kept, and what part should be modeled statistically, or, in other words, “dissipated”? Our mathematical framework is the initial value problem for the two-dimensional (2D) Euler equations, which we approximate numerically by solving the 2D Navier-Stokes equations in the vanishing viscosity limit. In order to obtain a good approximation of the inviscid dynamics, we use a spectral method and a resolution going up to 8192 2. We introduce a macroscopic concept of dissipation, relying on a split of the flow between coherent and incoherent contributions: the coherent flow is constructed from the large wavelet coefficients of the vorticity field, and the incoherent flow from the small ones. In previous work, a unique threshold was applied to all wavelet coefficients, while here we also consider the effect of a scale by scale thresholding algorithm, called scale-wise coherent vorticity extraction. We study the statistical properties of the coherent and incoherent vorticity fields, and the transfers of enstrophy between them, and then use these results to propose, within a maximum entropy framework, a simple model for the incoherent vorticity. In the framework of this model, we show that the flow velocity can be predicted accurately in the L2 norm for about 10 eddy turnover times.
Vorticity statistics in the direct cascade of two-dimensional turbulence.
Falkovich, Gregory; Lebedev, Vladimir
2011-04-01
For the direct cascade of steady two-dimensional (2D) Navier-Stokes turbulence, we derive analytically the probability of strong vorticity fluctuations. When ϖ is the vorticity coarse-grained over a scale R, the probability density function (PDF), P(ϖ), has a universal asymptotic behavior lnP~-ϖ/ϖ(rms) at ϖ≫ϖ(rms)=[Hln(L/R)](1/3), where H is the enstrophy flux and L is the pumping length. Therefore, the PDF has exponential tails and is self-similar, that is, it can be presented as a function of a single argument, ϖ/ϖ(rms), in distinction from other known direct cascades. PMID:21599229
NASA Astrophysics Data System (ADS)
Derrida, Bernard; Meerson, Baruch; Sasorov, Pavel V.
2016-04-01
Consider a one-dimensional branching Brownian motion and rescale the coordinate and time so that the rates of branching and diffusion are both equal to 1. If X1(t ) is the position of the rightmost particle of the branching Brownian motion at time t , the empirical velocity c of this rightmost particle is defined as c =X1(t ) /t . Using the Fisher-Kolmogorov-Petrovsky-Piscounov equation, we evaluate the probability distribution P (c ,t ) of this empirical velocity c in the long-time t limit for c >2 . It is already known that, for a single seed particle, P (c ,t ) ˜exp[-(c2/4 -1 ) t ] up to a prefactor that can depend on c and t . Here we show how to determine this prefactor. The result can be easily generalized to the case of multiple seed particles and to branching random walks associated with other traveling-wave equations.
Derrida, Bernard; Meerson, Baruch; Sasorov, Pavel V
2016-04-01
Consider a one-dimensional branching Brownian motion and rescale the coordinate and time so that the rates of branching and diffusion are both equal to 1. If X_{1}(t) is the position of the rightmost particle of the branching Brownian motion at time t, the empirical velocity c of this rightmost particle is defined as c=X_{1}(t)/t. Using the Fisher-Kolmogorov-Petrovsky-Piscounov equation, we evaluate the probability distribution P(c,t) of this empirical velocity c in the long-time t limit for c>2. It is already known that, for a single seed particle, P(c,t)∼exp[-(c^{2}/4-1)t] up to a prefactor that can depend on c and t. Here we show how to determine this prefactor. The result can be easily generalized to the case of multiple seed particles and to branching random walks associated with other traveling-wave equations. PMID:27176286
Universal asymptotic statistics of maximal relative height in one-dimensional solid-on-solid models
NASA Astrophysics Data System (ADS)
Schehr, Grégory; Majumdar, Satya N.
2006-05-01
We study the probability density function P(hm,L) of the maximum relative height hm in a wide class of one-dimensional solid-on-solid models of finite size L . For all these lattice models, in the large- L limit, a central limit argument shows that, for periodic boundary conditions, P(hm,L) takes a universal scaling form P(hm,L)˜(12wL)-1f(hm/(12wL)) , with wL the width of the fluctuating interface and f(x) the Airy distribution function. For one instance of these models, corresponding to the extremely anisotropic Ising model in two dimensions, this result is obtained by an exact computation using the transfer matrix technique, valid for any L>0 . These arguments and exact analytical calculations are supported by numerical simulations, which show in addition that the subleading scaling function is also universal, up to a nonuniversal amplitude, and simply given by the derivative of the Airy distribution function f'(x) .
NASA Astrophysics Data System (ADS)
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Malinowski, Kathleen T.; Pantarotto, Jason R.; Senan, Suresh
2010-08-01
Purpose: To investigate the feasibility of modeling Stage III lung cancer tumor and node positions from anatomical surrogates. Methods and Materials: To localize their centroids, the primary tumor and lymph nodes from 16 Stage III lung cancer patients were contoured in 10 equal-phase planning four-dimensional (4D) computed tomography (CT) image sets. The centroids of anatomical respiratory surrogates (carina, xyphoid, nipples, mid-sternum) in each image set were also localized. The correlations between target and surrogate positions were determined, and ordinary least-squares (OLS) and partial least-squares (PLS) regression models based on a subset of respiratory phases (three to eight randomly selected) were created to predict the target positions in the remaining images. The three-phase image sets that provided the best predictive information were used to create models based on either the carina alone or all surrogates. Results: The surrogate most correlated with target motion varied widely. Depending on the number of phases used to build the models, mean OLS and PLS errors were 1.0 to 1.4 mm and 0.8 to 1.0 mm, respectively. Models trained on the 0%, 40%, and 80% respiration phases had mean ({+-} standard deviation) PLS errors of 0.8 {+-} 0.5 mm and 1.1 {+-} 1.1 mm for models based on all surrogates and carina alone, respectively. For target coordinates with motion >5 mm, the mean three-phase PLS error based on all surrogates was 1.1 mm. Conclusions: Our results establish the feasibility of inferring primary tumor and nodal motion from anatomical surrogates in 4D CT scans of Stage III lung cancer. Using inferential modeling to decrease the processing time of 4D CT scans may facilitate incorporation of patient-specific treatment margins.
Inverse Ising inference with correlated samples
NASA Astrophysics Data System (ADS)
Obermayer, Benedikt; Levine, Erel
2014-12-01
Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem.
NASA Astrophysics Data System (ADS)
Nair, Anish Kumar M.; Rajeev, Kunjukrishnapillai
2012-07-01
Long-term (2006-2011) monthly and seasonal mean vertical distributions of clouds and their spatial variations over the Indian subcontinent and surrounding oceanic regions have been derived using data obtained from the space-borne radar, CloudSat. Together with the data from space-borne imagers (Kalpana-1-VHRR and NOAA-AVHRR), this provide insight into the 3-dimensional distribution of clouds and its linkage with dominant tropical dynamical features, which are largely unexplored over the Indian region. Meridonal cross sections of ITCZ, inferred from the vertical distribution of clouds, clearly reveal the relatively narrow structure of ITCZ flanked by thick cirrus outflows in the upper troposphere on either side. The base of cirrus clouds in the outflow region significantly increases away from the ITCZ core, while the corresponding variations in cirrus top is negligible, resulting in considerable thinning of cirrus away from the ITCZ. This provides direct observational evidence for the infrared radiative heating at cloud base and its role in regulating the cirrus lifetime through sublimation. On average, the frequency of occurrence of clouds rapidly decreases with altitude in the altitude band of 12-14 km, which corresponds to the convective tropopause altitude. North-south inclination and east-west asymmetry of ITCZ during the winter season are distinctly clear in the vertical distribution of clouds, which provide information on the pathways for inter-hemispheric transport over the Indian Ocean during this season. During the Asian summer monsoon season (June-September), substantial amount of deep convective clouds are found to occur over the North Bay of Bengal, extending up to an altitude of >14 km, which is ~1-2 km higher than that over other deep convective regions. This has potential implications in the pumping of tropospheric airmass across the tropical tropopause over the region. This study characterizes a pool of inhibited cloudiness over the southwest Bay of
NASA Astrophysics Data System (ADS)
Mackay, R. M.; Khalil, M. A. K.
1995-10-01
The zonally averaged response of the Global Change Research Center two-dimensional (2-D) statistical dynamical climate model (GCRC 2-D SDCM) to a doubling of atmospheric carbon dioxide (350 parts per million by volume (ppmv) to 700 ppmv) is reported. The model solves the two-dimensional primitive equations in finite difference form (mass continuity, Newton's second law, and the first law of thermodynamics) for the prognostic variables: zonal mean density, zonal mean zonal velocity, zonal mean meridional velocity, and zonal mean temperature on a grid that has 18 nodes in latitude and 9 vertical nodes (plus the surface). The equation of state, p=ρRT, and an assumed hydrostatic atmosphere, Delta;p=-ρgΔz, are used to diagnostically calculate the zonal mean pressure and vertical velocity for each grid node, and the moisture balance equation is used to estimate the precipitation rate. The model includes seasonal variations in solar intensity, including the effects of eccentricity, and has observed land and ocean fractions set for each zone. Seasonally varying values of cloud amounts, relative humidity profiles, ozone, and sea ice are all prescribed in the model. Equator to pole ocean heat transport is simulated in the model by turbulent diffusion. The change in global mean annual surface air temperature due to a doubling of atmospheric CO2 in the 2-D model is 1.61 K, which is close to that simulated by the one-dimensional (1-D) radiative convective model (RCM) which is at the heart of the 2-D model radiation code (1.67 K for the moist adiabatic lapse rate assumption in 1-D RCM). We find that the change in temperature structure of the model atmosphere has many of the characteristics common to General Circulation Models, including amplified warming at the poles and the upper tropical troposphere, and stratospheric cooling. Because of the potential importance of atmospheric circulation feedbacks on climate change, we have also investigated the response of the zonal wind
NASA Astrophysics Data System (ADS)
Zhang, Honghai; Walker, Nicholas; Mitchell, Steven C.; Thomas, Matthew; Wahle, Andreas; Scholz, Thomas; Sonka, Milan
2006-03-01
Conventional analysis of cardiac ventricular magnetic resonance images is performed using short axis images and does not guarantee completeness and consistency of the ventricle coverage. In this paper, a four-dimensional (4D, 3D+time) left and right ventricle statistical shape model was generated from the combination of the long axis and short axis images. Iterative mutual intensity registration and interpolation were used to merge the long axis and short axis images into isotropic 4D images and simultaneously correct existing breathing artifact. Distance-based shape interpolation and approximation were used to generate complete ventricle shapes from the long axis and short axis manual segmentations. Landmarks were automatically generated and propagated to 4D data samples using rigid alignment, distance-based merging, and B-spline transform. Principal component analysis (PCA) was used in model creation and analysis. The two strongest modes of the shape model captured the most important shape feature of Tetralogy of Fallot (TOF) patients, right ventricle enlargement. Classification of cardiac images into classes of normal and TOF subjects performed on 3D and 4D models showed 100% classification correctness rates for both normal and TOF subjects using k-Nearest Neighbor (k=1 or 3) classifier and the two strongest shape modes.
NASA Astrophysics Data System (ADS)
Bartolucci, Daniele; De Marchis, Francesca
2015-08-01
We are motivated by the study of the Microcanonical Variational Principle within Onsager's description of two-dimensional turbulence in the range of energies where the equivalence of statistical ensembles fails. We obtain sufficient conditions for the existence and multiplicity of solutions for the corresponding Mean Field Equation on convex and "thin" enough domains in the supercritical (with respect to the Moser-Trudinger inequality) regime. This is a brand new achievement since existence results in the supercritical region were previously known only on multiply connected domains. We then study the structure of these solutions by the analysis of their linearized problems and we also obtain a new uniqueness result for solutions of the Mean Field Equation on thin domains whose energy is uniformly bounded from above. Finally we evaluate the asymptotic expansion of those solutions with respect to the thinning parameter and, combining it with all the results obtained so far, we solve the Microcanonical Variational Principle in a small range of supercritical energies where the entropy is shown to be concave.
Porch, Clay E; Lauretta, Matthew V
2016-01-01
Forecasts of the future abundance of western Atlantic bluefin tuna (Thunnus thynnus) have, for nearly two decades, been based on two competing views of future recruitment potential: (1) a "low" recruitment scenario based on hockey-stick (two-line) curve where the expected level of recruitment is set equal to the geometric mean of the recruitment estimates for the years after a supposed regime-shift in 1975, and (2) a "high" recruitment scenario based on a Beverton-Holt curve fit to the time series of spawner-recruit pairs beginning in 1970. Several investigators inferred the relative plausibility of these two scenarios based on measures of their ability to fit estimates of spawning biomass and recruitment derived from stock assessment outputs. Typically, these comparisons have assumed the assessment estimates of spawning biomass are known without error. It is shown here that ignoring error in the spawning biomass estimates can predispose model-choice approaches to favor the regime-shift hypothesis over the Beverton-Holt curve with higher recruitment potential. When the variance of the observation error approaches that which is typically estimated for assessment outputs, the same model-choice approaches tend to favor the single Beverton-Holt curve. For this and other reasons, it is argued that standard model-choice approaches are insufficient to make the case for a regime shift in the recruitment dynamics of western Atlantic bluefin tuna. A more fruitful course of action may be to move away from the current high/low recruitment dichotomy and focus instead on adopting biological reference points and management procedures that are robust to these and other sources of uncertainty. PMID:27272215
Porch, Clay E.; Lauretta, Matthew V.
2016-01-01
Forecasts of the future abundance of western Atlantic bluefin tuna (Thunnus thynnus) have, for nearly two decades, been based on two competing views of future recruitment potential: (1) a “low” recruitment scenario based on hockey-stick (two-line) curve where the expected level of recruitment is set equal to the geometric mean of the recruitment estimates for the years after a supposed regime-shift in 1975, and (2) a “high” recruitment scenario based on a Beverton-Holt curve fit to the time series of spawner-recruit pairs beginning in 1970. Several investigators inferred the relative plausibility of these two scenarios based on measures of their ability to fit estimates of spawning biomass and recruitment derived from stock assessment outputs. Typically, these comparisons have assumed the assessment estimates of spawning biomass are known without error. It is shown here that ignoring error in the spawning biomass estimates can predispose model-choice approaches to favor the regime-shift hypothesis over the Beverton-Holt curve with higher recruitment potential. When the variance of the observation error approaches that which is typically estimated for assessment outputs, the same model-choice approaches tend to favor the single Beverton-Holt curve. For this and other reasons, it is argued that standard model-choice approaches are insufficient to make the case for a regime shift in the recruitment dynamics of western Atlantic bluefin tuna. A more fruitful course of action may be to move away from the current high/low recruitment dichotomy and focus instead on adopting biological reference points and management procedures that are robust to these and other sources of uncertainty. PMID:27272215
Shirodkar, P V; Xiao, Y K; Sarkar, A; Dalal, S G; Chivas, A R
2006-02-01
The behaviors of chlorine isotopes in relation to air-sea flux variables have been investigated through multivariate statistical analyses (MSA). The MSA technique provides an approach to reduce the data set and was applied to a set of 7 air-sea flux variables to supplement and describe the variation in chlorine isotopic compositions (delta37Cl) of ocean water. The variation in delta37Cl values of surface ocean water from 51 stations in 4 major world oceans--the Pacific, Atlantic, Indian and the Southern Ocean has been observed from -0.76 to +0.74 per thousand (av. 0.039+/-0.04 per thousand). The observed delta37Cl values show basic homogeneity and indicate that the air-sea fluxes act differently in different oceanic regions and help to maintain the balance between delta37Cl values of the world oceans. The study showed that it is possible to model the behavior of chlorine isotopes to the extent of 38-73% for different geographical regions. The models offered here are purely statistical in nature; however, the relationships uncovered by these models extend our understanding of the constancy in delta37Cl of ocean water in relation to air-sea flux variables. PMID:16214214
van IJsseldijk, E. A.; Valstar, E. R.; Stoel, B. C.; Nelissen, R. G. H. H.; Baka, N.; van’t Klooster, R.
2016-01-01
t Klooster, B. L. Kaptein. Three dimensional measurement of minimum joint space width in the knee from stereo radiographs using statistical shape models. Bone Joint Res 2016;320–327. DOI: 10.1302/2046-3758.58.2000626. PMID:27491660
NASA Technical Reports Server (NTRS)
Boardman, J. W.; Pieters, C. M.; Green, R. O.; Clark, R. N.; Sunshine, J.; Combe, J.-P.; Isaacson, P.; Lundeen, S. R.; Malaret, E.; McCord, T.; Nettles, J.; Petro, N. E.; Varanasi, P.; Taylor, L.
2010-01-01
The Moon Mineralogy Mapper (M3), a NASA Discovery Mission of Opportunity, was launched October 22, 2008 from Shriharikota in India on board the Indian ISRO Chandrayaan- 1 spacecraft for a nominal two-year mission in a 100-km polar lunar orbit. M3 is a high-fidelity imaging spectrometer with 260 spectral bands in Target Mode and 85 spectral bands in a reduced-resolution Global Mode. Target Mode pixel sizes are nominally 70 meters and Global pixels (binned 2 by 2) are 140 meters, from the planned 100-km orbit. The mission was cut short, just before halfway, in August, 2009 when the spacecraft ceased operations. Despite the abbreviated mission and numerous technical and scientific challenges during the flight, M3 was able to cover more than 95% of the Moon in Global Mode. These data, presented and analyzed here as a global whole, are revolutionizing our understanding of the Moon. Already, numerous discoveries relating to volatiles and unexpected mineralogy have been published [1], [2], [3]. The rich spectral and spatial information content of the M3 data indicates that many more discoveries and an improved understanding of the mineralogy, geology, photometry, thermal regime and volatile status of our nearest neighbor are forthcoming from these data. Sadly, only minimal high-resolution Target Mode images were acquired, as these were to be the focus of the second half of the mission. This abstract gives the reader a global overview of all the M3 data that were collected and an introduction to their rich spectral character and complexity. We employ a Principal Components statistical method to assess the underlying dimensionality of the Moon as a whole, as seen by M3, and to identify numerous areas that are low-probability targets and thus of potential interest to selenologists.
Bayesian Inference: with ecological applications
Link, William A.; Barker, Richard J.
2010-01-01
This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.
Chang Liyun; Ho, S.-Y.; Chui, C.-S.; Lee, J.-H.; Du Yichun; Chen Tainsong
2008-06-15
We propose a new method based on statistical analysis technique to determine the minimum setup distance of a well chamber used in the calibration of {sup 192}Ir high dose rate (HDR). The chamber should be placed at least this distance away from any wall or from the floor in order to mitigate the effect of scatter. Three different chambers were included in this study, namely, Sun Nuclear Corporation, Nucletron, and Standard Imaging. The results from this study indicated that the minimum setup distance varies depending on the particular chamber and the room architecture in which the chamber was used. Our result differs from that of a previous study by Podgorsak et al. [Med. Phys. 19, 1311-1314 (1992)], in which 25 cm was suggested, and also differs from that of the International Atomic Energy Agency (IAEA)-TECDOC-1079 report, which suggested 30 cm. The new method proposed in this study may be considered as an alternative approach to determine the minimum setup distance of a well-type chamber used in the calibration of {sup 192}Ir HDR.
Osnes, J.D. ); Winberg, A.; Andersson, J.E.; Larsson, N.A. )
1991-09-27
Statistical and probabilistic methods for estimating the probability that a fracture is nonconductive (or equivalently, the conductive-fracture frequency) and the distribution of the transmissivities of conductive fractures from transmissivity measurements made in single-hole injection (well) tests were developed. These methods were applied to a database consisting of over 1,000 measurements made in nearly 25 km of borehole at five sites in Sweden. The depths of the measurements ranged from near the surface to over 600-m deep, and packer spacings of 20- and 25-m were used. A probabilistic model that describes the distribution of a series of transmissivity measurements was derived. When the parameters of this model were estimated using maximum likelihood estimators, the resulting estimated distributions generally fit the cumulative histograms of the transmissivity measurements very well. Further, estimates of the mean transmissivity of conductive fractures based on the maximum likelihood estimates of the model's parameters were reasonable, both in magnitude and in trend, with respect to depth. The estimates of the conductive fracture probability were generated in the range of 0.5--5.0 percent, with the higher values at shallow depths and with increasingly smaller values as depth increased. An estimation procedure based on the probabilistic model and the maximum likelihood estimators of its parameters was recommended. Some guidelines regarding the design of injection test programs were drawn from the recommended estimation procedure and the parameter estimates based on the Swedish data. 24 refs., 12 figs., 14 tabs.
NASA Technical Reports Server (NTRS)
da Silva, Arlindo M.; Norris, Peter M.
2013-01-01
Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.
NASA Astrophysics Data System (ADS)
Johnson, Ria; Ponman, Trevor J.; Finoguenov, Alexis
2009-05-01
We have performed a statistical analysis of a sample of 28 nearby galaxy groups derived primarily from the Two-Dimensional XMM-Newton Group Survey, in order to ascertain what factors drive the observed differences in group properties. We specifically focus on entropy and the role of feedback, and divide the sample into cool core (CC) and non-cool core (NCC) systems. This is the first time the latter have been studied in detail in the group regime. We find the coolest groups to have steeper entropy profiles than the warmest systems, and find NCC groups to have higher central entropy and to exhibit more scatter than their CC counterparts. We investigate the entropy distribution of the gas in each system, and compare this to the expected theoretical distribution under the condition that non-gravitational processes are ignored. In all cases, the observed maximum entropy far exceeds that expected theoretically, and simple models for modifications of the theoretical entropy distribution perform poorly. A model which applies initial pre-heating through an entropy shift to match the high entropy behaviour of the observed profile, followed by radiative cooling, generally fails to match the low entropy behaviour, and only performs well when the difference between the maximum entropy of the observed and theoretical distributions is small. Successful feedback models need to work differentially to increase the entropy range in the gas, and we suggest two basic possibilities. We analyse the effects of feedback on the entropy distribution, finding systems with a high measure of `feedback impact' to typically reach higher entropy than their low feedback counterparts. The abundance profiles of high and low feedback systems are comparable over the majority of the radial range, but the high feedback systems show significantly lower central metallicities compared to the low feedback systems. If low entropy, metal-rich gas has been boosted to large entropy in the high feedback systems
Using Alien Coins to Test Whether Simple Inference Is Bayesian
ERIC Educational Resources Information Center
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
NASA Astrophysics Data System (ADS)
von Toussaint, Udo
2011-07-01
Bayesian inference provides a consistent method for the extraction of information from physics experiments even in ill-conditioned circumstances. The approach provides a unified rationale for data analysis, which both justifies many of the commonly used analysis procedures and reveals some of the implicit underlying assumptions. This review summarizes the general ideas of the Bayesian probability theory with emphasis on the application to the evaluation of experimental data. As case studies for Bayesian parameter estimation techniques examples ranging from extra-solar planet detection to the deconvolution of the apparatus functions for improving the energy resolution and change point estimation in time series are discussed. Special attention is paid to the numerical techniques suited for Bayesian analysis, with a focus on recent developments of Markov chain Monte Carlo algorithms for high-dimensional integration problems. Bayesian model comparison, the quantitative ranking of models for the explanation of a given data set, is illustrated with examples collected from cosmology, mass spectroscopy, and surface physics, covering problems such as background subtraction and automated outlier detection. Additionally the Bayesian inference techniques for the design and optimization of future experiments are introduced. Experiments, instead of being merely passive recording devices, can now be designed to adapt to measured data and to change the measurement strategy on the fly to maximize the information of an experiment. The applied key concepts and necessary numerical tools which provide the means of designing such inference chains and the crucial aspects of data fusion are summarized and some of the expected implications are highlighted.
Nonparametric inference of network structure and dynamics
NASA Astrophysics Data System (ADS)
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among