Generalized statistical complexity measures: Geometrical and analytical properties
NASA Astrophysics Data System (ADS)
Martin, M. T.; Plastino, A.; Rosso, O. A.
2006-09-01
We discuss bounds on the values adopted by the generalized statistical complexity measures [M.T. Martin et al., Phys. Lett. A 311 (2003) 126; P.W. Lamberti et al., Physica A 334 (2004) 119] introduced by López Ruiz et al. [Phys. Lett. A 209 (1995) 321] and Shiner et al. [Phys. Rev. E 59 (1999) 1459]. Several new theorems are proved and illustrated with reference to the celebrated logistic map.
Hanel, Rudolf; Thurner, Stefan; Gell-Mann, Murray
2014-05-13
The maximum entropy principle (MEP) is a method for obtaining the most likely distribution functions of observables from statistical systems by maximizing entropy under constraints. The MEP has found hundreds of applications in ergodic and Markovian systems in statistical mechanics, information theory, and statistics. For several decades there has been an ongoing controversy over whether the notion of the maximum entropy principle can be extended in a meaningful way to nonextensive, nonergodic, and complex statistical systems and processes. In this paper we start by reviewing how Boltzmann-Gibbs-Shannon entropy is related to multiplicities of independent random processes. We then show how the relaxation of independence naturally leads to the most general entropies that are compatible with the first three Shannon-Khinchin axioms, the (c,d)-entropies. We demonstrate that the MEP is a perfectly consistent concept for nonergodic and complex statistical systems if their relative entropy can be factored into a generalized multiplicity and a constraint term. The problem of finding such a factorization reduces to finding an appropriate representation of relative entropy in a linear basis. In a particular example we show that path-dependent random processes with memory naturally require specific generalized entropies. The example is to our knowledge the first exact derivation of a generalized entropy from the microscopic properties of a path-dependent random process.
Characterization of time series via Rényi complexity-entropy curves
NASA Astrophysics Data System (ADS)
Jauregui, M.; Zunino, L.; Lenzi, E. K.; Mendes, R. S.; Ribeiro, H. V.
2018-05-01
One of the most useful tools for distinguishing between chaotic and stochastic time series is the so-called complexity-entropy causality plane. This diagram involves two complexity measures: the Shannon entropy and the statistical complexity. Recently, this idea has been generalized by considering the Tsallis monoparametric generalization of the Shannon entropy, yielding complexity-entropy curves. These curves have proven to enhance the discrimination among different time series related to stochastic and chaotic processes of numerical and experimental nature. Here we further explore these complexity-entropy curves in the context of the Rényi entropy, which is another monoparametric generalization of the Shannon entropy. By combining the Rényi entropy with the proper generalization of the statistical complexity, we associate a parametric curve (the Rényi complexity-entropy curve) with a given time series. We explore this approach in a series of numerical and experimental applications, demonstrating the usefulness of this new technique for time series analysis. We show that the Rényi complexity-entropy curves enable the differentiation among time series of chaotic, stochastic, and periodic nature. In particular, time series of stochastic nature are associated with curves displaying positive curvature in a neighborhood of their initial points, whereas curves related to chaotic phenomena have a negative curvature; finally, periodic time series are represented by vertical straight lines.
A Not-So-Fundamental Limitation on Studying Complex Systems with Statistics: Comment on Rabin (2011)
NASA Astrophysics Data System (ADS)
Thomas, Drew M.
2012-12-01
Although living organisms are affected by many interrelated and unidentified variables, this complexity does not automatically impose a fundamental limitation on statistical inference. Nor need one invoke such complexity as an explanation of the "Truth Wears Off" or "decline" effect; similar "decline" effects occur with far simpler systems studied in physics. Selective reporting and publication bias, and scientists' biases in favor of reporting eye-catching results (in general) or conforming to others' results (in physics) better explain this feature of the "Truth Wears Off" effect than Rabin's suggested limitation on statistical inference.
Unifying Complexity and Information
NASA Astrophysics Data System (ADS)
Ke, Da-Guan
2013-04-01
Complex systems, arising in many contexts in the computer, life, social, and physical sciences, have not shared a generally-accepted complexity measure playing a fundamental role as the Shannon entropy H in statistical mechanics. Superficially-conflicting criteria of complexity measurement, i.e. complexity-randomness (C-R) relations, have given rise to a special measure intrinsically adaptable to more than one criterion. However, deep causes of the conflict and the adaptability are not much clear. Here I trace the root of each representative or adaptable measure to its particular universal data-generating or -regenerating model (UDGM or UDRM). A representative measure for deterministic dynamical systems is found as a counterpart of the H for random process, clearly redefining the boundary of different criteria. And a specific UDRM achieving the intrinsic adaptability enables a general information measure that ultimately solves all major disputes. This work encourages a single framework coving deterministic systems, statistical mechanics and real-world living organisms.
A survey of statistics in three UK general practice journal
Rigby, Alan S; Armstrong, Gillian K; Campbell, Michael J; Summerton, Nick
2004-01-01
Background Many medical specialities have reviewed the statistical content of their journals. To our knowledge this has not been done in general practice. Given the main role of a general practitioner as a diagnostician we thought it would be of interest to see whether the statistical methods reported reflect the diagnostic process. Methods Hand search of three UK journals of general practice namely the British Medical Journal (general practice section), British Journal of General Practice and Family Practice over a one-year period (1 January to 31 December 2000). Results A wide variety of statistical techniques were used. The most common methods included t-tests and Chi-squared tests. There were few articles reporting likelihood ratios and other useful diagnostic methods. There was evidence that the journals with the more thorough statistical review process reported a more complex and wider variety of statistical techniques. Conclusions The BMJ had a wider range and greater diversity of statistical methods than the other two journals. However, in all three journals there was a dearth of papers reflecting the diagnostic process. Across all three journals there were relatively few papers describing randomised controlled trials thus recognising the difficulty of implementing this design in general practice. PMID:15596014
Complex Quantum Network Manifolds in Dimension d > 2 are Scale-Free
NASA Astrophysics Data System (ADS)
Bianconi, Ginestra; Rahmede, Christoph
2015-09-01
In quantum gravity, several approaches have been proposed until now for the quantum description of discrete geometries. These theoretical frameworks include loop quantum gravity, causal dynamical triangulations, causal sets, quantum graphity, and energetic spin networks. Most of these approaches describe discrete spaces as homogeneous network manifolds. Here we define Complex Quantum Network Manifolds (CQNM) describing the evolution of quantum network states, and constructed from growing simplicial complexes of dimension . We show that in d = 2 CQNM are homogeneous networks while for d > 2 they are scale-free i.e. they are characterized by large inhomogeneities of degrees like most complex networks. From the self-organized evolution of CQNM quantum statistics emerge spontaneously. Here we define the generalized degrees associated with the -faces of the -dimensional CQNMs, and we show that the statistics of these generalized degrees can either follow Fermi-Dirac, Boltzmann or Bose-Einstein distributions depending on the dimension of the -faces.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests
Kosinski, Andrzej S.
2013-01-01
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
Rivas, Elena; Lang, Raymond; Eddy, Sean R
2012-02-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.
Rivas, Elena; Lang, Raymond; Eddy, Sean R.
2012-01-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
An Overview of Particle Sampling Bias
NASA Technical Reports Server (NTRS)
Meyers, James F.; Edwards, Robert V.
1984-01-01
The complex relation between particle arrival statistics and the interarrival statistics is explored. It is known that the mean interarrival time given an initial velocity is generally not the inverse of the mean rate corresponding to that velocity. Necessary conditions for the measurement of the conditional rate are given.
Complex Quantum Network Manifolds in Dimension d > 2 are Scale-Free
Bianconi, Ginestra; Rahmede, Christoph
2015-01-01
In quantum gravity, several approaches have been proposed until now for the quantum description of discrete geometries. These theoretical frameworks include loop quantum gravity, causal dynamical triangulations, causal sets, quantum graphity, and energetic spin networks. Most of these approaches describe discrete spaces as homogeneous network manifolds. Here we define Complex Quantum Network Manifolds (CQNM) describing the evolution of quantum network states, and constructed from growing simplicial complexes of dimension . We show that in d = 2 CQNM are homogeneous networks while for d > 2 they are scale-free i.e. they are characterized by large inhomogeneities of degrees like most complex networks. From the self-organized evolution of CQNM quantum statistics emerge spontaneously. Here we define the generalized degrees associated with the -faces of the -dimensional CQNMs, and we show that the statistics of these generalized degrees can either follow Fermi-Dirac, Boltzmann or Bose-Einstein distributions depending on the dimension of the -faces. PMID:26356079
Complex Quantum Network Manifolds in Dimension d > 2 are Scale-Free.
Bianconi, Ginestra; Rahmede, Christoph
2015-09-10
In quantum gravity, several approaches have been proposed until now for the quantum description of discrete geometries. These theoretical frameworks include loop quantum gravity, causal dynamical triangulations, causal sets, quantum graphity, and energetic spin networks. Most of these approaches describe discrete spaces as homogeneous network manifolds. Here we define Complex Quantum Network Manifolds (CQNM) describing the evolution of quantum network states, and constructed from growing simplicial complexes of dimension d. We show that in d = 2 CQNM are homogeneous networks while for d > 2 they are scale-free i.e. they are characterized by large inhomogeneities of degrees like most complex networks. From the self-organized evolution of CQNM quantum statistics emerge spontaneously. Here we define the generalized degrees associated with the δ-faces of the d-dimensional CQNMs, and we show that the statistics of these generalized degrees can either follow Fermi-Dirac, Boltzmann or Bose-Einstein distributions depending on the dimension of the δ-faces.
Central Limit Theorem for Exponentially Quasi-local Statistics of Spin Models on Cayley Graphs
NASA Astrophysics Data System (ADS)
Reddy, Tulasi Ram; Vadlamani, Sreekar; Yogeshwaran, D.
2018-04-01
Central limit theorems for linear statistics of lattice random fields (including spin models) are usually proven under suitable mixing conditions or quasi-associativity. Many interesting examples of spin models do not satisfy mixing conditions, and on the other hand, it does not seem easy to show central limit theorem for local statistics via quasi-associativity. In this work, we prove general central limit theorems for local statistics and exponentially quasi-local statistics of spin models on discrete Cayley graphs with polynomial growth. Further, we supplement these results by proving similar central limit theorems for random fields on discrete Cayley graphs taking values in a countable space, but under the stronger assumptions of α -mixing (for local statistics) and exponential α -mixing (for exponentially quasi-local statistics). All our central limit theorems assume a suitable variance lower bound like many others in the literature. We illustrate our general central limit theorem with specific examples of lattice spin models and statistics arising in computational topology, statistical physics and random networks. Examples of clustering spin models include quasi-associated spin models with fast decaying covariances like the off-critical Ising model, level sets of Gaussian random fields with fast decaying covariances like the massive Gaussian free field and determinantal point processes with fast decaying kernels. Examples of local statistics include intrinsic volumes, face counts, component counts of random cubical complexes while exponentially quasi-local statistics include nearest neighbour distances in spin models and Betti numbers of sub-critical random cubical complexes.
Capturing rogue waves by multi-point statistics
NASA Astrophysics Data System (ADS)
Hadjihosseini, A.; Wächter, Matthias; Hoffmann, N. P.; Peinke, J.
2016-01-01
As an example of a complex system with extreme events, we investigate ocean wave states exhibiting rogue waves. We present a statistical method of data analysis based on multi-point statistics which for the first time allows the grasping of extreme rogue wave events in a highly satisfactory statistical manner. The key to the success of the approach is mapping the complexity of multi-point data onto the statistics of hierarchically ordered height increments for different time scales, for which we can show that a stochastic cascade process with Markov properties is governed by a Fokker-Planck equation. Conditional probabilities as well as the Fokker-Planck equation itself can be estimated directly from the available observational data. With this stochastic description surrogate data sets can in turn be generated, which makes it possible to work out arbitrary statistical features of the complex sea state in general, and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics.
Variability in Rheumatology day care hospitals in Spain: VALORA study.
Hernández Miguel, María Victoria; Martín Martínez, María Auxiliadora; Corominas, Héctor; Sanchez-Piedra, Carlos; Sanmartí, Raimon; Fernandez Martinez, Carmen; García-Vicuña, Rosario
To describe the variability of the day care hospital units (DCHUs) of Rheumatology in Spain, in terms of structural resources and operating processes. Multicenter descriptive study with data from a self-completed questionnaire of DCHUs self-assessment based on DCHUs quality standards of the Spanish Society of Rheumatology. Structural resources and operating processes were analyzed and stratified by hospital complexity (regional, general, major and complex). Variability was determined using the coefficient of variation (CV) of the variable with clinical relevance that presented statistically significant differences when was compared by centers. A total of 89 hospitals (16 autonomous regions and Melilla) were included in the analysis. 11.2% of hospitals are regional, 22,5% general, 27%, major and 39,3% complex. A total of 92% of DCHUs were polyvalent. The number of treatments applied, the coordination between DCHUs and hospital pharmacy and the post graduate training process were the variables that showed statistically significant differences depending on the complexity of hospital. The highest rate of rheumatologic treatments was found in complex hospitals (2.97 per 1,000 population), and the lowest in general hospitals (2.01 per 1,000 population). The CV was 0.88 in major hospitals; 0.86 in regional; 0.76 in general, and 0.72 in the complex. there was variability in the number of treatments delivered in DCHUs, being greater in major hospitals and then in regional centers. Nonetheless, the variability in terms of structure and function does not seem due to differences in center complexity. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.
Online Statistical Modeling (Regression Analysis) for Independent Responses
NASA Astrophysics Data System (ADS)
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Reconciling statistical and systems science approaches to public health.
Ip, Edward H; Rahmandad, Hazhir; Shoham, David A; Hammond, Ross; Huang, Terry T-K; Wang, Youfa; Mabry, Patricia L
2013-10-01
Although systems science has emerged as a set of innovative approaches to study complex phenomena, many topically focused researchers including clinicians and scientists working in public health are somewhat befuddled by this methodology that at times appears to be radically different from analytic methods, such as statistical modeling, to which the researchers are accustomed. There also appears to be conflicts between complex systems approaches and traditional statistical methodologies, both in terms of their underlying strategies and the languages they use. We argue that the conflicts are resolvable, and the sooner the better for the field. In this article, we show how statistical and systems science approaches can be reconciled, and how together they can advance solutions to complex problems. We do this by comparing the methods within a theoretical framework based on the work of population biologist Richard Levins. We present different types of models as representing different tradeoffs among the four desiderata of generality, realism, fit, and precision.
Reconciling Statistical and Systems Science Approaches to Public Health
Ip, Edward H.; Rahmandad, Hazhir; Shoham, David A.; Hammond, Ross; Huang, Terry T.-K.; Wang, Youfa; Mabry, Patricia L.
2016-01-01
Although systems science has emerged as a set of innovative approaches to study complex phenomena, many topically focused researchers including clinicians and scientists working in public health are somewhat befuddled by this methodology that at times appears to be radically different from analytic methods, such as statistical modeling, to which the researchers are accustomed. There also appears to be conflicts between complex systems approaches and traditional statistical methodologies, both in terms of their underlying strategies and the languages they use. We argue that the conflicts are resolvable, and the sooner the better for the field. In this article, we show how statistical and systems science approaches can be reconciled, and how together they can advance solutions to complex problems. We do this by comparing the methods within a theoretical framework based on the work of population biologist Richard Levins. We present different types of models as representing different tradeoffs among the four desiderata of generality, realism, fit, and precision. PMID:24084395
Markov Logic Networks in the Analysis of Genetic Data
Sakhanenko, Nikita A.
2010-01-01
Abstract Complex, non-additive genetic interactions are common and can be critical in determining phenotypes. Genome-wide association studies (GWAS) and similar statistical studies of linkage data, however, assume additive models of gene interactions in looking for genotype-phenotype associations. These statistical methods view the compound effects of multiple genes on a phenotype as a sum of influences of each gene and often miss a substantial part of the heritable effect. Such methods do not use any biological knowledge about underlying mechanisms. Modeling approaches from the artificial intelligence (AI) field that incorporate deterministic knowledge into models to perform statistical analysis can be applied to include prior knowledge in genetic analysis. We chose to use the most general such approach, Markov Logic Networks (MLNs), for combining deterministic knowledge with statistical analysis. Using simple, logistic regression-type MLNs we can replicate the results of traditional statistical methods, but we also show that we are able to go beyond finding independent markers linked to a phenotype by using joint inference without an independence assumption. The method is applied to genetic data on yeast sporulation, a complex phenotype with gene interactions. In addition to detecting all of the previously identified loci associated with sporulation, our method identifies four loci with smaller effects. Since their effect on sporulation is small, these four loci were not detected with methods that do not account for dependence between markers due to gene interactions. We show how gene interactions can be detected using more complex models, which can be used as a general framework for incorporating systems biology with genetics. PMID:20958249
Constructing and Modifying Sequence Statistics for relevent Using informR in 𝖱
Marcum, Christopher Steven; Butts, Carter T.
2015-01-01
The informR package greatly simplifies the analysis of complex event histories in 𝖱 by providing user friendly tools to build sufficient statistics for the relevent package. Historically, building sufficient statistics to model event sequences (of the form a→b) using the egocentric generalization of Butts’ (2008) relational event framework for modeling social action has been cumbersome. The informR package simplifies the construction of the complex list of arrays needed by the rem() model fitting for a variety of cases involving egocentric event data, multiple event types, and/or support constraints. This paper introduces these tools using examples from real data extracted from the American Time Use Survey. PMID:26185488
A generalized association test based on U statistics.
Wei, Changshuai; Lu, Qing
2017-07-01
Second generation sequencing technologies are being increasingly used for genetic association studies, where the main research interest is to identify sets of genetic variants that contribute to various phenotypes. The phenotype can be univariate disease status, multivariate responses and even high-dimensional outcomes. Considering the genotype and phenotype as two complex objects, this also poses a general statistical problem of testing association between complex objects. We here proposed a similarity-based test, generalized similarity U (GSU), that can test the association between complex objects. We first studied the theoretical properties of the test in a general setting and then focused on the application of the test to sequencing association studies. Based on theoretical analysis, we proposed to use Laplacian Kernel-based similarity for GSU to boost power and enhance robustness. Through simulation, we found that GSU did have advantages over existing methods in terms of power and robustness. We further performed a whole genome sequencing (WGS) scan for Alzherimer's disease neuroimaging initiative data, identifying three genes, APOE , APOC1 and TOMM40 , associated with imaging phenotype. We developed a C ++ package for analysis of WGS data using GSU. The source codes can be downloaded at https://github.com/changshuaiwei/gsu . weichangshuai@gmail.com ; qlu@epi.msu.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Interference in the classical probabilistic model and its representation in complex Hilbert space
NASA Astrophysics Data System (ADS)
Khrennikov, Andrei Yu.
2005-10-01
The notion of a context (complex of physical conditions, that is to say: specification of the measurement setup) is basic in this paper.We show that the main structures of quantum theory (interference of probabilities, Born's rule, complex probabilistic amplitudes, Hilbert state space, representation of observables by operators) are present already in a latent form in the classical Kolmogorov probability model. However, this model should be considered as a calculus of contextual probabilities. In our approach it is forbidden to consider abstract context independent probabilities: “first context and only then probability”. We construct the representation of the general contextual probabilistic dynamics in the complex Hilbert space. Thus dynamics of the wave function (in particular, Schrödinger's dynamics) can be considered as Hilbert space projections of a realistic dynamics in a “prespace”. The basic condition for representing of the prespace-dynamics is the law of statistical conservation of energy-conservation of probabilities. In general the Hilbert space projection of the “prespace” dynamics can be nonlinear and even irreversible (but it is always unitary). Methods developed in this paper can be applied not only to quantum mechanics, but also to classical statistical mechanics. The main quantum-like structures (e.g., interference of probabilities) might be found in some models of classical statistical mechanics. Quantum-like probabilistic behavior can be demonstrated by biological systems. In particular, it was recently found in some psychological experiments.
Massive parallelization of serial inference algorithms for a complex generalized linear model
Suchard, Marc A.; Simpson, Shawn E.; Zorych, Ivan; Ryan, Patrick; Madigan, David
2014-01-01
Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety. PMID:25328363
Quantum communication complexity advantage implies violation of a Bell inequality
Buhrman, Harry; Czekaj, Łukasz; Grudka, Andrzej; Horodecki, Michał; Horodecki, Paweł; Markiewicz, Marcin; Speelman, Florian; Strelchuk, Sergii
2016-01-01
We obtain a general connection between a large quantum advantage in communication complexity and Bell nonlocality. We show that given any protocol offering a sufficiently large quantum advantage in communication complexity, there exists a way of obtaining measurement statistics that violate some Bell inequality. Our main tool is port-based teleportation. If the gap between quantum and classical communication complexity can grow arbitrarily large, the ratio of the quantum value to the classical value of the Bell quantity becomes unbounded with the increase in the number of inputs and outputs. PMID:26957600
Enforcing Job Safety: A Managerial View
ERIC Educational Resources Information Center
Barnako, Frank R.
1975-01-01
The views of management or of employees regarding enforcement of the job safety law range from general satisfaction to calls for repeal of the act. The complexity of standards, statistics and recordkeeping, and enforcement procedures are major areas of concern. (MW)
Transmit Designs for the MIMO Broadcast Channel With Statistical CSI
NASA Astrophysics Data System (ADS)
Wu, Yongpeng; Jin, Shi; Gao, Xiqi; McKay, Matthew R.; Xiao, Chengshan
2014-09-01
We investigate the multiple-input multiple-output broadcast channel with statistical channel state information available at the transmitter. The so-called linear assignment operation is employed, and necessary conditions are derived for the optimal transmit design under general fading conditions. Based on this, we introduce an iterative algorithm to maximize the linear assignment weighted sum-rate by applying a gradient descent method. To reduce complexity, we derive an upper bound of the linear assignment achievable rate of each receiver, from which a simplified closed-form expression for a near-optimal linear assignment matrix is derived. This reveals an interesting construction analogous to that of dirty-paper coding. In light of this, a low complexity transmission scheme is provided. Numerical examples illustrate the significant performance of the proposed low complexity scheme.
Higgs, W
1996-12-01
This paper examines the relationship between intellectual debate, technologies for analysing information, and the production of statistics in the General Register Office (GRO) in London in the early twentieth century. It argues that controversy between eugenicists and public health officials respecting the cause and effect of class-specific variations in fertility led to the introduction of questions in the 1911 census on marital fertility. The increasing complexity of the census necessitated a shift from manual to mechanised forms of data processing within the GRO. The subsequent increase in processing power allowed the GRO to make important changes to the medical and demographic statistics it published in the annual Reports of the Registrar General. These included substituting administrative sanitary districts for registration districts as units of analysis, consistently transferring deaths in institutions back to place of residence, and abstracting deaths according to the International List of Causes of Death.
Right-Sizing Statistical Models for Longitudinal Data
Wood, Phillip K.; Steinley, Douglas; Jackson, Kristina M.
2015-01-01
Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to “right-size” the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting overly parsimonious models to more complex better fitting alternatives, and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically under-identified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A three-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation/covariation patterns. The orthogonal, free-curve slope-intercept (FCSI) growth model is considered as a general model which includes, as special cases, many models including the Factor Mean model (FM, McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, Hierarchical Linear Models (HLM), Repeated Measures MANOVA, and the Linear Slope Intercept (LinearSI) Growth Model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparison of several candidate parametric growth and chronometric models in a Monte Carlo study. PMID:26237507
Right-sizing statistical models for longitudinal data.
Wood, Phillip K; Steinley, Douglas; Jackson, Kristina M
2015-12-01
Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to "right-size" the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting, overly parsimonious models to more complex, better-fitting alternatives and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically underidentified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A 3-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation-covariation patterns. The orthogonal free curve slope intercept (FCSI) growth model is considered a general model that includes, as special cases, many models, including the factor mean (FM) model (McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, hierarchical linear models (HLMs), repeated-measures multivariate analysis of variance (MANOVA), and the linear slope intercept (linearSI) growth model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparing several candidate parametric growth and chronometric models in a Monte Carlo study. (c) 2015 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Qi, Di
Turbulent dynamical systems are ubiquitous in science and engineering. Uncertainty quantification (UQ) in turbulent dynamical systems is a grand challenge where the goal is to obtain statistical estimates for key physical quantities. In the development of a proper UQ scheme for systems characterized by both a high-dimensional phase space and a large number of instabilities, significant model errors compared with the true natural signal are always unavoidable due to both the imperfect understanding of the underlying physical processes and the limited computational resources available. One central issue in contemporary research is the development of a systematic methodology for reduced order models that can recover the crucial features both with model fidelity in statistical equilibrium and with model sensitivity in response to perturbations. In the first part, we discuss a general mathematical framework to construct statistically accurate reduced-order models that have skill in capturing the statistical variability in the principal directions of a general class of complex systems with quadratic nonlinearity. A systematic hierarchy of simple statistical closure schemes, which are built through new global statistical energy conservation principles combined with statistical equilibrium fidelity, are designed and tested for UQ of these problems. Second, the capacity of imperfect low-order stochastic approximations to model extreme events in a passive scalar field advected by turbulent flows is investigated. The effects in complicated flow systems are considered including strong nonlinear and non-Gaussian interactions, and much simpler and cheaper imperfect models with model error are constructed to capture the crucial statistical features in the stationary tracer field. Several mathematical ideas are introduced to improve the prediction skill of the imperfect reduced-order models. Most importantly, empirical information theory and statistical linear response theory are applied in the training phase for calibrating model errors to achieve optimal imperfect model parameters; and total statistical energy dynamics are introduced to improve the model sensitivity in the prediction phase especially when strong external perturbations are exerted. The validity of reduced-order models for predicting statistical responses and intermittency is demonstrated on a series of instructive models with increasing complexity, including the stochastic triad model, the Lorenz '96 model, and models for barotropic and baroclinic turbulence. The skillful low-order modeling methods developed here should also be useful for other applications such as efficient algorithms for data assimilation.
A Guide to the Literature on Learning Graphical Models
NASA Technical Reports Server (NTRS)
Buntine, Wray L.; Friedland, Peter (Technical Monitor)
1994-01-01
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and more generally, learning probabilistic graphical models. Because many problems in artificial intelligence, statistics and neural networks can be represented as a probabilistic graphical model, this area provides a unifying perspective on learning. This paper organizes the research in this area along methodological lines of increasing complexity.
A generalized complexity measure based on Rényi entropy
NASA Astrophysics Data System (ADS)
Sánchez-Moreno, Pablo; Angulo, Juan Carlos; Dehesa, Jesus S.
2014-08-01
The intrinsic statistical complexities of finite many-particle systems (i.e., those defined in terms of the single-particle density) quantify the degree of structure or patterns, far beyond the entropy measures. They are intuitively constructed to be minima at the opposite extremes of perfect order and maximal randomness. Starting from the pioneering LMC measure, which satisfies these requirements, some extensions of LMC-Rényi type have been published in the literature. The latter measures were shown to describe a variety of physical aspects of the internal disorder in atomic and molecular systems (e.g., quantum phase transitions, atomic shell filling) which are not grasped by their mother LMC quantity. However, they are not minimal for maximal randomness in general. In this communication, we propose a generalized LMC-Rényi complexity which overcomes this problem. Some applications which illustrate this fact are given.
Exploiting Complexity Information for Brain Activation Detection
Zhang, Yan; Liang, Jiali; Lin, Qiang; Hu, Zhenghui
2016-01-01
We present a complexity-based approach for the analysis of fMRI time series, in which sample entropy (SampEn) is introduced as a quantification of the voxel complexity. Under this hypothesis the voxel complexity could be modulated in pertinent cognitive tasks, and it changes through experimental paradigms. We calculate the complexity of sequential fMRI data for each voxel in two distinct experimental paradigms and use a nonparametric statistical strategy, the Wilcoxon signed rank test, to evaluate the difference in complexity between them. The results are compared with the well known general linear model based Statistical Parametric Mapping package (SPM12), where a decided difference has been observed. This is because SampEn method detects brain complexity changes in two experiments of different conditions and the data-driven method SampEn evaluates just the complexity of specific sequential fMRI data. Also, the larger and smaller SampEn values correspond to different meanings, and the neutral-blank design produces higher predictability than threat-neutral. Complexity information can be considered as a complementary method to the existing fMRI analysis strategies, and it may help improving the understanding of human brain functions from a different perspective. PMID:27045838
Is Statistical Learning Constrained by Lower Level Perceptual Organization?
Emberson, Lauren L.; Liu, Ran; Zevin, Jason D.
2013-01-01
In order for statistical information to aid in complex developmental processes such as language acquisition, learning from higher-order statistics (e.g. across successive syllables in a speech stream to support segmentation) must be possible while perceptual abilities (e.g. speech categorization) are still developing. The current study examines how perceptual organization interacts with statistical learning. Adult participants were presented with multiple exemplars from novel, complex sound categories designed to reflect some of the spectral complexity and variability of speech. These categories were organized into sequential pairs and presented such that higher-order statistics, defined based on sound categories, could support stream segmentation. Perceptual similarity judgments and multi-dimensional scaling revealed that participants only perceived three perceptual clusters of sounds and thus did not distinguish the four experimenter-defined categories, creating a tension between lower level perceptual organization and higher-order statistical information. We examined whether the resulting pattern of learning is more consistent with statistical learning being “bottom-up,” constrained by the lower levels of organization, or “top-down,” such that higher-order statistical information of the stimulus stream takes priority over the perceptual organization, and perhaps influences perceptual organization. We consistently find evidence that learning is constrained by perceptual organization. Moreover, participants generalize their learning to novel sounds that occupy a similar perceptual space, suggesting that statistical learning occurs based on regions of or clusters in perceptual space. Overall, these results reveal a constraint on learning of sound sequences, such that statistical information is determined based on lower level organization. These findings have important implications for the role of statistical learning in language acquisition. PMID:23618755
The best motivator priorities parents choose via analytical hierarchy process
NASA Astrophysics Data System (ADS)
Farah, R. N.; Latha, P.
2015-05-01
Motivation is probably the most important factor that educators can target in order to improve learning. Numerous cross-disciplinary theories have been postulated to explain motivation. While each of these theories has some truth, no single theory seems to adequately explain all human motivation. The fact is that human beings in general and pupils in particular are complex creatures with complex needs and desires. In this paper, Analytic Hierarchy Process (AHP) has been proposed as an emerging solution to move towards too large, dynamic and complex real world multi-criteria decision making problems in selecting the most suitable motivator when choosing school for their children. Data were analyzed using SPSS 17.0 ("Statistical Package for Social Science") software. Statistic testing used are descriptive and inferential statistic. Descriptive statistic used to identify respondent pupils and parents demographic factors. The statistical testing used to determine the pupils and parents highest motivator priorities and parents' best priorities using AHP to determine the criteria chosen by parents such as school principals, teachers, pupils and parents. The moderating factors are selected schools based on "Standard Kualiti Pendidikan Malaysia" (SKPM) in Ampang. Inferential statistics such as One-way ANOVA used to get the significant and data used to calculate the weightage of AHP. School principals is found to be the best motivator for parents in choosing school for their pupils followed by teachers, parents and pupils.
The Entropy of Non-Ergodic Complex Systems — a Derivation from First Principles
NASA Astrophysics Data System (ADS)
Thurner, Stefan; Hanel, Rudolf
In information theory the 4 Shannon-Khinchin1,2 (SK) axioms determine Boltzmann Gibbs entropy, S -∑i pilog pi, as the unique entropy. Physics is different from information in the sense that physical systems can be non-ergodic or non-Markovian. To characterize such strongly interacting, statistical systems - complex systems in particular - within a thermodynamical framework it might be necessary to introduce generalized entropies. A series of such entropies have been proposed in the past decades. Until now the understanding of their fundamental origin and their deeper relations to complex systems remains unclear. To clarify the situation we note that non-ergodicity explicitly violates the fourth SK axiom. We show that by relaxing this axiom the entropy generalizes to, S ∑i Γ(d + 1, 1 - c log pi), where Γ is the incomplete Gamma function, and c and d are scaling exponents. All recently proposed entropies compatible with the first 3 SK axioms appear to be special cases. We prove that each statistical system is uniquely characterized by the pair of the two scaling exponents (c, d), which defines equivalence classes for all systems. The corresponding distribution functions are special forms of Lambert-W exponentials containing, as special cases, Boltzmann, stretched exponential and Tsallis distributions (power-laws) - all widely abundant in nature. This derivation is the first ab initio justification for generalized entropies. We next show how the phasespace volume of a system is related to its generalized entropy, and provide a concise criterion when it is not of Boltzmann-Gibbs type but assumes a generalized form. We show that generalized entropies only become relevant when the dynamically (statistically) relevant fraction of degrees of freedom in a system vanishes in the thermodynamic limit. These are systems where the bulk of the degrees of freedom is frozen. Systems governed by generalized entropies are therefore systems whose phasespace volume effectively collapses to a lower-dimensional 'surface'. We explicitly illustrate the situation for accelerating random walks, and a spin system on a constant-conectancy network. We argue that generalized entropies should be relevant for self-organized critical systems such as sand piles, for spin systems which form meta-structures such as vortices, domains, instantons, etc., and for problems associated with anomalous diffusion.
Physics-based statistical learning approach to mesoscopic model selection.
Taverniers, Søren; Haut, Terry S; Barros, Kipton; Alexander, Francis J; Lookman, Turab
2015-11-01
In materials science and many other research areas, models are frequently inferred without considering their generalization to unseen data. We apply statistical learning using cross-validation to obtain an optimally predictive coarse-grained description of a two-dimensional kinetic nearest-neighbor Ising model with Glauber dynamics (GD) based on the stochastic Ginzburg-Landau equation (sGLE). The latter is learned from GD "training" data using a log-likelihood analysis, and its predictive ability for various complexities of the model is tested on GD "test" data independent of the data used to train the model on. Using two different error metrics, we perform a detailed analysis of the error between magnetization time trajectories simulated using the learned sGLE coarse-grained description and those obtained using the GD model. We show that both for equilibrium and out-of-equilibrium GD training trajectories, the standard phenomenological description using a quartic free energy does not always yield the most predictive coarse-grained model. Moreover, increasing the amount of training data can shift the optimal model complexity to higher values. Our results are promising in that they pave the way for the use of statistical learning as a general tool for materials modeling and discovery.
a Statistical Dynamic Approach to Structural Evolution of Complex Capital Market Systems
NASA Astrophysics Data System (ADS)
Shao, Xiao; Chai, Li H.
As an important part of modern financial systems, capital market has played a crucial role on diverse social resource allocations and economical exchanges. Beyond traditional models and/or theories based on neoclassical economics, considering capital markets as typical complex open systems, this paper attempts to develop a new approach to overcome some shortcomings of the available researches. By defining the generalized entropy of capital market systems, a theoretical model and nonlinear dynamic equation on the operations of capital market are proposed from statistical dynamic perspectives. The US security market from 1995 to 2001 is then simulated and analyzed as a typical case. Some instructive results are discussed and summarized.
A Statistical Test of Walrasian Equilibrium by Means of Complex Networks Theory
NASA Astrophysics Data System (ADS)
Bargigli, Leonardo; Viaggiu, Stefano; Lionetto, Andrea
2016-10-01
We represent an exchange economy in terms of statistical ensembles for complex networks by introducing the concept of market configuration. This is defined as a sequence of nonnegative discrete random variables {w_{ij}} describing the flow of a given commodity from agent i to agent j. This sequence can be arranged in a nonnegative matrix W which we can regard as the representation of a weighted and directed network or digraph G. Our main result consists in showing that general equilibrium theory imposes highly restrictive conditions upon market configurations, which are in most cases not fulfilled by real markets. An explicit example with reference to the e-MID interbank credit market is provided.
Statistical learning and the challenge of syntax: Beyond finite state automata
NASA Astrophysics Data System (ADS)
Elman, Jeff
2003-10-01
Over the past decade, it has been clear that even very young infants are sensitive to the statistical structure of language input presented to them, and use the distributional regularities to induce simple grammars. But can such statistically-driven learning also explain the acquisition of more complex grammar, particularly when the grammar includes recursion? Recent claims (e.g., Hauser, Chomsky, and Fitch, 2002) have suggested that the answer is no, and that at least recursion must be an innate capacity of the human language acquisition device. In this talk evidence will be presented that indicates that, in fact, statistically-driven learning (embodied in recurrent neural networks) can indeed enable the learning of complex grammatical patterns, including those that involve recursion. When the results are generalized to idealized machines, it is found that the networks are at least equivalent to Push Down Automata. Perhaps more interestingly, with limited and finite resources (such as are presumed to exist in the human brain) these systems demonstrate patterns of performance that resemble those in humans.
NASA Astrophysics Data System (ADS)
Siddiqui, Maheen; Wedemann, Roseli S.; Jensen, Henrik Jeldtoft
2018-01-01
We explore statistical characteristics of avalanches associated with the dynamics of a complex-network model, where two modules corresponding to sensorial and symbolic memories interact, representing unconscious and conscious mental processes. The model illustrates Freud's ideas regarding the neuroses and that consciousness is related with symbolic and linguistic memory activity in the brain. It incorporates the Stariolo-Tsallis generalization of the Boltzmann Machine in order to model memory retrieval and associativity. In the present work, we define and measure avalanche size distributions during memory retrieval, in order to gain insight regarding basic aspects of the functioning of these complex networks. The avalanche sizes defined for our model should be related to the time consumed and also to the size of the neuronal region which is activated, during memory retrieval. This allows the qualitative comparison of the behaviour of the distribution of cluster sizes, obtained during fMRI measurements of the propagation of signals in the brain, with the distribution of avalanche sizes obtained in our simulation experiments. This comparison corroborates the indication that the Nonextensive Statistical Mechanics formalism may indeed be more well suited to model the complex networks which constitute brain and mental structure.
Statistical dielectronic recombination rates for multielectron ions in plasma
NASA Astrophysics Data System (ADS)
Demura, A. V.; Leont'iev, D. S.; Lisitsa, V. S.; Shurygin, V. A.
2017-10-01
We describe the general analytic derivation of the dielectronic recombination (DR) rate coefficient for multielectron ions in a plasma based on the statistical theory of an atom in terms of the spatial distribution of the atomic electron density. The dielectronic recombination rates for complex multielectron tungsten ions are calculated numerically in a wide range of variation of the plasma temperature, which is important for modern nuclear fusion studies. The results of statistical theory are compared with the data obtained using level-by-level codes ADPAK, FAC, HULLAC, and experimental results. We consider different statistical DR models based on the Thomas-Fermi distribution, viz., integral and differential with respect to the orbital angular momenta of the ion core and the trapped electron, as well as the Rost model, which is an analog of the Frank-Condon model as applied to atomic structures. In view of its universality and relative simplicity, the statistical approach can be used for obtaining express estimates of the dielectronic recombination rate coefficients in complex calculations of the parameters of the thermonuclear plasmas. The application of statistical methods also provides information for the dielectronic recombination rates with much smaller computer time expenditures as compared to available level-by-level codes.
Eigenvalue statistics for the sum of two complex Wishart matrices
NASA Astrophysics Data System (ADS)
Kumar, Santosh
2014-09-01
The sum of independent Wishart matrices, taken from distributions with unequal covariance matrices, plays a crucial role in multivariate statistics, and has applications in the fields of quantitative finance and telecommunication. However, analytical results concerning the corresponding eigenvalue statistics have remained unavailable, even for the sum of two Wishart matrices. This can be attributed to the complicated and rotationally noninvariant nature of the matrix distribution that makes extracting the information about eigenvalues a nontrivial task. Using a generalization of the Harish-Chandra-Itzykson-Zuber integral, we find exact solution to this problem for the complex Wishart case when one of the covariance matrices is proportional to the identity matrix, while the other is arbitrary. We derive exact and compact expressions for the joint probability density and marginal density of eigenvalues. The analytical results are compared with numerical simulations and we find perfect agreement.
Statistical physics of the symmetric group.
Williams, Mobolaji
2017-04-01
Ordered chains (such as chains of amino acids) are ubiquitous in biological cells, and these chains perform specific functions contingent on the sequence of their components. Using the existence and general properties of such sequences as a theoretical motivation, we study the statistical physics of systems whose state space is defined by the possible permutations of an ordered list, i.e., the symmetric group, and whose energy is a function of how certain permutations deviate from some chosen correct ordering. Such a nonfactorizable state space is quite different from the state spaces typically considered in statistical physics systems and consequently has novel behavior in systems with interacting and even noninteracting Hamiltonians. Various parameter choices of a mean-field model reveal the system to contain five different physical regimes defined by two transition temperatures, a triple point, and a quadruple point. Finally, we conclude by discussing how the general analysis can be extended to state spaces with more complex combinatorial properties and to other standard questions of statistical mechanics models.
Statistical physics of the symmetric group
NASA Astrophysics Data System (ADS)
Williams, Mobolaji
2017-04-01
Ordered chains (such as chains of amino acids) are ubiquitous in biological cells, and these chains perform specific functions contingent on the sequence of their components. Using the existence and general properties of such sequences as a theoretical motivation, we study the statistical physics of systems whose state space is defined by the possible permutations of an ordered list, i.e., the symmetric group, and whose energy is a function of how certain permutations deviate from some chosen correct ordering. Such a nonfactorizable state space is quite different from the state spaces typically considered in statistical physics systems and consequently has novel behavior in systems with interacting and even noninteracting Hamiltonians. Various parameter choices of a mean-field model reveal the system to contain five different physical regimes defined by two transition temperatures, a triple point, and a quadruple point. Finally, we conclude by discussing how the general analysis can be extended to state spaces with more complex combinatorial properties and to other standard questions of statistical mechanics models.
Rogue waves in terms of multi-point statistics and nonequilibrium thermodynamics
NASA Astrophysics Data System (ADS)
Hadjihosseini, Ali; Lind, Pedro; Mori, Nobuhito; Hoffmann, Norbert P.; Peinke, Joachim
2017-04-01
Ocean waves, which lead to rogue waves, are investigated on the background of complex systems. In contrast to deterministic approaches based on the nonlinear Schroedinger equation or focusing effects, we analyze this system in terms of a noisy stochastic system. In particular we present a statistical method that maps the complexity of multi-point data into the statistics of hierarchically ordered height increments for different time scales. We show that the stochastic cascade process with Markov properties is governed by a Fokker-Planck equation. Conditional probabilities as well as the Fokker-Planck equation itself can be estimated directly from the available observational data. This stochastic description enables us to show several new aspects of wave states. Surrogate data sets can in turn be generated allowing to work out different statistical features of the complex sea state in general and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics. As a new outlook the ocean wave states will be considered in terms of nonequilibrium thermodynamics, for which the entropy production of different wave heights will be considered. We show evidence that rogue waves are characterized by negative entropy production. The statistics of the entropy production can be used to distinguish different wave states.
Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao
2009-01-01
Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Experimental econophysics: Complexity, self-organization, and emergent properties
NASA Astrophysics Data System (ADS)
Huang, J. P.
2015-03-01
Experimental econophysics is concerned with statistical physics of humans in the laboratory, and it is based on controlled human experiments developed by physicists to study some problems related to economics or finance. It relies on controlled human experiments in the laboratory together with agent-based modeling (for computer simulations and/or analytical theory), with an attempt to reveal the general cause-effect relationship between specific conditions and emergent properties of real economic/financial markets (a kind of complex adaptive systems). Here I review the latest progress in the field, namely, stylized facts, herd behavior, contrarian behavior, spontaneous cooperation, partial information, and risk management. Also, I highlight the connections between such progress and other topics of traditional statistical physics. The main theme of the review is to show diverse emergent properties of the laboratory markets, originating from self-organization due to the nonlinear interactions among heterogeneous humans or agents (complexity).
GLIMMPSE Lite: Calculating Power and Sample Size on Smartphone Devices
Munjal, Aarti; Sakhadeo, Uttara R.; Muller, Keith E.; Glueck, Deborah H.; Kreidler, Sarah M.
2014-01-01
Researchers seeking to develop complex statistical applications for mobile devices face a common set of difficult implementation issues. In this work, we discuss general solutions to the design challenges. We demonstrate the utility of the solutions for a free mobile application designed to provide power and sample size calculations for univariate, one-way analysis of variance (ANOVA), GLIMMPSE Lite. Our design decisions provide a guide for other scientists seeking to produce statistical software for mobile platforms. PMID:25541688
Detection of crossover time scales in multifractal detrended fluctuation analysis
NASA Astrophysics Data System (ADS)
Ge, Erjia; Leung, Yee
2013-04-01
Fractal is employed in this paper as a scale-based method for the identification of the scaling behavior of time series. Many spatial and temporal processes exhibiting complex multi(mono)-scaling behaviors are fractals. One of the important concepts in fractals is crossover time scale(s) that separates distinct regimes having different fractal scaling behaviors. A common method is multifractal detrended fluctuation analysis (MF-DFA). The detection of crossover time scale(s) is, however, relatively subjective since it has been made without rigorous statistical procedures and has generally been determined by eye balling or subjective observation. Crossover time scales such determined may be spurious and problematic. It may not reflect the genuine underlying scaling behavior of a time series. The purpose of this paper is to propose a statistical procedure to model complex fractal scaling behaviors and reliably identify the crossover time scales under MF-DFA. The scaling-identification regression model, grounded on a solid statistical foundation, is first proposed to describe multi-scaling behaviors of fractals. Through the regression analysis and statistical inference, we can (1) identify the crossover time scales that cannot be detected by eye-balling observation, (2) determine the number and locations of the genuine crossover time scales, (3) give confidence intervals for the crossover time scales, and (4) establish the statistically significant regression model depicting the underlying scaling behavior of a time series. To substantive our argument, the regression model is applied to analyze the multi-scaling behaviors of avian-influenza outbreaks, water consumption, daily mean temperature, and rainfall of Hong Kong. Through the proposed model, we can have a deeper understanding of fractals in general and a statistical approach to identify multi-scaling behavior under MF-DFA in particular.
The World of Juvenile Justice According to the Numbers
ERIC Educational Resources Information Center
Rozalski, Michael; Deignan, Marilyn; Engel, Suzanne
2008-01-01
Intended to be an instructive, yet sobering, introduction to the complex and disturbing nature of the juvenile justice system, this article details the "numbers," including selected percentages, ratios, and dollar amounts, that are relevant to developing a better understanding of the juvenile justice system. General statistics about juvenile and…
Evidence of the non-extensive character of Earth's ambient noise.
NASA Astrophysics Data System (ADS)
Koutalonis, Ioannis; Vallianatos, Filippos
2017-04-01
Investigation of dynamical features of ambient seismic noise is one of the important scientific and practical research challenges. In the same time there isgrowing interest concerning an approach to study Earth Physics based on thescience of complex systems and non extensive statistical mechanics which is a generalization of Boltzmann-Gibbs statistical physics (Vallianatos et al., 2016).This seems to be a promising framework for studying complex systems exhibitingphenomena such as, long-range interactions, and memory effects. Inthis work we use non-extensive statistical mechanics and signal analysis methodsto explore the nature of ambient noise as measured in the stations of the HSNC in South Aegean (Chatzopoulos et al., 2016). In the present work we analyzed the de-trended increments time series of ambient seismic noise X(t), in time windows of 20 minutes to 10 seconds within "calm time zones" where the human-induced noise presents a minimum. Following the non extensive statistical physics approach, the probability distribution function of the increments of ambient noise is investigated. Analyzing the probability density function (PDF)p(X), normalized to zero mean and unit varianceresults that the fluctuations of Earth's ambient noise follows a q-Gaussian distribution asdefined in the frame of non-extensive statisticalmechanics indicated the possible existence of memory effects in Earth's ambient noise. References: F. Vallianatos, G. Papadakis, G. Michas, Generalized statistical mechanics approaches to earthquakes and tectonics. Proc. R. Soc. A, 472, 20160497, 2016. G. Chatzopoulos, I.Papadopoulos, F.Vallianatos, The Hellenic Seismological Network of Crete (HSNC): Validation and results of the 2013 aftershock,Advances in Geosciences, 41, 65-72, 2016.
NASA Astrophysics Data System (ADS)
Vallianatos, F.; Tzanis, A.; Michas, G.; Papadakis, G.
2012-04-01
Since the middle of summer 2011, an increase in the seismicity rates of the volcanic complex system of Santorini Island, Greece, was observed. In the present work, the temporal distribution of seismicity, as well as the magnitude distribution of earthquakes, have been studied using the concept of Non-Extensive Statistical Physics (NESP; Tsallis, 2009) along with the evolution of Shanon entropy H (also called information entropy). The analysis is based on the earthquake catalogue of the Geodynamic Institute of the National Observatory of Athens for the period July 2011-January 2012 (http://www.gein.noa.gr/). Non-Extensive Statistical Physics, which is a generalization of Boltzmann-Gibbs statistical physics, seems a suitable framework for studying complex systems. The observed distributions of seismicity rates at Santorini can be described (fitted) with NESP models to exceptionally well. This implies the inherent complexity of the Santorini volcanic seismicity, the applicability of NESP concepts to volcanic earthquake activity and the usefulness of NESP in investigating phenomena exhibiting multifractality and long-range coupling effects. Acknowledgments. This work was supported in part by the THALES Program of the Ministry of Education of Greece and the European Union in the framework of the project entitled "Integrated understanding of Seismicity, using innovative Methodologies of Fracture mechanics along with Earthquake and non extensive statistical physics - Application to the geodynamic system of the Hellenic Arc. SEISMO FEAR HELLARC". GM and GP wish to acknowledge the partial support of the Greek State Scholarships Foundation (ΙΚΥ).
Role of microstructure on twin nucleation and growth in HCP titanium: A statistical study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arul Kumar, M.; Wroński, M.; McCabe, Rodney James
In this study, a detailed statistical analysis is performed using Electron Back Scatter Diffraction (EBSD) to establish the effect of microstructure on twin nucleation and growth in deformed commercial purity hexagonal close packed (HCP) titanium. Rolled titanium samples are compressed along rolling, transverse and normal directions to establish statistical correlations for {10–12}, {11–21}, and {11–22} twins. A recently developed automated EBSD-twinning analysis software is employed for the statistical analysis. Finally, the analysis provides the following key findings: (I) grain size and strain dependence is different for twin nucleation and growth; (II) twinning statistics can be generalized for the HCP metalsmore » magnesium, zirconium and titanium; and (III) complex microstructure, where grain shape and size distribution is heterogeneous, requires multi-point statistical correlations.« less
Role of microstructure on twin nucleation and growth in HCP titanium: A statistical study
Arul Kumar, M.; Wroński, M.; McCabe, Rodney James; ...
2018-02-01
In this study, a detailed statistical analysis is performed using Electron Back Scatter Diffraction (EBSD) to establish the effect of microstructure on twin nucleation and growth in deformed commercial purity hexagonal close packed (HCP) titanium. Rolled titanium samples are compressed along rolling, transverse and normal directions to establish statistical correlations for {10–12}, {11–21}, and {11–22} twins. A recently developed automated EBSD-twinning analysis software is employed for the statistical analysis. Finally, the analysis provides the following key findings: (I) grain size and strain dependence is different for twin nucleation and growth; (II) twinning statistics can be generalized for the HCP metalsmore » magnesium, zirconium and titanium; and (III) complex microstructure, where grain shape and size distribution is heterogeneous, requires multi-point statistical correlations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willse, Alan R.; Belcher, Ann; Preti, George
2005-04-15
Gas chromatography (GC), combined with mass spectrometry (MS) detection, is a powerful analytical technique that can be used to separate, quantify, and identify volatile compounds in complex mixtures. This paper examines the application of GC-MS in a comparative experiment to identify volatiles that differ in concentration between two groups. A complex mixture might comprise several hundred or even thousands of volatile compounds. Because their number and location in a chromatogram generally are unknown, and because components overlap in populous chromatograms, the statistical problems offer significant challenges beyond traditional two-group screening procedures. We describe a statistical procedure to compare two-dimensional GC-MSmore » profiles between groups, which entails (1) signal processing: baseline correction and peak detection in single ion chromatograms; (2) aligning chromatograms in time; (3) normalizing differences in overall signal intensities; and (4) detecting chromatographic regions that differ between groups. Compared to existing approaches, the proposed method is robust to errors made at earlier stages of analysis, such as missed peaks or slightly misaligned chromatograms. To illustrate the method, we identify differences in GC-MS chromatograms of ether-extracted urine collected from two nearly identical inbred groups of mice, to investigate the relationship between odor and genetics of the major histocompatibility complex.« less
EEG analysis using wavelet-based information tools.
Rosso, O A; Martin, M T; Figliola, A; Keller, K; Plastino, A
2006-06-15
Wavelet-based informational tools for quantitative electroencephalogram (EEG) record analysis are reviewed. Relative wavelet energies, wavelet entropies and wavelet statistical complexities are used in the characterization of scalp EEG records corresponding to secondary generalized tonic-clonic epileptic seizures. In particular, we show that the epileptic recruitment rhythm observed during seizure development is well described in terms of the relative wavelet energies. In addition, during the concomitant time-period the entropy diminishes while complexity grows. This is construed as evidence supporting the conjecture that an epileptic focus, for this kind of seizures, triggers a self-organized brain state characterized by both order and maximal complexity.
Probing the Fluctuations of Optical Properties in Time-Resolved Spectroscopy
NASA Astrophysics Data System (ADS)
Randi, Francesco; Esposito, Martina; Giusti, Francesca; Misochko, Oleg; Parmigiani, Fulvio; Fausti, Daniele; Eckstein, Martin
2017-11-01
We show that, in optical pump-probe experiments on bulk samples, the statistical distribution of the intensity of ultrashort light pulses after interaction with a nonequilibrium complex material can be used to measure the time-dependent noise of the current in the system. We illustrate the general arguments for a photoexcited Peierls material. The transient noise spectroscopy allows us to measure to what extent electronic degrees of freedom dynamically obey the fluctuation-dissipation theorem, and how well they thermalize during the coherent lattice vibrations. The proposed statistical measurement developed here provides a new general framework to retrieve dynamical information on the excited distributions in nonequilibrium experiments, which could be extended to other degrees of freedom of magnetic or vibrational origin.
PREFACE: Mathematical Aspects of Generalized Entropies and their Applications
NASA Astrophysics Data System (ADS)
Suyari, Hiroki; Ohara, Atsumi; Wada, Tatsuaki
2010-01-01
In the recent increasing interests in power-law behaviors beyond the usual exponential ones, there have been some concrete attempts in statistical physics to generalize the standard Boltzmann-Gibbs statistics. Among such generalizations, nonextensive statistical mechanics has been well studied for about the last two decades with many modifications and refinements. The generalization has provided not only a theoretical framework but also many applications such as chaos, multi-fractal, complex systems, nonequilibrium statistical mechanics, biophysics, econophysics, information theory and so on. At the same time as the developments in the generalization of statistical mechanics, the corresponding mathematical structures have also been required and uncovered. In particular, some deep connections to mathematical sciences such as q-analysis, information geometry, information theory and quantum probability theory have been revealed recently. These results obviously indicate an existence of the generalized mathematical structure including the mathematical framework for the exponential family as a special case, but the whole structure is still unclear. In order to make an opportunity to discuss the mathematical structure induced from generalized entropies by scientists in many fields, the international workshop 'Mathematical Aspects of Generalized Entropies and their Applications' was held on 7-9 July 2009 at Kyoto TERRSA, Kyoto, Japan. This volume is the proceedings of the workshop which consisted of 6 invited speakers, 14 oral presenters, 7 poster presenters and 63 other participants. The topics of the workshop cover the nonextensive statistical mechanics, chaos, cosmology, information geometry, divergence theory, econophysics, materials engineering, molecular dynamics and entropy theory, information theory and so on. The workshop was organized as the first attempt to discuss these mathematical aspects with leading experts in each area. We would like to express special thanks to all the invited speakers, the contributors and the participants at the workshop. We are also grateful to RIMS (Research Institute for Mathematical Science) in Kyoto University and the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research (B), 18300003, 2009 for their support. Organizing Committee Editors of the Proceedings Hiroki Suyari (Chiba University, Japan) Atsumi Ohara (Osaka University, Japan) Tatsuaki Wada (Ibaraki University, Japan) Conference photograph
Jonsen, Ian D; Myers, Ransom A; James, Michael C
2006-09-01
1. Biological and statistical complexity are features common to most ecological data that hinder our ability to extract meaningful patterns using conventional tools. Recent work on implementing modern statistical methods for analysis of such ecological data has focused primarily on population dynamics but other types of data, such as animal movement pathways obtained from satellite telemetry, can also benefit from the application of modern statistical tools. 2. We develop a robust hierarchical state-space approach for analysis of multiple satellite telemetry pathways obtained via the Argos system. State-space models are time-series methods that allow unobserved states and biological parameters to be estimated from data observed with error. We show that the approach can reveal important patterns in complex, noisy data where conventional methods cannot. 3. Using the largest Atlantic satellite telemetry data set for critically endangered leatherback turtles, we show that the diel pattern in travel rates of these turtles changes over different phases of their migratory cycle. While foraging in northern waters the turtles show similar travel rates during day and night, but on their southward migration to tropical waters travel rates are markedly faster during the day. These patterns are generally consistent with diving data, and may be related to changes in foraging behaviour. Interestingly, individuals that migrate southward to breed generally show higher daytime travel rates than individuals that migrate southward in a non-breeding year. 4. Our approach is extremely flexible and can be applied to many ecological analyses that use complex, sequential data.
ERIC Educational Resources Information Center
Kozliak, Evguenii I.
2004-01-01
A molecular approach for introducing entropy in undergraduate physical chemistry course and incorporating the features of Davies' treatment that meets the needs of the students but ignores the complexities of statistics and upgrades the qualitative, intuitive approach of Lambert for general chemistry to a semiquantitative treatment using Boltzmann…
Power Analysis for Complex Mediational Designs Using Monte Carlo Methods
ERIC Educational Resources Information Center
Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.
2010-01-01
Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex…
Transitions between superstatistical regimes: Validity, breakdown and applications
NASA Astrophysics Data System (ADS)
Jizba, Petr; Korbel, Jan; Lavička, Hynek; Prokš, Martin; Svoboda, Václav; Beck, Christian
2018-03-01
Superstatistics is a widely employed tool of non-equilibrium statistical physics which plays an important rôle in analysis of hierarchical complex dynamical systems. Yet, its "canonical" formulation in terms of a single nuisance parameter is often too restrictive when applied to complex empirical data. Here we show that a multi-scale generalization of the superstatistics paradigm is more versatile, allowing to address such pertinent issues as transmutation of statistics or inter-scale stochastic behavior. To put some flesh on the bare bones, we provide a numerical evidence for a transition between two superstatistics regimes, by analyzing high-frequency (minute-tick) data for share-price returns of seven selected companies. Salient issues, such as breakdown of superstatistics in fractional diffusion processes or connection with Brownian subordination are also briefly discussed.
Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2013-01-01
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method. PMID:23750314
Adaptive distributed video coding with correlation estimation using expectation propagation
NASA Astrophysics Data System (ADS)
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2012-10-01
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.
Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation.
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2012-10-15
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.
Statistical and sampling issues when using multiple particle tracking
NASA Astrophysics Data System (ADS)
Savin, Thierry; Doyle, Patrick S.
2007-08-01
Video microscopy can be used to simultaneously track several microparticles embedded in a complex material. The trajectories are used to extract a sample of displacements at random locations in the material. From this sample, averaged quantities characterizing the dynamics of the probes are calculated to evaluate structural and/or mechanical properties of the assessed material. However, the sampling of measured displacements in heterogeneous systems is singular because the volume of observation with video microscopy is finite. By carefully characterizing the sampling design in the experimental output of the multiple particle tracking technique, we derive estimators for the mean and variance of the probes’ dynamics that are independent of the peculiar statistical characteristics. We expose stringent tests of these estimators using simulated and experimental complex systems with a known heterogeneous structure. Up to a certain fundamental limitation, which we characterize through a material degree of sampling by the embedded probe tracking, these estimators can be applied to quantify the heterogeneity of a material, providing an original and intelligible kind of information on complex fluid properties. More generally, we show that the precise assessment of the statistics in the multiple particle tracking output sample of observations is essential in order to provide accurate unbiased measurements.
Reciprocity in directed networks
NASA Astrophysics Data System (ADS)
Yin, Mei; Zhu, Lingjiong
2016-04-01
Reciprocity is an important characteristic of directed networks and has been widely used in the modeling of World Wide Web, email, social, and other complex networks. In this paper, we take a statistical physics point of view and study the limiting entropy and free energy densities from the microcanonical ensemble, the canonical ensemble, and the grand canonical ensemble whose sufficient statistics are given by edge and reciprocal densities. The sparse case is also studied for the grand canonical ensemble. Extensions to more general reciprocal models including reciprocal triangle and star densities will likewise be discussed.
Statistics of Shared Components in Complex Component Systems
NASA Astrophysics Data System (ADS)
Mazzolini, Andrea; Gherardi, Marco; Caselle, Michele; Cosentino Lagomarsino, Marco; Osella, Matteo
2018-04-01
Many complex systems are modular. Such systems can be represented as "component systems," i.e., sets of elementary components, such as LEGO bricks in LEGO sets. The bricks found in a LEGO set reflect a target architecture, which can be built following a set-specific list of instructions. In other component systems, instead, the underlying functional design and constraints are not obvious a priori, and their detection is often a challenge of both scientific and practical importance, requiring a clear understanding of component statistics. Importantly, some quantitative invariants appear to be common to many component systems, most notably a common broad distribution of component abundances, which often resembles the well-known Zipf's law. Such "laws" affect in a general and nontrivial way the component statistics, potentially hindering the identification of system-specific functional constraints or generative processes. Here, we specifically focus on the statistics of shared components, i.e., the distribution of the number of components shared by different system realizations, such as the common bricks found in different LEGO sets. To account for the effects of component heterogeneity, we consider a simple null model, which builds system realizations by random draws from a universe of possible components. Under general assumptions on abundance heterogeneity, we provide analytical estimates of component occurrence, which quantify exhaustively the statistics of shared components. Surprisingly, this simple null model can positively explain important features of empirical component-occurrence distributions obtained from large-scale data on bacterial genomes, LEGO sets, and book chapters. Specific architectural features and functional constraints can be detected from occurrence patterns as deviations from these null predictions, as we show for the illustrative case of the "core" genome in bacteria.
The applications of Complexity Theory and Tsallis Non-extensive Statistics at Solar Plasma Dynamics
NASA Astrophysics Data System (ADS)
Pavlos, George
2015-04-01
As the solar plasma lives far from equilibrium it is an excellent laboratory for testing complexity theory and non-equilibrium statistical mechanics. In this study, we present the highlights of complexity theory and Tsallis non extensive statistical mechanics as concerns their applications at solar plasma dynamics, especially at sunspot, solar flare and solar wind phenomena. Generally, when a physical system is driven far from equilibrium states some novel characteristics can be observed related to the nonlinear character of dynamics. Generally, the nonlinearity in space plasma dynamics can generate intermittent turbulence with the typical characteristics of the anomalous diffusion process and strange topologies of stochastic space plasma fields (velocity and magnetic fields) caused by the strange dynamics and strange kinetics (Zaslavsky, 2002). In addition, according to Zelenyi and Milovanov (2004) the complex character of the space plasma system includes the existence of non-equilibrium (quasi)-stationary states (NESS) having the topology of a percolating fractal set. The stabilization of a system near the NESS is perceived as a transition into a turbulent state determined by self-organization processes. The long-range correlation effects manifest themselves as a strange non-Gaussian behavior of kinetic processes near the NESS plasma state. The complex character of space plasma can also be described by the non-extensive statistical thermodynamics pioneered by Tsallis, which offers a consistent and effective theoretical framework, based on a generalization of Boltzmann - Gibbs (BG) entropy, to describe far from equilibrium nonlinear complex dynamics (Tsallis, 2009). In a series of recent papers, the hypothesis of Tsallis non-extensive statistics in magnetosphere, sunspot dynamics, solar flares, solar wind and space plasma in general, was tested and verified (Karakatsanis et al., 2013; Pavlos et al., 2014; 2015). Our study includes the analysis of solar plasma time series at three cases: sunspot index, solar flare and solar wind data. The non-linear analysis of the sunspot index is embedded in the non-extensive statistical theory of Tsallis (1988; 2004; 2009). The q-triplet of Tsallis, as well as the correlation dimension and the Lyapunov exponent spectrum were estimated for the SVD components of the sunspot index timeseries. Also the multifractal scaling exponent spectrum f(a), the generalized Renyi dimension spectrum D(q) and the spectrum J(p) of the structure function exponents were estimated experimentally and theoretically by using the q-entropy principle included in Tsallis non-extensive statistical theory, following Arimitsu and Arimitsu (2000, 2001). Our analysis showed clearly the following: (a) a phase transition process in the solar dynamics from high dimensional non-Gaussian SOC state to a low dimensional non-Gaussian chaotic state, (b) strong intermittent solar turbulence and anomalous (multifractal) diffusion solar process, which is strengthened as the solar dynamics makes a phase transition to low dimensional chaos in accordance to Ruzmaikin, Zelenyi and Milovanov's studies (Zelenyi and Milovanov, 1991; Milovanov and Zelenyi, 1993; Ruzmakin et al., 1996), (c) faithful agreement of Tsallis non-equilibrium statistical theory with the experimental estimations of: (i) non-Gaussian probability distribution function P(x), (ii) multifractal scaling exponent spectrum f(a) and generalized Renyi dimension spectrum Dq, (iii) exponent spectrum J(p) of the structure functions estimated for the sunspot index and its underlying non equilibrium solar dynamics. Also, the q-triplet of Tsallis as well as the correlation dimension and the Lyapunov exponent spectrum were estimated for the singular value decomposition (SVD) components of the solar flares timeseries. Also the multifractal scaling exponent spectrum f(a), the generalized Renyi dimension spectrum D(q) and the spectrum J(p) of the structure function exponents were estimated experimentally and theoretically by using the q-entropy principle included in Tsallis non-extensive statistical theory, following Arimitsu and Arimitsu (2000). Our analysis showed clearly the following: (a) a phase transition process in the solar flare dynamics from a high dimensional non-Gaussian self-organized critical (SOC) state to a low dimensional also non-Gaussian chaotic state, (b) strong intermittent solar corona turbulence and an anomalous (multifractal) diffusion solar corona process, which is strengthened as the solar corona dynamics makes a phase transition to low dimensional chaos, (c) faithful agreement of Tsallis non-equilibrium statistical theory with the experimental estimations of the functions: (i) non-Gaussian probability distribution function P(x), (ii) f(a) and D(q), and (iii) J(p) for the solar flares timeseries and its underlying non-equilibrium solar dynamics, and (d) the solar flare dynamical profile is revealed similar to the dynamical profile of the solar corona zone as far as the phase transition process from self-organized criticality (SOC) to chaos state. However the solar low corona (solar flare) dynamical characteristics can be clearly discriminated from the dynamical characteristics of the solar convection zone. At last we present novel results revealing non-equilibrium phase transition processes in the solar wind plasma during a strong shock event, which can take place in Solar wind plasma system. The solar wind plasma as well as the entire solar plasma system is a typical case of stochastic spatiotemporal distribution of physical state variables such as force fields ( ) and matter fields (particle and current densities or bulk plasma distributions). This study shows clearly the non-extensive and non-Gaussian character of the solar wind plasma and the existence of multi-scale strong correlations from the microscopic to the macroscopic level. It also underlines the inefficiency of classical magneto-hydro-dynamic (MHD) or plasma statistical theories, based on the classical central limit theorem (CLT), to explain the complexity of the solar wind dynamics, since these theories include smooth and differentiable spatial-temporal functions (MHD theory) or Gaussian statistics (Boltzmann-Maxwell statistical mechanics). On the contrary, the results of this study indicate the presence of non-Gaussian non-extensive statistics with heavy tails probability distribution functions, which are related to the q-extension of CLT. Finally, the results of this study can be understood in the framework of modern theoretical concepts such as non-extensive statistical mechanics (Tsallis, 2009), fractal topology (Zelenyi and Milovanov, 2004), turbulence theory (Frisch, 1996), strange dynamics (Zaslavsky, 2002), percolation theory (Milovanov, 1997), anomalous diffusion theory and anomalous transport theory (Milovanov, 2001), fractional dynamics (Tarasov, 2013) and non-equilibrium phase transition theory (Chang, 1992). References 1. T. Arimitsu, N. Arimitsu, Tsallis statistics and fully developed turbulence, J. Phys. A: Math. Gen. 33 (2000) L235. 2. T. Arimitsu, N. Arimitsu, Analysis of turbulence by statistics based on generalized entropies, Physica A 295 (2001) 177-194. 3. T. Chang, Low-dimensional behavior and symmetry braking of stochastic systems near criticality can these effects be observed in space and in the laboratory, IEEE 20 (6) (1992) 691-694. 4. U. Frisch, Turbulence, Cambridge University Press, Cambridge, UK, 1996, p. 310. 5. L.P. Karakatsanis, G.P. Pavlos, M.N. Xenakis, Tsallis non-extensive statistics, intermittent turbulence, SOC and chaos in the solar plasma. Part two: Solar flares dynamics, Physica A 392 (2013) 3920-3944. 6. A.V. Milovanov, Topological proof for the Alexander-Orbach conjecture, Phys. Rev. E 56 (3) (1997) 2437-2446. 7. A.V. Milovanov, L.M. Zelenyi, Fracton excitations as a driving mechanism for the self-organized dynamical structuring in the solar wind, Astrophys. Space Sci. 264 (1-4) (1999) 317-345. 8. A.V. Milovanov, Stochastic dynamics from the fractional Fokker-Planck-Kolmogorov equation: large-scale behavior of the turbulent transport coefficient, Phys. Rev. E 63 (2001) 047301. 9. G.P. Pavlos, et al., Universality of non-extensive Tsallis statistics and time series analysis: Theory and applications, Physica A 395 (2014) 58-95. 10. G.P. Pavlos, et al., Tsallis non-extensive statistics and solar wind plasma complexity, Physica A 422 (2015) 113-135. 11. A.A. Ruzmaikin, et al., Spectral properties of solar convection and diffusion, ApJ 471 (1996) 1022. 12. V.E. Tarasov, Review of some promising fractional physical models, Internat. J. Modern Phys. B 27 (9) (2013) 1330005. 13. C. Tsallis, Possible generalization of BG statistics, J. Stat. Phys. J 52 (1-2) (1988) 479-487. 14. C. Tsallis, Nonextensive statistical mechanics: construction and physical interpretation, in: G.M. Murray, C. Tsallis (Eds.), Nonextensive Entropy-Interdisciplinary Applications, Oxford Univ. Press, 2004, pp. 1-53. 15. C. Tsallis, Introduction to Non-Extensive Statistical Mechanics, Springer, 2009. 16. G.M. Zaslavsky, Chaos, fractional kinetics, and anomalous transport, Physics Reports 371 (2002) 461-580. 17. L.M. Zelenyi, A.V. Milovanov, Fractal properties of sunspots, Sov. Astron. Lett. 17 (6) (1991) 425. 18. L.M. Zelenyi, A.V. Milovanov, Fractal topology and strange kinetics: from percolation theory to problems in cosmic electrodynamics, Phys.-Usp. 47 (8), (2004) 749-788.
Statistical label fusion with hierarchical performance models
Asman, Andrew J.; Dagley, Alexander S.; Landman, Bennett A.
2014-01-01
Label fusion is a critical step in many image segmentation frameworks (e.g., multi-atlas segmentation) as it provides a mechanism for generalizing a collection of labeled examples into a single estimate of the underlying segmentation. In the multi-label case, typical label fusion algorithms treat all labels equally – fully neglecting the known, yet complex, anatomical relationships exhibited in the data. To address this problem, we propose a generalized statistical fusion framework using hierarchical models of rater performance. Building on the seminal work in statistical fusion, we reformulate the traditional rater performance model from a multi-tiered hierarchical perspective. This new approach provides a natural framework for leveraging known anatomical relationships and accurately modeling the types of errors that raters (or atlases) make within a hierarchically consistent formulation. Herein, we describe several contributions. First, we derive a theoretical advancement to the statistical fusion framework that enables the simultaneous estimation of multiple (hierarchical) performance models within the statistical fusion context. Second, we demonstrate that the proposed hierarchical formulation is highly amenable to the state-of-the-art advancements that have been made to the statistical fusion framework. Lastly, in an empirical whole-brain segmentation task we demonstrate substantial qualitative and significant quantitative improvement in overall segmentation accuracy. PMID:24817809
Spectroscopic studies on Solvatochromism of mixed-chelate copper(II) complexes using MLR technique
NASA Astrophysics Data System (ADS)
Golchoubian, Hamid; Moayyedi, Golasa; Fazilati, Hakimeh
2012-01-01
Mixed-chelate copper(II) complexes with a general formula [Cu(acac)(diamine)]X where acac = acetylacetonate ion, diamine = N,N-dimethyl,N'-benzyl-1,2-diaminoethane and X = BPh 4-, PF 6-, ClO 4- and BF 4- have been prepared. The complexes were characterized on the basis of elemental analysis, molar conductance, UV-vis and IR spectroscopies. The complexes are solvatochromic and their solvatochromism were investigated by visible spectroscopy. All complexes demonstrated the positive solvatochromism and among the complexes [Cu(acac)(diamine)]BPh 4·H 2O showed the highest Δ νmax value. To explore the mechanism of interaction between solvent molecules and the complexes, different solvent parameters such as DN, AN, α and β using multiple linear regression (MLR) method were employed. The statistical results suggested that the DN parameter of the solvent plays a dominate contribution to the shift of the d-d absorption band of the complexes.
On the Way to Appropriate Model Complexity
NASA Astrophysics Data System (ADS)
Höge, M.
2016-12-01
When statistical models are used to represent natural phenomena they are often too simple or too complex - this is known. But what exactly is model complexity? Among many other definitions, the complexity of a model can be conceptualized as a measure of statistical dependence between observations and parameters (Van der Linde, 2014). However, several issues remain when working with model complexity: A unique definition for model complexity is missing. Assuming a definition is accepted, how can model complexity be quantified? How can we use a quantified complexity to the better of modeling? Generally defined, "complexity is a measure of the information needed to specify the relationships between the elements of organized systems" (Bawden & Robinson, 2015). The complexity of a system changes as the knowledge about the system changes. For models this means that complexity is not a static concept: With more data or higher spatio-temporal resolution of parameters, the complexity of a model changes. There are essentially three categories into which all commonly used complexity measures can be classified: (1) An explicit representation of model complexity as "Degrees of freedom" of a model, e.g. effective number of parameters. (2) Model complexity as code length, a.k.a. "Kolmogorov complexity": The longer the shortest model code, the higher its complexity (e.g. in bits). (3) Complexity defined via information entropy of parametric or predictive uncertainty. Preliminary results show that Bayes theorem allows for incorporating all parts of the non-static concept of model complexity like data quality and quantity or parametric uncertainty. Therefore, we test how different approaches for measuring model complexity perform in comparison to a fully Bayesian model selection procedure. Ultimately, we want to find a measure that helps to assess the most appropriate model.
Generalized permutation entropy analysis based on the two-index entropic form S q , δ
NASA Astrophysics Data System (ADS)
Xu, Mengjia; Shang, Pengjian
2015-05-01
Permutation entropy (PE) is a novel measure to quantify the complexity of nonlinear time series. In this paper, we propose a generalized permutation entropy ( P E q , δ ) based on the recently postulated entropic form, S q , δ , which was proposed as an unification of the well-known Sq of nonextensive-statistical mechanics and S δ , a possibly appropriate candidate for the black-hole entropy. We find that P E q , δ with appropriate parameters can amplify minor changes and trends of complexities in comparison to PE. Experiments with this generalized permutation entropy method are performed with both synthetic and stock data showing its power. Results show that P E q , δ is an exponential function of q and the power ( k ( δ ) ) is a constant if δ is determined. Some discussions about k ( δ ) are provided. Besides, we also find some interesting results about power law.
ERIC Educational Resources Information Center
Conway, Christopher M.; Karpicke, Jennifer; Pisoni, David B.
2007-01-01
Spoken language consists of a complex, sequentially arrayed signal that contains patterns that can be described in terms of statistical relations among language units. Previous research has suggested that a domain-general ability to learn structured sequential patterns may underlie language acquisition. To test this prediction, we examined the…
Characterizing time series via complexity-entropy curves
NASA Astrophysics Data System (ADS)
Ribeiro, Haroldo V.; Jauregui, Max; Zunino, Luciano; Lenzi, Ervin K.
2017-06-01
The search for patterns in time series is a very common task when dealing with complex systems. This is usually accomplished by employing a complexity measure such as entropies and fractal dimensions. However, such measures usually only capture a single aspect of the system dynamics. Here, we propose a family of complexity measures for time series based on a generalization of the complexity-entropy causality plane. By replacing the Shannon entropy by a monoparametric entropy (Tsallis q entropy) and after considering the proper generalization of the statistical complexity (q complexity), we build up a parametric curve (the q -complexity-entropy curve) that is used for characterizing and classifying time series. Based on simple exact results and numerical simulations of stochastic processes, we show that these curves can distinguish among different long-range, short-range, and oscillating correlated behaviors. Also, we verify that simulated chaotic and stochastic time series can be distinguished based on whether these curves are open or closed. We further test this technique in experimental scenarios related to chaotic laser intensity, stock price, sunspot, and geomagnetic dynamics, confirming its usefulness. Finally, we prove that these curves enhance the automatic classification of time series with long-range correlations and interbeat intervals of healthy subjects and patients with heart disease.
NASA Astrophysics Data System (ADS)
Eduardo Virgilio Silva, Luiz; Otavio Murta, Luiz
2012-12-01
Complexity in time series is an intriguing feature of living dynamical systems, with potential use for identification of system state. Although various methods have been proposed for measuring physiologic complexity, uncorrelated time series are often assigned high values of complexity, errouneously classifying them as a complex physiological signals. Here, we propose and discuss a method for complex system analysis based on generalized statistical formalism and surrogate time series. Sample entropy (SampEn) was rewritten inspired in Tsallis generalized entropy, as function of q parameter (qSampEn). qSDiff curves were calculated, which consist of differences between original and surrogate series qSampEn. We evaluated qSDiff for 125 real heart rate variability (HRV) dynamics, divided into groups of 70 healthy, 44 congestive heart failure (CHF), and 11 atrial fibrillation (AF) subjects, and for simulated series of stochastic and chaotic process. The evaluations showed that, for nonperiodic signals, qSDiff curves have a maximum point (qSDiffmax) for q ≠1. Values of q where the maximum point occurs and where qSDiff is zero were also evaluated. Only qSDiffmax values were capable of distinguish HRV groups (p-values 5.10×10-3, 1.11×10-7, and 5.50×10-7 for healthy vs. CHF, healthy vs. AF, and CHF vs. AF, respectively), consistently with the concept of physiologic complexity, and suggests a potential use for chaotic system analysis.
Inferring general relations between network characteristics from specific network ensembles.
Cardanobile, Stefano; Pernice, Volker; Deger, Moritz; Rotter, Stefan
2012-01-01
Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their ability to generate networks with large structural variability. In particular, we consider the statistical constraints which the respective construction scheme imposes on the generated networks. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This makes it possible to infer global features from local ones using regression models trained on networks with high generalization power. Our results confirm and extend previous findings regarding the synchronization properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks in good approximation. Finally, we demonstrate on three different data sets (C. elegans neuronal network, R. prowazekii metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models.
Hospital graduate social work field work programs: a study in New York City.
Showers, N
1990-02-01
Twenty-seven hospital field work programs in New York City were studied. Questionnaires were administered to program coordinators and 238 graduate social work students participating in study programs. High degrees of program structural complexity and variation were found, indicating a state of art well beyond that described in the general field work literature. High rates of student satisfaction with learning, field instructors, programs, and the overall field work experience found suggest that the complexity of study programs may be more effective than traditional field work models. Statistically nonsignificant study findings indicate areas in which hospital social work departments may develop field work programs consistent with shifting organizational needs, without undue risk to educational effectiveness. Statistically significant findings suggest areas in which inflexibility in program design may be more beneficial in the diagnostic related groups era.
Air-flow distortion and turbulence statistics near an animal facility
NASA Astrophysics Data System (ADS)
Prueger, J. H.; Eichinger, W. E.; Hipps, L. E.; Hatfield, J. L.; Cooper, D. I.
The emission and dispersion of particulates and gases from concentrated animal feeding operations (CAFO) at local to regional scales is a current issue in science and society. The transport of particulates, odors and toxic chemical species from the source into the local and eventually regional atmosphere is largely determined by turbulence. Any models that attempt to simulate the dispersion of particles must either specify or assume various statistical properties of the turbulence field. Statistical properties of turbulence are well documented for idealized boundary layers above uniform surfaces. However, an animal production facility is a complex surface with structures that act as bluff bodies that distort the turbulence intensity near the buildings. As a result, the initial release and subsequent dispersion of effluents in the region near a facility will be affected by the complex nature of the surface. Previous Lidar studies of plume dispersion over the facility used in this study indicated that plumes move in complex yet organized patterns that would not be explained by the properties of turbulence generally assumed in models. The objective of this study was to characterize the near-surface turbulence statistics in the flow field around an array of animal confinement buildings. Eddy covariance towers were erected in the upwind, within the building array and downwind regions of the flow field. Substantial changes in turbulence intensity statistics and turbulence-kinetic energy (TKE) were observed as the mean wind flow encountered the building structures. Spectra analysis demonstrated unique distribution of the spectral energy in the vertical profile above the buildings.
Building out a Measurement Model to Incorporate Complexities of Testing in the Language Domain
ERIC Educational Resources Information Center
Wilson, Mark; Moore, Stephen
2011-01-01
This paper provides a summary of a novel and integrated way to think about the item response models (most often used in measurement applications in social science areas such as psychology, education, and especially testing of various kinds) from the viewpoint of the statistical theory of generalized linear and nonlinear mixed models. In addition,…
Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, Willie R.; Storaasli, Olaf O.
2004-01-01
Of several iterative and direct equation solvers evaluated previously for computations in aeroacoustics, the most promising was the NASA-developed General-Purpose Solver (winner of NASA's 1999 software of the year award). This paper presents detailed, single-processor statistics of the performance of this solver, which has been tailored and optimized for large-scale aeroacoustic computations. The statistics, compiled using an SGI ORIGIN 2000 computer with 12 Gb available memory (RAM) and eight available processors, are the central processing unit time, RAM requirements, and solution error. The equation solver is capable of solving 10 thousand complex unknowns in as little as 0.01 sec using 0.02 Gb RAM, and 8.4 million complex unknowns in slightly less than 3 hours using all 12 Gb. This latter solution is the largest aeroacoustics problem solved to date with this technique. The study was unable to detect any noticeable error in the solution, since noise levels predicted from these solution vectors are in excellent agreement with the noise levels computed from the exact solution. The equation solver provides a means for obtaining numerical solutions to aeroacoustics problems in three dimensions.
Aoyagi, Miki; Nagata, Kenji
2012-06-01
The term algebraic statistics arises from the study of probabilistic models and techniques for statistical inference using methods from algebra and geometry (Sturmfels, 2009 ). The purpose of our study is to consider the generalization error and stochastic complexity in learning theory by using the log-canonical threshold in algebraic geometry. Such thresholds correspond to the main term of the generalization error in Bayesian estimation, which is called a learning coefficient (Watanabe, 2001a , 2001b ). The learning coefficient serves to measure the learning efficiencies in hierarchical learning models. In this letter, we consider learning coefficients for Vandermonde matrix-type singularities, by using a new approach: focusing on the generators of the ideal, which defines singularities. We give tight new bound values of learning coefficients for the Vandermonde matrix-type singularities and the explicit values with certain conditions. By applying our results, we can show the learning coefficients of three-layered neural networks and normal mixture models.
Transfer Entropy as a Log-Likelihood Ratio
NASA Astrophysics Data System (ADS)
Barnett, Lionel; Bossomaier, Terry
2012-09-01
Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.
Transfer entropy as a log-likelihood ratio.
Barnett, Lionel; Bossomaier, Terry
2012-09-28
Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.
NASA Technical Reports Server (NTRS)
Djorgovski, George
1993-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
NASA Technical Reports Server (NTRS)
Djorgovski, Stanislav
1992-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
Rescore protein-protein docked ensembles with an interface contact statistics.
Mezei, Mihaly
2017-02-01
The recently developed statistical measure for the type of residue-residue contact at protein complex interfaces, based on a parameter-free definition of contact, has been used to define a contact score that is correlated with the likelihood of correctness of a proposed complex structure. Comparing the proposed contact scores on the native structure and on a set of model structures the proposed measure was shown to generally favor the native structure but in itself was not able to reliably score the native structure to be the best. Adjusting the scores of redocking experiments with the contact score showed that the adjusted score was able to move up the ranking of the native-like structure among the proposed complexes when the native-like was not ranked the best by the respective program. Tests on docking of unbound proteins compared the contact scores of the complexes with the contact score of the crystal structure again showing the tendency of the contact score to favor native-like conformations. The possibility of using the contact score to improve the determination of biological dimers in a crystal structure was also explored. Proteins 2017; 85:235-241. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Best Statistical Distribution of flood variables for Johor River in Malaysia
NASA Astrophysics Data System (ADS)
Salarpour Goodarzi, M.; Yusop, Z.; Yusof, F.
2012-12-01
A complex flood event is always characterized by a few characteristics such as flood peak, flood volume, and flood duration, which might be mutually correlated. This study explored the statistical distribution of peakflow, flood duration and flood volume at Rantau Panjang gauging station on the Johor River in Malaysia. Hourly data were recorded for 45 years. The data were analysed based on water year (July - June). Five distributions namely, Log Normal, Generalize Pareto, Log Pearson, Normal and Generalize Extreme Value (GEV) were used to model the distribution of all the three variables. Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests were used to evaluate the best fit. Goodness-of-fit tests at 5% level of significance indicate that all the models can be used to model the distribution of peakflow, flood duration and flood volume. However, Generalize Pareto distribution is found to be the most suitable model when tested with the Anderson-Darling test and the, Kolmogorov-Smirnov suggested that GEV is the best for peakflow. The result of this research can be used to improve flood frequency analysis. Comparison between Generalized Extreme Value, Generalized Pareto and Log Pearson distributions in the Cumulative Distribution Function of peakflow
NASA Astrophysics Data System (ADS)
Havens, Timothy C.; Cummings, Ian; Botts, Jonathan; Summers, Jason E.
2017-05-01
The linear ordered statistic (LOS) is a parameterized ordered statistic (OS) that is a weighted average of a rank-ordered sample. LOS operators are useful generalizations of aggregation as they can represent any linear aggregation, from minimum to maximum, including conventional aggregations, such as mean and median. In the fuzzy logic field, these aggregations are called ordered weighted averages (OWAs). Here, we present a method for learning LOS operators from training data, viz., data for which you know the output of the desired LOS. We then extend the learning process with regularization, such that a lower complexity or sparse LOS can be learned. Hence, we discuss what 'lower complexity' means in this context and how to represent that in the optimization procedure. Finally, we apply our learning methods to the well-known constant-false-alarm-rate (CFAR) detection problem, specifically for the case of background levels modeled by long-tailed distributions, such as the K-distribution. These backgrounds arise in several pertinent imaging problems, including the modeling of clutter in synthetic aperture radar and sonar (SAR and SAS) and in wireless communications.
The effects of an energy efficiency retrofit on indoor air quality.
Frey, S E; Destaillats, H; Cohn, S; Ahrentzen, S; Fraser, M P
2015-04-01
To investigate the impacts of an energy efficiency retrofit, indoor air quality and resident health were evaluated at a low-income senior housing apartment complex in Phoenix, Arizona, before and after a green energy building renovation. Indoor and outdoor air quality sampling was carried out simultaneously with a questionnaire to characterize personal habits and general health of residents. Measured indoor formaldehyde levels before the building retrofit routinely exceeded reference exposure limits, but in the long-term follow-up sampling, indoor formaldehyde decreased for the entire study population by a statistically significant margin. Indoor PM levels were dominated by fine particles and showed a statistically significant decrease in the long-term follow-up sampling within certain resident subpopulations (i.e. residents who report smoking and residents who had lived longer at the apartment complex). © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Efstathiou, Angeliki; Tzanis, Andreas; Vallianatos, Filippos
2017-09-01
We examine the nature of the seismogenetic system in North California, USA, by searching for evidence of complexity and non-extensivity in the earthquake record. We attempt to determine whether earthquakes are generated by a self-excited Poisson process, in which case they obey Boltzmann-Gibbs thermodynamics, or by a Critical process, in which long-range interactions in non-equilibrium states are expected (correlation) and the thermodynamics deviate from the Boltzmann-Gibbs formalism. Emphasis is given to background seismicity since it is generally agreed that aftershock sequences comprise correlated sets. We use the complete and homogeneous earthquake catalogue published by the North California Earthquake Data Centre, in which aftershocks are either included, or have been removed by a stochastic declustering procedure. We examine multivariate cumulative frequency distributions of earthquake magnitudes, interevent time and interevent distance in the context of Non-Extensive Statistical Physics, which is a generalization of extensive Boltzmann-Gibbs thermodynamics to non-equilibrating (non-extensive) systems. Our results indicate that the seismogenetic systems of North California are generally sub-extensive complex and non-Poissonian. The background seismicity exhibits long-range interaction as evidenced by the overall increase of correlation observed by declustering the earthquake catalogues, as well as by the high correlation observed for earthquakes separated by long interevent distances. It is also important to emphasize that two subsystems with rather different properties appear to exist. The correlation observed along the Sierra Nevada Range - Walker Lane is quasi-stationary and indicates a Self-Organized Critical fault system. Conversely, the north segment of the San Andreas Fault exhibits changes in the level of correlation with reference to the large Loma Prieta event of 1989 and thus has attributes of Critical Point behaviour albeit without acceleration of seismic release rates. SOC appears to be a likely explanation of complexity mechanisms but since there are other ways by which complexity may emerge, additional work is required before assertive conclusions can be drawn.
skelesim: an extensible, general framework for population genetic simulation in R.
Parobek, Christian M; Archer, Frederick I; DePrenger-Levin, Michelle E; Hoban, Sean M; Liggins, Libby; Strand, Allan E
2017-01-01
Simulations are a key tool in molecular ecology for inference and forecasting, as well as for evaluating new methods. Due to growing computational power and a diversity of software with different capabilities, simulations are becoming increasingly powerful and useful. However, the widespread use of simulations by geneticists and ecologists is hindered by difficulties in understanding these softwares' complex capabilities, composing code and input files, a daunting bioinformatics barrier and a steep conceptual learning curve. skelesim (an R package) guides users in choosing appropriate simulations, setting parameters, calculating genetic summary statistics and organizing data output, in a reproducible pipeline within the R environment. skelesim is designed to be an extensible framework that can 'wrap' around any simulation software (inside or outside the R environment) and be extended to calculate and graph any genetic summary statistics. Currently, skelesim implements coalescent and forward-time models available in the fastsimcoal2 and rmetasim simulation engines to produce null distributions for multiple population genetic statistics and marker types, under a variety of demographic conditions. skelesim is intended to make simulations easier while still allowing full model complexity to ensure that simulations play a fundamental role in molecular ecology investigations. skelesim can also serve as a teaching tool: demonstrating the outcomes of stochastic population genetic processes; teaching general concepts of simulations; and providing an introduction to the R environment with a user-friendly graphical user interface (using shiny). © 2016 John Wiley & Sons Ltd.
skeleSim: an extensible, general framework for population genetic simulation in R
Parobek, Christian M.; Archer, Frederick I.; DePrenger-Levin, Michelle E.; Hoban, Sean M.; Liggins, Libby; Strand, Allan E.
2016-01-01
Simulations are a key tool in molecular ecology for inference and forecasting, as well as for evaluating new methods. Due to growing computational power and a diversity of software with different capabilities, simulations are becoming increasingly powerful and useful. However, the widespread use of simulations by geneticists and ecologists is hindered by difficulties in understanding these softwares’ complex capabilities, composing code and input files, a daunting bioinformatics barrier, and a steep conceptual learning curve. skeleSim (an R package) guides users in choosing appropriate simulations, setting parameters, calculating genetic summary statistics, and organizing data output, in a reproducible pipeline within the R environment. skeleSim is designed to be an extensible framework that can ‘wrap’ around any simulation software (inside or outside the R environment) and be extended to calculate and graph any genetic summary statistics. Currently, skeleSim implements coalescent and forward-time models available in the fastsimcoal2 and rmetasim simulation engines to produce null distributions for multiple population genetic statistics and marker types, under a variety of demographic conditions. skeleSim is intended to make simulations easier while still allowing full model complexity to ensure that simulations play a fundamental role in molecular ecology investigations. skeleSim can also serve as a teaching tool: demonstrating the outcomes of stochastic population genetic processes; teaching general concepts of simulations; and providing an introduction to the R environment with a user-friendly graphical user interface (using shiny). PMID:27736016
Low-complexity stochastic modeling of wall-bounded shear flows
NASA Astrophysics Data System (ADS)
Zare, Armin
Turbulent flows are ubiquitous in nature and they appear in many engineering applications. Transition to turbulence, in general, increases skin-friction drag in air/water vehicles compromising their fuel-efficiency and reduces the efficiency and longevity of wind turbines. While traditional flow control techniques combine physical intuition with costly experiments, their effectiveness can be significantly enhanced by control design based on low-complexity models and optimization. In this dissertation, we develop a theoretical and computational framework for the low-complexity stochastic modeling of wall-bounded shear flows. Part I of the dissertation is devoted to the development of a modeling framework which incorporates data-driven techniques to refine physics-based models. We consider the problem of completing partially known sample statistics in a way that is consistent with underlying stochastically driven linear dynamics. Neither the statistics nor the dynamics are precisely known. Thus, our objective is to reconcile the two in a parsimonious manner. To this end, we formulate optimization problems to identify the dynamics and directionality of input excitation in order to explain and complete available covariance data. For problem sizes that general-purpose solvers cannot handle, we develop customized optimization algorithms based on alternating direction methods. The solution to the optimization problem provides information about critical directions that have maximal effect in bringing model and statistics in agreement. In Part II, we employ our modeling framework to account for statistical signatures of turbulent channel flow using low-complexity stochastic dynamical models. We demonstrate that white-in-time stochastic forcing is not sufficient to explain turbulent flow statistics and develop models for colored-in-time forcing of the linearized Navier-Stokes equations. We also examine the efficacy of stochastically forced linearized NS equations and their parabolized equivalents in the receptivity analysis of velocity fluctuations to external sources of excitation as well as capturing the effect of the slowly-varying base flow on streamwise streaks and Tollmien-Schlichting waves. In Part III, we develop a model-based approach to design surface actuation of turbulent channel flow in the form of streamwise traveling waves. This approach is capable of identifying the drag reducing trends of traveling waves in a simulation-free manner. We also use the stochastically forced linearized NS equations to examine the Reynolds number independent effects of spanwise wall oscillations on drag reduction in turbulent channel flows. This allows us to extend the predictive capability of our simulation-free approach to high Reynolds numbers.
Medical Image Compression Using a New Subband Coding Method
NASA Technical Reports Server (NTRS)
Kossentini, Faouzi; Smith, Mark J. T.; Scales, Allen; Tucker, Doug
1995-01-01
A recently introduced iterative complexity- and entropy-constrained subband quantization design algorithm is generalized and applied to medical image compression. In particular, the corresponding subband coder is used to encode Computed Tomography (CT) axial slice head images, where statistical dependencies between neighboring image subbands are exploited. Inter-slice conditioning is also employed for further improvements in compression performance. The subband coder features many advantages such as relatively low complexity and operation over a very wide range of bit rates. Experimental results demonstrate that the performance of the new subband coder is relatively good, both objectively and subjectively.
Spatio-temporal analysis of aftershock sequences in terms of Non Extensive Statistical Physics.
NASA Astrophysics Data System (ADS)
Chochlaki, Kalliopi; Vallianatos, Filippos
2017-04-01
Earth's seismicity is considered as an extremely complicated process where long-range interactions and fracturing exist (Vallianatos et al., 2016). For this reason, in order to analyze it, we use an innovative methodological approach, introduced by Tsallis (Tsallis, 1988; 2009), named Non Extensive Statistical Physics. This approach introduce a generalization of the Boltzmann-Gibbs statistical mechanics and it is based on the definition of Tsallis entropy Sq, which maximized leads the the so-called q-exponential function that expresses the probability distribution function that maximizes the Sq. In the present work, we utilize the concept of Non Extensive Statistical Physics in order to analyze the spatiotemporal properties of several aftershock series. Marekova (Marekova, 2014) suggested that the probability densities of the inter-event distances between successive aftershocks follow a beta distribution. Using the same data set we analyze the inter-event distance distribution of several aftershocks sequences in different geographic regions by calculating non extensive parameters that determine the behavior of the system and by fitting the q-exponential function, which expresses the degree of non-extentivity of the investigated system. Furthermore, the inter-event times distribution of the aftershocks as well as the frequency-magnitude distribution has been analyzed. The results supports the applicability of Non Extensive Statistical Physics ideas in aftershock sequences where a strong correlation exists along with memory effects. References C. Tsallis, Possible generalization of Boltzmann-Gibbs statistics, J. Stat. Phys. 52 (1988) 479-487. doi:10.1007/BF01016429 C. Tsallis, Introduction to nonextensive statistical mechanics: Approaching a complex world, 2009. doi:10.1007/978-0-387-85359-8. E. Marekova, Analysis of the spatial distribution between successive earthquakes in aftershocks series, Annals of Geophysics, 57, 5, doi:10.4401/ag-6556, 2014 F. Vallianatos, G. Papadakis, G. Michas, Generalized statistical mechanics approaches to earthquakes and tectonics. Proc. R. Soc. A, 472, 20160497, 2016.
A multi-factor Rasch scale for artistic judgment.
Bezruczko, Nikolaus
2002-01-01
Measurement properties are reported for a combined scale of abstract and figurative artistic judgment aptitude items. Abstract items are synthetic, rule-based images from Visual Designs Test which implements a statistical algorithm to control design complexity and redundancy, and figurative items are canvas paintings in five styles, Fauvism, Post-Impressionism, Surrealism, Renaissance, and Baroque especially created for this research. The paintings integrate syntactic structure from VDT Abstract designs with thematic content for each style at four levels of complexity while controlling redundancy. Trained test administrators collected preference for synthetic abstract designs and authentic figurative art from 462 examinees in Johnson O'Connor Research Foundation testing offices in Boston, New York, Chicago, and Dallas. The Rasch model replicated measurement properties for VDT Abstract items and identified an item hierarchy that was statistically invariant between genders and generally stable across age for new, authentic figurative items. Further examination of the figurative item hierarchy revealed that complexity interacts with style and meaning. Sound measurement properties for a combined VDT Abstract and Figurative scale shows promise for a comprehensive artistic judgment construct.
Path statistics, memory, and coarse-graining of continuous-time random walks on networks
Kion-Crosby, Willow; Morozov, Alexandre V.
2015-01-01
Continuous-time random walks (CTRWs) on discrete state spaces, ranging from regular lattices to complex networks, are ubiquitous across physics, chemistry, and biology. Models with coarse-grained states (for example, those employed in studies of molecular kinetics) or spatial disorder can give rise to memory and non-exponential distributions of waiting times and first-passage statistics. However, existing methods for analyzing CTRWs on complex energy landscapes do not address these effects. Here we use statistical mechanics of the nonequilibrium path ensemble to characterize first-passage CTRWs on networks with arbitrary connectivity, energy landscape, and waiting time distributions. Our approach can be applied to calculating higher moments (beyond the mean) of path length, time, and action, as well as statistics of any conservative or non-conservative force along a path. For homogeneous networks, we derive exact relations between length and time moments, quantifying the validity of approximating a continuous-time process with its discrete-time projection. For more general models, we obtain recursion relations, reminiscent of transfer matrix and exact enumeration techniques, to efficiently calculate path statistics numerically. We have implemented our algorithm in PathMAN (Path Matrix Algorithm for Networks), a Python script that users can apply to their model of choice. We demonstrate the algorithm on a few representative examples which underscore the importance of non-exponential distributions, memory, and coarse-graining in CTRWs. PMID:26646868
ERIC Educational Resources Information Center
Klein, Ariel; Badia, Toni
2015-01-01
In this study we show how complex creative relations can arise from fairly frequent semantic relations observed in everyday language. By doing this, we reflect on some key cognitive aspects of linguistic and general creativity. In our experimentation, we automated the process of solving a battery of Remote Associates Test tasks. By applying…
UAV Swarm Mission Planning Development Using Evolutionary Algorithms - Part I
2008-05-01
desired behaviors in autonomous vehicles is a difficult problem at best and in general prob- ably impossible to completely resolve in complex dynamic...associated behaviors. Various techniques inspired by biological self-organized systems as found in forging insects and flocking birds, revolve around...swarms of heterogeneous vehicles in a distributed simulation system with animated graphics. Statistical measurements and observations indicate that bio
Yur'yev, Andriy; Yur'yeva, Lyudmyla; Värnik, Peeter; Lumiste, Kaur; Värnik, Airi
2015-01-01
This study assesses the complex impact of risk and protective factors on suicide mortality in the Ukrainian general population. Data on suicide rates and socioeconomic and medical factors were obtained from the Ukrainian State Statistical Office, WHO, and the European Social Survey. Structural equation modeling was used for data analysis. Religion and education were negatively associated with suicide. The relationship between drug addiction/alcoholism and suicide was positive. The association between urbanization and suicide mortality was negative. The relationship between gross regional product (GRP) and female suicide was slightly negative. Religiosity was the protective factor most strongly linked with suicide mortality followed by urbanization. The harmful role of drug addiction and alcoholism was confirmed. The role of education and GRP is controversial. No striking gender differences were found.
NASA Astrophysics Data System (ADS)
Bae, Minja; Park, Jihyun; Kim, Jongju; Xue, Dandan; Park, Kyu-Chil; Yoon, Jong Rak
2016-07-01
The bit error rate of an underwater acoustic communication system is related to multipath fading statistics, which determine the signal-to-noise ratio. The amplitude and delay of each path depend on sea surface roughness, propagation medium properties, and source-to-receiver range as a function of frequency. Therefore, received signals will show frequency-dependent fading. A shallow-water acoustic communication channel generally shows a few strong multipaths that interfere with each other and the resulting interference affects the fading statistics model. In this study, frequency-selective fading statistics are modeled on the basis of the phasor representation of the complex path amplitude. The fading statistics distribution is parameterized by the frequency-dependent constructive or destructive interference of multipaths. At a 16 m depth with a muddy bottom, a wave height of 0.2 m, and source-to-receiver ranges of 100 and 400 m, fading statistics tend to show a Rayleigh distribution at a destructive interference frequency, but a Rice distribution at a constructive interference frequency. The theoretical fading statistics well matched the experimental ones.
Different Manhattan project: automatic statistical model generation
NASA Astrophysics Data System (ADS)
Yap, Chee Keng; Biermann, Henning; Hertzmann, Aaron; Li, Chen; Meyer, Jon; Pao, Hsing-Kuo; Paxia, Salvatore
2002-03-01
We address the automatic generation of large geometric models. This is important in visualization for several reasons. First, many applications need access to large but interesting data models. Second, we often need such data sets with particular characteristics (e.g., urban models, park and recreation landscape). Thus we need the ability to generate models with different parameters. We propose a new approach for generating such models. It is based on a top-down propagation of statistical parameters. We illustrate the method in the generation of a statistical model of Manhattan. But the method is generally applicable in the generation of models of large geographical regions. Our work is related to the literature on generating complex natural scenes (smoke, forests, etc) based on procedural descriptions. The difference in our approach stems from three characteristics: modeling with statistical parameters, integration of ground truth (actual map data), and a library-based approach for texture mapping.
Bayesian demography 250 years after Bayes
Bijak, Jakub; Bryant, John
2016-01-01
Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889
Quantifying spatial and temporal trends in beach-dune volumetric changes using spatial statistics
NASA Astrophysics Data System (ADS)
Eamer, Jordan B. R.; Walker, Ian J.
2013-06-01
Spatial statistics are generally underutilized in coastal geomorphology, despite offering great potential for identifying and quantifying spatial-temporal trends in landscape morphodynamics. In particular, local Moran's Ii provides a statistical framework for detecting clusters of significant change in an attribute (e.g., surface erosion or deposition) and quantifying how this changes over space and time. This study analyzes and interprets spatial-temporal patterns in sediment volume changes in a beach-foredune-transgressive dune complex following removal of invasive marram grass (Ammophila spp.). Results are derived by detecting significant changes in post-removal repeat DEMs derived from topographic surveys and airborne LiDAR. The study site was separated into discrete, linked geomorphic units (beach, foredune, transgressive dune complex) to facilitate sub-landscape scale analysis of volumetric change and sediment budget responses. Difference surfaces derived from a pixel-subtraction algorithm between interval DEMs and the LiDAR baseline DEM were filtered using the local Moran's Ii method and two different spatial weights (1.5 and 5 m) to detect statistically significant change. Moran's Ii results were compared with those derived from a more spatially uniform statistical method that uses a simpler student's t distribution threshold for change detection. Morphodynamic patterns and volumetric estimates were similar between the uniform geostatistical method and Moran's Ii at a spatial weight of 5 m while the smaller spatial weight (1.5 m) consistently indicated volumetric changes of less magnitude. The larger 5 m spatial weight was most representative of broader site morphodynamics and spatial patterns while the smaller spatial weight provided volumetric changes consistent with field observations. All methods showed foredune deflation immediately following removal with increased sediment volumes into the spring via deposition at the crest and on lobes in the lee, despite erosion on the stoss slope and dune toe. Generally, the foredune became wider by landward extension and the seaward slope recovered from erosion to a similar height and form to that of pre-restoration despite remaining essentially free of vegetation.
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
2017-01-01
Recent advances in understanding protein folding have benefitted from coarse-grained representations of protein structures. Empirical energy functions derived from these techniques occasionally succeed in distinguishing native structures from their corresponding ensembles of nonnative folds or decoys which display varying degrees of structural dissimilarity to the native proteins. Here we utilized atomic coordinates of single protein chains, comprising a large diverse training set, to develop and evaluate twelve all-atom four-body statistical potentials obtained by exploring alternative values for a pair of inherent parameters. Delaunay tessellation was performed on the atomic coordinates of each protein to objectively identify all quadruplets of interacting atoms, and atomic potentials were generated via statistical analysis of the data and implementation of the inverted Boltzmann principle. Our potentials were evaluated using benchmarking datasets from Decoys-‘R'-Us, and comparisons were made with twelve other physics- and knowledge-based potentials. Ranking 3rd, our best potential tied CHARMM19 and surpassed AMBER force field potentials. We illustrate how a generalized version of our potential can be used to empirically calculate binding energies for target-ligand complexes, using HIV-1 protease-inhibitor complexes for a practical application. The combined results suggest an accurate and efficient atomic four-body statistical potential for protein structure prediction and assessment. PMID:29119109
Statistical assessment on a combined analysis of GRYN-ROMN-UCBN upland vegetation vital signs
Irvine, Kathryn M.; Rodhouse, Thomas J.
2014-01-01
As of 2013, Rocky Mountain and Upper Columbia Basin Inventory and Monitoring Networks have multiple years of vegetation data and Greater Yellowstone Network has three years of vegetation data and monitoring is ongoing in all three networks. Our primary objective is to assess whether a combined analysis of these data aimed at exploring correlations with climate and weather data is feasible. We summarize the core survey design elements across protocols and point out the major statistical challenges for a combined analysis at present. The dissimilarity in response designs between ROMN and UCBN-GRYN network protocols presents a statistical challenge that has not been resolved yet. However, the UCBN and GRYN data are compatible as they implement a similar response design; therefore, a combined analysis is feasible and will be pursued in future. When data collected by different networks are combined, the survey design describing the merged dataset is (likely) a complex survey design. A complex survey design is the result of combining datasets from different sampling designs. A complex survey design is characterized by unequal probability sampling, varying stratification, and clustering (see Lohr 2010 Chapter 7 for general overview). Statistical analysis of complex survey data requires modifications to standard methods, one of which is to include survey design weights within a statistical model. We focus on this issue for a combined analysis of upland vegetation from these networks, leaving other topics for future research. We conduct a simulation study on the possible effects of equal versus unequal probability selection of points on parameter estimates of temporal trend using available packages within the R statistical computing package. We find that, as written, using lmer or lm for trend detection in a continuous response and clm and clmm for visually estimated cover classes with “raw” GRTS design weights specified for the weight argument leads to substantially different results and/or computational instability. However, when only fixed effects are of interest, the survey package (svyglm and svyolr) may be suitable for a model-assisted analysis for trend. We provide possible directions for future research into combined analysis for ordinal and continuous vital sign indictors.
An operational GLS model for hydrologic regression
Tasker, Gary D.; Stedinger, J.R.
1989-01-01
Recent Monte Carlo studies have documented the value of generalized least squares (GLS) procedures to estimate empirical relationships between streamflow statistics and physiographic basin characteristics. This paper presents a number of extensions of the GLS method that deal with realities and complexities of regional hydrologic data sets that were not addressed in the simulation studies. These extensions include: (1) a more realistic model of the underlying model errors; (2) smoothed estimates of cross correlation of flows; (3) procedures for including historical flow data; (4) diagnostic statistics describing leverage and influence for GLS regression; and (5) the formulation of a mathematical program for evaluating future gaging activities. ?? 1989.
A simple and fast representation space for classifying complex time series
NASA Astrophysics Data System (ADS)
Zunino, Luciano; Olivares, Felipe; Bariviera, Aurelio F.; Rosso, Osvaldo A.
2017-03-01
In the context of time series analysis considerable effort has been directed towards the implementation of efficient discriminating statistical quantifiers. Very recently, a simple and fast representation space has been introduced, namely the number of turning points versus the Abbe value. It is able to separate time series from stationary and non-stationary processes with long-range dependences. In this work we show that this bidimensional approach is useful for distinguishing complex time series: different sets of financial and physiological data are efficiently discriminated. Additionally, a multiscale generalization that takes into account the multiple time scales often involved in complex systems has been also proposed. This multiscale analysis is essential to reach a higher discriminative power between physiological time series in health and disease.
Mathematical Methods for Physics and Engineering Third Edition Paperback Set
NASA Astrophysics Data System (ADS)
Riley, Ken F.; Hobson, Mike P.; Bence, Stephen J.
2006-06-01
Prefaces; 1. Preliminary algebra; 2. Preliminary calculus; 3. Complex numbers and hyperbolic functions; 4. Series and limits; 5. Partial differentiation; 6. Multiple integrals; 7. Vector algebra; 8. Matrices and vector spaces; 9. Normal modes; 10. Vector calculus; 11. Line, surface and volume integrals; 12. Fourier series; 13. Integral transforms; 14. First-order ordinary differential equations; 15. Higher-order ordinary differential equations; 16. Series solutions of ordinary differential equations; 17. Eigenfunction methods for differential equations; 18. Special functions; 19. Quantum operators; 20. Partial differential equations: general and particular; 21. Partial differential equations: separation of variables; 22. Calculus of variations; 23. Integral equations; 24. Complex variables; 25. Application of complex variables; 26. Tensors; 27. Numerical methods; 28. Group theory; 29. Representation theory; 30. Probability; 31. Statistics; Index.
Phenomenological model to fit complex permittivity data of water from radio to optical frequencies.
Shubitidze, Fridon; Osterberg, Ulf
2007-04-01
A general factorized form of the dielectric function together with a fractional model-based parameter estimation method is used to provide an accurate analytical formula for the complex refractive index in water for the frequency range 10(8)-10(16)Hz . The analytical formula is derived using a combination of a microscopic frequency-dependent rational function for adjusting zeros and poles of the dielectric dispersion together with the macroscopic statistical Fermi-Dirac distribution to provide a description of both the real and imaginary parts of the complex permittivity for water. The Fermi-Dirac distribution allows us to model the dramatic reduction in the imaginary part of the permittivity in the visible window of the water spectrum.
NASA Astrophysics Data System (ADS)
Niemann, Brand Lee
A major field program to study beta-mesoscale transport and dispersion over complex mountainous terrain was conducted during 1969 with the cooperation of three government agencies at the White Sands Missile Range in central Utah. The purpose of the program was to measure simultaneously on a large number of days the synoptic and mesoscale wind fields, the relative dispersion between pairs of particle trajectories and the rate of small scale turbulence dissipation. The field program included measurements during more than 60 days in the months of March, June, and November. The large quantity of data generated from this program has been processed and analyzed to provide case studies and statistics to evaluate and refine Lagrangian variable trajectory models. The case studies selected to illustrate the complexities of mesoscale transport and dispersion over complex terrain include those with terrain blocking, lee waves, and stagnation, as well as those with large vertical wind shears and horizontal wind field deformation. The statistics of relative particle dispersion were computed and compared to the classical theories of Richardson and Batchelor and the more recent theories of Lin and Kao among others. The relative particle dispersion was generally found to increase with travel time in the alongwind and crosswind directions, but in a more oscillatory than sustained or even accelerated manner as predicted by most theories, unless substantial wind shears or finite vertical separations between particles were present. The relative particle dispersion in the vertical was generally found to be small and bounded even when substantial vertical motions due to lee waves were present because of the limiting effect of stable temperature stratification. The data show that velocity shears have a more significant effect than turbulence on relative particle dispersion and that sufficient turbulence may not always be present above the planetary boundary layer for "wind direction shear induced dispersion" to become effective horizontal dispersion by vertical mixing over the shear layer. The statistics of relative particle dispersion in the three component directions have been summarized and stratified by flow parameters for use in practical prediction problems.
Multi-level emulation of complex climate model responses to boundary forcing data
NASA Astrophysics Data System (ADS)
Tran, Giang T.; Oliver, Kevin I. C.; Holden, Philip B.; Edwards, Neil R.; Sóbester, András; Challenor, Peter
2018-04-01
Climate model components involve both high-dimensional input and output fields. It is desirable to efficiently generate spatio-temporal outputs of these models for applications in integrated assessment modelling or to assess the statistical relationship between such sets of inputs and outputs, for example, uncertainty analysis. However, the need for efficiency often compromises the fidelity of output through the use of low complexity models. Here, we develop a technique which combines statistical emulation with a dimensionality reduction technique to emulate a wide range of outputs from an atmospheric general circulation model, PLASIM, as functions of the boundary forcing prescribed by the ocean component of a lower complexity climate model, GENIE-1. Although accurate and detailed spatial information on atmospheric variables such as precipitation and wind speed is well beyond the capability of GENIE-1's energy-moisture balance model of the atmosphere, this study demonstrates that the output of this model is useful in predicting PLASIM's spatio-temporal fields through multi-level emulation. Meaningful information from the fast model, GENIE-1 was extracted by utilising the correlation between variables of the same type in the two models and between variables of different types in PLASIM. We present here the construction and validation of several PLASIM variable emulators and discuss their potential use in developing a hybrid model with statistical components.
NASA Astrophysics Data System (ADS)
Skitka, J.; Marston, B.; Fox-Kemper, B.
2016-02-01
Sub-grid turbulence models for planetary boundary layers are typically constructed additively, starting with local flow properties and including non-local (KPP) or higher order (Mellor-Yamada) parameters until a desired level of predictive capacity is achieved or a manageable threshold of complexity is surpassed. Such approaches are necessarily limited in general circumstances, like global circulation models, by their being optimized for particular flow phenomena. By building a model reductively, starting with the infinite hierarchy of turbulence statistics, truncating at a given order, and stripping degrees of freedom from the flow, we offer the prospect a turbulence model and investigative tool that is equally applicable to all flow types and able to take full advantage of the wealth of nonlocal information in any flow. Direct statistical simulation (DSS) that is based upon expansion in equal-time cumulants can be used to compute flow statistics of arbitrary order. We investigate the feasibility of a second-order closure (CE2) by performing simulations of the ocean boundary layer in a quasi-linear approximation for which CE2 is exact. As oceanographic examples, wind-driven Langmuir turbulence and thermal convection are studied by comparison of the quasi-linear and fully nonlinear statistics. We also characterize the computational advantages and physical uncertainties of CE2 defined on a reduced basis determined via proper orthogonal decomposition (POD) of the flow fields.
NASA Astrophysics Data System (ADS)
Xu, Kaixuan; Wang, Jun
2017-02-01
In this paper, recently introduced permutation entropy and sample entropy are further developed to the fractional cases, weighted fractional permutation entropy (WFPE) and fractional sample entropy (FSE). The fractional order generalization of information entropy is utilized in the above two complexity approaches, to detect the statistical characteristics of fractional order information in complex systems. The effectiveness analysis of proposed methods on the synthetic data and the real-world data reveals that tuning the fractional order allows a high sensitivity and more accurate characterization to the signal evolution, which is useful in describing the dynamics of complex systems. Moreover, the numerical research on nonlinear complexity behaviors is compared between the returns series of Potts financial model and the actual stock markets. And the empirical results confirm the feasibility of the proposed model.
NASA Astrophysics Data System (ADS)
Kumar, Jagadish; Ananthakrishna, G.
2018-01-01
Scale-invariant power-law distributions for acoustic emission signals are ubiquitous in several plastically deforming materials. However, power-law distributions for acoustic emission energies are reported in distinctly different plastically deforming situations such as hcp and fcc single and polycrystalline samples exhibiting smooth stress-strain curves and in dilute metallic alloys exhibiting discontinuous flow. This is surprising since the underlying dislocation mechanisms in these two types of deformations are very different. So far, there have been no models that predict the power-law statistics for discontinuous flow. Furthermore, the statistics of the acoustic emission signals in jerky flow is even more complex, requiring multifractal measures for a proper characterization. There has been no model that explains the complex statistics either. Here we address the problem of statistical characterization of the acoustic emission signals associated with the three types of the Portevin-Le Chatelier bands. Following our recently proposed general framework for calculating acoustic emission, we set up a wave equation for the elastic degrees of freedom with a plastic strain rate as a source term. The energy dissipated during acoustic emission is represented by the Rayleigh-dissipation function. Using the plastic strain rate obtained from the Ananthakrishna model for the Portevin-Le Chatelier effect, we compute the acoustic emission signals associated with the three Portevin-Le Chatelier bands and the Lüders-like band. The so-calculated acoustic emission signals are used for further statistical characterization. Our results show that the model predicts power-law statistics for all the acoustic emission signals associated with the three types of Portevin-Le Chatelier bands with the exponent values increasing with increasing strain rate. The calculated multifractal spectra corresponding to the acoustic emission signals associated with the three band types have a maximum spread for the type C bands and decreasing with types B and A. We further show that the acoustic emission signals associated with Lüders-like band also exhibit a power-law distribution and multifractality.
Kumar, Jagadish; Ananthakrishna, G
2018-01-01
Scale-invariant power-law distributions for acoustic emission signals are ubiquitous in several plastically deforming materials. However, power-law distributions for acoustic emission energies are reported in distinctly different plastically deforming situations such as hcp and fcc single and polycrystalline samples exhibiting smooth stress-strain curves and in dilute metallic alloys exhibiting discontinuous flow. This is surprising since the underlying dislocation mechanisms in these two types of deformations are very different. So far, there have been no models that predict the power-law statistics for discontinuous flow. Furthermore, the statistics of the acoustic emission signals in jerky flow is even more complex, requiring multifractal measures for a proper characterization. There has been no model that explains the complex statistics either. Here we address the problem of statistical characterization of the acoustic emission signals associated with the three types of the Portevin-Le Chatelier bands. Following our recently proposed general framework for calculating acoustic emission, we set up a wave equation for the elastic degrees of freedom with a plastic strain rate as a source term. The energy dissipated during acoustic emission is represented by the Rayleigh-dissipation function. Using the plastic strain rate obtained from the Ananthakrishna model for the Portevin-Le Chatelier effect, we compute the acoustic emission signals associated with the three Portevin-Le Chatelier bands and the Lüders-like band. The so-calculated acoustic emission signals are used for further statistical characterization. Our results show that the model predicts power-law statistics for all the acoustic emission signals associated with the three types of Portevin-Le Chatelier bands with the exponent values increasing with increasing strain rate. The calculated multifractal spectra corresponding to the acoustic emission signals associated with the three band types have a maximum spread for the type C bands and decreasing with types B and A. We further show that the acoustic emission signals associated with Lüders-like band also exhibit a power-law distribution and multifractality.
General Blending Models for Data From Mixture Experiments
Brown, L.; Donev, A. N.; Bissett, A. C.
2015-01-01
We propose a new class of models providing a powerful unification and extension of existing statistical methodology for analysis of data obtained in mixture experiments. These models, which integrate models proposed by Scheffé and Becker, extend considerably the range of mixture component effects that may be described. They become complex when the studied phenomenon requires it, but remain simple whenever possible. This article has supplementary material online. PMID:26681812
Spectral Entropies as Information-Theoretic Tools for Complex Network Comparison
NASA Astrophysics Data System (ADS)
De Domenico, Manlio; Biamonte, Jacob
2016-10-01
Any physical system can be viewed from the perspective that information is implicitly represented in its state. However, the quantification of this information when it comes to complex networks has remained largely elusive. In this work, we use techniques inspired by quantum statistical mechanics to define an entropy measure for complex networks and to develop a set of information-theoretic tools, based on network spectral properties, such as Rényi q entropy, generalized Kullback-Leibler and Jensen-Shannon divergences, the latter allowing us to define a natural distance measure between complex networks. First, we show that by minimizing the Kullback-Leibler divergence between an observed network and a parametric network model, inference of model parameter(s) by means of maximum-likelihood estimation can be achieved and model selection can be performed with appropriate information criteria. Second, we show that the information-theoretic metric quantifies the distance between pairs of networks and we can use it, for instance, to cluster the layers of a multilayer system. By applying this framework to networks corresponding to sites of the human microbiome, we perform hierarchical cluster analysis and recover with high accuracy existing community-based associations. Our results imply that spectral-based statistical inference in complex networks results in demonstrably superior performance as well as a conceptual backbone, filling a gap towards a network information theory.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McManamay, Ryan A
2014-01-01
Despite the ubiquitous existence of dams within riverscapes, much of our knowledge about dams and their environmental effects remains context-specific. Hydrology, more than any other environmental variable, has been studied in great detail with regard to dam regulation. While much progress has been made in generalizing the hydrologic effects of regulation by large dams, many aspects of hydrology show site-specific fidelity to dam operations, small dams (including diversions), and regional hydrologic regimes. A statistical modeling framework is presented to quantify and generalize hydrologic responses to varying degrees of dam regulation. Specifically, the objectives were to 1) compare the effects ofmore » local versus cumulative dam regulation, 2) determine the importance of different regional hydrologic regimes in influencing hydrologic responses to dams, and 3) evaluate how different regulation contexts lead to error in predicting hydrologic responses to dams. Overall, model performance was poor in quantifying the magnitude of hydrologic responses, but performance was sufficient in classifying hydrologic responses as negative or positive. Responses of some hydrologic indices to dam regulation were highly dependent upon hydrologic class membership and the purpose of the dam. The opposing coefficients between local and cumulative-dam predictors suggested that hydrologic responses to cumulative dam regulation are complex, and predicting the hydrology downstream of individual dams, as opposed to multiple dams, may be more easy accomplished using statistical approaches. Results also suggested that particular contexts, including multipurpose dams, high cumulative regulation by multiple dams, diversions, close proximity to dams, and certain hydrologic classes are all sources of increased error when predicting hydrologic responses to dams. Statistical models, such as the ones presented herein, show promise in their ability to model the effects of dam regulation effects at large spatial scales as to generalize the directionality of hydrologic responses.« less
Cardone, A.; Bornstein, A.; Pant, H. C.; Brady, M.; Sriram, R.; Hassan, S. A.
2015-01-01
A method is proposed to study protein-ligand binding in a system governed by specific and non-specific interactions. Strong associations lead to narrow distributions in the proteins configuration space; weak and ultra-weak associations lead instead to broader distributions, a manifestation of non-specific, sparsely-populated binding modes with multiple interfaces. The method is based on the notion that a discrete set of preferential first-encounter modes are metastable states from which stable (pre-relaxation) complexes at equilibrium evolve. The method can be used to explore alternative pathways of complexation with statistical significance and can be integrated into a general algorithm to study protein interaction networks. The method is applied to a peptide-protein complex. The peptide adopts several low-population conformers and binds in a variety of modes with a broad range of affinities. The system is thus well suited to analyze general features of binding, including conformational selection, multiplicity of binding modes, and nonspecific interactions, and to illustrate how the method can be applied to study these problems systematically. The equilibrium distributions can be used to generate biasing functions for simulations of multiprotein systems from which bulk thermodynamic quantities can be calculated. PMID:25782918
Walking through the statistical black boxes of plant breeding.
Xavier, Alencar; Muir, William M; Craig, Bruce; Rainey, Katy Martin
2016-10-01
The main statistical procedures in plant breeding are based on Gaussian process and can be computed through mixed linear models. Intelligent decision making relies on our ability to extract useful information from data to help us achieve our goals more efficiently. Many plant breeders and geneticists perform statistical analyses without understanding the underlying assumptions of the methods or their strengths and pitfalls. In other words, they treat these statistical methods (software and programs) like black boxes. Black boxes represent complex pieces of machinery with contents that are not fully understood by the user. The user sees the inputs and outputs without knowing how the outputs are generated. By providing a general background on statistical methodologies, this review aims (1) to introduce basic concepts of machine learning and its applications to plant breeding; (2) to link classical selection theory to current statistical approaches; (3) to show how to solve mixed models and extend their application to pedigree-based and genomic-based prediction; and (4) to clarify how the algorithms of genome-wide association studies work, including their assumptions and limitations.
Parsons, Nick R; Price, Charlotte L; Hiskens, Richard; Achten, Juul; Costa, Matthew L
2012-04-25
The application of statistics in reported research in trauma and orthopaedic surgery has become ever more important and complex. Despite the extensive use of statistical analysis, it is still a subject which is often not conceptually well understood, resulting in clear methodological flaws and inadequate reporting in many papers. A detailed statistical survey sampled 100 representative orthopaedic papers using a validated questionnaire that assessed the quality of the trial design and statistical analysis methods. The survey found evidence of failings in study design, statistical methodology and presentation of the results. Overall, in 17% (95% confidence interval; 10-26%) of the studies investigated the conclusions were not clearly justified by the results, in 39% (30-49%) of studies a different analysis should have been undertaken and in 17% (10-26%) a different analysis could have made a difference to the overall conclusions. It is only by an improved dialogue between statistician, clinician, reviewer and journal editor that the failings in design methodology and analysis highlighted by this survey can be addressed.
Subjective randomness as statistical inference.
Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B
2018-06-01
Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.
Multi-region statistical shape model for cochlear implantation
NASA Astrophysics Data System (ADS)
Romera, Jordi; Kjer, H. Martin; Piella, Gemma; Ceresa, Mario; González Ballester, Miguel A.
2016-03-01
Statistical shape models are commonly used to analyze the variability between similar anatomical structures and their use is established as a tool for analysis and segmentation of medical images. However, using a global model to capture the variability of complex structures is not enough to achieve the best results. The complexity of a proper global model increases even more when the amount of data available is limited to a small number of datasets. Typically, the anatomical variability between structures is associated to the variability of their physiological regions. In this paper, a complete pipeline is proposed for building a multi-region statistical shape model to study the entire variability from locally identified physiological regions of the inner ear. The proposed model, which is based on an extension of the Point Distribution Model (PDM), is built for a training set of 17 high-resolution images (24.5 μm voxels) of the inner ear. The model is evaluated according to its generalization ability and specificity. The results are compared with the ones of a global model built directly using the standard PDM approach. The evaluation results suggest that better accuracy can be achieved using a regional modeling of the inner ear.
Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf
NASA Astrophysics Data System (ADS)
Pietronero, L.; Tosatti, E.; Tosatti, V.; Vespignani, A.
2001-04-01
The distribution of first digits in numbers series obtained from very different origins shows a marked asymmetry in favor of small digits that goes under the name of Benford's law. We analyze in detail this property for different data sets and give a general explanation for the origin of the Benford's law in terms of multiplicative processes. We show that this law can be also generalized to series of numbers generated from more complex systems like the catalogs of seismic activity. Finally, we derive a relation between the generalized Benford's law and the popular Zipf's law which characterize the rank order statistics and has been extensively applied to many problems ranging from city population to linguistics.
Improvement of short-term numerical wind predictions
NASA Astrophysics Data System (ADS)
Bedard, Joel
Geophysic Model Output Statistics (GMOS) are developed to optimize the use of NWP for complex sites. GMOS differs from other MOS that are widely used by meteorological centers in the following aspects: it takes into account the surrounding geophysical parameters such as surface roughness, terrain height, etc., along with wind direction; it can be directly applied without any training, although training will further improve the results. The GMOS was applied to improve the Environment Canada GEM-LAM 2.5km forecasts at North Cape (PEI, Canada): It improves the predictions RMSE by 25-30% for all time horizons and almost all meteorological conditions; the topographic signature of the forecast error due to insufficient grid refinement is eliminated and the NWP combined with GMOS outperform the persistence from a 2h horizon, instead of 4h without GMOS. Finally, GMOS was applied at another site (Bouctouche, NB, Canada): similar improvements were observed, thus showing its general applicability. Keywords: wind energy, wind power forecast, numerical weather prediction, complex sites, model output statistics
Complex Network Analysis for Characterizing Global Value Chains in Equipment Manufacturing.
Xiao, Hao; Sun, Tianyang; Meng, Bo; Cheng, Lihong
2017-01-01
The rise of global value chains (GVCs) characterized by the so-called "outsourcing", "fragmentation production", and "trade in tasks" has been considered one of the most important phenomena for the 21st century trade. GVCs also can play a decisive role in trade policy making. However, due to the increasing complexity and sophistication of international production networks, especially in the equipment manufacturing industry, conventional trade statistics and the corresponding trade indicators may give us a distorted picture of trade. This paper applies various network analysis tools to the new GVC accounting system proposed by Koopman et al. (2014) and Wang et al. (2013) in which gross exports can be decomposed into value-added terms through various routes along GVCs. This helps to divide the equipment manufacturing-related GVCs into some sub-networks with clear visualization. The empirical results of this paper significantly improve our understanding of the topology of equipment manufacturing-related GVCs as well as the interdependency of countries in these GVCs that is generally invisible from the traditional trade statistics.
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-07-05
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.
Student Solution Manual for Mathematical Methods for Physics and Engineering Third Edition
NASA Astrophysics Data System (ADS)
Riley, K. F.; Hobson, M. P.
2006-03-01
Preface; 1. Preliminary algebra; 2. Preliminary calculus; 3. Complex numbers and hyperbolic functions; 4. Series and limits; 5. Partial differentiation; 6. Multiple integrals; 7. Vector algebra; 8. Matrices and vector spaces; 9. Normal modes; 10. Vector calculus; 11. Line, surface and volume integrals; 12. Fourier series; 13. Integral transforms; 14. First-order ordinary differential equations; 15. Higher-order ordinary differential equations; 16. Series solutions of ordinary differential equations; 17. Eigenfunction methods for differential equations; 18. Special functions; 19. Quantum operators; 20. Partial differential equations: general and particular; 21. Partial differential equations: separation of variables; 22. Calculus of variations; 23. Integral equations; 24. Complex variables; 25. Application of complex variables; 26. Tensors; 27. Numerical methods; 28. Group theory; 29. Representation theory; 30. Probability; 31. Statistics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jing; Ackerman, David M.; Lin, Victor S.-Y.
2013-04-02
Statistical mechanical modeling is performed of a catalytic conversion reaction within a functionalized nanoporous material to assess the effect of varying the reaction product-pore interior interaction from attractive to repulsive. A strong enhancement in reactivity is observed not just due to the shift in reaction equilibrium towards completion but also due to enhanced transport within the pore resulting from reduced loading. The latter effect is strongest for highly restricted transport (single-file diffusion), and applies even for irreversible reactions. The analysis is performed utilizing a generalized hydrodynamic formulation of the reaction-diffusion equations which can reliably capture the complex interplay between reactionmore » and restricted transport.« less
JIGSAW: Preference-directed, co-operative scheduling
NASA Technical Reports Server (NTRS)
Linden, Theodore A.; Gaw, David
1992-01-01
Techniques that enable humans and machines to cooperate in the solution of complex scheduling problems have evolved out of work on the daily allocation and scheduling of Tactical Air Force resources. A generalized, formal model of these applied techniques is being developed. It is called JIGSAW by analogy with the multi-agent, constructive process used when solving jigsaw puzzles. JIGSAW begins from this analogy and extends it by propagating local preferences into global statistics that dynamically influence the value and variable ordering decisions. The statistical projections also apply to abstract resources and time periods--allowing more opportunities to find a successful variable ordering by reserving abstract resources and deferring the choice of a specific resource or time period.
Landsman, V; Lou, W Y W; Graubard, B I
2015-05-20
We present a two-step approach for estimating hazard rates and, consequently, survival probabilities, by levels of general categorical exposure. The resulting estimator utilizes three sources of data: vital statistics data and census data are used at the first step to estimate the overall hazard rate for a given combination of gender and age group, and cohort data constructed from a nationally representative complex survey with linked mortality records, are used at the second step to divide the overall hazard rate by exposure levels. We present an explicit expression for the resulting estimator and consider two methods for variance estimation that account for complex multistage sample design: (1) the leaving-one-out jackknife method, and (2) the Taylor linearization method, which provides an analytic formula for the variance estimator. The methods are illustrated with smoking and all-cause mortality data from the US National Health Interview Survey Linked Mortality Files, and the proposed estimator is compared with a previously studied crude hazard rate estimator that uses survey data only. The advantages of a two-step approach and possible extensions of the proposed estimator are discussed. Copyright © 2015 John Wiley & Sons, Ltd.
Statistical Inferences from the Topology of Complex Networks
2016-10-04
stable, does not lose any information, has continuous and discrete versions, and obeys a strong law of large numbers and a central limit theorem. The...paper (with J.A. Scott) “Categorification of persistent homology” [7] in the journal Discrete and Computational Geome- try and the paper “Metrics for...Generalized Persistence Modules” (with J.A. Scott and V. de Silva) in the journal Foundations of Computational Math - ematics [5]. These papers develop
New phases of D ge 2 current and diffeomorphism algebras in particle physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tze, Chia-Hsiung.
We survey some global results and open issues of current algebras and their canonical field theoretical realization in D {ge} 2 dimensional spacetime. We assess the status of the representation theory of their generalized Kac-Moody and diffeomorphism algebras. Particular emphasis is put on higher dimensional analogs of fermi-bose correspondence, complex analyticity and the phase entanglements of anyonic solitons with exotic spin and statistics. 101 refs.
Nonlinear Relaxation in Population Dynamics
NASA Astrophysics Data System (ADS)
Cirone, Markus A.; de Pasquale, Ferdinando; Spagnolo, Bernardo
We analyze the nonlinear relaxation of a complex ecosystem composed of many interacting species. The ecological system is described by generalized Lotka-Volterra equations with a multiplicative noise. The transient dynamics is studied in the framework of the mean field theory and with random interaction between the species. We focus on the statistical properties of the asymptotic behaviour of the time integral of the ith population and on the distribution of the population and of the local field.
How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
du Plessis, Louis; Leventhal, Gabriel E.; Bonhoeffer, Sebastian
2016-01-01
Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world questions, as the high dimensionality of sequence spaces makes it impossible to exhaustively measure the fitness of all variants of biologically meaningful sequences. We must therefore revert to statistical descriptions of fitness landscapes that are based on a sparse sample of fitness measurements. It remains unclear, however, how much data are required for such statistical descriptions to be useful. Here, we assess the ability of regression models accounting for single and pairwise mutations to correctly approximate a complex quasi-empirical fitness landscape. We compare approximations based on various sampling regimes of an RNA landscape and find that the sampling regime strongly influences the quality of the regression. On the one hand it is generally impossible to generate sufficient samples to achieve a good approximation of the complete fitness landscape, and on the other hand systematic sampling schemes can only provide a good description of the immediate neighborhood of a sequence of interest. Nevertheless, we obtain a remarkably good and unbiased fit to the local landscape when using sequences from a population that has evolved under strong selection. Thus, current statistical methods can provide a good approximation to the landscape of naturally evolving populations. PMID:27189564
Generalized memory associativity in a network model for the neuroses
NASA Astrophysics Data System (ADS)
Wedemann, Roseli S.; Donangelo, Raul; de Carvalho, Luís A. V.
2009-03-01
We review concepts introduced in earlier work, where a neural network mechanism describes some mental processes in neurotic pathology and psychoanalytic working-through, as associative memory functioning, according to the findings of Freud. We developed a complex network model, where modules corresponding to sensorial and symbolic memories interact, representing unconscious and conscious mental processes. The model illustrates Freud's idea that consciousness is related to symbolic and linguistic memory activity in the brain. We have introduced a generalization of the Boltzmann machine to model memory associativity. Model behavior is illustrated with simulations and some of its properties are analyzed with methods from statistical mechanics.
The epistemological status of general circulation models
NASA Astrophysics Data System (ADS)
Loehle, Craig
2018-03-01
Forecasts of both likely anthropogenic effects on climate and consequent effects on nature and society are based on large, complex software tools called general circulation models (GCMs). Forecasts generated by GCMs have been used extensively in policy decisions related to climate change. However, the relation between underlying physical theories and results produced by GCMs is unclear. In the case of GCMs, many discretizations and approximations are made, and simulating Earth system processes is far from simple and currently leads to some results with unknown energy balance implications. Statistical testing of GCM forecasts for degree of agreement with data would facilitate assessment of fitness for use. If model results need to be put on an anomaly basis due to model bias, then both visual and quantitative measures of model fit depend strongly on the reference period used for normalization, making testing problematic. Epistemology is here applied to problems of statistical inference during testing, the relationship between the underlying physics and the models, the epistemic meaning of ensemble statistics, problems of spatial and temporal scale, the existence or not of an unforced null for climate fluctuations, the meaning of existing uncertainty estimates, and other issues. Rigorous reasoning entails carefully quantifying levels of uncertainty.
Statistical downscaling of precipitation using long short-term memory recurrent neural networks
NASA Astrophysics Data System (ADS)
Misra, Saptarshi; Sarkar, Sudeshna; Mitra, Pabitra
2017-11-01
Hydrological impacts of global climate change on regional scale are generally assessed by downscaling large-scale climatic variables, simulated by General Circulation Models (GCMs), to regional, small-scale hydrometeorological variables like precipitation, temperature, etc. In this study, we propose a new statistical downscaling model based on Recurrent Neural Network with Long Short-Term Memory which captures the spatio-temporal dependencies in local rainfall. The previous studies have used several other methods such as linear regression, quantile regression, kernel regression, beta regression, and artificial neural networks. Deep neural networks and recurrent neural networks have been shown to be highly promising in modeling complex and highly non-linear relationships between input and output variables in different domains and hence we investigated their performance in the task of statistical downscaling. We have tested this model on two datasets—one on precipitation in Mahanadi basin in India and the second on precipitation in Campbell River basin in Canada. Our autoencoder coupled long short-term memory recurrent neural network model performs the best compared to other existing methods on both the datasets with respect to temporal cross-correlation, mean squared error, and capturing the extremes.
New class of generalized photon-added coherent states and some of their non-classical properties
NASA Astrophysics Data System (ADS)
Mojaveri, B.; Dehghani, A.; Mahmoodi, S.
2014-08-01
In this paper, we construct a new class of generalized photon added coherent states (GPACSs), |z,m{{\\rangle }_{r}} by excitations on a newly introduced family of generalized coherent states (GCSs) |z{{\\rangle }_{r}} (A Dehghani and B Mojaveri 2012 J. Phys. A: Math. Theor. 45 095304), obtained via generalized hypergeometric type displacement operators acting on the vacuum state of the simple harmonic oscillator. We show that these states realize resolution of the identity property through positive definite measures on the complex plane. Meanwhile, we demonstrate that the introduced states can also be interpreted as nonlinear coherent states (NLCSs), with a spacial nonlinearity function. Finally, some of their non-classical features as well as their quantum statistical properties are compared with Agarwal's photon-added coherent states (PACSs), \\left| z,m \\right\\rangle .
NASA Astrophysics Data System (ADS)
Donges, J. F.; Schleussner, C.-F.; Siegmund, J. F.; Donner, R. V.
2016-05-01
Studying event time series is a powerful approach for analyzing the dynamics of complex dynamical systems in many fields of science. In this paper, we describe the method of event coincidence analysis to provide a framework for quantifying the strength, directionality and time lag of statistical interrelationships between event series. Event coincidence analysis allows to formulate and test null hypotheses on the origin of the observed interrelationships including tests based on Poisson processes or, more generally, stochastic point processes with a prescribed inter-event time distribution and other higher-order properties. Applying the framework to country-level observational data yields evidence that flood events have acted as triggers of epidemic outbreaks globally since the 1950s. Facing projected future changes in the statistics of climatic extreme events, statistical techniques such as event coincidence analysis will be relevant for investigating the impacts of anthropogenic climate change on human societies and ecosystems worldwide.
Admixture, Population Structure, and F-Statistics.
Peter, Benjamin M
2016-04-01
Many questions about human genetic history can be addressed by examining the patterns of shared genetic variation between sets of populations. A useful methodological framework for this purpose isF-statistics that measure shared genetic drift between sets of two, three, and four populations and can be used to test simple and complex hypotheses about admixture between populations. This article provides context from phylogenetic and population genetic theory. I review how F-statistics can be interpreted as branch lengths or paths and derive new interpretations, using coalescent theory. I further show that the admixture tests can be interpreted as testing general properties of phylogenies, allowing extension of some ideas applications to arbitrary phylogenetic trees. The new results are used to investigate the behavior of the statistics under different models of population structure and show how population substructure complicates inference. The results lead to simplified estimators in many cases, and I recommend to replace F3 with the average number of pairwise differences for estimating population divergence. Copyright © 2016 by the Genetics Society of America.
Network geometry with flavor: From complexity to quantum geometry
NASA Astrophysics Data System (ADS)
Bianconi, Ginestra; Rahmede, Christoph
2016-03-01
Network geometry is attracting increasing attention because it has a wide range of applications, ranging from data mining to routing protocols in the Internet. At the same time advances in the understanding of the geometrical properties of networks are essential for further progress in quantum gravity. In network geometry, simplicial complexes describing the interaction between two or more nodes play a special role. In fact these structures can be used to discretize a geometrical d -dimensional space, and for this reason they have already been widely used in quantum gravity. Here we introduce the network geometry with flavor s =-1 ,0 ,1 (NGF) describing simplicial complexes defined in arbitrary dimension d and evolving by a nonequilibrium dynamics. The NGF can generate discrete geometries of different natures, ranging from chains and higher-dimensional manifolds to scale-free networks with small-world properties, scale-free degree distribution, and nontrivial community structure. The NGF admits as limiting cases both the Bianconi-Barabási models for complex networks, the stochastic Apollonian network, and the recently introduced model for complex quantum network manifolds. The thermodynamic properties of NGF reveal that NGF obeys a generalized area law opening a new scenario for formulating its coarse-grained limit. The structure of NGF is strongly dependent on the dimensionality d . In d =1 NGFs grow complex networks for which the preferential attachment mechanism is necessary in order to obtain a scale-free degree distribution. Instead, for NGF with dimension d >1 it is not necessary to have an explicit preferential attachment rule to generate scale-free topologies. We also show that NGF admits a quantum mechanical description in terms of associated quantum network states. Quantum network states evolve by a Markovian dynamics and a quantum network state at time t encodes all possible NGF evolutions up to time t . Interestingly the NGF remains fully classical but its statistical properties reveal the relation to its quantum mechanical description. In fact the δ -dimensional faces of the NGF have generalized degrees that follow either the Fermi-Dirac, Boltzmann, or Bose-Einstein statistics depending on the flavor s and the dimensions d and δ .
Network geometry with flavor: From complexity to quantum geometry.
Bianconi, Ginestra; Rahmede, Christoph
2016-03-01
Network geometry is attracting increasing attention because it has a wide range of applications, ranging from data mining to routing protocols in the Internet. At the same time advances in the understanding of the geometrical properties of networks are essential for further progress in quantum gravity. In network geometry, simplicial complexes describing the interaction between two or more nodes play a special role. In fact these structures can be used to discretize a geometrical d-dimensional space, and for this reason they have already been widely used in quantum gravity. Here we introduce the network geometry with flavor s=-1,0,1 (NGF) describing simplicial complexes defined in arbitrary dimension d and evolving by a nonequilibrium dynamics. The NGF can generate discrete geometries of different natures, ranging from chains and higher-dimensional manifolds to scale-free networks with small-world properties, scale-free degree distribution, and nontrivial community structure. The NGF admits as limiting cases both the Bianconi-Barabási models for complex networks, the stochastic Apollonian network, and the recently introduced model for complex quantum network manifolds. The thermodynamic properties of NGF reveal that NGF obeys a generalized area law opening a new scenario for formulating its coarse-grained limit. The structure of NGF is strongly dependent on the dimensionality d. In d=1 NGFs grow complex networks for which the preferential attachment mechanism is necessary in order to obtain a scale-free degree distribution. Instead, for NGF with dimension d>1 it is not necessary to have an explicit preferential attachment rule to generate scale-free topologies. We also show that NGF admits a quantum mechanical description in terms of associated quantum network states. Quantum network states evolve by a Markovian dynamics and a quantum network state at time t encodes all possible NGF evolutions up to time t. Interestingly the NGF remains fully classical but its statistical properties reveal the relation to its quantum mechanical description. In fact the δ-dimensional faces of the NGF have generalized degrees that follow either the Fermi-Dirac, Boltzmann, or Bose-Einstein statistics depending on the flavor s and the dimensions d and δ.
duVerle, David A; Yotsukura, Sohiya; Nomura, Seitaro; Aburatani, Hiroyuki; Tsuda, Koji
2016-09-13
Single-cell RNA sequencing is fast becoming one the standard method for gene expression measurement, providing unique insights into cellular processes. A number of methods, based on general dimensionality reduction techniques, have been suggested to help infer and visualise the underlying structure of cell populations from single-cell expression levels, yet their models generally lack proper biological grounding and struggle at identifying complex differentiation paths. Here we introduce cellTree: an R/Bioconductor package that uses a novel statistical approach, based on document analysis techniques, to produce tree structures outlining the hierarchical relationship between single-cell samples, while identifying latent groups of genes that can provide biological insights. With cellTree, we provide experimentalists with an easy-to-use tool, based on statistically and biologically-sound algorithms, to efficiently explore and visualise single-cell RNA data. The cellTree package is publicly available in the online Bionconductor repository at: http://bioconductor.org/packages/cellTree/ .
Frequency-dependent scaling from mesoscale to macroscale in viscoelastic random composites
Zhang, Jun
2016-01-01
This paper investigates the scaling from a statistical volume element (SVE; i.e. mesoscale level) to representative volume element (RVE; i.e. macroscale level) of spatially random linear viscoelastic materials, focusing on the quasi-static properties in the frequency domain. Requiring the material statistics to be spatially homogeneous and ergodic, the mesoscale bounds on the RVE response are developed from the Hill–Mandel homogenization condition adapted to viscoelastic materials. The bounds are obtained from two stochastic initial-boundary value problems set up, respectively, under uniform kinematic and traction boundary conditions. The frequency and scale dependencies of mesoscale bounds are obtained through computational mechanics for composites with planar random chessboard microstructures. In general, the frequency-dependent scaling to RVE can be described through a complex-valued scaling function, which generalizes the concept originally developed for linear elastic random composites. This scaling function is shown to apply for all different phase combinations on random chessboards and, essentially, is only a function of the microstructure and mesoscale. PMID:27274689
The precise time-dependent solution of the Fokker–Planck equation with anomalous diffusion
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Ran; Du, Jiulin, E-mail: jiulindu@aliyun.com
2015-08-15
We study the time behavior of the Fokker–Planck equation in Zwanzig’s rule (the backward-Ito’s rule) based on the Langevin equation of Brownian motion with an anomalous diffusion in a complex medium. The diffusion coefficient is a function in momentum space and follows a generalized fluctuation–dissipation relation. We obtain the precise time-dependent analytical solution of the Fokker–Planck equation and at long time the solution approaches to a stationary power-law distribution in nonextensive statistics. As a test, numerically we have demonstrated the accuracy and validity of the time-dependent solution. - Highlights: • The precise time-dependent solution of the Fokker–Planck equation with anomalousmore » diffusion is found. • The anomalous diffusion satisfies a generalized fluctuation–dissipation relation. • At long time the time-dependent solution approaches to a power-law distribution in nonextensive statistics. • Numerically we have demonstrated the accuracy and validity of the time-dependent solution.« less
Large-N -approximated field theory for multipartite entanglement
NASA Astrophysics Data System (ADS)
Facchi, P.; Florio, G.; Parisi, G.; Pascazio, S.; Scardicchio, A.
2015-12-01
We try to characterize the statistics of multipartite entanglement of the random states of an n -qubit system. Unable to solve the problem exactly we generalize it, replacing complex numbers with real vectors with Nc components (the original problem is recovered for Nc=2 ). Studying the leading diagrams in the large-Nc approximation, we unearth the presence of a phase transition and, in an explicit example, show that the so-called entanglement frustration disappears in the large-Nc limit.
Path integral molecular dynamics for exact quantum statistics of multi-electronic-state systems.
Liu, Xinzijian; Liu, Jian
2018-03-14
An exact approach to compute physical properties for general multi-electronic-state (MES) systems in thermal equilibrium is presented. The approach is extended from our recent progress on path integral molecular dynamics (PIMD), Liu et al. [J. Chem. Phys. 145, 024103 (2016)] and Zhang et al. [J. Chem. Phys. 147, 034109 (2017)], for quantum statistical mechanics when a single potential energy surface is involved. We first define an effective potential function that is numerically favorable for MES-PIMD and then derive corresponding estimators in MES-PIMD for evaluating various physical properties. Its application to several representative one-dimensional and multi-dimensional models demonstrates that MES-PIMD in principle offers a practical tool in either of the diabatic and adiabatic representations for studying exact quantum statistics of complex/large MES systems when the Born-Oppenheimer approximation, Condon approximation, and harmonic bath approximation are broken.
Path integral molecular dynamics for exact quantum statistics of multi-electronic-state systems
NASA Astrophysics Data System (ADS)
Liu, Xinzijian; Liu, Jian
2018-03-01
An exact approach to compute physical properties for general multi-electronic-state (MES) systems in thermal equilibrium is presented. The approach is extended from our recent progress on path integral molecular dynamics (PIMD), Liu et al. [J. Chem. Phys. 145, 024103 (2016)] and Zhang et al. [J. Chem. Phys. 147, 034109 (2017)], for quantum statistical mechanics when a single potential energy surface is involved. We first define an effective potential function that is numerically favorable for MES-PIMD and then derive corresponding estimators in MES-PIMD for evaluating various physical properties. Its application to several representative one-dimensional and multi-dimensional models demonstrates that MES-PIMD in principle offers a practical tool in either of the diabatic and adiabatic representations for studying exact quantum statistics of complex/large MES systems when the Born-Oppenheimer approximation, Condon approximation, and harmonic bath approximation are broken.
Econophysical visualization of Adam Smith’s invisible hand
NASA Astrophysics Data System (ADS)
Cohen, Morrel H.; Eliazar, Iddo I.
2013-02-01
Consider a complex system whose macrostate is statistically observable, but yet whose operating mechanism is an unknown black-box. In this paper we address the problem of inferring, from the system’s macrostate statistics, the system’s intrinsic force yielding the observed statistics. The inference is established via two diametrically opposite approaches which result in the very same intrinsic force: a top-down approach based on the notion of entropy, and a bottom-up approach based on the notion of Langevin dynamics. The general results established are applied to the problem of visualizing the intrinsic socioeconomic force-Adam Smith’s invisible hand-shaping the distribution of wealth in human societies. Our analysis yields quantitative econophysical representations of figurative socioeconomic forces, quantitative definitions of “poor” and “rich”, and a quantitative characterization of the “poor-get-poorer” and the “rich-get-richer” phenomena.
Compositional Solution Space Quantification for Probabilistic Software Analysis
NASA Technical Reports Server (NTRS)
Borges, Mateus; Pasareanu, Corina S.; Filieri, Antonio; d'Amorim, Marcelo; Visser, Willem
2014-01-01
Probabilistic software analysis aims at quantifying how likely a target event is to occur during program execution. Current approaches rely on symbolic execution to identify the conditions to reach the target event and try to quantify the fraction of the input domain satisfying these conditions. Precise quantification is usually limited to linear constraints, while only approximate solutions can be provided in general through statistical approaches. However, statistical approaches may fail to converge to an acceptable accuracy within a reasonable time. We present a compositional statistical approach for the efficient quantification of solution spaces for arbitrarily complex constraints over bounded floating-point domains. The approach leverages interval constraint propagation to improve the accuracy of the estimation by focusing the sampling on the regions of the input domain containing the sought solutions. Preliminary experiments show significant improvement on previous approaches both in results accuracy and analysis time.
Modeling the Development of Audiovisual Cue Integration in Speech Perception
Getz, Laura M.; Nordeen, Elke R.; Vrabic, Sarah C.; Toscano, Joseph C.
2017-01-01
Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues. PMID:28335558
Modeling the Development of Audiovisual Cue Integration in Speech Perception.
Getz, Laura M; Nordeen, Elke R; Vrabic, Sarah C; Toscano, Joseph C
2017-03-21
Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues.
Refined generalized multiscale entropy analysis for physiological signals
NASA Astrophysics Data System (ADS)
Liu, Yunxiao; Lin, Youfang; Wang, Jing; Shang, Pengjian
2018-01-01
Multiscale entropy analysis has become a prevalent complexity measurement and been successfully applied in various fields. However, it only takes into account the information of mean values (first moment) in coarse-graining procedure. Then generalized multiscale entropy (MSEn) considering higher moments to coarse-grain a time series was proposed and MSEσ2 has been implemented. However, the MSEσ2 sometimes may yield an imprecise estimation of entropy or undefined entropy, and reduce statistical reliability of sample entropy estimation as scale factor increases. For this purpose, we developed the refined model, RMSEσ2, to improve MSEσ2. Simulations on both white noise and 1 / f noise show that RMSEσ2 provides higher entropy reliability and reduces the occurrence of undefined entropy, especially suitable for short time series. Besides, we discuss the effect on RMSEσ2 analysis from outliers, data loss and other concepts in signal processing. We apply the proposed model to evaluate the complexity of heartbeat interval time series derived from healthy young and elderly subjects, patients with congestive heart failure and patients with atrial fibrillation respectively, compared to several popular complexity metrics. The results demonstrate that RMSEσ2 measured complexity (a) decreases with aging and diseases, and (b) gives significant discrimination between different physiological/pathological states, which may facilitate clinical application.
What do we gain from simplicity versus complexity in species distribution models?
Merow, Cory; Smith, Matthew J.; Edwards, Thomas C.; Guisan, Antoine; McMahon, Sean M.; Normand, Signe; Thuiller, Wilfried; Wuest, Rafael O.; Zimmermann, Niklaus E.; Elith, Jane
2014-01-01
Species distribution models (SDMs) are widely used to explain and predict species ranges and environmental niches. They are most commonly constructed by inferring species' occurrence–environment relationships using statistical and machine-learning methods. The variety of methods that can be used to construct SDMs (e.g. generalized linear/additive models, tree-based models, maximum entropy, etc.), and the variety of ways that such models can be implemented, permits substantial flexibility in SDM complexity. Building models with an appropriate amount of complexity for the study objectives is critical for robust inference. We characterize complexity as the shape of the inferred occurrence–environment relationships and the number of parameters used to describe them, and search for insights into whether additional complexity is informative or superfluous. By building ‘under fit’ models, having insufficient flexibility to describe observed occurrence–environment relationships, we risk misunderstanding the factors shaping species distributions. By building ‘over fit’ models, with excessive flexibility, we risk inadvertently ascribing pattern to noise or building opaque models. However, model selection can be challenging, especially when comparing models constructed under different modeling approaches. Here we argue for a more pragmatic approach: researchers should constrain the complexity of their models based on study objective, attributes of the data, and an understanding of how these interact with the underlying biological processes. We discuss guidelines for balancing under fitting with over fitting and consequently how complexity affects decisions made during model building. Although some generalities are possible, our discussion reflects differences in opinions that favor simpler versus more complex models. We conclude that combining insights from both simple and complex SDM building approaches best advances our knowledge of current and future species ranges.
Derivative Free Optimization of Complex Systems with the Use of Statistical Machine Learning Models
2015-09-12
AFRL-AFOSR-VA-TR-2015-0278 DERIVATIVE FREE OPTIMIZATION OF COMPLEX SYSTEMS WITH THE USE OF STATISTICAL MACHINE LEARNING MODELS Katya Scheinberg...COMPLEX SYSTEMS WITH THE USE OF STATISTICAL MACHINE LEARNING MODELS 5a. CONTRACT NUMBER 5b. GRANT NUMBER FA9550-11-1-0239 5c. PROGRAM ELEMENT...developed, which has been the focus of our research. 15. SUBJECT TERMS optimization, Derivative-Free Optimization, Statistical Machine Learning 16. SECURITY
General Aviation Avionics Statistics : 1975
DOT National Transportation Integrated Search
1978-06-01
This report presents avionics statistics for the 1975 general aviation (GA) aircraft fleet and updates a previous publication, General Aviation Avionics Statistics: 1974. The statistics are presented in a capability group framework which enables one ...
Empirical support for the definition of a complex trauma event in children and adolescents.
Wamser-Nanney, Rachel; Vandenberg, Brian R
2013-12-01
Complex trauma events have been defined as chronic, interpersonal traumas that begin early in life (Cook, Blaustein, Spinazzola, & van der Kolk, 2003). The complex trauma definition has been examined in adults, as indicated by the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV) field trial; however, this research was lacking in child populations. The symptom presentations of complexly traumatized children were contrasted with those exposed to other, less severe trauma ecologies that met 1 or 2 features of the complex trauma definition. Included in this study were 346 treatment-seeking children and adolescents (ages 3–18 years) who had experienced atraumatic event. Results indicated that child survivors of complex trauma presented with higher levels of generalized behavior problems and trauma-related symptoms than those who experienced (a) acute noninterpersonal trauma, (b) chronic interpersonal trauma that begins later in life, and (c) acute interpersonal trauma. Greater levels of behavioral problems were observed in children exposed to complex trauma as compared to those who experienced a traumatic event that begins early in life. These results provide support for the complex trauma event definition and suggest the need for a complex trauma diagnostic construct for children and adolescents.
NASA Astrophysics Data System (ADS)
Challet, Damien; Marsili, M.; Ottino, Gabriele
2004-02-01
We mathematize El Farol bar problem and transform it into a workable model. We find general conditions on the predictor space under which the convergence of the average attendance to the resource level does not require any intelligence on the side of the agents. Secondly, specializing to a particular ensemble of continuous strategies yields a model similar to the Minority Game. Statistical physics of disordered systems allows us to derive a complete understanding of the complex behavior of this model, on the basis of its phase diagram.
Properties of a memory network in psychology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wedemann, Roseli S.; Donangelo, Raul; Carvalho, Luis A. V. de
We have previously described neurotic psychopathology and psychoanalytic working-through by an associative memory mechanism, based on a neural network model, where memory was modelled by a Boltzmann machine (BM). Since brain neural topology is selectively structured, we simulated known microscopic mechanisms that control synaptic properties, showing that the network self-organizes to a hierarchical, clustered structure. Here, we show some statistical mechanical properties of the complex networks which result from this self-organization. They indicate that a generalization of the BM may be necessary to model memory.
Fatigue criterion to system design, life and reliability
NASA Technical Reports Server (NTRS)
Zaretsky, E. V.
1985-01-01
A generalized methodology to structural life prediction, design, and reliability based upon a fatigue criterion is advanced. The life prediction methodology is based in part on work of W. Weibull and G. Lundberg and A. Palmgren. The approach incorporates the computed life of elemental stress volumes of a complex machine element to predict system life. The results of coupon fatigue testing can be incorporated into the analysis allowing for life prediction and component or structural renewal rates with reasonable statistical certainty.
Properties of a memory network in psychology
NASA Astrophysics Data System (ADS)
Wedemann, Roseli S.; Donangelo, Raul; de Carvalho, Luís A. V.
2007-12-01
We have previously described neurotic psychopathology and psychoanalytic working-through by an associative memory mechanism, based on a neural network model, where memory was modelled by a Boltzmann machine (BM). Since brain neural topology is selectively structured, we simulated known microscopic mechanisms that control synaptic properties, showing that the network self-organizes to a hierarchical, clustered structure. Here, we show some statistical mechanical properties of the complex networks which result from this self-organization. They indicate that a generalization of the BM may be necessary to model memory.
NASA Astrophysics Data System (ADS)
Ross, Z. E.; Ben-Zion, Y.; Zhu, L.
2015-02-01
We analyse source tensor properties of seven Mw > 4.2 earthquakes in the complex trifurcation area of the San Jacinto Fault Zone, CA, with a focus on isotropic radiation that may be produced by rock damage in the source volumes. The earthquake mechanisms are derived with generalized `Cut and Paste' (gCAP) inversions of three-component waveforms typically recorded by >70 stations at regional distances. The gCAP method includes parameters ζ and χ representing, respectively, the relative strength of the isotropic and CLVD source terms. The possible errors in the isotropic and CLVD components due to station variability is quantified with bootstrap resampling for each event. The results indicate statistically significant explosive isotropic components for at least six of the events, corresponding to ˜0.4-8 per cent of the total potency/moment of the sources. In contrast, the CLVD components for most events are not found to be statistically significant. Trade-off and correlation between the isotropic and CLVD components are studied using synthetic tests with realistic station configurations. The associated uncertainties are found to be generally smaller than the observed isotropic components. Two different tests with velocity model perturbation are conducted to quantify the uncertainty due to inaccuracies in the Green's functions. Applications of the Mann-Whitney U test indicate statistically significant explosive isotropic terms for most events consistent with brittle damage production at the source.
General Aviation Avionics Statistics : 1976
DOT National Transportation Integrated Search
1979-11-01
This report presents avionics statistics for the 1976 general aviation (GA) aircraft fleet and is the third in a series titled "General Aviation Avionics Statistics." The statistics are presented in a capability group framework which enables one to r...
General Aviation Avionics Statistics : 1978 Data
DOT National Transportation Integrated Search
1980-12-01
The report presents avionics statistics for the 1978 general aviation (GA) aircraft fleet and is the fifth in a series titled "General Aviation Statistics." The statistics are presented in a capability group framework which enables one to relate airb...
General Aviation Avionics Statistics : 1979 Data
DOT National Transportation Integrated Search
1981-04-01
This report presents avionics statistics for the 1979 general aviation (GA) aircraft fleet and is the sixth in a series titled General Aviation Avionics Statistics. The statistics preseneted in a capability group framework which enables one to relate...
Gu, Ming-liang; Chu, Jia-you
2007-12-01
Human genome has structures of haplotype and haplotype block which provide valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to complex diseases. Haplotype block can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of ptag SNPsq can be used to distinguish a large fraction of the haplotypes. These tag SNPs can be potentially useful for construction of haplotype and haplotype block, and association studies in complex diseases. There are two general classes of methods to construct haplotype and haplotype blocks based on genotypes on large pedigrees and statistical algorithms respectively. The author evaluate several construction methods to assess the power of different association tests with a variety of disease models and block-partitioning criteria. The advantages, limitations and applications of each method and the application in the association studies are discussed equitably. With the completion of the HapMap and development of statistical algorithms for addressing haplotype reconstruction, ideas of construction of haplotype based on combination of mathematics, physics, and computer science etc will have profound impacts on population genetics, location and cloning for susceptible genes in complex diseases, and related domain with life science etc.
Complex Network Analysis for Characterizing Global Value Chains in Equipment Manufacturing
Meng, Bo; Cheng, Lihong
2017-01-01
The rise of global value chains (GVCs) characterized by the so-called “outsourcing”, “fragmentation production”, and “trade in tasks” has been considered one of the most important phenomena for the 21st century trade. GVCs also can play a decisive role in trade policy making. However, due to the increasing complexity and sophistication of international production networks, especially in the equipment manufacturing industry, conventional trade statistics and the corresponding trade indicators may give us a distorted picture of trade. This paper applies various network analysis tools to the new GVC accounting system proposed by Koopman et al. (2014) and Wang et al. (2013) in which gross exports can be decomposed into value-added terms through various routes along GVCs. This helps to divide the equipment manufacturing-related GVCs into some sub-networks with clear visualization. The empirical results of this paper significantly improve our understanding of the topology of equipment manufacturing-related GVCs as well as the interdependency of countries in these GVCs that is generally invisible from the traditional trade statistics. PMID:28081201
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-01-01
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell. DOI: http://dx.doi.org/10.7554/eLife.26580.001 PMID:28678007
Young, Katelyn A; Lane, Samantha M; Widger, John E; Neuhaus, Nina M; Dove, James T; Fluck, Marcus; Hunsinger, Marie A; Blansfield, Joseph A; Shabahang, Mohsen M
Characterize the concordance among faculty and resident perceptions of surgical case complexity, resident technical performance, and autonomy in a diverse sample of general surgery procedures using case-specific evaluations. A prospective study was conducted in which a faculty surgeon and surgical resident independently completed a postoperative assessment examining case complexity, resident operative performance (Milestone assessment) and autonomy (Zwisch model). Pearson correlation coefficients (r) reaching statistical significance (p < 0.05) were further classified as moderate (r ≥ 0.40), strong (r ≥ 0.60), or very strong (r ≥ 0.80). This study was conducted in the General Surgery Residency Program at an academic tertiary care facility (Geisinger Medical Center, Danville, PA). Participants included 6 faculty surgeons, in addition to 5 postgraduate year (PGY) 1, 6 midlevel (PGY 2-3), and 4 chief (PGY 4-5) residents. In total, 75 surgical cases were analyzed. Midlevel residents accounted for the highest number of cases (35, 46.6%). Overall, faculty and resident perceptions of case complexity demonstrated a strong correlation (r = 0.76, p < 0.0001). Technical performance scores were also strongly correlated (r = 0.66, p < 0.0001), whereas perceptions of autonomy demonstrated a moderate correlation (r = 0.56, p < 0.0001). Subgroup analysis revealed very strong correlations among faculty perceptions of case complexity and the perceptions of PGY 1 (r = 0.80, p < 0.0001) and chief residents (r = 0.82, p < 0.0001). All other intergroup correlations were strong with 2 notable exceptions as follows: midlevel and chief residents failed to correlate with faculty perceptions of autonomy and operative performance, respectively. General surgery residents generally demonstrated high correlations with faculty perceptions of case complexity, technical performance, and operative autonomy. This generalized accord supports the use of the Milestone and Zwisch assessments in residency programs. However, discordance among perceptions of midlevel resident autonomy and chief resident operative performance suggests that these trainees may need more direct communication from the faculty. Copyright © 2017 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Statistical Features of Complex Systems ---Toward Establishing Sociological Physics---
NASA Astrophysics Data System (ADS)
Kobayashi, Naoki; Kuninaka, Hiroto; Wakita, Jun-ichi; Matsushita, Mitsugu
2011-07-01
Complex systems have recently attracted much attention, both in natural sciences and in sociological sciences. Members constituting a complex system evolve through nonlinear interactions among each other. This means that in a complex system the multiplicative experience or, so to speak, the history of each member produces its present characteristics. If attention is paid to any statistical property in any complex system, the lognormal distribution is the most natural and appropriate among the standard or ``normal'' statistics to overview the whole system. In fact, the lognormality emerges rather conspicuously when we examine, as familiar and typical examples of statistical aspects in complex systems, the nursing-care period for the aged, populations of prefectures and municipalities, and our body height and weight. Many other examples are found in nature and society. On the basis of these observations, we discuss the possibility of sociological physics.
Statistical mechanics of complex economies
NASA Astrophysics Data System (ADS)
Bardoscia, Marco; Livan, Giacomo; Marsili, Matteo
2017-04-01
In the pursuit of ever increasing efficiency and growth, our economies have evolved to remarkable degrees of complexity, with nested production processes feeding each other in order to create products of greater sophistication from less sophisticated ones, down to raw materials. The engine of such an expansion have been competitive markets that, according to general equilibrium theory (GET), achieve efficient allocations under specific conditions. We study large random economies within the GET framework, as templates of complex economies, and we find that a non-trivial phase transition occurs: the economy freezes in a state where all production processes collapse when either the number of primary goods or the number of available technologies fall below a critical threshold. As in other examples of phase transitions in large random systems, this is an unintended consequence of the growth in complexity. Our findings suggest that the Industrial Revolution can be regarded as a sharp transition between different phases, but also imply that well developed economies can collapse if too many intermediate goods are introduced.
Soave, David; Sun, Lei
2017-09-01
We generalize Levene's test for variance (scale) heterogeneity between k groups for more complex data, when there are sample correlation and group membership uncertainty. Following a two-stage regression framework, we show that least absolute deviation regression must be used in the stage 1 analysis to ensure a correct asymptotic χk-12/(k-1) distribution of the generalized scale (gS) test statistic. We then show that the proposed gS test is independent of the generalized location test, under the joint null hypothesis of no mean and no variance heterogeneity. Consequently, we generalize the recently proposed joint location-scale (gJLS) test, valuable in settings where there is an interaction effect but one interacting variable is not available. We evaluate the proposed method via an extensive simulation study and two genetic association application studies. © 2017 The Authors Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.
Ecogeographic Genetic Epidemiology
Sloan, Chantel D.; Duell, Eric J.; Shi, Xun; Irwin, Rebecca; Andrew, Angeline S.; Williams, Scott M.; Moore, Jason H.
2009-01-01
Complex diseases such as cancer and heart disease result from interactions between an individual's genetics and environment, i.e. their human ecology. Rates of complex diseases have consistently demonstrated geographic patterns of incidence, or spatial “clusters” of increased incidence relative to the general population. Likewise, genetic subpopulations and environmental influences are not evenly distributed across space. Merging appropriate methods from genetic epidemiology, ecology and geography will provide a more complete understanding of the spatial interactions between genetics and environment that result in spatial patterning of disease rates. Geographic Information Systems (GIS), which are tools designed specifically for dealing with geographic data and performing spatial analyses to determine their relationship, are key to this kind of data integration. Here the authors introduce a new interdisciplinary paradigm, ecogeographic genetic epidemiology, which uses GIS and spatial statistical analyses to layer genetic subpopulation and environmental data with disease rates and thereby discern the complex gene-environment interactions which result in spatial patterns of incidence. PMID:19025788
Weck, P J; Schaffner, D A; Brown, M R; Wicks, R T
2015-02-01
The Bandt-Pompe permutation entropy and the Jensen-Shannon statistical complexity are used to analyze fluctuating time series of three different turbulent plasmas: the magnetohydrodynamic (MHD) turbulence in the plasma wind tunnel of the Swarthmore Spheromak Experiment (SSX), drift-wave turbulence of ion saturation current fluctuations in the edge of the Large Plasma Device (LAPD), and fully developed turbulent magnetic fluctuations of the solar wind taken from the Wind spacecraft. The entropy and complexity values are presented as coordinates on the CH plane for comparison among the different plasma environments and other fluctuation models. The solar wind is found to have the highest permutation entropy and lowest statistical complexity of the three data sets analyzed. Both laboratory data sets have larger values of statistical complexity, suggesting that these systems have fewer degrees of freedom in their fluctuations, with SSX magnetic fluctuations having slightly less complexity than the LAPD edge I(sat). The CH plane coordinates are compared to the shape and distribution of a spectral decomposition of the wave forms. These results suggest that fully developed turbulence (solar wind) occupies the lower-right region of the CH plane, and that other plasma systems considered to be turbulent have less permutation entropy and more statistical complexity. This paper presents use of this statistical analysis tool on solar wind plasma, as well as on an MHD turbulent experimental plasma.
A statistical shape model of the human second cervical vertebra.
Clogenson, Marine; Duff, John M; Luethi, Marcel; Levivier, Marc; Meuli, Reto; Baur, Charles; Henein, Simon
2015-07-01
Statistical shape and appearance models play an important role in reducing the segmentation processing time of a vertebra and in improving results for 3D model development. Here, we describe the different steps in generating a statistical shape model (SSM) of the second cervical vertebra (C2) and provide the shape model for general use by the scientific community. The main difficulties in its construction are the morphological complexity of the C2 and its variability in the population. The input dataset is composed of manually segmented anonymized patient computerized tomography (CT) scans. The alignment of the different datasets is done with the procrustes alignment on surface models, and then, the registration is cast as a model-fitting problem using a Gaussian process. A principal component analysis (PCA)-based model is generated which includes the variability of the C2. The SSM was generated using 92 CT scans. The resulting SSM was evaluated for specificity, compactness and generalization ability. The SSM of the C2 is freely available to the scientific community in Slicer (an open source software for image analysis and scientific visualization) with a module created to visualize the SSM using Statismo, a framework for statistical shape modeling. The SSM of the vertebra allows the shape variability of the C2 to be represented. Moreover, the SSM will enable semi-automatic segmentation and 3D model generation of the vertebra, which would greatly benefit surgery planning.
The distribution of density in supersonic turbulence
NASA Astrophysics Data System (ADS)
Squire, Jonathan; Hopkins, Philip F.
2017-11-01
We propose a model for the statistics of the mass density in supersonic turbulence, which plays a crucial role in star formation and the physics of the interstellar medium (ISM). The model is derived by considering the density to be arranged as a collection of strong shocks of width ˜ M^{-2}, where M is the turbulent Mach number. With two physically motivated parameters, the model predicts all density statistics for M>1 turbulence: the density probability distribution and its intermittency (deviation from lognormality), the density variance-Mach number relation, power spectra and structure functions. For the proposed model parameters, reasonable agreement is seen between model predictions and numerical simulations, albeit within the large uncertainties associated with current simulation results. More generally, the model could provide a useful framework for more detailed analysis of future simulations and observational data. Due to the simple physical motivations for the model in terms of shocks, it is straightforward to generalize to more complex physical processes, which will be helpful in future more detailed applications to the ISM. We see good qualitative agreement between such extensions and recent simulations of non-isothermal turbulence.
Analysis of swarm behaviors based on an inversion of the fluctuation theorem.
Hamann, Heiko; Schmickl, Thomas; Crailsheim, Karl
2014-01-01
A grand challenge in the field of artificial life is to find a general theory of emergent self-organizing systems. In swarm systems most of the observed complexity is based on motion of simple entities. Similarly, statistical mechanics focuses on collective properties induced by the motion of many interacting particles. In this article we apply methods from statistical mechanics to swarm systems. We try to explain the emergent behavior of a simulated swarm by applying methods based on the fluctuation theorem. Empirical results indicate that swarms are able to produce negative entropy within an isolated subsystem due to frozen accidents. Individuals of a swarm are able to locally detect fluctuations of the global entropy measure and store them, if they are negative entropy productions. By accumulating these stored fluctuations over time the swarm as a whole is producing negative entropy and the system ends up in an ordered state. We claim that this indicates the existence of an inverted fluctuation theorem for emergent self-organizing dissipative systems. This approach bears the potential of general applicability.
Thermodynamic limit of random partitions and dispersionless Toda hierarchy
NASA Astrophysics Data System (ADS)
Takasaki, Kanehisa; Nakatsu, Toshio
2012-01-01
We study the thermodynamic limit of random partition models for the instanton sum of 4D and 5D supersymmetric U(1) gauge theories deformed by some physical observables. The physical observables correspond to external potentials in the statistical model. The partition function is reformulated in terms of the density function of Maya diagrams. The thermodynamic limit is governed by a limit shape of Young diagrams associated with dominant terms in the partition function. The limit shape is characterized by a variational problem, which is further converted to a scalar-valued Riemann-Hilbert problem. This Riemann-Hilbert problem is solved with the aid of a complex curve, which may be thought of as the Seiberg-Witten curve of the deformed U(1) gauge theory. This solution of the Riemann-Hilbert problem is identified with a special solution of the dispersionless Toda hierarchy that satisfies a pair of generalized string equations. The generalized string equations for the 5D gauge theory are shown to be related to hidden symmetries of the statistical model. The prepotential and the Seiberg-Witten differential are also considered.
NASA Astrophysics Data System (ADS)
Suess, Daniel; Rudnicki, Łukasz; maciel, Thiago O.; Gross, David
2017-09-01
The outcomes of quantum mechanical measurements are inherently random. It is therefore necessary to develop stringent methods for quantifying the degree of statistical uncertainty about the results of quantum experiments. For the particularly relevant task of quantum state tomography, it has been shown that a significant reduction in uncertainty can be achieved by taking the positivity of quantum states into account. However—the large number of partial results and heuristics notwithstanding—no efficient general algorithm is known that produces an optimal uncertainty region from experimental data, while making use of the prior constraint of positivity. Here, we provide a precise formulation of this problem and show that the general case is NP-hard. Our result leaves room for the existence of efficient approximate solutions, and therefore does not in itself imply that the practical task of quantum uncertainty quantification is intractable. However, it does show that there exists a non-trivial trade-off between optimality and computational efficiency for error regions. We prove two versions of the result: one for frequentist and one for Bayesian statistics.
Self-learning Monte Carlo method and cumulative update in fermion systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Junwei; Shen, Huitao; Qi, Yang
2017-06-07
In this study, we develop the self-learning Monte Carlo (SLMC) method, a general-purpose numerical method recently introduced to simulate many-body systems, for studying interacting fermion systems. Our method uses a highly efficient update algorithm, which we design and dub “cumulative update”, to generate new candidate configurations in the Markov chain based on a self-learned bosonic effective model. From a general analysis and a numerical study of the double exchange model as an example, we find that the SLMC with cumulative update drastically reduces the computational cost of the simulation, while remaining statistically exact. Remarkably, its computational complexity is far lessmore » than the conventional algorithm with local updates.« less
The Performance Analysis Based on SAR Sample Covariance Matrix
Erten, Esra
2012-01-01
Multi-channel systems appear in several fields of application in science. In the Synthetic Aperture Radar (SAR) context, multi-channel systems may refer to different domains, as multi-polarization, multi-interferometric or multi-temporal data, or even a combination of them. Due to the inherent speckle phenomenon present in SAR images, the statistical description of the data is almost mandatory for its utilization. The complex images acquired over natural media present in general zero-mean circular Gaussian characteristics. In this case, second order statistics as the multi-channel covariance matrix fully describe the data. For practical situations however, the covariance matrix has to be estimated using a limited number of samples, and this sample covariance matrix follow the complex Wishart distribution. In this context, the eigendecomposition of the multi-channel covariance matrix has been shown in different areas of high relevance regarding the physical properties of the imaged scene. Specifically, the maximum eigenvalue of the covariance matrix has been frequently used in different applications as target or change detection, estimation of the dominant scattering mechanism in polarimetric data, moving target indication, etc. In this paper, the statistical behavior of the maximum eigenvalue derived from the eigendecomposition of the sample multi-channel covariance matrix in terms of multi-channel SAR images is simplified for SAR community. Validation is performed against simulated data and examples of estimation and detection problems using the analytical expressions are as well given. PMID:22736976
Rinaldi, Antonio
2011-04-01
Traditional fiber bundles models (FBMs) have been an effective tool to understand brittle heterogeneous systems. However, fiber bundles in modern nano- and bioapplications demand a new generation of FBM capturing more complex deformation processes in addition to damage. In the context of loose bundle systems and with reference to time-independent plasticity and soft biomaterials, we formulate a generalized statistical model for ductile fracture and nonlinear elastic problems capable of handling more simultaneous deformation mechanisms by means of two order parameters (as opposed to one). As the first rational FBM for coupled damage problems, it may be the cornerstone for advanced statistical models of heterogeneous systems in nanoscience and materials design, especially to explore hierarchical and bio-inspired concepts in the arena of nanobiotechnology. Applicative examples are provided for illustrative purposes at last, discussing issues in inverse analysis (i.e., nonlinear elastic polymer fiber and ductile Cu submicron bars arrays) and direct design (i.e., strength prediction).
A Flexible Approach for the Statistical Visualization of Ensemble Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potter, K.; Wilson, A.; Bremer, P.
2009-09-29
Scientists are increasingly moving towards ensemble data sets to explore relationships present in dynamic systems. Ensemble data sets combine spatio-temporal simulation results generated using multiple numerical models, sampled input conditions and perturbed parameters. While ensemble data sets are a powerful tool for mitigating uncertainty, they pose significant visualization and analysis challenges due to their complexity. We present a collection of overview and statistical displays linked through a high level of interactivity to provide a framework for gaining key scientific insight into the distribution of the simulation results as well as the uncertainty associated with the data. In contrast to methodsmore » that present large amounts of diverse information in a single display, we argue that combining multiple linked statistical displays yields a clearer presentation of the data and facilitates a greater level of visual data analysis. We demonstrate this approach using driving problems from climate modeling and meteorology and discuss generalizations to other fields.« less
Quantum-Like Bayesian Networks for Modeling Decision Making
Moreira, Catarina; Wichert, Andreas
2016-01-01
In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669
Key statistical and analytical issues for evaluating treatment effects in periodontal research.
Tu, Yu-Kang; Gilthorpe, Mark S
2012-06-01
Statistics is an indispensible tool for evaluating treatment effects in clinical research. Due to the complexities of periodontal disease progression and data collection, statistical analyses for periodontal research have been a great challenge for both clinicians and statisticians. The aim of this article is to provide an overview of several basic, but important, statistical issues related to the evaluation of treatment effects and to clarify some common statistical misconceptions. Some of these issues are general, concerning many disciplines, and some are unique to periodontal research. We first discuss several statistical concepts that have sometimes been overlooked or misunderstood by periodontal researchers. For instance, decisions about whether to use the t-test or analysis of covariance, or whether to use parametric tests such as the t-test or its non-parametric counterpart, the Mann-Whitney U-test, have perplexed many periodontal researchers. We also describe more advanced methodological issues that have sometimes been overlooked by researchers. For instance, the phenomenon of regression to the mean is a fundamental issue to be considered when evaluating treatment effects, and collinearity amongst covariates is a conundrum that must be resolved when explaining and predicting treatment effects. Quick and easy solutions to these methodological and analytical issues are not always available in the literature, and careful statistical thinking is paramount when conducting useful and meaningful research. © 2012 John Wiley & Sons A/S.
Robust Strategy for Rocket Engine Health Monitoring
NASA Technical Reports Server (NTRS)
Santi, L. Michael
2001-01-01
Monitoring the health of rocket engine systems is essentially a two-phase process. The acquisition phase involves sensing physical conditions at selected locations, converting physical inputs to electrical signals, conditioning the signals as appropriate to establish scale or filter interference, and recording results in a form that is easy to interpret. The inference phase involves analysis of results from the acquisition phase, comparison of analysis results to established health measures, and assessment of health indications. A variety of analytical tools may be employed in the inference phase of health monitoring. These tools can be separated into three broad categories: statistical, rule based, and model based. Statistical methods can provide excellent comparative measures of engine operating health. They require well-characterized data from an ensemble of "typical" engines, or "golden" data from a specific test assumed to define the operating norm in order to establish reliable comparative measures. Statistical methods are generally suitable for real-time health monitoring because they do not deal with the physical complexities of engine operation. The utility of statistical methods in rocket engine health monitoring is hindered by practical limits on the quantity and quality of available data. This is due to the difficulty and high cost of data acquisition, the limited number of available test engines, and the problem of simulating flight conditions in ground test facilities. In addition, statistical methods incur a penalty for disregarding flow complexity and are therefore limited in their ability to define performance shift causality. Rule based methods infer the health state of the engine system based on comparison of individual measurements or combinations of measurements with defined health norms or rules. This does not mean that rule based methods are necessarily simple. Although binary yes-no health assessment can sometimes be established by relatively simple rules, the causality assignment needed for refined health monitoring often requires an exceptionally complex rule base involving complicated logical maps. Structuring the rule system to be clear and unambiguous can be difficult, and the expert input required to maintain a large logic network and associated rule base can be prohibitive.
Exponentiated power Lindley distribution.
Ashour, Samir K; Eltehiwy, Mahmoud A
2015-11-01
A new generalization of the Lindley distribution is recently proposed by Ghitany et al. [1], called as the power Lindley distribution. Another generalization of the Lindley distribution was introduced by Nadarajah et al. [2], named as the generalized Lindley distribution. This paper proposes a more generalization of the Lindley distribution which generalizes the two. We refer to this new generalization as the exponentiated power Lindley distribution. The new distribution is important since it contains as special sub-models some widely well-known distributions in addition to the above two models, such as the Lindley distribution among many others. It also provides more flexibility to analyze complex real data sets. We study some statistical properties for the new distribution. We discuss maximum likelihood estimation of the distribution parameters. Least square estimation is used to evaluate the parameters. Three algorithms are proposed for generating random data from the proposed distribution. An application of the model to a real data set is analyzed using the new distribution, which shows that the exponentiated power Lindley distribution can be used quite effectively in analyzing real lifetime data.
Statistical methods used in articles published by the Journal of Periodontal and Implant Science.
Choi, Eunsil; Lyu, Jiyoung; Park, Jinyoung; Kim, Hae-Young
2014-12-01
The purposes of this study were to assess the trend of use of statistical methods including parametric and nonparametric methods and to evaluate the use of complex statistical methodology in recent periodontal studies. This study analyzed 123 articles published in the Journal of Periodontal & Implant Science (JPIS) between 2010 and 2014. Frequencies and percentages were calculated according to the number of statistical methods used, the type of statistical method applied, and the type of statistical software used. Most of the published articles considered (64.4%) used statistical methods. Since 2011, the percentage of JPIS articles using statistics has increased. On the basis of multiple counting, we found that the percentage of studies in JPIS using parametric methods was 61.1%. Further, complex statistical methods were applied in only 6 of the published studies (5.0%), and nonparametric statistical methods were applied in 77 of the published studies (38.9% of a total of 198 studies considered). We found an increasing trend towards the application of statistical methods and nonparametric methods in recent periodontal studies and thus, concluded that increased use of complex statistical methodology might be preferred by the researchers in the fields of study covered by JPIS.
Power estimation using simulations for air pollution time-series studies
2012-01-01
Background Estimation of power to assess associations of interest can be challenging for time-series studies of the acute health effects of air pollution because there are two dimensions of sample size (time-series length and daily outcome counts), and because these studies often use generalized linear models to control for complex patterns of covariation between pollutants and time trends, meteorology and possibly other pollutants. In general, statistical software packages for power estimation rely on simplifying assumptions that may not adequately capture this complexity. Here we examine the impact of various factors affecting power using simulations, with comparison of power estimates obtained from simulations with those obtained using statistical software. Methods Power was estimated for various analyses within a time-series study of air pollution and emergency department visits using simulations for specified scenarios. Mean daily emergency department visit counts, model parameter value estimates and daily values for air pollution and meteorological variables from actual data (8/1/98 to 7/31/99 in Atlanta) were used to generate simulated daily outcome counts with specified temporal associations with air pollutants and randomly generated error based on a Poisson distribution. Power was estimated by conducting analyses of the association between simulated daily outcome counts and air pollution in 2000 data sets for each scenario. Power estimates from simulations and statistical software (G*Power and PASS) were compared. Results In the simulation results, increasing time-series length and average daily outcome counts both increased power to a similar extent. Our results also illustrate the low power that can result from using outcomes with low daily counts or short time series, and the reduction in power that can accompany use of multipollutant models. Power estimates obtained using standard statistical software were very similar to those from the simulations when properly implemented; implementation, however, was not straightforward. Conclusions These analyses demonstrate the similar impact on power of increasing time-series length versus increasing daily outcome counts, which has not previously been reported. Implementation of power software for these studies is discussed and guidance is provided. PMID:22995599
Power estimation using simulations for air pollution time-series studies.
Winquist, Andrea; Klein, Mitchel; Tolbert, Paige; Sarnat, Stefanie Ebelt
2012-09-20
Estimation of power to assess associations of interest can be challenging for time-series studies of the acute health effects of air pollution because there are two dimensions of sample size (time-series length and daily outcome counts), and because these studies often use generalized linear models to control for complex patterns of covariation between pollutants and time trends, meteorology and possibly other pollutants. In general, statistical software packages for power estimation rely on simplifying assumptions that may not adequately capture this complexity. Here we examine the impact of various factors affecting power using simulations, with comparison of power estimates obtained from simulations with those obtained using statistical software. Power was estimated for various analyses within a time-series study of air pollution and emergency department visits using simulations for specified scenarios. Mean daily emergency department visit counts, model parameter value estimates and daily values for air pollution and meteorological variables from actual data (8/1/98 to 7/31/99 in Atlanta) were used to generate simulated daily outcome counts with specified temporal associations with air pollutants and randomly generated error based on a Poisson distribution. Power was estimated by conducting analyses of the association between simulated daily outcome counts and air pollution in 2000 data sets for each scenario. Power estimates from simulations and statistical software (G*Power and PASS) were compared. In the simulation results, increasing time-series length and average daily outcome counts both increased power to a similar extent. Our results also illustrate the low power that can result from using outcomes with low daily counts or short time series, and the reduction in power that can accompany use of multipollutant models. Power estimates obtained using standard statistical software were very similar to those from the simulations when properly implemented; implementation, however, was not straightforward. These analyses demonstrate the similar impact on power of increasing time-series length versus increasing daily outcome counts, which has not previously been reported. Implementation of power software for these studies is discussed and guidance is provided.
Meng, Xiang-He; Shen, Hui; Chen, Xiang-Ding; Xiao, Hong-Mei; Deng, Hong-Wen
2018-03-01
Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with diverse complex phenotypes and diseases, and provided tremendous opportunities for further analyses using summary association statistics. Recently, Pickrell et al. developed a robust method for causal inference using independent putative causal SNPs. However, this method may fail to infer the causal relationship between two phenotypes when only a limited number of independent putative causal SNPs identified. Here, we extended Pickrell's method to make it more applicable for the general situations. We extended the causal inference method by replacing the putative causal SNPs with the lead SNPs (the set of the most significant SNPs in each independent locus) and tested the performance of our extended method using both simulation and empirical data. Simulations suggested that when the same number of genetic variants is used, our extended method had similar distribution of test statistic under the null model as well as comparable power under the causal model compared with the original method by Pickrell et al. But in practice, our extended method would generally be more powerful because the number of independent lead SNPs was often larger than the number of independent putative causal SNPs. And including more SNPs, on the other hand, would not cause more false positives. By applying our extended method to summary statistics from GWAS for blood metabolites and femoral neck bone mineral density (FN-BMD), we successfully identified ten blood metabolites that may causally influence FN-BMD. We extended a causal inference method for inferring putative causal relationship between two phenotypes using summary statistics from GWAS, and identified a number of potential causal metabolites for FN-BMD, which may provide novel insights into the pathophysiological mechanisms underlying osteoporosis.
Statistical complexity without explicit reference to underlying probabilities
NASA Astrophysics Data System (ADS)
Pennini, F.; Plastino, A.
2018-06-01
We show that extremely simple systems of a not too large number of particles can be simultaneously thermally stable and complex. To such an end, we extend the statistical complexity's notion to simple configurations of non-interacting particles, without appeal to probabilities, and discuss configurational properties.
On the Origins of Suboptimality in Human Probabilistic Inference
Acerbi, Luigi; Vijayakumar, Sethu; Wolpert, Daniel M.
2014-01-01
Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior. PMID:24945142
Decoding the spatial signatures of multi-scale climate variability - a climate network perspective
NASA Astrophysics Data System (ADS)
Donner, R. V.; Jajcay, N.; Wiedermann, M.; Ekhtiari, N.; Palus, M.
2017-12-01
During the last years, the application of complex networks as a versatile tool for analyzing complex spatio-temporal data has gained increasing interest. Establishing this approach as a new paradigm in climatology has already provided valuable insights into key spatio-temporal climate variability patterns across scales, including novel perspectives on the dynamics of the El Nino Southern Oscillation or the emergence of extreme precipitation patterns in monsoonal regions. In this work, we report first attempts to employ network analysis for disentangling multi-scale climate variability. Specifically, we introduce the concept of scale-specific climate networks, which comprises a sequence of networks representing the statistical association structure between variations at distinct time scales. For this purpose, we consider global surface air temperature reanalysis data and subject the corresponding time series at each grid point to a complex-valued continuous wavelet transform. From this time-scale decomposition, we obtain three types of signals per grid point and scale - amplitude, phase and reconstructed signal, the statistical similarity of which is then represented by three complex networks associated with each scale. We provide a detailed analysis of the resulting connectivity patterns reflecting the spatial organization of climate variability at each chosen time-scale. Global network characteristics like transitivity or network entropy are shown to provide a new view on the (global average) relevance of different time scales in climate dynamics. Beyond expected trends originating from the increasing smoothness of fluctuations at longer scales, network-based statistics reveal different degrees of fragmentation of spatial co-variability patterns at different scales and zonal shifts among the key players of climate variability from tropically to extra-tropically dominated patterns when moving from inter-annual to decadal scales and beyond. The obtained results demonstrate the potential usefulness of systematically exploiting scale-specific climate networks, whose general patterns are in line with existing climatological knowledge, but provide vast opportunities for further quantifications at local, regional and global scales that are yet to be explored.
Effective control of complex turbulent dynamical systems through statistical functionals.
Majda, Andrew J; Qi, Di
2017-05-30
Turbulent dynamical systems characterized by both a high-dimensional phase space and a large number of instabilities are ubiquitous among complex systems in science and engineering, including climate, material, and neural science. Control of these complex systems is a grand challenge, for example, in mitigating the effects of climate change or safe design of technology with fully developed shear turbulence. Control of flows in the transition to turbulence, where there is a small dimension of instabilities about a basic mean state, is an important and successful discipline. In complex turbulent dynamical systems, it is impossible to track and control the large dimension of instabilities, which strongly interact and exchange energy, and new control strategies are needed. The goal of this paper is to propose an effective statistical control strategy for complex turbulent dynamical systems based on a recent statistical energy principle and statistical linear response theory. We illustrate the potential practical efficiency and verify this effective statistical control strategy on the 40D Lorenz 1996 model in forcing regimes with various types of fully turbulent dynamics with nearly one-half of the phase space unstable.
Probing the space of toric quiver theories
NASA Astrophysics Data System (ADS)
Hewlett, Joseph; He, Yang-Hui
2010-03-01
We demonstrate a practical and efficient method for generating toric Calabi-Yau quiver theories, applicable to both D3 and M2 brane world-volume physics. A new analytic method is presented at low order parametres and an algorithm for the general case is developed which has polynomial complexity in the number of edges in the quiver. Using this algorithm, carefully implemented, we classify the quiver diagram and assign possible superpotentials for various small values of the number of edges and nodes. We examine some preliminary statistics on this space of toric quiver theories.
ACIRF User’s Guide for the General Model (Version 3.5)
1992-06-01
61 3c Example ACIRF formatted output for the frozen-in model (summary of measured realization statistics for antenr.. 2...must be delta correlated in angle, delay, and Doppler frequency: < z(KL,O)o) *(Kijj",o) = S(K±,T, O)D) 5(KL-K’) 8(T-r’) 8(0D-Oab) .( 61 ) The first-order... 61 , and the central limit theorem could be invoked to argue that h(p,r,t) and hA(p,rt) are zero- mean, normally-distributed complex quantities. Indeed
Soto-Gordoa, Myriam; Arrospide, Arantzazu; Merino Hernández, Marisa; Mora Amengual, Joana; Fullaondo Zabala, Ane; Larrañaga, Igor; de Manuel, Esteban; Mar, Javier
2017-01-01
To develop a framework for the management of complex health care interventions within the Deming continuous improvement cycle and to test the framework in the case of an integrated intervention for multimorbid patients in the Basque Country within the CareWell project. Statistical analysis alone, although necessary, may not always represent the practical significance of the intervention. Thus, to ascertain the true economic impact of the intervention, the statistical results can be integrated into the budget impact analysis. The intervention of the case study consisted of a comprehensive approach that integrated new provider roles and new technological infrastructure for multimorbid patients, with the aim of reducing patient decompensations by 10% over 5 years. The study period was 2012 to 2020. Given the aging of the general population, the conventional scenario predicts an increase of 21% in the health care budget for care of multimorbid patients during the study period. With a successful intervention, this figure should drop to 18%. The statistical analysis, however, showed no significant differences in costs either in primary care or in hospital care between 2012 and 2014. The real costs in 2014 were by far closer to those in the conventional scenario than to the reductions expected in the objective scenario. The present implementation should be reappraised, because the present expenditure did not move closer to the objective budget. This work demonstrates the capacity of budget impact analysis to enhance the implementation of complex interventions. Its integration in the context of the continuous improvement cycle is transferable to other contexts in which implementation depth and time are important. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Campbell, D A; Chkrebtii, O
2013-12-01
Statistical inference for biochemical models often faces a variety of characteristic challenges. In this paper we examine state and parameter estimation for the JAK-STAT intracellular signalling mechanism, which exemplifies the implementation intricacies common in many biochemical inference problems. We introduce an extension to the Generalized Smoothing approach for estimating delay differential equation models, addressing selection of complexity parameters, choice of the basis system, and appropriate optimization strategies. Motivated by the JAK-STAT system, we further extend the generalized smoothing approach to consider a nonlinear observation process with additional unknown parameters, and highlight how the approach handles unobserved states and unevenly spaced observations. The methodology developed is generally applicable to problems of estimation for differential equation models with delays, unobserved states, nonlinear observation processes, and partially observed histories. Crown Copyright © 2013. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Fymat, A. L.
1978-01-01
A unifying approach, based on a generalization of Pearson's differential equation of statistical theory, is proposed for both the representation of particulate size distribution and the interpretation of radiometric measurements in terms of this parameter. A single-parameter gamma-type distribution is introduced, and it is shown that inversion can only provide the dimensionless parameter, r/ab (where r = particle radius, a = effective radius, b = effective variance), at least when the distribution vanishes at both ends. The basic inversion problem in reconstructing the particle size distribution is analyzed, and the existing methods are reviewed (with emphasis on their capabilities) and classified. A two-step strategy is proposed for simultaneously determining the complex refractive index and reconstructing the size distribution of atmospheric particulates.
NASA Astrophysics Data System (ADS)
Mahoney, John R.; Aghamohammadi, Cina; Crutchfield, James P.
2016-02-01
A stochastic process’ statistical complexity stands out as a fundamental property: the minimum information required to synchronize one process generator to another. How much information is required, though, when synchronizing over a quantum channel? Recent work demonstrated that representing causal similarity as quantum state-indistinguishability provides a quantum advantage. We generalize this to synchronization and offer a sequence of constructions that exploit extended causal structures, finding substantial increase of the quantum advantage. We demonstrate that maximum compression is determined by the process’ cryptic order-a classical, topological property closely allied to Markov order, itself a measure of historical dependence. We introduce an efficient algorithm that computes the quantum advantage and close noting that the advantage comes at a cost-one trades off prediction for generation complexity.
Mahoney, John R; Aghamohammadi, Cina; Crutchfield, James P
2016-02-15
A stochastic process' statistical complexity stands out as a fundamental property: the minimum information required to synchronize one process generator to another. How much information is required, though, when synchronizing over a quantum channel? Recent work demonstrated that representing causal similarity as quantum state-indistinguishability provides a quantum advantage. We generalize this to synchronization and offer a sequence of constructions that exploit extended causal structures, finding substantial increase of the quantum advantage. We demonstrate that maximum compression is determined by the process' cryptic order--a classical, topological property closely allied to Markov order, itself a measure of historical dependence. We introduce an efficient algorithm that computes the quantum advantage and close noting that the advantage comes at a cost-one trades off prediction for generation complexity.
Like Beauty, Complexity is Hard to Define
NASA Astrophysics Data System (ADS)
Tsallis, Constantino
Like beauty, complexity is hard to define and rather easy to identify: nonlinear dynamics, strongly interconnected simple elements, some sort of divisoria aquorum between order and disorder. Before focusing on complexity, let us remember that the theoretical pillars of contemporary physics are mechanics (Newtonian, relativistic, quantum), Maxwell electromagnetism, and (Boltzmann-Gibbs, BG) statistical mechanics - obligatory basic disciplines in any advanced course in physics. The firstprinciple statistical-mechanical approach starts from (microscopic) electro-mechanics and theory of probabilities, and, through a variety of possible mesoscopic descriptions, arrives to (oscopic) thermodynamics. In the middle of this trip, we cross energy and entropy. Energy is related to the possible microscopic configurations of the system, whereas entropy is related to the corresponding probabilities. Therefore, in some sense, entropy represents a concept which, epistemologically speaking, is one step further with regard to energy. The fact that energy is not parameter-independent is very familiar: the kinetic energy of a truck is very different from that of a fly, and the relativistic energy of a fast electron is very different from its classical value, and so on. What about entropy? One hundred and forty years of tradition, and hundreds - we may even say thousands - of impressive theoretical successes of the parameter-free BG entropy have sedimented, in the mind of many scientists, the conviction that it is unique. However, it can be straightforwardly argued that, in general, this is not the case...
NASA Astrophysics Data System (ADS)
Menzel, Andreas M.
2015-11-01
Diffusion of colloidal particles in a complex environment such as polymer networks or biological cells is a topic of high complexity with significant biological and medical relevance. In such situations, the interaction between the surroundings and the particle motion has to be taken into account. We analyze a simplified diffusion model that includes some aspects of a complex environment in the framework of a nonlinear friction process: at low particle speeds, friction grows linearly with the particle velocity as for regular viscous friction; it grows more than linearly at higher particle speeds; finally, at a maximum of the possible particle speed, the friction diverges. In addition to bare diffusion, we study the influence of a constant drift force acting on the diffusing particle. While the corresponding stationary velocity distributions can be derived analytically, the displacement statistics generally must be determined numerically. However, as a benefit of our model, analytical progress can be made in one case of a special maximum particle speed. The effect of a drift force in this case is analytically determined by perturbation theory. It will be interesting in the future to compare our results to real experimental systems. One realization could be magnetic colloidal particles diffusing through a shear-thickening environment such as starch suspensions, possibly exposed to an external magnetic field gradient.
An exploratory statistical approach to depression pattern identification
NASA Astrophysics Data System (ADS)
Feng, Qing Yi; Griffiths, Frances; Parsons, Nick; Gunn, Jane
2013-02-01
Depression is a complex phenomenon thought to be due to the interaction of biological, psychological and social factors. Currently depression assessment uses self-reported depressive symptoms but this is limited in the degree to which it can characterise the different expressions of depression emerging from the complex causal pathways that are thought to underlie depression. In this study, we aimed to represent the different patterns of depression with pattern values unique to each individual, where each value combines all the available information about an individual’s depression. We considered the depressed individual as a subsystem of an open complex system, proposed Generalized Information Entropy (GIE) to represent the general characteristics of information entropy of the system, and then implemented Maximum Entropy Estimates to derive equations for depression patterns. We also introduced a numerical simulation method to process the depression related data obtained by the Diamond Cohort Study which has been underway in Australia since 2005 involving 789 people. Unlike traditional assessment, we obtained a unique value for each depressed individual which gives an overall assessment of the depression pattern. Our work provides a novel way to visualise and quantitatively measure the depression pattern of the depressed individual which could be used for pattern categorisation. This may have potential for tailoring health interventions to depressed individuals to maximize health benefit.
Analysis and meta-analysis of single-case designs: an introduction.
Shadish, William R
2014-04-01
The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Hoover, Brian G; Gamiz, Victor L
2006-02-01
The scalar bidirectional reflectance distribution function (BRDF) due to a perfectly conducting surface with roughness and autocorrelation width comparable with the illumination wavelength is derived from coherence theory on the assumption of a random reflective phase screen and an expansion valid for large effective roughness. A general quadratic expansion of the two-dimensional isotropic surface autocorrelation function near the origin yields representative Cauchy and Gaussian BRDF solutions and an intermediate general solution as the sum of an incoherent component and a nonspecular coherent component proportional to an integral of the plasma dispersion function in the complex plane. Plots illustrate agreement of the derived general solution with original bistatic BRDF data due to a machined aluminum surface, and comparisons are drawn with previously published data in the examination of variations with incident angle, roughness, illumination wavelength, and autocorrelation coefficients in the bistatic and monostatic geometries. The general quadratic autocorrelation expansion provides a BRDF solution that smoothly interpolates between the well-known results of the linear and parabolic approximations.
ERIC Educational Resources Information Center
Smith, Rachel A.; Levine, Timothy R.; Lachlan, Kenneth A.; Fediuk, Thomas A.
2002-01-01
Notes that the availability of statistical software packages has led to a sharp increase in use of complex research designs and complex statistical analyses in communication research. Reports a series of Monte Carlo simulations which demonstrate that this complexity may come at a heavier cost than many communication researchers realize. Warns…
Focused Belief Measures for Uncertainty Quantification in High Performance Semantic Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joslyn, Cliff A.; Weaver, Jesse R.
In web-scale semantic data analytics there is a great need for methods which aggregate uncertainty claims, on the one hand respecting the information provided as accurately as possible, while on the other still being tractable. Traditional statistical methods are more robust, but only represent distributional, additive uncertainty. Generalized information theory methods, including fuzzy systems and Dempster-Shafer (DS) evidence theory, represent multiple forms of uncertainty, but are computationally and methodologically difficult. We require methods which provide an effective balance between the complete representation of the full complexity of uncertainty claims in their interaction, while satisfying the needs of both computational complexitymore » and human cognition. Here we build on J{\\o}sang's subjective logic to posit methods in focused belief measures (FBMs), where a full DS structure is focused to a single event. The resulting ternary logical structure is posited to be able to capture the minimal amount of generalized complexity needed at a maximum of computational efficiency. We demonstrate the efficacy of this approach in a web ingest experiment over the 2012 Billion Triple dataset from the Semantic Web Challenge.« less
Testing for independence in J×K contingency tables with complex sample survey data.
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Sinha, Debajyoti; Hevelone, Nathanael; Giovannucci, Edward; Hu, Jim C
2015-09-01
The test of independence of row and column variables in a (J×K) contingency table is a widely used statistical test in many areas of application. For complex survey samples, use of the standard Pearson chi-squared test is inappropriate due to correlation among units within the same cluster. Rao and Scott (1981, Journal of the American Statistical Association 76, 221-230) proposed an approach in which the standard Pearson chi-squared statistic is multiplied by a design effect to adjust for the complex survey design. Unfortunately, this test fails to exist when one of the observed cell counts equals zero. Even with the large samples typical of many complex surveys, zero cell counts can occur for rare events, small domains, or contingency tables with a large number of cells. Here, we propose Wald and score test statistics for independence based on weighted least squares estimating equations. In contrast to the Rao-Scott test statistic, the proposed Wald and score test statistics always exist. In simulations, the score test is found to perform best with respect to type I error. The proposed method is motivated by, and applied to, post surgical complications data from the United States' Nationwide Inpatient Sample (NIS) complex survey of hospitals in 2008. © 2015, The International Biometric Society.
Ge, Long; Tian, Jin-hui; Li, Xiu-xia; Song, Fujian; Li, Lun; Zhang, Jun; Li, Ge; Pei, Gai-qin; Qiu, Xia; Yang, Ke-hu
2016-01-01
Because of the methodological complexity of network meta-analyses (NMAs), NMAs may be more vulnerable to methodological risks than conventional pair-wise meta-analysis. Our study aims to investigate epidemiology characteristics, conduction of literature search, methodological quality and reporting of statistical analysis process in the field of cancer based on PRISMA extension statement and modified AMSTAR checklist. We identified and included 102 NMAs in the field of cancer. 61 NMAs were conducted using a Bayesian framework. Of them, more than half of NMAs did not report assessment of convergence (60.66%). Inconsistency was assessed in 27.87% of NMAs. Assessment of heterogeneity in traditional meta-analyses was more common (42.62%) than in NMAs (6.56%). Most of NMAs did not report assessment of similarity (86.89%) and did not used GRADE tool to assess quality of evidence (95.08%). 43 NMAs were adjusted indirect comparisons, the methods used were described in 53.49% NMAs. Only 4.65% NMAs described the details of handling of multi group trials and 6.98% described the methods of similarity assessment. The median total AMSTAR-score was 8.00 (IQR: 6.00–8.25). Methodological quality and reporting of statistical analysis did not substantially differ by selected general characteristics. Overall, the quality of NMAs in the field of cancer was generally acceptable. PMID:27848997
Two statistical mechanics aspects of complex networks
NASA Astrophysics Data System (ADS)
Thurner, Stefan; Biely, Christoly
2006-12-01
By adopting an ensemble interpretation of non-growing rewiring networks, network theory can be reduced to a counting problem of possible network states and an identification of their associated probabilities. We present two scenarios of how different rewirement schemes can be used to control the state probabilities of the system. In particular, we review how by generalizing the linking rules of random graphs, in combination with superstatistics and quantum mechanical concepts, one can establish an exact relation between the degree distribution of any given network and the nodes’ linking probability distributions. In a second approach, we control state probabilities by a network Hamiltonian, whose characteristics are motivated by biological and socio-economical statistical systems. We demonstrate that a thermodynamics of networks becomes a fully consistent concept, allowing to study e.g. ‘phase transitions’ and computing entropies through thermodynamic relations.
Waller, Niels
2018-01-01
Kristof's Theorem (Kristof, 1970 ) describes a matrix trace inequality that can be used to solve a wide-class of least-square optimization problems without calculus. Considering its generality, it is surprising that Kristof's Theorem is rarely used in statistics and psychometric applications. The underutilization of this method likely stems, in part, from the mathematical complexity of Kristof's ( 1964 , 1970 ) writings. In this article, I describe the underlying logic of Kristof's Theorem in simple terms by reviewing four key mathematical ideas that are used in the theorem's proof. I then show how Kristof's Theorem can be used to provide novel derivations to two cognate models from statistics and psychometrics. This tutorial includes a glossary of technical terms and an online supplement with R (R Core Team, 2017 ) code to perform the calculations described in the text.
Quantum statistics in complex networks
NASA Astrophysics Data System (ADS)
Bianconi, Ginestra
The Barabasi-Albert (BA) model for a complex network shows a characteristic power law connectivity distribution typical of scale free systems. The Ising model on the BA network shows that the ferromagnetic phase transition temperature depends logarithmically on its size. We have introduced a fitness parameter for the BA network which describes the different abilities of nodes to compete for links. This model predicts the formation of a scale free network where each node increases its connectivity in time as a power-law with an exponent depending on its fitness. This model includes the fact that the node connectivity and growth rate do not depend on the node age alone and it reproduces non trivial correlation properties of the Internet. We have proposed a model of bosonic networks by a generalization of the BA model where the properties of quantum statistics can be applied. We have introduced a fitness eta i = e-bei where the temperature T = 1/ b is determined by the noise in the system and the energy ei accounts for qualitative differences of each node for acquiring links. The results of this work show that a power law network with exponent gamma = 2 can give a Bose condensation where a single node grabs a finite fraction of all the links. In order to address the connection with self-organized processes we have introduced a model for a growing Cayley tree that generalizes the dynamics of invasion percolation. At each node we associate a parameter ei (called energy) such that the probability to grow for each node is given by pii ∝ ebei where T = 1/ b is a statistical parameter of the system determined by the noise called the temperature. This model has been solved analytically with a similar mathematical technique as the bosonic scale-free networks and it shows the self organization of the low energy nodes at the interface. In the thermodynamic limit the Fermi distribution describes the probability of the energy distribution at the interface.
Generalization of Entropy Based Divergence Measures for Symbolic Sequence Analysis
Ré, Miguel A.; Azad, Rajeev K.
2014-01-01
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms. PMID:24728338
Generalization of entropy based divergence measures for symbolic sequence analysis.
Ré, Miguel A; Azad, Rajeev K
2014-01-01
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms.
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Esteve-Altava, Borja; Boughner, Julia C.; Diogo, Rui; Villmoare, Brian A.; Rasskin-Gutman, Diego
2015-01-01
Modularity and complexity go hand in hand in the evolution of the skull of primates. Because analyses of these two parameters often use different approaches, we do not know yet how modularity evolves within, or as a consequence of, an also-evolving complex organization. Here we use a novel network theory-based approach (Anatomical Network Analysis) to assess how the organization of skull bones constrains the co-evolution of modularity and complexity among primates. We used the pattern of bone contacts modeled as networks to identify connectivity modules and quantify morphological complexity. We analyzed whether modularity and complexity evolved coordinately in the skull of primates. Specifically, we tested Herbert Simon’s general theory of near-decomposability, which states that modularity promotes the evolution of complexity. We found that the skulls of extant primates divide into one conserved cranial module and up to three labile facial modules, whose composition varies among primates. Despite changes in modularity, statistical analyses reject a positive feedback between modularity and complexity. Our results suggest a decoupling of complexity and modularity that translates to varying levels of constraint on the morphological evolvability of the primate skull. This study has methodological and conceptual implications for grasping the constraints that underlie the developmental and functional integration of the skull of humans and other primates. PMID:25992690
2012-01-01
High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods. Reviewers This article was reviewed by Arcady Mushegian, Byung-Soo Kim and Joel Bader. PMID:23227854
Universality of clone dynamics during tissue development
NASA Astrophysics Data System (ADS)
Rulands, Steffen; Lescroart, Fabienne; Chabab, Samira; Hindley, Christopher J.; Prior, Nicole; Sznurkowska, Magdalena K.; Huch, Meritxell; Philpott, Anna; Blanpain, Cedric; Simons, Benjamin D.
2018-05-01
The emergence of complex organs is driven by the coordinated proliferation, migration and differentiation of precursor cells. The fate behaviour of these cells is reflected in the time evolution of their progeny, termed clones, which serve as a key experimental observable. In adult tissues, where cell dynamics is constrained by the condition of homeostasis, clonal tracing studies based on transgenic animal models have advanced our understanding of cell fate behaviour and its dysregulation in disease1,2. But what can be learnt from clonal dynamics in development, where the spatial cohesiveness of clones is impaired by tissue deformations during tissue growth? Drawing on the results of clonal tracing studies, we show that, despite the complexity of organ development, clonal dynamics may converge to a critical state characterized by universal scaling behaviour of clone sizes. By mapping clonal dynamics onto a generalization of the classical theory of aerosols, we elucidate the origin and range of scaling behaviours and show how the identification of universal scaling dependences may allow lineage-specific information to be distilled from experiments. Our study shows the emergence of core concepts of statistical physics in an unexpected context, identifying cellular systems as a laboratory to study non-equilibrium statistical physics.
A genome-wide methylation study on obesity: differential variability and differential methylation.
Xu, Xiaojing; Su, Shaoyong; Barnes, Vernon A; De Miguel, Carmen; Pollock, Jennifer; Ownby, Dennis; Shi, Hidong; Zhu, Haidong; Snieder, Harold; Wang, Xiaoling
2013-05-01
Besides differential methylation, DNA methylation variation has recently been proposed and demonstrated to be a potential contributing factor to cancer risk. Here we aim to examine whether differential variability in methylation is also an important feature of obesity, a typical non-malignant common complex disease. We analyzed genome-wide methylation profiles of over 470,000 CpGs in peripheral blood samples from 48 obese and 48 lean African-American youth aged 14-20 y old. A substantial number of differentially variable CpG sites (DVCs), using statistics based on variances, as well as a substantial number of differentially methylated CpG sites (DMCs), using statistics based on means, were identified. Similar to the findings in cancers, DVCs generally exhibited an outlier structure and were more variable in cases than in controls. By randomly splitting the current sample into a discovery and validation set, we observed that both the DVCs and DMCs identified from the first set could independently predict obesity status in the second set. Furthermore, both the genes harboring DMCs and the genes harboring DVCs showed significant enrichment of genes identified by genome-wide association studies on obesity and related diseases, such as hypertension, dyslipidemia, type 2 diabetes and certain types of cancers, supporting their roles in the etiology and pathogenesis of obesity. We generalized the recent finding on methylation variability in cancer research to obesity and demonstrated that differential variability is also an important feature of obesity-related methylation changes. Future studies on the epigenetics of obesity will benefit from both statistics based on means and statistics based on variances.
NASA Astrophysics Data System (ADS)
Kolokythas, Kostantinos; Vasileios, Salamalikis; Athanassios, Argiriou; Kazantzidis, Andreas
2015-04-01
The wind is a result of complex interactions of numerous mechanisms taking place in small or large scales, so, the better knowledge of its behavior is essential in a variety of applications, especially in the field of power production coming from wind turbines. In the literature there is a considerable number of models, either physical or statistical ones, dealing with the problem of simulation and prediction of wind speed. Among others, Artificial Neural Networks (ANNs) are widely used for the purpose of wind forecasting and, in the great majority of cases, outperform other conventional statistical models. In this study, a number of ANNs with different architectures, which have been created and applied in a dataset of wind time series, are compared to Auto Regressive Integrated Moving Average (ARIMA) statistical models. The data consist of mean hourly wind speeds coming from a wind farm on a hilly Greek region and cover a period of one year (2013). The main goal is to evaluate the models ability to simulate successfully the wind speed at a significant point (target). Goodness-of-fit statistics are performed for the comparison of the different methods. In general, the ANN showed the best performance in the estimation of wind speed prevailing over the ARIMA models.
Generalized statistical mechanics approaches to earthquakes and tectonics.
Vallianatos, Filippos; Papadakis, Giorgos; Michas, Georgios
2016-12-01
Despite the extreme complexity that characterizes the mechanism of the earthquake generation process, simple empirical scaling relations apply to the collective properties of earthquakes and faults in a variety of tectonic environments and scales. The physical characterization of those properties and the scaling relations that describe them attract a wide scientific interest and are incorporated in the probabilistic forecasting of seismicity in local, regional and planetary scales. Considerable progress has been made in the analysis of the statistical mechanics of earthquakes, which, based on the principle of entropy, can provide a physical rationale to the macroscopic properties frequently observed. The scale-invariant properties, the (multi) fractal structures and the long-range interactions that have been found to characterize fault and earthquake populations have recently led to the consideration of non-extensive statistical mechanics (NESM) as a consistent statistical mechanics framework for the description of seismicity. The consistency between NESM and observations has been demonstrated in a series of publications on seismicity, faulting, rock physics and other fields of geosciences. The aim of this review is to present in a concise manner the fundamental macroscopic properties of earthquakes and faulting and how these can be derived by using the notions of statistical mechanics and NESM, providing further insights into earthquake physics and fault growth processes.
Generalized statistical mechanics approaches to earthquakes and tectonics
Papadakis, Giorgos; Michas, Georgios
2016-01-01
Despite the extreme complexity that characterizes the mechanism of the earthquake generation process, simple empirical scaling relations apply to the collective properties of earthquakes and faults in a variety of tectonic environments and scales. The physical characterization of those properties and the scaling relations that describe them attract a wide scientific interest and are incorporated in the probabilistic forecasting of seismicity in local, regional and planetary scales. Considerable progress has been made in the analysis of the statistical mechanics of earthquakes, which, based on the principle of entropy, can provide a physical rationale to the macroscopic properties frequently observed. The scale-invariant properties, the (multi) fractal structures and the long-range interactions that have been found to characterize fault and earthquake populations have recently led to the consideration of non-extensive statistical mechanics (NESM) as a consistent statistical mechanics framework for the description of seismicity. The consistency between NESM and observations has been demonstrated in a series of publications on seismicity, faulting, rock physics and other fields of geosciences. The aim of this review is to present in a concise manner the fundamental macroscopic properties of earthquakes and faulting and how these can be derived by using the notions of statistical mechanics and NESM, providing further insights into earthquake physics and fault growth processes. PMID:28119548
The Manipulative Complexity of Lower Paleolithic Stone Toolmaking
Faisal, Aldo; Stout, Dietrich; Apel, Jan; Bradley, Bruce
2010-01-01
Background Early stone tools provide direct evidence of human cognitive and behavioral evolution that is otherwise unavailable. Proper interpretation of these data requires a robust interpretive framework linking archaeological evidence to specific behavioral and cognitive actions. Methodology/Principal Findings Here we employ a data glove to record manual joint angles in a modern experimental toolmaker (the 4th author) replicating ancient tool forms in order to characterize and compare the manipulative complexity of two major Lower Paleolithic technologies (Oldowan and Acheulean). To this end we used a principled and general measure of behavioral complexity based on the statistics of joint movements. Conclusions/Significance This allowed us to confirm that previously observed differences in brain activation associated with Oldowan versus Acheulean technologies reflect higher-level behavior organization rather than lower-level differences in manipulative complexity. This conclusion is consistent with a scenario in which the earliest stages of human technological evolution depended on novel perceptual-motor capacities (such as the control of joint stiffness) whereas later developments increasingly relied on enhanced mechanisms for cognitive control. This further suggests possible links between toolmaking and language evolution. PMID:21072164
Generalized Optical Theorem Detection in Random and Complex Media
NASA Astrophysics Data System (ADS)
Tu, Jing
The problem of detecting changes of a medium or environment based on active, transmit-plus-receive wave sensor data is at the heart of many important applications including radar, surveillance, remote sensing, nondestructive testing, and cancer detection. This is a challenging problem because both the change or target and the surrounding background medium are in general unknown and can be quite complex. This Ph.D. dissertation presents a new wave physics-based approach for the detection of targets or changes in rather arbitrary backgrounds. The proposed methodology is rooted on a fundamental result of wave theory called the optical theorem, which gives real physical energy meaning to the statistics used for detection. This dissertation is composed of two main parts. The first part significantly expands the theory and understanding of the optical theorem for arbitrary probing fields and arbitrary media including nonreciprocal media, active media, as well as time-varying and nonlinear scatterers. The proposed formalism addresses both scalar and full vector electromagnetic fields. The second contribution of this dissertation is the application of the optical theorem to change detection with particular emphasis on random, complex, and active media, including single frequency probing fields and broadband probing fields. The first part of this work focuses on the generalization of the existing theoretical repertoire and interpretation of the scalar and electromagnetic optical theorem. Several fundamental generalizations of the optical theorem are developed. A new theory is developed for the optical theorem for scalar fields in nonhomogeneous media which can be bounded or unbounded. The bounded media context is essential for applications such as intrusion detection and surveillance in enclosed environments such as indoor facilities, caves, tunnels, as well as for nondestructive testing and communication systems based on wave-guiding structures. The developed scalar optical theorem theory applies to arbitrary lossless backgrounds and quite general probing fields including near fields which play a key role in super-resolution imaging. The derived formulation holds for arbitrary passive scatterers, which can be dissipative, as well as for the more general class of active scatterers which are composed of a (passive) scatterer component and an active, radiating (antenna) component. Furthermore, the generalization of the optical theorem to active scatterers is relevant to many applications such as surveillance of active targets including certain cloaks, invisible scatterers, and wireless communications. The latter developments have important military applications. The derived theoretical framework includes the familiar real power optical theorem describing power extinction due to both dissipation and scattering as well as a reactive optical theorem related to the reactive power changes. Meanwhile, the developed approach naturally leads to three optical theorem indicators or statistics, which can be used to detect changes or targets in unknown complex media. In addition, the optical theorem theory is generalized in the time domain so that it applies to arbitrary full vector fields, and arbitrary media including anisotropic media, nonreciprocal media, active media, as well as time-varying and nonlinear scatterers. The second component of this Ph.D. research program focuses on the application of the optical theorem to change detection. Three different forms of indicators or statistics are developed for change detection in unknown background media: a real power optical theorem detector, a reactive power optical theorem detector, and a total apparent power optical theorem detector. No prior knowledge is required of the background or the change or target. The performance of the three proposed optical theorem detectors is compared with the classical energy detector approach for change detection. The latter uses a mathematical or functional energy while the optical theorem detectors are based on real physical energy. For reference, the optical theorem detectors are also compared with the matched filter approach which (unlike the optical theorem detectors) assumes perfect target and medium information. The practical implementation of the optical theorem detectors is based for certain random and complex media on the exploitation of time reversal focusing ideas developed in the past 20 years in electromagnetics and acoustics. In the final part of the dissertation, we also discuss the implementation of the optical theorem sensors for one-dimensional propagation systems such as transmission lines. We also present a new generalized likelihood ratio test for detection that exploits a prior data constraint based on the optical theorem. Finally, we also address the practical implementation of the optical theorem sensors for optical imaging systems, by means of holography. The later is the first holographic implementation the optical theorem for arbitrary scenes and targets.
Joint probability of statistical success of multiple phase III trials.
Zhang, Jianliang; Zhang, Jenny J
2013-01-01
In drug development, after completion of phase II proof-of-concept trials, the sponsor needs to make a go/no-go decision to start expensive phase III trials. The probability of statistical success (PoSS) of the phase III trials based on data from earlier studies is an important factor in that decision-making process. Instead of statistical power, the predictive power of a phase III trial, which takes into account the uncertainty in the estimation of treatment effect from earlier studies, has been proposed to evaluate the PoSS of a single trial. However, regulatory authorities generally require statistical significance in two (or more) trials for marketing licensure. We show that the predictive statistics of two future trials are statistically correlated through use of the common observed data from earlier studies. Thus, the joint predictive power should not be evaluated as a simplistic product of the predictive powers of the individual trials. We develop the relevant formulae for the appropriate evaluation of the joint predictive power and provide numerical examples. Our methodology is further extended to the more complex phase III development scenario comprising more than two (K > 2) trials, that is, the evaluation of the PoSS of at least k₀ (k₀≤ K) trials from a program of K total trials. Copyright © 2013 John Wiley & Sons, Ltd.
Dalla Bella, Simone; Sowiński, Jakub
2015-03-16
A set of behavioral tasks for assessing perceptual and sensorimotor timing abilities in the general population (i.e., non-musicians) is presented here with the goal of uncovering rhythm disorders, such as beat deafness. Beat deafness is characterized by poor performance in perceiving durations in auditory rhythmic patterns or poor synchronization of movement with auditory rhythms (e.g., with musical beats). These tasks include the synchronization of finger tapping to the beat of simple and complex auditory stimuli and the detection of rhythmic irregularities (anisochrony detection task) embedded in the same stimuli. These tests, which are easy to administer, include an assessment of both perceptual and sensorimotor timing abilities under different conditions (e.g., beat rates and types of auditory material) and are based on the same auditory stimuli, ranging from a simple metronome to a complex musical excerpt. The analysis of synchronized tapping data is performed with circular statistics, which provide reliable measures of synchronization accuracy (e.g., the difference between the timing of the taps and the timing of the pacing stimuli) and consistency. Circular statistics on tapping data are particularly well-suited for detecting individual differences in the general population. Synchronized tapping and anisochrony detection are sensitive measures for identifying profiles of rhythm disorders and have been used with success to uncover cases of poor synchronization with spared perceptual timing. This systematic assessment of perceptual and sensorimotor timing can be extended to populations of patients with brain damage, neurodegenerative diseases (e.g., Parkinson's disease), and developmental disorders (e.g., Attention Deficit Hyperactivity Disorder).
NASA Astrophysics Data System (ADS)
Asadollahi, Parisa; Li, Jian
2016-04-01
Understanding the dynamic behavior of complex structures such as long-span bridges requires dense deployment of sensors. Traditional wired sensor systems are generally expensive and time-consuming to install due to cabling. With wireless communication and on-board computation capabilities, wireless smart sensor networks have the advantages of being low cost, easy to deploy and maintain and therefore facilitate dense instrumentation for structural health monitoring. A long-term monitoring project was recently carried out for a cable-stayed bridge in South Korea with a dense array of 113 smart sensors, which feature the world's largest wireless smart sensor network for civil structural monitoring. This paper presents a comprehensive statistical analysis of the modal properties including natural frequencies, damping ratios and mode shapes of the monitored cable-stayed bridge. Data analyzed in this paper is composed of structural vibration signals monitored during a 12-month period under ambient excitations. The correlation between environmental temperature and the modal frequencies is also investigated. The results showed the long-term statistical structural behavior of the bridge, which serves as the basis for Bayesian statistical updating for the numerical model.
Network model of bilateral power markets based on complex networks
NASA Astrophysics Data System (ADS)
Wu, Yang; Liu, Junyong; Li, Furong; Yan, Zhanxin; Zhang, Li
2014-06-01
The bilateral power transaction (BPT) mode becomes a typical market organization with the restructuring of electric power industry, the proper model which could capture its characteristics is in urgent need. However, the model is lacking because of this market organization's complexity. As a promising approach to modeling complex systems, complex networks could provide a sound theoretical framework for developing proper simulation model. In this paper, a complex network model of the BPT market is proposed. In this model, price advantage mechanism is a precondition. Unlike other general commodity transactions, both of the financial layer and the physical layer are considered in the model. Through simulation analysis, the feasibility and validity of the model are verified. At same time, some typical statistical features of BPT network are identified. Namely, the degree distribution follows the power law, the clustering coefficient is low and the average path length is a bit long. Moreover, the topological stability of the BPT network is tested. The results show that the network displays a topological robustness to random market member's failures while it is fragile against deliberate attacks, and the network could resist cascading failure to some extent. These features are helpful for making decisions and risk management in BPT markets.
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Multiagent model and mean field theory of complex auction dynamics
NASA Astrophysics Data System (ADS)
Chen, Qinghua; Huang, Zi-Gang; Wang, Yougui; Lai, Ying-Cheng
2015-09-01
Recent years have witnessed a growing interest in analyzing a variety of socio-economic phenomena using methods from statistical and nonlinear physics. We study a class of complex systems arising from economics, the lowest unique bid auction (LUBA) systems, which is a recently emerged class of online auction game systems. Through analyzing large, empirical data sets of LUBA, we identify a general feature of the bid price distribution: an inverted J-shaped function with exponential decay in the large bid price region. To account for the distribution, we propose a multi-agent model in which each agent bids stochastically in the field of winner’s attractiveness, and develop a theoretical framework to obtain analytic solutions of the model based on mean field analysis. The theory produces bid-price distributions that are in excellent agreement with those from the real data. Our model and theory capture the essential features of human behaviors in the competitive environment as exemplified by LUBA, and may provide significant quantitative insights into complex socio-economic phenomena.
Characterization of branch complexity by fractal analyses
Alados, C.L.; Escos, J.; Emlen, J.M.; Freeman, D.C.
1999-01-01
The comparison between complexity in the sense of space occupancy (box-counting fractal dimension D(c) and information dimension D1) and heterogeneity in the sense of space distribution (average evenness index f and evenness variation coefficient J(cv)) were investigated in mathematical fractal objects and natural branch structures. In general, increased fractal dimension was paired with low heterogeneity. Comparisons between branch architecture in Anthyllis cytisoides under different slope exposure and grazing impact revealed that branches were more complex and more homogeneously distributed for plants on northern exposures than southern, while grazing had no impact during a wet year. Developmental instability was also investigated by the statistical noise of the allometric relation between internode length and node order. In conclusion, our study demonstrated that fractal dimension of branch structure can be used to analyze the structural organization of plants, especially if we consider not only fractal dimension but also shoot distribution within the canopy (lacunarity). These indexes together with developmental instability analyses are good indicators of growth responses to the environment.
Characterization of branch complexity by fractal analyses and detect plant functional adaptations
Alados, C.L.; Escos, J.; Emlen, J.M.; Freeman, D.C.
1999-01-01
The comparison between complexity in the sense of space occupancy (box-counting fractal dimension Dc and information dimension DI ) and heterogeneity in the sense of space distribution (average evenness index and evenness variation coefficient JCV) were investigated in mathematical fractal objects and natural branch ¯ J structures. In general, increased fractal dimension was paired with low heterogeneity. Comparisons between branch architecture in Anthyllis cytisoides under different slope exposure and grazing impact revealed that branches were more complex and more homogeneously distributed for plants on northern exposures than southern, while grazing had no impact during a wet year. Developmental instability was also investigated by the statistical noise of the allometric relation between internode length and node order. In conclusion, our study demonstrated that fractal dimension of branch structure can be used to analyze the structural organization of plants, especially if we consider not only fractal dimension but also shoot distribution within the canopy (lacunarity). These indexes together with developmental instability analyses are good indicators of growth responses to the environment.
NASA Astrophysics Data System (ADS)
Rosso, Osvaldo A.; Craig, Hugh; Moscato, Pablo
2009-03-01
We introduce novel Information Theory quantifiers in a computational linguistic study that involves a large corpus of English Renaissance literature. The 185 texts studied (136 plays and 49 poems in total), with first editions that range from 1580 to 1640, form a representative set of its period. Our data set includes 30 texts unquestionably attributed to Shakespeare; in addition we also included A Lover’s Complaint, a poem which generally appears in Shakespeare collected editions but whose authorship is currently in dispute. Our statistical complexity quantifiers combine the power of Jensen-Shannon’s divergence with the entropy variations as computed from a probability distribution function of the observed word use frequencies. Our results show, among other things, that for a given entropy poems display higher complexity than plays, that Shakespeare’s work falls into two distinct clusters in entropy, and that his work is remarkable for its homogeneity and for its closeness to overall means.
Zaitlen, Noah A.; Ye, Chun Jimmie; Witte, John S.
2016-01-01
The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature. PMID:27197206
14 CFR 291.41 - Financial and statistical reporting-general.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 14 Aeronautics and Space 4 2011-01-01 2011-01-01 false Financial and statistical reporting-general. 291.41 Section 291.41 Aeronautics and Space OFFICE OF THE SECRETARY, DEPARTMENT OF TRANSPORTATION... Rules § 291.41 Financial and statistical reporting—general. (a) Carriers providing cargo operations in...
14 CFR 291.41 - Financial and statistical reporting-general.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 14 Aeronautics and Space 4 2010-01-01 2010-01-01 false Financial and statistical reporting-general. 291.41 Section 291.41 Aeronautics and Space OFFICE OF THE SECRETARY, DEPARTMENT OF TRANSPORTATION... Rules § 291.41 Financial and statistical reporting—general. (a) Carriers providing cargo operations in...
Teaching Statistics--Despite Its Applications
ERIC Educational Resources Information Center
Ridgway, Jim; Nicholson, James; McCusker, Sean
2007-01-01
Evidence-based policy requires sophisticated modelling and reasoning about complex social data. The current UK statistics curricula do not equip tomorrow's citizens to understand such reasoning. We advocate radical curriculum reform, designed to require students to reason from complex data.
Effect of minimally invasive surgery fellowship on residents' operative experience.
Altieri, Maria S; Frenkel, Catherine; Scriven, Richard; Thornton, Deborah; Halbert, Caitlin; Talamini, Mark; Telem, Dana A; Pryor, Aurora D
2017-01-01
There is an increased need for surgical trainees to acquire advanced laparoscopic skills as laparoscopy becomes the standard of care in many areas of general surgery. Since the introduction of minimally invasive surgery (MIS) fellowships, there has been a continuing debate as to whether these fellowships adversely affect general surgery resident exposure to laparoscopic cases. The aim of our study was to examine whether the introduction of an MIS fellowship negatively impacts general surgery residents' experience at a single academic center. We describe the changes following establishment of MIS fellowship at an academic center. Resident case log system from the Accreditation Council for Graduate Medical Education was queried to obtain all PGY 1-5 resident operative case logs. Two-year time period preceding and following the institution of an MIS fellowship at our institution in 2012 was compared. P values less than 0.05 were considered statistically significant. Following initiation of the MIS fellowship, an MIS service was established. The service comprised of a fellow, midlevel resident, and intern. Operative experience was examined. From 2010-2012 to 2012-2014, residents logged a total of 272 and 585 complex laparoscopic cases, respectively. There were 43 residents from 2010 to 2013 and 44 residents from 2013 to 2014. When the two time periods were compared, a trend of increased numbers for all procedures was noted, except laparoscopic GYN/genito-urinary procedures. Average percent increase in complex general surgery procedures was 249 ± 179.8 %. Following establishment of a MIS fellowship, reported cases by residents were higher or similar to those reported nationally for laparoscopic procedures. Institution of an MIS fellowship had a favorable effect on general surgery resident operative education at a single academic training center. Residents may benefit from the presence of a fellowship at an academic center because they are able to participate in an increased number of complex laparoscopic cases.
Car driving in schizophrenia: can visual memory and organization make a difference?
Lipskaya-Velikovsky, Lena; Kotler, Moshe; Weiss, Penina; Kaspi, Maya; Gamzo, Shimrit; Ratzon, Navah
2013-09-01
Driving is a meaningful occupation which is ascribed to functional independence in schizophrenia. Although it is estimated that individuals with schizophrenia have two times more traffic accidents, little research has been done in this field. Present research explores differences in mental status, visual working memory and visual organization between drivers and non-drivers with schizophrenia in comparison to healthy drivers. There were three groups in the study: 20 drivers with schizophrenia, 20 non-driving individuals with schizophrenia and 20 drivers without schizophrenia (DWS). Visual perception was measured with Rey-Osterrieth Complex Figure test and a general cognitive status with Mini-Mental State Examination. The general cognitive status predicted actual driving situation in people with schizophrenia. No statistically significant differences were found between driving and non-driving persons with schizophrenia on any of the visual parameters tested, although these abilities were significantly lower than those of DWS. The research demonstrates that impairment of visual abilities does not prevent people with schizophrenia from driving and emphasizes the importance of general cognitive status for complex and multidimensional everyday tasks. The findings support the need for further investigation in the field of car driving for this population - a move that will considerably contribute to the participation and well-being. Implication for Rehabilitation Unique approach for driving evaluation in schizophrenia should be designed since direct applications of knowledge and practice acquired from other populations are not reliable. This research demonstrates that visual perception deficits in schizophrenia do not prevent clients from driving, and general cognitive status appeared to be a valid determinant for actual driving. We recommended usage of a general test of cognition such as Mini-Mental State Examination, or conjunction number of cognitive factors such as executive functions (e.g., Trail Making Test) and attention (e.g., Continuous Performance Test) in addition to spatial-visual ability tests (e.g., Rey-Osterrieth Complex Figure test) for considering driving status in schizophrenia.
Functional constraints on tooth morphology in carnivorous mammals
2012-01-01
Background The range of potential morphologies resulting from evolution is limited by complex interacting processes, ranging from development to function. Quantifying these interactions is important for understanding adaptation and convergent evolution. Using three-dimensional reconstructions of carnivoran and dasyuromorph tooth rows, we compared statistical models of the relationship between tooth row shape and the opposing tooth row, a static feature, as well as measures of mandibular motion during chewing (occlusion), which are kinetic features. This is a new approach to quantifying functional integration because we use measures of movement and displacement, such as the amount the mandible translates laterally during occlusion, as opposed to conventional morphological measures, such as mandible length and geometric landmarks. By sampling two distantly related groups of ecologically similar mammals, we study carnivorous mammals in general rather than a specific group of mammals. Results Statistical model comparisons demonstrate that the best performing models always include some measure of mandibular motion, indicating that functional and statistical models of tooth shape as purely a function of the opposing tooth row are too simple and that increased model complexity provides a better understanding of tooth form. The predictors of the best performing models always included the opposing tooth row shape and a relative linear measure of mandibular motion. Conclusions Our results provide quantitative support of long-standing hypotheses of tooth row shape as being influenced by mandibular motion in addition to the opposing tooth row. Additionally, this study illustrates the utility and necessity of including kinetic features in analyses of morphological integration. PMID:22899809
Learning Predictive Statistics: Strategies and Brain Mechanisms.
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe
2017-08-30
When immersed in a new environment, we are challenged to decipher initially incomprehensible streams of sensory information. However, quite rapidly, the brain finds structure and meaning in these incoming signals, helping us to predict and prepare ourselves for future actions. This skill relies on extracting the statistics of event streams in the environment that contain regularities of variable complexity from simple repetitive patterns to complex probabilistic combinations. Here, we test the brain mechanisms that mediate our ability to adapt to the environment's statistics and predict upcoming events. By combining behavioral training and multisession fMRI in human participants (male and female), we track the corticostriatal mechanisms that mediate learning of temporal sequences as they change in structure complexity. We show that learning of predictive structures relates to individual decision strategy; that is, selecting the most probable outcome in a given context (maximizing) versus matching the exact sequence statistics. These strategies engage distinct human brain regions: maximizing engages dorsolateral prefrontal, cingulate, sensory-motor regions, and basal ganglia (dorsal caudate, putamen), whereas matching engages occipitotemporal regions (including the hippocampus) and basal ganglia (ventral caudate). Our findings provide evidence for distinct corticostriatal mechanisms that facilitate our ability to extract behaviorally relevant statistics to make predictions. SIGNIFICANCE STATEMENT Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. Past work has studied how humans identify repetitive patterns and associative pairings. However, the natural environment contains regularities that vary in complexity from simple repetition to complex probabilistic combinations. Here, we combine behavior and multisession fMRI to track the brain mechanisms that mediate our ability to adapt to changes in the environment's statistics. We provide evidence for an alternate route for learning complex temporal statistics: extracting the most probable outcome in a given context is implemented by interactions between executive and motor corticostriatal mechanisms compared with visual corticostriatal circuits (including hippocampal cortex) that support learning of the exact temporal statistics. Copyright © 2017 Wang et al.
Trends in Mediation Analysis in Nursing Research: Improving Current Practice.
Hertzog, Melody
2018-06-01
The purpose of this study was to describe common approaches used by nursing researchers to test mediation models and evaluate them within the context of current methodological advances. MEDLINE was used to locate studies testing a mediation model and published from 2004 to 2015 in nursing journals. Design (experimental/correlation, cross-sectional/longitudinal, model complexity) and analysis (method, inclusion of test of mediated effect, violations/discussion of assumptions, sample size/power) characteristics were coded for 456 studies. General trends were identified using descriptive statistics. Consistent with findings of reviews in other disciplines, evidence was found that nursing researchers may not be aware of the strong assumptions and serious limitations of their analyses. Suggestions for strengthening the rigor of such studies and an overview of current methods for testing more complex models, including longitudinal mediation processes, are presented.
Doiron, Dany; Marcon, Yannick; Fortier, Isabel; Burton, Paul; Ferretti, Vincent
2017-01-01
Abstract Motivation Improving the dissemination of information on existing epidemiological studies and facilitating the interoperability of study databases are essential to maximizing the use of resources and accelerating improvements in health. To address this, Maelstrom Research proposes Opal and Mica, two inter-operable open-source software packages providing out-of-the-box solutions for epidemiological data management, harmonization and dissemination. Implementation Opal and Mica are two standalone but inter-operable web applications written in Java, JavaScript and PHP. They provide web services and modern user interfaces to access them. General features Opal allows users to import, manage, annotate and harmonize study data. Mica is used to build searchable web portals disseminating study and variable metadata. When used conjointly, Mica users can securely query and retrieve summary statistics on geographically dispersed Opal servers in real-time. Integration with the DataSHIELD approach allows conducting more complex federated analyses involving statistical models. Availability Opal and Mica are open-source and freely available at [www.obiba.org] under a General Public License (GPL) version 3, and the metadata models and taxonomies that accompany them are available under a Creative Commons licence. PMID:29025122
Statistical genetics and evolution of quantitative traits
NASA Astrophysics Data System (ADS)
Neher, Richard A.; Shraiman, Boris I.
2011-10-01
The distribution and heritability of many traits depends on numerous loci in the genome. In general, the astronomical number of possible genotypes makes the system with large numbers of loci difficult to describe. Multilocus evolution, however, greatly simplifies in the limit of weak selection and frequent recombination. In this limit, populations rapidly reach quasilinkage equilibrium (QLE) in which the dynamics of the full genotype distribution, including correlations between alleles at different loci, can be parametrized by the allele frequencies. This review provides a simplified exposition of the concept and mathematics of QLE which is central to the statistical description of genotypes in sexual populations. Key results of quantitative genetics such as the generalized Fisher’s “fundamental theorem,” along with Wright’s adaptive landscape, are shown to emerge within QLE from the dynamics of the genotype distribution. This is followed by a discussion under what circumstances QLE is applicable, and what the breakdown of QLE implies for the population structure and the dynamics of selection. Understanding the fundamental aspects of multilocus evolution obtained through simplified models may be helpful in providing conceptual and computational tools to address the challenges arising in the studies of complex quantitative phenotypes of practical interest.
Johnson, Douglas H.; Cook, R.D.
2013-01-01
In her AAAS News & Notes piece "Can the Southwest manage its thirst?" (26 July, p. 362), K. Wren quotes Ajay Kalra, who advocates a particular method for predicting Colorado River streamflow "because it eschews complex physical climate models for a statistical data-driven modeling approach." A preference for data-driven models may be appropriate in this individual situation, but it is not so generally, Data-driven models often come with a warning against extrapolating beyond the range of the data used to develop the models. When the future is like the past, data-driven models can work well for prediction, but it is easy to over-model local or transient phenomena, often leading to predictive inaccuracy (1). Mechanistic models are built on established knowledge of the process that connects the response variables with the predictors, using information obtained outside of an extant data set. One may shy away from a mechanistic approach when the underlying process is judged to be too complicated, but good predictive models can be constructed with statistical components that account for ingredients missing in the mechanistic analysis. Models with sound mechanistic components are more generally applicable and robust than data-driven models.
Usami, Satoshi
2017-03-01
Behavioral and psychological researchers have shown strong interests in investigating contextual effects (i.e., the influences of combinations of individual- and group-level predictors on individual-level outcomes). The present research provides generalized formulas for determining the sample size needed in investigating contextual effects according to the desired level of statistical power as well as width of confidence interval. These formulas are derived within a three-level random intercept model that includes one predictor/contextual variable at each level to simultaneously cover various kinds of contextual effects that researchers can show interest. The relative influences of indices included in the formulas on the standard errors of contextual effects estimates are investigated with the aim of further simplifying sample size determination procedures. In addition, simulation studies are performed to investigate finite sample behavior of calculated statistical power, showing that estimated sample sizes based on derived formulas can be both positively and negatively biased due to complex effects of unreliability of contextual variables, multicollinearity, and violation of assumption regarding the known variances. Thus, it is advisable to compare estimated sample sizes under various specifications of indices and to evaluate its potential bias, as illustrated in the example.
Modeling exposure–lag–response associations with distributed lag non-linear models
Gasparrini, Antonio
2014-01-01
In biomedical research, a health effect is frequently associated with protracted exposures of varying intensity sustained in the past. The main complexity of modeling and interpreting such phenomena lies in the additional temporal dimension needed to express the association, as the risk depends on both intensity and timing of past exposures. This type of dependency is defined here as exposure–lag–response association. In this contribution, I illustrate a general statistical framework for such associations, established through the extension of distributed lag non-linear models, originally developed in time series analysis. This modeling class is based on the definition of a cross-basis, obtained by the combination of two functions to flexibly model linear or nonlinear exposure-responses and the lag structure of the relationship, respectively. The methodology is illustrated with an example application to cohort data and validated through a simulation study. This modeling framework generalizes to various study designs and regression models, and can be applied to study the health effects of protracted exposures to environmental factors, drugs or carcinogenic agents, among others. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:24027094
Azogil-López, Luis Miguel; Pérez-Lázaro, Juan José; Ávila-Pecci, Patricia; Medrano-Sánchez, Esther María; Coronado-Vázquez, María Valle
2018-04-23
The purpose of this study is to find out whether telephone referral from Primary Health Care to Internal Medicine Consult manages to reduce waiting days as compared to traditional referral. This study also aims to know how acceptable is the telephone referral to general practitioners and their patients. No blind randomized controlled clinical trial. Northern Huelva Health District. 154 patients. Patients referrals from intervention clinicians were sent via telephone consultation, whereas patients referrals from control clinicians were sent by traditional via. Number of days from referral request to Internal Medicine Consult. Number of telephone and traditional referrals. Number of doctors and patients denied. Denial reasons. A statistically significant difference was found between groups, with an average of 27 (21-34) days. Among General Practitioners, 8 of the first 58 total doctors after randomization and, subsequently, 6 of the 20 doctors of the test group refused to engage in the trial because they considered "excessive time and effort consuming". 50% of patients referred by the 14 General Practitioners finally randomized to the intervention group were denied referral by telephone due to patient's complexity. Telephone referral significantly reduces waiting days for Internal Medicine consult. This type of referral did not mean an "excessive time and effort consuming" to General Practitioners and was not all that beneficial to complex patients. Copyright © 2018 The Authors. Publicado por Elsevier España, S.L.U. All rights reserved.
Pichetti, Sylvain; Sermet, Catherine; Godman, Brian; Campbell, Stephen M; Gustafsson, Lars L
2013-06-01
The French National Health Insurance and the Ministry of Health have introduced multiple reforms in recent years to increase prescribing efficiency. These include guidelines, academic detailing, financial incentives for the prescribing and dispensing of generics drugs as well as a voluntary pay-for-performance programme. However, the quality and efficiency of prescribing could be enhanced potentially if there was better understanding of the dynamics of prescribing behaviour in France. To analyse the patient and general practitioner characteristics that influence patented versus multiple-sourced statin prescribing in France. Statistical analysis was performed on the statin prescribing habits from 341 general practitioners (GPs) that were included in the IMS-Health Permanent Survey on Medical Prescription in France, which was conducted between 2009 and 2010 and involved 14,360 patients. Patient characteristics included their age and gender as well as five medical profiles that were constructed from the diagnoses obtained during consultations. These were (1) disorders of lipoprotein metabolism, (2) heart disease, (3) diabetes, (4) complex profiles and (5) profiles based on other diagnoses. Physician characteristics included their age, gender, solo or group practice, weekly workload and payment scheme. Patient age had a statistically significant impact on statin prescribing for patients in profile 1 (disorders of lipoprotein metabolism) and profile 3 (complex profiles) with a greater number of patented statins being prescribed for the youngest patients. For instance, patients older than 76 years with a complex profile were prescribed fewer patented statins than patients aged 68-76 years old with the same medical profile (coefficient: -0.225; p = 0.0008). By contrast, regardless of the patient's age, the medical profile did not affect the probability of prescribing a patented statin except in young patients with heart diseases who were prescribed a greater number of patented statins (coefficient: 0.3992; p = 0.0007). Prescribing was also statistically influenced by physician features, e.g., older male physicians were more likely to prescribe patented statins (coefficient: 0.245; p = 0.0417) and GPs practicing in groups were more likely to prescribe multiple sourced statins (coefficient: -0.178; p = 0.0338), which is an important finding of the study. GPs with a lower workload prescribed a greater number of patented statins. There is significant variability in the prescribing of different statins among patient and physician profiles as well as between solo and group practices. Consequently, there are opportunities to target demand-side measures to enhance the prescribing of multiple-sourced statins. Further studies are warranted, in particular in other therapeutic classes, to provide a counter-balance to the considerable marketing activities of pharmaceutical companies.
Stochastic cycle selection in active flow networks.
Woodhouse, Francis G; Forrow, Aden; Fawcett, Joanna B; Dunkel, Jörn
2016-07-19
Active biological flow networks pervade nature and span a wide range of scales, from arterial blood vessels and bronchial mucus transport in humans to bacterial flow through porous media or plasmodial shuttle streaming in slime molds. Despite their ubiquity, little is known about the self-organization principles that govern flow statistics in such nonequilibrium networks. Here we connect concepts from lattice field theory, graph theory, and transition rate theory to understand how topology controls dynamics in a generic model for actively driven flow on a network. Our combined theoretical and numerical analysis identifies symmetry-based rules that make it possible to classify and predict the selection statistics of complex flow cycles from the network topology. The conceptual framework developed here is applicable to a broad class of biological and nonbiological far-from-equilibrium networks, including actively controlled information flows, and establishes a correspondence between active flow networks and generalized ice-type models.
Stochastic cycle selection in active flow networks
NASA Astrophysics Data System (ADS)
Woodhouse, Francis; Forrow, Aden; Fawcett, Joanna; Dunkel, Jorn
2016-11-01
Active biological flow networks pervade nature and span a wide range of scales, from arterial blood vessels and bronchial mucus transport in humans to bacterial flow through porous media or plasmodial shuttle streaming in slime molds. Despite their ubiquity, little is known about the self-organization principles that govern flow statistics in such non-equilibrium networks. By connecting concepts from lattice field theory, graph theory and transition rate theory, we show how topology controls dynamics in a generic model for actively driven flow on a network. Through theoretical and numerical analysis we identify symmetry-based rules to classify and predict the selection statistics of complex flow cycles from the network topology. Our conceptual framework is applicable to a broad class of biological and non-biological far-from-equilibrium networks, including actively controlled information flows, and establishes a new correspondence between active flow networks and generalized ice-type models.
NASA Astrophysics Data System (ADS)
Trifoniuk, L. I.; Ushenko, Yu. A.; Sidor, M. I.; Minzer, O. P.; Gritsyuk, M. V.; Novakovskaya, O. Y.
2014-08-01
The work consists of investigation results of diagnostic efficiency of a new azimuthally stable Mueller-matrix method of analysis of laser autofluorescence coordinate distributions of biological tissues histological sections. A new model of generalized optical anisotropy of biological tissues protein networks is proposed in order to define the processes of laser autofluorescence. The influence of complex mechanisms of both phase anisotropy (linear birefringence and optical activity) and linear (circular) dichroism is taken into account. The interconnections between the azimuthally stable Mueller-matrix elements characterizing laser autofluorescence and different mechanisms of optical anisotropy are determined. The statistic analysis of coordinate distributions of such Mueller-matrix rotation invariants is proposed. Thereupon the quantitative criteria (statistic moments of the 1st to the 4th order) of differentiation of histological sections of uterus wall tumor - group 1 (dysplasia) and group 2 (adenocarcinoma) are estimated.
Inter-occurrence times and universal laws in finance, earthquakes and genomes
NASA Astrophysics Data System (ADS)
Tsallis, Constantino
2016-07-01
A plethora of natural, artificial and social systems exist which do not belong to the Boltzmann-Gibbs (BG) statistical-mechanical world, based on the standard additive entropy $S_{BG}$ and its associated exponential BG factor. Frequent behaviors in such complex systems have been shown to be closely related to $q$-statistics instead, based on the nonadditive entropy $S_q$ (with $S_1=S_{BG}$), and its associated $q$-exponential factor which generalizes the usual BG one. In fact, a wide range of phenomena of quite different nature exist which can be described and, in the simplest cases, understood through analytic (and explicit) functions and probability distributions which exhibit some universal features. Universality classes are concomitantly observed which can be characterized through indices such as $q$. We will exhibit here some such cases, namely concerning the distribution of inter-occurrence (or inter-event) times in the areas of finance, earthquakes and genomes.
Fluid leakage near the percolation threshold
NASA Astrophysics Data System (ADS)
Dapp, Wolf B.; Müser, Martin H.
2016-02-01
Percolation is a concept widely used in many fields of research and refers to the propagation of substances through porous media (e.g., coffee filtering), or the behaviour of complex networks (e.g., spreading of diseases). Percolation theory asserts that most percolative processes are universal, that is, the emergent powerlaws only depend on the general, statistical features of the macroscopic system, but not on specific details of the random realisation. In contrast, our computer simulations of the leakage through a seal—applying common assumptions of elasticity, contact mechanics, and fluid dynamics—show that the critical behaviour (how the flow ceases near the sealing point) solely depends on the microscopic details of the last constriction. It appears fundamentally impossible to accurately predict from statistical properties of the surfaces alone how strongly we have to tighten a water tap to make it stop dripping and also how it starts dripping once we loosen it again.
NASA Astrophysics Data System (ADS)
Ushenko, Yu. O.; Pashkovskaya, N. V.; Marchuk, Y. F.; Dubolazov, O. V.; Savich, V. O.
2015-08-01
The work consists of investigation results of diagnostic efficiency of a new azimuthally stable Muellermatrix method of analysis of laser autofluorescence coordinate distributions of biological liquid layers. A new model of generalized optical anisotropy of biological tissues protein networks is proposed in order to define the processes of laser autofluorescence. The influence of complex mechanisms of both phase anisotropy (linear birefringence and optical activity) and linear (circular) dichroism is taken into account. The interconnections between the azimuthally stable Mueller-matrix elements characterizing laser autofluorescence and different mechanisms of optical anisotropy are determined. The statistic analysis of coordinate distributions of such Mueller-matrix rotation invariants is proposed. Thereupon the quantitative criteria (statistic moments of the 1st to the 4th order) of differentiation of human urine polycrystalline layers for the sake of diagnosing and differentiating cholelithiasis with underlying chronic cholecystitis (group 1) and diabetes mellitus of degree II (group 2) are estimated.
Quantum-statistical theory of microwave detection using superconducting tunnel junctions
NASA Astrophysics Data System (ADS)
Deviatov, I. A.; Kuzmin, L. S.; Likharev, K. K.; Migulin, V. V.; Zorin, A. B.
1986-09-01
A quantum-statistical theory of microwave and millimeter-wave detection using superconducting tunnel junctions is developed, with a rigorous account of quantum, thermal, and shot noise arising from fluctuation sources associated with the junctions, signal source, and matching circuits. The problem of the noise characterization in the quantum sensitivity range is considered and a general noise parameter Theta(N) is introduced. This parameter is shown to be an adequate figure of merit for most receivers of interest while some devices can require a more complex characterization. Analytical expressions and/or numerically calculated plots for Theta(N) are presented for the most promising detection modes including the parametric amplification, heterodyne mixing, and quadratic videodetection, using both the quasiparticle-current and the Cooper-pair-current nonlinearities. Ultimate minimum values of Theta(N) for each detection mode are compared and found to be in agreement with limitations imposed by the quantum-mechanical uncertainty principle.
NASA Astrophysics Data System (ADS)
Ushenko, Yuriy A.; Koval, Galina D.; Ushenko, Alexander G.; Dubolazov, Olexander V.; Ushenko, Vladimir A.; Novakovskaia, Olga Yu.
2016-07-01
This research presents investigation results of the diagnostic efficiency of an azimuthally stable Mueller-matrix method of analysis of laser autofluorescence of polycrystalline films of dried uterine cavity peritoneal fluid. A model of the generalized optical anisotropy of films of dried peritoneal fluid is proposed in order to define the processes of laser autofluorescence. The influence of complex mechanisms of both phase (linear and circular birefringence) and amplitude (linear and circular dichroism) anisotropies is taken into consideration. The interconnections between the azimuthally stable Mueller-matrix elements characterizing laser autofluorescence and different mechanisms of optical anisotropy are determined. The statistical analysis of coordinate distributions of such Mueller-matrix rotation invariants is proposed. Thereupon the quantitative criteria (statistic moments of the first to the fourth order) of differentiation of polycrystalline films of dried peritoneal fluid, group 1 (healthy donors) and group 2 (uterus endometriosis patients), are determined.
NASA Astrophysics Data System (ADS)
Ushenko, A. G.; Dubolazov, O. V.; Ushenko, Vladimir A.; Ushenko, Yu. A.; Sakhnovskiy, M. Yu.; Prydiy, O. G.; Lakusta, I. I.; Novakovskaya, O. Yu.; Melenko, S. R.
2016-12-01
This research presents investigation results of diagnostic efficiency of a new azimuthally stable Mueller-matrix method of laser autofluorescence coordinate distributions analysis of dried polycrystalline films of uterine cavity peritoneal fluid. A new model of generalized optical anisotropy of biological tissues protein networks is proposed in order to define the processes of laser autofluorescence. The influence of complex mechanisms of both phase anisotropy (linear birefringence and optical activity) and linear (circular) dichroism is taken into account. The interconnections between the azimuthally stable Mueller-matrix elements characterizing laser autofluorescence and different mechanisms of optical anisotropy are determined. The statistic analysis of coordinate distributions of such Mueller-matrix rotation invariants is proposed. Thereupon the quantitative criteria (statistic moments of the 1st to the 4th order) of differentiation of dried polycrystalline films of peritoneal fluid - group 1 (healthy donors) and group 2 (uterus endometriosis patients) are estimated.
Stochastic cycle selection in active flow networks
Woodhouse, Francis G.; Forrow, Aden; Fawcett, Joanna B.; Dunkel, Jörn
2016-01-01
Active biological flow networks pervade nature and span a wide range of scales, from arterial blood vessels and bronchial mucus transport in humans to bacterial flow through porous media or plasmodial shuttle streaming in slime molds. Despite their ubiquity, little is known about the self-organization principles that govern flow statistics in such nonequilibrium networks. Here we connect concepts from lattice field theory, graph theory, and transition rate theory to understand how topology controls dynamics in a generic model for actively driven flow on a network. Our combined theoretical and numerical analysis identifies symmetry-based rules that make it possible to classify and predict the selection statistics of complex flow cycles from the network topology. The conceptual framework developed here is applicable to a broad class of biological and nonbiological far-from-equilibrium networks, including actively controlled information flows, and establishes a correspondence between active flow networks and generalized ice-type models. PMID:27382186
Gardenier, John S
2012-12-01
This paper recommends how authors of statistical studies can communicate to general audiences fully, clearly, and comfortably. The studies may use statistical methods to explore issues in science, engineering, and society or they may address issues in statistics specifically. In either case, readers without explicit statistical training should have no problem understanding the issues, the methods, or the results at a non-technical level. The arguments for those results should be clear, logical, and persuasive. This paper also provides advice for editors of general journals on selecting high quality statistical articles without the need for exceptional work or expense. Finally, readers are also advised to watch out for some common errors or misuses of statistics that can be detected without a technical statistical background.
Time, frequency, and time-varying Granger-causality measures in neuroscience.
Cekic, Sezen; Grandjean, Didier; Renaud, Olivier
2018-05-20
This article proposes a systematic methodological review and an objective criticism of existing methods enabling the derivation of time, frequency, and time-varying Granger-causality statistics in neuroscience. The capacity to describe the causal links between signals recorded at different brain locations during a neuroscience experiment is indeed of primary interest for neuroscientists, who often have very precise prior hypotheses about the relationships between recorded brain signals. The increasing interest and the huge number of publications related to this topic calls for this systematic review, which describes the very complex methodological aspects underlying the derivation of these statistics. In this article, we first present a general framework that allows us to review and compare Granger-causality statistics in the time domain, and the link with transfer entropy. Then, the spectral and the time-varying extensions are exposed and discussed together with their estimation and distributional properties. Although not the focus of this article, partial and conditional Granger causality, dynamical causal modelling, directed transfer function, directed coherence, partial directed coherence, and their variant are also mentioned. Copyright © 2018 John Wiley & Sons, Ltd.
Hydration sites of unpaired RNA bases: a statistical analysis of the PDB structures.
Kirillova, Svetlana; Carugo, Oliviero
2011-10-19
Hydration is crucial for RNA structure and function. X-ray crystallography is the most commonly used method to determine RNA structures and hydration and, therefore, statistical surveys are based on crystallographic results, the number of which is quickly increasing. A statistical analysis of the water molecule distribution in high-resolution X-ray structures of unpaired RNA nucleotides showed that: different bases have the same penchant to be surrounded by water molecules; clusters of water molecules indicate possible hydration sites, which, in some cases, match those of the major and minor grooves of RNA and DNA double helices; complex hydrogen bond networks characterize the solvation of the nucleotides, resulting in a significant rigidity of the base and its surrounding water molecules. Interestingly, the hydration sites around unpaired RNA bases do not match, in general, the positions that are occupied by the second nucleotide when the base-pair is formed. The hydration sites around unpaired RNA bases were found. They do not replicate the atom positions of complementary bases in the Watson-Crick pairs.
Hydration sites of unpaired RNA bases: a statistical analysis of the PDB structures
2011-01-01
Background Hydration is crucial for RNA structure and function. X-ray crystallography is the most commonly used method to determine RNA structures and hydration and, therefore, statistical surveys are based on crystallographic results, the number of which is quickly increasing. Results A statistical analysis of the water molecule distribution in high-resolution X-ray structures of unpaired RNA nucleotides showed that: different bases have the same penchant to be surrounded by water molecules; clusters of water molecules indicate possible hydration sites, which, in some cases, match those of the major and minor grooves of RNA and DNA double helices; complex hydrogen bond networks characterize the solvation of the nucleotides, resulting in a significant rigidity of the base and its surrounding water molecules. Interestingly, the hydration sites around unpaired RNA bases do not match, in general, the positions that are occupied by the second nucleotide when the base-pair is formed. Conclusions The hydration sites around unpaired RNA bases were found. They do not replicate the atom positions of complementary bases in the Watson-Crick pairs. PMID:22011380
Dissecting the genetics of complex traits using summary association statistics.
Pasaniuc, Bogdan; Price, Alkes L
2017-02-01
During the past decade, genome-wide association studies (GWAS) have been used to successfully identify tens of thousands of genetic variants associated with complex traits and diseases. These studies have produced extensive repositories of genetic variation and trait measurements across large numbers of individuals, providing tremendous opportunities for further analyses. However, privacy concerns and other logistical considerations often limit access to individual-level genetic data, motivating the development of methods that analyse summary association statistics. Here, we review recent progress on statistical methods that leverage summary association data to gain insights into the genetic basis of complex traits and diseases.
Dissecting the genetics of complex traits using summary association statistics
Pasaniuc, Bogdan; Price, Alkes L.
2017-01-01
During the past decade, genome-wide association studies (GWAS) have successfully identified tens of thousands of genetic variants associated with complex traits and diseases. These studies have produced extensive repositories of genetic variation and trait measurements across large numbers of individuals, providing tremendous opportunities for further analyses. However, privacy concerns and other logistical considerations often limit access to individual-level genetic data, motivating the development of methods that analyze summary association statistics. Here we review recent progress on statistical methods that leverage summary association data to gain insights into the genetic basis of complex traits and diseases. PMID:27840428
Janhunen, Pekka; Kaartokallio, Hermanni; Oksanen, Ilona; Lehto, Kirsi; Lehto, Harry
2007-02-14
Several severe glaciations occurred during the Neoproterozoic eon, and especially near its end in the Cryogenian period (630-850 Ma). While the glacial periods themselves were probably related to the continental positions being appropriate for glaciation, the general coldness of the Neoproterozoic and Cryogenian as a whole lacks specific explanation. The Cryogenian was immediately followed by the Ediacaran biota and Cambrian Metazoan, thus understanding the climate-biosphere interactions around the Cryogenian period is central to understanding the development of complex multicellular life in general. Here we present a feedback mechanism between growth of eukaryotic algal phytoplankton and climate which explains how the Earth system gradually entered the Cryogenian icehouse from the warm Mesoproterozoic greenhouse. The more abrupt termination of the Cryogenian is explained by the increase in gaseous carbon release caused by the more complex planktonic and benthic foodwebs and enhanced by a diversification of metazoan zooplankton and benthic animals. The increased ecosystem complexity caused a decrease in organic carbon burial rate, breaking the algal-climatic feedback loop of the earlier Neoproterozoic eon. Prior to the Neoproterozoic eon, eukaryotic evolution took place in a slow timescale regulated by interior cooling of the Earth and solar brightening. Evolution could have proceeded faster had these geophysical processes been faster. Thus, complex life could theoretically also be found around stars that are more massive than the Sun and have main sequence life shorter than 10 Ga. We also suggest that snow and glaciers are, in a statistical sense, important markers for conditions that may possibly promote the development of complex life on extrasolar planets.
Janhunen, Pekka; Kaartokallio, Hermanni; Oksanen, Ilona; Lehto, Kirsi; Lehto, Harry
2007-01-01
Several severe glaciations occurred during the Neoproterozoic eon, and especially near its end in the Cryogenian period (630–850 Ma). While the glacial periods themselves were probably related to the continental positions being appropriate for glaciation, the general coldness of the Neoproterozoic and Cryogenian as a whole lacks specific explanation. The Cryogenian was immediately followed by the Ediacaran biota and Cambrian Metazoan, thus understanding the climate-biosphere interactions around the Cryogenian period is central to understanding the development of complex multicellular life in general. Here we present a feedback mechanism between growth of eukaryotic algal phytoplankton and climate which explains how the Earth system gradually entered the Cryogenian icehouse from the warm Mesoproterozoic greenhouse. The more abrupt termination of the Cryogenian is explained by the increase in gaseous carbon release caused by the more complex planktonic and benthic foodwebs and enhanced by a diversification of metazoan zooplankton and benthic animals. The increased ecosystem complexity caused a decrease in organic carbon burial rate, breaking the algal-climatic feedback loop of the earlier Neoproterozoic eon. Prior to the Neoproterozoic eon, eukaryotic evolution took place in a slow timescale regulated by interior cooling of the Earth and solar brightening. Evolution could have proceeded faster had these geophysical processes been faster. Thus, complex life could theoretically also be found around stars that are more massive than the Sun and have main sequence life shorter than 10 Ga. We also suggest that snow and glaciers are, in a statistical sense, important markers for conditions that may possibly promote the development of complex life on extrasolar planets. PMID:17299594
Modified Likelihood-Based Item Fit Statistics for the Generalized Graded Unfolding Model
ERIC Educational Resources Information Center
Roberts, James S.
2008-01-01
Orlando and Thissen (2000) developed an item fit statistic for binary item response theory (IRT) models known as S-X[superscript 2]. This article generalizes their statistic to polytomous unfolding models. Four alternative formulations of S-X[superscript 2] are developed for the generalized graded unfolding model (GGUM). The GGUM is a…
Code of Federal Regulations, 2011 CFR
2011-01-01
... Regulations of the Department of Agriculture (Continued) NATIONAL AGRICULTURAL STATISTICS SERVICE, DEPARTMENT OF AGRICULTURE ORGANIZATION AND FUNCTIONS § 3600.1 General. The National Agricultural Statistics... Statistical Reporting Service concurrent with an internal restructuring. Primary NASS responsibilities are...
Code of Federal Regulations, 2010 CFR
2010-01-01
... Regulations of the Department of Agriculture (Continued) NATIONAL AGRICULTURAL STATISTICS SERVICE, DEPARTMENT OF AGRICULTURE ORGANIZATION AND FUNCTIONS § 3600.1 General. The National Agricultural Statistics... Statistical Reporting Service concurrent with an internal restructuring. Primary NASS responsibilities are...
A note on generalized Genome Scan Meta-Analysis statistics
Koziol, James A; Feng, Anne C
2005-01-01
Background Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies. Results We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results. Conclusion Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic. PMID:15717930
On statistical inference in time series analysis of the evolution of road safety.
Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora
2013-11-01
Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.
Watanabe, Hiroshi
2012-01-01
Procedures of statistical analysis are reviewed to provide an overview of applications of statistics for general use. Topics that are dealt with are inference on a population, comparison of two populations with respect to means and probabilities, and multiple comparisons. This study is the second part of series in which we survey medical statistics. Arguments related to statistical associations and regressions will be made in subsequent papers.
Optimal Prediction in the Retina and Natural Motion Statistics
NASA Astrophysics Data System (ADS)
Salisbury, Jared M.; Palmer, Stephanie E.
2016-03-01
Almost all behaviors involve making predictions. Whether an organism is trying to catch prey, avoid predators, or simply move through a complex environment, the organism uses the data it collects through its senses to guide its actions by extracting from these data information about the future state of the world. A key aspect of the prediction problem is that not all features of the past sensory input have predictive power, and representing all features of the external sensory world is prohibitively costly both due to space and metabolic constraints. This leads to the hypothesis that neural systems are optimized for prediction. Here we describe theoretical and computational efforts to define and quantify the efficient representation of the predictive information by the brain. Another important feature of the prediction problem is that the physics of the world is diverse enough to contain a wide range of possible statistical ensembles, yet not all inputs are probable. Thus, the brain might not be a generalized predictive machine; it might have evolved to specifically solve the prediction problems most common in the natural environment. This paper summarizes recent results on predictive coding and optimal predictive information in the retina and suggests approaches for quantifying prediction in response to natural motion. Basic statistics of natural movies reveal that general patterns of spatiotemporal correlation are present across a wide range of scenes, though individual differences in motion type may be important for optimal processing of motion in a given ecological niche.
Computational statistics using the Bayesian Inference Engine
NASA Astrophysics Data System (ADS)
Weinberg, Martin D.
2013-09-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.
Hero, Alfred O; Rajaratnam, Bala
2016-01-01
When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.
General aviation avionics statistics : 1977.
DOT National Transportation Integrated Search
1980-06-01
This report presents avionics statistics for the 1977 general aviation (GA) aircraft fleet and is the fourth in a series. The statistics are presented in a capability group framework which enables one to relate airborne avionics equipment to the capa...
Detection of Parent-of-Origin Effects Using General Pedigree Data
Zhou, Ji-Yuan; Ding, Jie; Fung, Wing K.; Lin, Shili
2010-01-01
Genomic imprinting is an important epigenetic factor in complex traits study, which has generally been examined by testing for parent-of-origin effects of alleles. For a diallelic marker locus, the parental-asymmetry test (PAT) based on case-parents trios and its extensions to incomplete nuclear families (1-PAT and C-PAT) are simple and powerful for detecting parent-of-origin effects. However, these methods are suitable only for nuclear families and thus are not amenable to general pedigree data. Use of data from extended pedigrees, if available, may lead to more powerful methods than randomly selecting one two-generation nuclear family from each pedigree. In this study, we extend PAT to accommodate general pedigree data by proposing the pedigree PAT (PPAT) statistic, which uses all informative family trios from pedigrees. To fully utilize pedigrees with some missing genotypes, we further develop the Monte Carlo (MC) PPAT (MCPPAT) statistic based on MC sampling and estimation. Extensive simulations were carried out to evaluate the performance of the proposed methods. Under the assumption that the pedigrees and their associated affection patterns are randomly drawn from a population of pedigrees with at least one affected offspring, we demonstrated that MCPPAT is a valid test for parent-of-origin effects in the presence of association. Further, MCPPAT is much more powerful compared to PAT for trios or even PPAT for all informative family trios from the same pedigrees if there is missing data. Application of the proposed methods to a rheumatoid arthritis dataset further demonstrates the advantage of MCPPAT. PMID:19676055
To b or not to b ?? A nonextensive view of b-value in the Gutenberg-Richter law.
NASA Astrophysics Data System (ADS)
Vallianatos, Filippos
2014-05-01
The Gutenberg-Richter (GR) (Gutenberg and Richter, 1944) law one of the cornerstones of modern seismology has been considered as a paradigm of manifestation of self-organized criticality since the dependence of the cumulative number of earthquakes with energy, i.e., the number of earthquakes with energy greater than E, behaves as a power law with the b value related to the critical exponent. A great number of seismic hazard studies have been originated as a result of this law. The Gutenberg-Richter (GR) law is an empirical relationship, which recent efforts relate it with general physical principles (Kagan and Knopoff, 1981; Wesnousky, 1999; Sarlis et al., 2010; Telesca, 2012; Vallianatos and Sammonds, 2013). Nonextensive statistical mechanics pioneered by Tsallis (Tsallis, 2009) provides a consistent theoretical framework for the studies of complex systems in their nonequilibrium stationary states, systems with multi fractal and self-similar structures, long-range interacting systems, etc. Earth is such system. In the present work we analyze the different pathways (originated in Sotolongo-Costa, A. Posadas , 2004; Silva et al., 2006) to extract the generalization of the G-R law as obtained in the frame of non extensive statistical physics. We estimate the b-value and we discuss its underline physics. This research has been funded by the European Union (European Social Fund) and Greek national resources under the framework of the "THALES Program: SEISMO FEAR HELLARC" project of the "Education & Lifelong Learning" Operational Programme. References Gutenberg, B. and C. F. Richter (1944). Bull. Seismol. Soc. Am. 34, 185-188. Kagan, Y. Y. and L. Knopoff (1981). J. Geophys. Res. 86, 2853-2862. Sarlis, N., E. Skordas and P. Varotsos (2010). Physical Review E - Statistical, Nonlinear, and Soft Matter Physics 82 (2) , 021110. Silva, R., G. Franca, C. Vilar and J. Alcaniz (2006). Phys. Rev. E, 73, 026102 Sotolongo-Costa, O. and A. Posadas (2004). Phys. Rev. Lett., 92, 048501 Telesca, L. (2012). Bull. Seismol. Soc. Amer., 102,886-891. Tsallis, C. (2009). Introduction to Nonextensive Statistical Mechanics, Approaching a Complex World Springer, New York Vallianatos, F. and P. Sammonds, (2013). Tectonophysics 590, 52-58 Wesnousky, S. G. (1999). Bull. Seismol. Soc. Am. 89, 1131-1137.
Skinner, Carl G; Patel, Manish M; Thomas, Jerry D; Miller, Michael A
2011-01-01
Statistical methods are pervasive in medical research and general medical literature. Understanding general statistical concepts will enhance our ability to critically appraise the current literature and ultimately improve the delivery of patient care. This article intends to provide an overview of the common statistical methods relevant to medicine.
Proof of the Spin Statistics Connection 2: Relativistic Theory
NASA Astrophysics Data System (ADS)
Santamato, Enrico; De Martini, Francesco
2017-12-01
The traditional standard theory of quantum mechanics is unable to solve the spin-statistics problem, i.e. to justify the utterly important "Pauli Exclusion Principle" but by the adoption of the complex standard relativistic quantum field theory. In a recent paper (Santamato and De Martini in Found Phys 45(7):858-873, 2015) we presented a proof of the spin-statistics problem in the nonrelativistic approximation on the basis of the "Conformal Quantum Geometrodynamics". In the present paper, by the same theory the proof of the spin-statistics theorem is extended to the relativistic domain in the general scenario of curved spacetime. The relativistic approach allows to formulate a manifestly step-by-step Weyl gauge invariant theory and to emphasize some fundamental aspects of group theory in the demonstration. No relativistic quantum field operators are used and the particle exchange properties are drawn from the conservation of the intrinsic helicity of elementary particles. It is therefore this property, not considered in the standard quantum mechanics, which determines the correct spin-statistics connection observed in Nature (Santamato and De Martini in Found Phys 45(7):858-873, 2015). The present proof of the spin-statistics theorem is simpler than the one presented in Santamato and De Martini (Found Phys 45(7):858-873, 2015), because it is based on symmetry group considerations only, without having recourse to frames attached to the particles. Second quantization and anticommuting operators are not necessary.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149
A generalized concept of power helped to choose optimal endpoints in clinical trials.
Borm, George F; van der Wilt, Gert J; Kremer, Jan A M; Zielhuis, Gerhard A
2007-04-01
A clinical trial may have multiple objectives. Sometimes the results for several parameters may need to be significant or meet certain other criteria. In such cases, it is important to evaluate the probability that all these objectives will be met, rather than the probability that each will be met. The purpose of this article is to introduce a definition of power that is tailored to handle this situation and that is helpful for the design of such trials. We introduce a generalized concept of power. It can handle complex situations, for example, in which there is a logical combination of partial objectives. These may be formulated not only in terms of statistical tests and of confidence intervals, but also in nonstatistical terms, such as "selecting the optimal by dose." The power of a trial was calculated for various objectives and combinations of objectives. The generalized concept of power may lead to power calculations that closely match the objectives of the trial and contribute to choosing more efficient endpoints and designs.
Perspective: Maximum caliber is a general variational principle for dynamical systems
NASA Astrophysics Data System (ADS)
Dixit, Purushottam D.; Wagoner, Jason; Weistuch, Corey; Pressé, Steve; Ghosh, Kingshuk; Dill, Ken A.
2018-01-01
We review here Maximum Caliber (Max Cal), a general variational principle for inferring distributions of paths in dynamical processes and networks. Max Cal is to dynamical trajectories what the principle of maximum entropy is to equilibrium states or stationary populations. In Max Cal, you maximize a path entropy over all possible pathways, subject to dynamical constraints, in order to predict relative path weights. Many well-known relationships of non-equilibrium statistical physics—such as the Green-Kubo fluctuation-dissipation relations, Onsager's reciprocal relations, and Prigogine's minimum entropy production—are limited to near-equilibrium processes. Max Cal is more general. While it can readily derive these results under those limits, Max Cal is also applicable far from equilibrium. We give examples of Max Cal as a method of inference about trajectory distributions from limited data, finding reaction coordinates in bio-molecular simulations, and modeling the complex dynamics of non-thermal systems such as gene regulatory networks or the collective firing of neurons. We also survey its basis in principle and some limitations.
Perspective: Maximum caliber is a general variational principle for dynamical systems.
Dixit, Purushottam D; Wagoner, Jason; Weistuch, Corey; Pressé, Steve; Ghosh, Kingshuk; Dill, Ken A
2018-01-07
We review here Maximum Caliber (Max Cal), a general variational principle for inferring distributions of paths in dynamical processes and networks. Max Cal is to dynamical trajectories what the principle of maximum entropy is to equilibrium states or stationary populations. In Max Cal, you maximize a path entropy over all possible pathways, subject to dynamical constraints, in order to predict relative path weights. Many well-known relationships of non-equilibrium statistical physics-such as the Green-Kubo fluctuation-dissipation relations, Onsager's reciprocal relations, and Prigogine's minimum entropy production-are limited to near-equilibrium processes. Max Cal is more general. While it can readily derive these results under those limits, Max Cal is also applicable far from equilibrium. We give examples of Max Cal as a method of inference about trajectory distributions from limited data, finding reaction coordinates in bio-molecular simulations, and modeling the complex dynamics of non-thermal systems such as gene regulatory networks or the collective firing of neurons. We also survey its basis in principle and some limitations.
Analysis of in vitro fertilization data with multiple outcomes using discrete time-to-event analysis
Maity, Arnab; Williams, Paige; Ryan, Louise; Missmer, Stacey; Coull, Brent; Hauser, Russ
2014-01-01
In vitro fertilization (IVF) is an increasingly common method of assisted reproductive technology. Because of the careful observation and followup required as part of the procedure, IVF studies provide an ideal opportunity to identify and assess clinical and demographic factors along with environmental exposures that may impact successful reproduction. A major challenge in analyzing data from IVF studies is handling the complexity and multiplicity of outcome, resulting from both multiple opportunities for pregnancy loss within a single IVF cycle in addition to multiple IVF cycles. To date, most evaluations of IVF studies do not make use of full data due to its complex structure. In this paper, we develop statistical methodology for analysis of IVF data with multiple cycles and possibly multiple failure types observed for each individual. We develop a general analysis framework based on a generalized linear modeling formulation that allows implementation of various types of models including shared frailty models, failure specific frailty models, and transitional models, using standard software. We apply our methodology to data from an IVF study conducted at the Brigham and Women’s Hospital, Massachusetts. We also summarize the performance of our proposed methods based on a simulation study. PMID:24317880
Improved disparity map analysis through the fusion of monocular image segmentations
NASA Technical Reports Server (NTRS)
Perlant, Frederic P.; Mckeown, David M.
1991-01-01
The focus is to examine how estimates of three dimensional scene structure, as encoded in a scene disparity map, can be improved by the analysis of the original monocular imagery. The utilization of surface illumination information is provided by the segmentation of the monocular image into fine surface patches of nearly homogeneous intensity to remove mismatches generated during stereo matching. These patches are used to guide a statistical analysis of the disparity map based on the assumption that such patches correspond closely with physical surfaces in the scene. Such a technique is quite independent of whether the initial disparity map was generated by automated area-based or feature-based stereo matching. Stereo analysis results are presented on a complex urban scene containing various man-made and natural features. This scene contains a variety of problems including low building height with respect to the stereo baseline, buildings and roads in complex terrain, and highly textured buildings and terrain. The improvements are demonstrated due to monocular fusion with a set of different region-based image segmentations. The generality of this approach to stereo analysis and its utility in the development of general three dimensional scene interpretation systems are also discussed.
Fitting direct covariance structures by the MSTRUCT modeling language of the CALIS procedure.
Yung, Yiu-Fai; Browne, Michael W; Zhang, Wei
2015-02-01
This paper demonstrates the usefulness and flexibility of the general structural equation modelling (SEM) approach to fitting direct covariance patterns or structures (as opposed to fitting implied covariance structures from functional relationships among variables). In particular, the MSTRUCT modelling language (or syntax) of the CALIS procedure (SAS/STAT version 9.22 or later: SAS Institute, 2010) is used to illustrate the SEM approach. The MSTRUCT modelling language supports a direct covariance pattern specification of each covariance element. It also supports the input of additional independent and dependent parameters. Model tests, fit statistics, estimates, and their standard errors are then produced under the general SEM framework. By using numerical and computational examples, the following tests of basic covariance patterns are illustrated: sphericity, compound symmetry, and multiple-group covariance patterns. Specification and testing of two complex correlation structures, the circumplex pattern and the composite direct product models with or without composite errors and scales, are also illustrated by the MSTRUCT syntax. It is concluded that the SEM approach offers a general and flexible modelling of direct covariance and correlation patterns. In conjunction with the use of SAS macros, the MSTRUCT syntax provides an easy-to-use interface for specifying and fitting complex covariance and correlation structures, even when the number of variables or parameters becomes large. © 2014 The British Psychological Society.
Rajab, Ghada Z; Suh, Soh Youn; Demer, Joseph L
2017-06-01
Dissociated strabismus complex (DSC) is an enigmatic form of strabismus that includes dissociated vertical deviation (DVD) and dissociated horizontal deviation (DHD). We employed magnetic resonance imaging (MRI) to evaluate the extraocular muscles in DSC. We studied 5 patients with DSC and mean age of 25 years (range, 12-42 years), and 15 age-matched, orthotropic control subjects. All patients had DVD; 4 also had DHD. We employed high-resolution, surface coil MRI with thin, 2 mm slices and central target fixation. Volumes of the rectus and superior oblique muscles in the region 12 mm posterior to 4 mm anterior to the globe-optic nerve junction were measured in quasi-coronal planes in central gaze. Patients with DSC had no structural abnormalities of rectus muscles or rectus pulleys or the superior oblique muscle but exhibited modest, statistically significant increased volume of all rectus muscles ranging from 20% for medial rectus to 9% for lateral rectus (P < 0.05). DSC includes various combinations of sursumduction, excycloduction, and abduction not conforming to Hering's law. We have found modest generalized enlargement of all rectus muscles. DSC is associated with generalized rectus extraocular muscle hypertrophy in the absence of other orbital abnormalities. Copyright © 2017 American Association for Pediatric Ophthalmology and Strabismus. Published by Elsevier Inc. All rights reserved.
Shepherd, Allyson R; Ali, Halimah
2015-05-01
Dental treatment is the commonest reason for a child to be in hospital in the UK. This is a shocking statistic for a preventable disease. How can we reduce the high numbers of dental general anaesthetics? It is essential that dental treatment under general anaesthesia (GA) is fully justifiable, ensuring that the right patients receive the right treatment. Guidance for general dental practitioners on when to refer a child for a dental GA is discussed. Treatment planning for this dentally high-risk group of children requires a holistic approach. It is complex and requires an experienced and competent clinical team, including dental care professionals with additional postgraduate qualifications. Often, alternative treatments are successful and a GA can be avoided. An audit of 85 patients referred for GA with Oldham Community Dental Service demonstrated 35% of patients accepted treatment with local anaesthesia only, 25% required inhalation sedation and only 25% were actually referred on for GA. Treatment for this group of patients must include the availability and provision of appropriate alternative treatment modalities, with the right staff and facilities, including those for dental general anaesthetic sessions. Ongoing follow-up within the general dental services is essential for this group of patients.
NASA Astrophysics Data System (ADS)
Glushak, P. A.; Markiv, B. B.; Tokarchuk, M. V.
2018-01-01
We present a generalization of Zubarev's nonequilibrium statistical operator method based on the principle of maximum Renyi entropy. In the framework of this approach, we obtain transport equations for the basic set of parameters of the reduced description of nonequilibrium processes in a classical system of interacting particles using Liouville equations with fractional derivatives. For a classical systems of particles in a medium with a fractal structure, we obtain a non-Markovian diffusion equation with fractional spatial derivatives. For a concrete model of the frequency dependence of a memory function, we obtain generalized Kettano-type diffusion equation with the spatial and temporal fractality taken into account. We present a generalization of nonequilibrium thermofield dynamics in Zubarev's nonequilibrium statistical operator method in the framework of Renyi statistics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pant, Nidhi; Das, Santanu; Mitra, Sanjit
Mild, unavoidable deviations from circular-symmetry of instrumental beams along with scan strategy can give rise to measurable Statistical Isotropy (SI) violation in Cosmic Microwave Background (CMB) experiments. If not accounted properly, this spurious signal can complicate the extraction of other SI violation signals (if any) in the data. However, estimation of this effect through exact numerical simulation is computationally intensive and time consuming. A generalized analytical formalism not only provides a quick way of estimating this signal, but also gives a detailed understanding connecting the leading beam anisotropy components to a measurable BipoSH characterisation of SI violation. In this paper,more » we provide an approximate generic analytical method for estimating the SI violation generated due to a non-circular (NC) beam and arbitrary scan strategy, in terms of the Bipolar Spherical Harmonic (BipoSH) spectra. Our analytical method can predict almost all the features introduced by a NC beam in a complex scan and thus reduces the need for extensive numerical simulation worth tens of thousands of CPU hours into minutes long calculations. As an illustrative example, we use WMAP beams and scanning strategy to demonstrate the easability, usability and efficiency of our method. We test all our analytical results against that from exact numerical simulations.« less
Causes and correlations in cambium phenology: towards an integrated framework of xylogenesis.
Rossi, Sergio; Morin, Hubert; Deslauriers, Annie
2012-03-01
Although habitually considered as a whole, xylogenesis is a complex process of division and maturation of a pool of cells where the relationship between the phenological phases generating such a growth pattern remains essentially unknown. This study investigated the causal relationships in cambium phenology of black spruce [Picea mariana (Mill.) BSP] monitored for 8 years on four sites of the boreal forest of Quebec, Canada. The dependency links connecting the timing of xylem cell differentiation and cell production were defined and the resulting causal model was analysed with d-sep tests and generalized mixed models with repeated measurements, and tested with Fisher's C statistics to determine whether and how causality propagates through the measured variables. The higher correlations were observed between the dates of emergence of the first developing cells and between the ending of the differentiation phases, while the number of cells was significantly correlated with all phenological phases. The model with eight dependency links was statistically valid for explaining the causes and correlations between the dynamics of cambium phenology. Causal modelling suggested that the phenological phases involved in xylogenesis are closely interconnected by complex relationships of cause and effect, with the onset of cell differentiation being the main factor directly or indirectly triggering all successive phases of xylem maturation.
Causes and correlations in cambium phenology: towards an integrated framework of xylogenesis
Rossi, Sergio; Morin, Hubert; Deslauriers, Annie
2012-01-01
Although habitually considered as a whole, xylogenesis is a complex process of division and maturation of a pool of cells where the relationship between the phenological phases generating such a growth pattern remains essentially unknown. This study investigated the causal relationships in cambium phenology of black spruce [Picea mariana (Mill.) BSP] monitored for 8 years on four sites of the boreal forest of Quebec, Canada. The dependency links connecting the timing of xylem cell differentiation and cell production were defined and the resulting causal model was analysed with d-sep tests and generalized mixed models with repeated measurements, and tested with Fisher’s C statistics to determine whether and how causality propagates through the measured variables. The higher correlations were observed between the dates of emergence of the first developing cells and between the ending of the differentiation phases, while the number of cells was significantly correlated with all phenological phases. The model with eight dependency links was statistically valid for explaining the causes and correlations between the dynamics of cambium phenology. Causal modelling suggested that the phenological phases involved in xylogenesis are closely interconnected by complex relationships of cause and effect, with the onset of cell differentiation being the main factor directly or indirectly triggering all successive phases of xylem maturation. PMID:22174441
Statistical Physics of Complex Substitutive Systems
NASA Astrophysics Data System (ADS)
Jin, Qing
Diffusion processes are central to human interactions. Despite extensive studies that span multiple disciplines, our knowledge is limited to spreading processes in non-substitutive systems. Yet, a considerable number of ideas, products, and behaviors spread by substitution; to adopt a new one, agents must give up an existing one. This captures the spread of scientific constructs--forcing scientists to choose, for example, a deterministic or probabilistic worldview, as well as the adoption of durable items, such as mobile phones, cars, or homes. In this dissertation, I develop a statistical physics framework to describe, quantify, and understand substitutive systems. By empirically exploring three collected high-resolution datasets pertaining to such systems, I build a mechanistic model describing substitutions, which not only analytically predicts the universal macroscopic phenomenon discovered in the collected datasets, but also accurately captures the trajectories of individual items in a complex substitutive system, demonstrating a high degree of regularity and universality in substitutive systems. I also discuss the origins and insights of the parameters in the substitution model and possible generalization form of the mathematical framework. The systematical study of substitutive systems presented in this dissertation could potentially guide the understanding and prediction of all spreading phenomena driven by substitutions, from electric cars to scientific paradigms, and from renewable energy to new healthy habits.
Zikou, Anastasia K; Xydis, Vasileios G; Astrakas, Loukas G; Nakou, Iliada; Tzarouchi, Loukia C; Tzoufi, Meropi; Argyropoulou, Maria I
2016-07-01
There is evidence of microstructural changes in normal-appearing white matter of patients with tuberous sclerosis complex. To evaluate major white matter tracts in children with tuberous sclerosis complex using tract-based spatial statistics diffusion tensor imaging (DTI) analysis. Eight children (mean age ± standard deviation: 8.5 ± 5.5 years) with an established diagnosis of tuberous sclerosis complex and 8 age-matched controls were studied. The imaging protocol consisted of T1-weighted high-resolution 3-D spoiled gradient-echo sequence and a spin-echo, echo-planar diffusion-weighted sequence. Differences in the diffusion indices were evaluated using tract-based spatial statistics. Tract-based spatial statistics showed increased axial diffusivity in the children with tuberous sclerosis complex in the superior and anterior corona radiata, the superior longitudinal fascicle, the inferior fronto-occipital fascicle, the uncinate fascicle and the anterior thalamic radiation. No significant differences were observed in fractional anisotropy, mean diffusivity and radial diffusivity between patients and control subjects. No difference was found in the diffusion indices between the baseline and follow-up examination in the patient group. Patients with tuberous sclerosis complex have increased axial diffusivity in major white matter tracts, probably related to reduced axonal integrity.
Stochastic modeling of sunshine number data
NASA Astrophysics Data System (ADS)
Brabec, Marek; Paulescu, Marius; Badescu, Viorel
2013-11-01
In this paper, we will present a unified statistical modeling framework for estimation and forecasting sunshine number (SSN) data. Sunshine number has been proposed earlier to describe sunshine time series in qualitative terms (Theor Appl Climatol 72 (2002) 127-136) and since then, it was shown to be useful not only for theoretical purposes but also for practical considerations, e.g. those related to the development of photovoltaic energy production. Statistical modeling and prediction of SSN as a binary time series has been challenging problem, however. Our statistical model for SSN time series is based on an underlying stochastic process formulation of Markov chain type. We will show how its transition probabilities can be efficiently estimated within logistic regression framework. In fact, our logistic Markovian model can be relatively easily fitted via maximum likelihood approach. This is optimal in many respects and it also enables us to use formalized statistical inference theory to obtain not only the point estimates of transition probabilities and their functions of interest, but also related uncertainties, as well as to test of various hypotheses of practical interest, etc. It is straightforward to deal with non-homogeneous transition probabilities in this framework. Very importantly from both physical and practical points of view, logistic Markov model class allows us to test hypotheses about how SSN dependents on various external covariates (e.g. elevation angle, solar time, etc.) and about details of the dynamic model (order and functional shape of the Markov kernel, etc.). Therefore, using generalized additive model approach (GAM), we can fit and compare models of various complexity which insist on keeping physical interpretation of the statistical model and its parts. After introducing the Markovian model and general approach for identification of its parameters, we will illustrate its use and performance on high resolution SSN data from the Solar Radiation Monitoring Station of the West University of Timisoara.
Stochastic modeling of sunshine number data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brabec, Marek, E-mail: mbrabec@cs.cas.cz; Paulescu, Marius; Badescu, Viorel
2013-11-13
In this paper, we will present a unified statistical modeling framework for estimation and forecasting sunshine number (SSN) data. Sunshine number has been proposed earlier to describe sunshine time series in qualitative terms (Theor Appl Climatol 72 (2002) 127-136) and since then, it was shown to be useful not only for theoretical purposes but also for practical considerations, e.g. those related to the development of photovoltaic energy production. Statistical modeling and prediction of SSN as a binary time series has been challenging problem, however. Our statistical model for SSN time series is based on an underlying stochastic process formulation ofmore » Markov chain type. We will show how its transition probabilities can be efficiently estimated within logistic regression framework. In fact, our logistic Markovian model can be relatively easily fitted via maximum likelihood approach. This is optimal in many respects and it also enables us to use formalized statistical inference theory to obtain not only the point estimates of transition probabilities and their functions of interest, but also related uncertainties, as well as to test of various hypotheses of practical interest, etc. It is straightforward to deal with non-homogeneous transition probabilities in this framework. Very importantly from both physical and practical points of view, logistic Markov model class allows us to test hypotheses about how SSN dependents on various external covariates (e.g. elevation angle, solar time, etc.) and about details of the dynamic model (order and functional shape of the Markov kernel, etc.). Therefore, using generalized additive model approach (GAM), we can fit and compare models of various complexity which insist on keeping physical interpretation of the statistical model and its parts. After introducing the Markovian model and general approach for identification of its parameters, we will illustrate its use and performance on high resolution SSN data from the Solar Radiation Monitoring Station of the West University of Timisoara.« less
NASA Astrophysics Data System (ADS)
Tayurskii, Dmitrii; Abe, Sumiyoshi; Alexandre Wang, Q.
2012-11-01
The 3rd International Workshop on Statistical Physics and Mathematics for Complex Systems (SPMCS2012) was held between 25-30 August at Kazan (Volga Region) Federal University, Kazan, Russian Federation. This workshop was jointly organized by Kazan Federal University and Institut Supérieur des Matériaux et Mécaniques Avancées (ISMANS), France. The series of SPMCS workshops was created in 2008 with the aim to be an interdisciplinary incubator for the worldwide exchange of innovative ideas and information about the latest results. The first workshop was held at ISMANS, Le Mans (France) in 2008, and the third at Huazhong Normal University, Wuhan (China) in 2010. At SPMCS2012, we wished to bring together a broad community of researchers from the different branches of the rapidly developing complexity science to discuss the fundamental theoretical challenges (geometry/topology, number theory, statistical physics, dynamical systems, etc) as well as experimental and applied aspects of many practical problems (condensed matter, disordered systems, financial markets, chemistry, biology, geoscience, etc). The program of SPMCS2012 was prepared based on three categories: (i) physical and mathematical studies (quantum mechanics, generalized nonequilibrium thermodynamics, nonlinear dynamics, condensed matter physics, nanoscience); (ii) natural complex systems (physical, geophysical, chemical and biological); (iii) social, economical, political agent systems and man-made complex systems. The conference attracted 64 participants from 10 countries. There were 10 invited lectures, 12 invited talks and 28 regular oral talks in the morning and afternoon sessions. The book of Abstracts is available from the conference website (http://www.ksu.ru/conf/spmcs2012/?id=3). A round table was also held, the topic of which was 'Recent and Anticipated Future Progress in Science of Complexity', discussing a variety of questions and opinions important for the understanding of the concept of complexity itself, the behaviours of complex systems as well as for the finding of new theoretical methods. The papers submitted to this volume were carefully reviewed by referees. We are very grateful to the referees for their very efficient and thoughtful actions. A few submitted papers were unfortunately not included based on the referee reports. As a result, 34 papers are included here. We are very grateful to the members of the international advisory committee for their recommendations of speakers for SPMCS2012. We also appreciate the behind-the-scenes work of the members of the local organizing committee in preparing the conference site, web page, mail correspondence, arrangements for excursions and accommodation, handling the financial support for participants, and so on. Finally, we acknowledge the support from Kazan Federal University. Sumiyoshi Abe Alain Le Méhauté Dmitrii Tayurskii
NASA Astrophysics Data System (ADS)
Boulton, Chris A.; Allison, Lesley C.; Lenton, Timothy M.
2014-12-01
The Atlantic Meridional Overturning Circulation (AMOC) exhibits two stable states in models of varying complexity. Shifts between alternative AMOC states are thought to have played a role in past abrupt climate changes, but the proximity of the climate system to a threshold for future AMOC collapse is unknown. Generic early warning signals of critical slowing down before AMOC collapse have been found in climate models of low and intermediate complexity. Here we show that early warning signals of AMOC collapse are present in a fully coupled atmosphere-ocean general circulation model, subject to a freshwater hosing experiment. The statistical significance of signals of increasing lag-1 autocorrelation and variance vary with latitude. They give up to 250 years warning before AMOC collapse, after ~550 years of monitoring. Future work is needed to clarify suggested dynamical mechanisms driving critical slowing down as the AMOC collapse is approached.
Boulton, Chris A.; Allison, Lesley C.; Lenton, Timothy M.
2014-01-01
The Atlantic Meridional Overturning Circulation (AMOC) exhibits two stable states in models of varying complexity. Shifts between alternative AMOC states are thought to have played a role in past abrupt climate changes, but the proximity of the climate system to a threshold for future AMOC collapse is unknown. Generic early warning signals of critical slowing down before AMOC collapse have been found in climate models of low and intermediate complexity. Here we show that early warning signals of AMOC collapse are present in a fully coupled atmosphere-ocean general circulation model, subject to a freshwater hosing experiment. The statistical significance of signals of increasing lag-1 autocorrelation and variance vary with latitude. They give up to 250 years warning before AMOC collapse, after ~550 years of monitoring. Future work is needed to clarify suggested dynamical mechanisms driving critical slowing down as the AMOC collapse is approached. PMID:25482065
Boulton, Chris A; Allison, Lesley C; Lenton, Timothy M
2014-12-08
The Atlantic Meridional Overturning Circulation (AMOC) exhibits two stable states in models of varying complexity. Shifts between alternative AMOC states are thought to have played a role in past abrupt climate changes, but the proximity of the climate system to a threshold for future AMOC collapse is unknown. Generic early warning signals of critical slowing down before AMOC collapse have been found in climate models of low and intermediate complexity. Here we show that early warning signals of AMOC collapse are present in a fully coupled atmosphere-ocean general circulation model, subject to a freshwater hosing experiment. The statistical significance of signals of increasing lag-1 autocorrelation and variance vary with latitude. They give up to 250 years warning before AMOC collapse, after ~550 years of monitoring. Future work is needed to clarify suggested dynamical mechanisms driving critical slowing down as the AMOC collapse is approached.
Various complexity measures in confined hydrogen atom
NASA Astrophysics Data System (ADS)
Majumdar, Sangita; Mukherjee, Neetik; Roy, Amlan K.
2017-11-01
Several well-known statistical measures similar to LMC and Fisher-Shannon complexity have been computed for confined hydrogen atom in both position (r) and momentum (p) spaces. Further, a more generalized form of these quantities with Rényi entropy (R) is explored here. The role of scaling parameter in the exponential part is also pursued. R is evaluated taking order of entropic moments α, β as (2/3, 3) in r and p spaces. Detailed systematic results of these measures with respect to variation of confinement radius rc is presented for low-lying states such as, 1 s - 3 d, 4 f and 5 g . For nodal states, such as 2 s, 3 s and 3 p , as rc progresses there appears a maximum followed by a minimum in r space, having certain values of the scaling parameter. However, the corresponding p-space results lack such distinct patterns. This study reveals many other interesting features.
Estimation of Dynamic Systems for Gene Regulatory Networks from Dependent Time-Course Data.
Kim, Yoonji; Kim, Jaejik
2018-06-15
Dynamic system consisting of ordinary differential equations (ODEs) is a well-known tool for describing dynamic nature of gene regulatory networks (GRNs), and the dynamic features of GRNs are usually captured through time-course gene expression data. Owing to high-throughput technologies, time-course gene expression data have complex structures such as heteroscedasticity, correlations between genes, and time dependence. Since gene experiments typically yield highly noisy data with small sample size, for a more accurate prediction of the dynamics, the complex structures should be taken into account in ODE models. Hence, this study proposes an ODE model considering such data structures and a fast and stable estimation method for the ODE parameters based on the generalized profiling approach with data smoothing techniques. The proposed method also provides statistical inference for the ODE estimator and it is applied to a zebrafish retina cell network.
POWER ANALYSIS FOR COMPLEX MEDIATIONAL DESIGNS USING MONTE CARLO METHODS
Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.
2013-01-01
Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex mediational models. The approach is based on the well known technique of generating a large number of samples in a Monte Carlo study, and estimating power as the percentage of cases in which an estimate of interest is significantly different from zero. Examples of power calculation for commonly used mediational models are provided. Power analyses for the single mediator, multiple mediators, three-path mediation, mediation with latent variables, moderated mediation, and mediation in longitudinal designs are described. Annotated sample syntax for Mplus is appended and tabled values of required sample sizes are shown for some models. PMID:23935262
Sherman, Aleksandra; Grabowecky, Marcia; Suzuki, Satoru
2015-08-01
What shapes art appreciation? Much research has focused on the importance of visual features themselves (e.g., symmetry, natural scene statistics) and of the viewer's experience and expertise with specific artworks. However, even after taking these factors into account, there are considerable individual differences in art preferences. Our new result suggests that art preference is also influenced by the compatibility between visual properties and the characteristics of the viewer's visual system. Specifically, we have demonstrated, using 120 artworks from diverse periods, cultures, genres, and styles, that art appreciation is increased when the level of visual complexity within an artwork is compatible with the viewer's visual working memory capacity. The result highlights the importance of the interaction between visual features and the beholder's general visual capacity in shaping art appreciation. (c) 2015 APA, all rights reserved).
Subcontinuum mass transport of condensed hydrocarbons in nanoporous media
Falk, Kerstin; Coasne, Benoit; Pellenq, Roland; Ulm, Franz-Josef; Bocquet, Lydéric
2015-01-01
Although hydrocarbon production from unconventional reservoirs, the so-called shale gas, has exploded recently, reliable predictions of resource availability and extraction are missing because conventional tools fail to account for their ultra-low permeability and complexity. Here, we use molecular simulation and statistical mechanics to show that continuum description—Darcy's law—fails to predict transport in shales nanoporous matrix (kerogen). The non-Darcy behaviour arises from strong adsorption in kerogen and the breakdown of hydrodynamics at the nanoscale, which contradict the assumption of viscous flow. Despite this complexity, all permeances collapse on a master curve with an unexpected dependence on alkane length. We rationalize this non-hydrodynamic behaviour using a molecular description capturing the scaling of permeance with alkane length and density. These results, which stress the need for a change of paradigm from classical descriptions to nanofluidic transport, have implications for shale gas but more generally for transport in nanoporous media. PMID:25901931
NASA Astrophysics Data System (ADS)
Sokolovskiy, Vladimir; Grünebohm, Anna; Buchelnikov, Vasiliy; Entel, Peter
2014-09-01
This special issue collects contributions from the participants of the "Information in Dynamical Systems and Complex Systems" workshop, which cover a wide range of important problems and new approaches that lie in the intersection of information theory and dynamical systems. The contributions include theoretical characterization and understanding of the different types of information flow and causality in general stochastic processes, inference and identification of coupling structure and parameters of system dynamics, rigorous coarse-grain modeling of network dynamical systems, and exact statistical testing of fundamental information-theoretic quantities such as the mutual information. The collective efforts reported herein reflect a modern perspective of the intimate connection between dynamical systems and information flow, leading to the promise of better understanding and modeling of natural complex systems and better/optimal design of engineering systems.
Valente, Carlo C; Bauer, Florian F; Venter, Fritz; Watson, Bruce; Nieuwoudt, Hélène H
2018-03-21
The increasingly large volumes of publicly available sensory descriptions of wine raises the question whether this source of data can be mined to extract meaningful domain-specific information about the sensory properties of wine. We introduce a novel application of formal concept lattices, in combination with traditional statistical tests, to visualise the sensory attributes of a big data set of some 7,000 Chenin blanc and Sauvignon blanc wines. Complexity was identified as an important driver of style in hereto uncharacterised Chenin blanc, and the sensory cues for specific styles were identified. This is the first study to apply these methods for the purpose of identifying styles within varietal wines. More generally, our interactive data visualisation and mining driven approach opens up new investigations towards better understanding of the complex field of sensory science.
Ion specific correlations in bulk and at biointerfaces.
Kalcher, I; Horinek, D; Netz, R R; Dzubiella, J
2009-10-21
Ion specific effects are ubiquitous in any complex colloidal or biological fluid in bulk or at interfaces. The molecular origins of these 'Hofmeister effects' are not well understood and their theoretical description poses a formidable challenge to the modeling and simulation community. On the basis of the combination of atomistically resolved molecular dynamics (MD) computer simulations and statistical mechanics approaches, we present a few selected examples of specific electrolyte effects in bulk, at simple neutral and charged interfaces, and on a short α-helical peptide. The structural complexity in these strongly Coulomb-correlated systems is highlighted and analyzed in the light of available experimental data. While in general the comparison of MD simulations to experiments often lacks quantitative agreement, mostly because molecular force fields and coarse-graining procedures remain to be optimized, the consensus as regards trends provides important insights into microscopic hydration and binding mechanisms.
The highly intelligent virtual agents for modeling financial markets
NASA Astrophysics Data System (ADS)
Yang, G.; Chen, Y.; Huang, J. P.
2016-02-01
Researchers have borrowed many theories from statistical physics, like ensemble, Ising model, etc., to study complex adaptive systems through agent-based modeling. However, one fundamental difference between entities (such as spins) in physics and micro-units in complex adaptive systems is that the latter are usually with high intelligence, such as investors in financial markets. Although highly intelligent virtual agents are essential for agent-based modeling to play a full role in the study of complex adaptive systems, how to create such agents is still an open question. Hence, we propose three principles for designing high artificial intelligence in financial markets and then build a specific class of agents called iAgents based on these three principles. Finally, we evaluate the intelligence of iAgents through virtual index trading in two different stock markets. For comparison, we also include three other types of agents in this contest, namely, random traders, agents from the wealth game (modified on the famous minority game), and agents from an upgraded wealth game. As a result, iAgents perform the best, which gives a well support for the three principles. This work offers a general framework for the further development of agent-based modeling for various kinds of complex adaptive systems.
Information and complexity measures in the interface of a metal and a superconductor
NASA Astrophysics Data System (ADS)
Moustakidis, Ch. C.; Panos, C. P.
2018-06-01
Fisher information, Shannon information entropy and Statistical Complexity are calculated for the interface of a normal metal and a superconductor, as a function of the temperature for several materials. The order parameter Ψ (r) derived from the Ginzburg-Landau theory is used as an input together with experimental values of critical transition temperature Tc and the superconducting coherence length ξ0. Analytical expressions are obtained for information and complexity measures. Thus Tc is directly related in a simple way with disorder and complexity. An analytical relation is found of the Fisher Information with the energy profile of superconductivity i.e. the ratio of surface free energy and the bulk free energy. We verify that a simple relation holds between Shannon and Fisher information i.e. a decomposition of a global information quantity (Shannon) in terms of two local ones (Fisher information), previously derived and verified for atoms and molecules by Liu et al. Finally, we find analytical expressions for generalized information measures like the Tsallis entropy and Fisher information. We conclude that the proper value of the non-extensivity parameter q ≃ 1, in agreement with previous work using a different model, where q ≃ 1.005.
Introduction to the topical issue: Nonadditive entropy and nonextensive statistical mechanics
NASA Astrophysics Data System (ADS)
Sugiyama, Masaru
. Dear CMT readers, it is my pleasure to introduce you to this topical issue dealing with a new research field of great interest, nonextensive statistical mechanics. This theory was initiated by Constantino Tsallis' work in 1998, as a possible generalization of Boltzmann-Gibbs thermostatistics. It is based on a nonadditive entropy, nowadays referred to as the Tsallis entropy. Nonextensive statistical mechanics is expected to be a consistent and unified theoretical framework for describing the macroscopic properties of complex systems that are anomalous in view of ordinary thermostatistics. In such systems, the long-standing problem regarding the relationship between statistical and dynamical laws becomes highlighted, since ergodicity and mixing may not be well realized in situations such as the edge of chaos. The phase space appears to self-organize in a structure that is not simply Euclidean but (multi)fractal. Due to this nontrivial structure, the concept of homogeneity of the system, which is the basic premise in ordinary thermodynamics, is violated and accordingly the additivity postulate for the thermodynamic quantities such as the internal energy and entropy may not be justified, in general. (Physically, nonadditivity is deeply relevant to nonextensivity of a system, in which the thermodynamic quantities do not scale with size in a simple way. Typical examples are systems with long-range interactions like self-gravitating systems as well as nonneutral charged ones.) A point of crucial importance here is that, phenomenologically, such an exotic phase-space structure has a fairly long lifetime. Therefore, this state, referred to as a metaequilibrium state or a nonequilibrium stationary state, appears to be described by a generalized entropic principle different from the traditional Boltzmann-Gibbs form, even though it may eventually approach the Boltzmann-Gibbs equilibrium state. The limits t-> ∞ and N-> ∞ do not commute, where t and N are time and the number of particles, respectively. The present topical issue is devoted to summarizing the current status of nonextensive statistical mechanics from various perspectives. It is my hope that this issue can inform the reader of one of the foremost research areas in thermostatistics. This issue consists of eight articles. The first one by Tsallis and Brigatti presents a general introduction and an overview of nonextensive statistical mechanics. At first glance, generalization of the ordinary Boltzmann-Gibbs-Shannon entropy might be completely arbitrary. But Abe's article explains how Tsallis' generalization of the statistical entropy can uniquely be characterized by both physical and mathematical principles. Then, the article by Pluchino, Latora, and Rapisarda presents a strong evidence that nonextensive statistical mechanics is in fact relevant to nonextensive systems with long-range interactions. The articles by Rajagopal, by Wada, and by Plastino, Miller, and Plastino are concerned with the macroscopic thermodynamic properties of nonextensive statistical mechanics. Rajagopal discusses the first and second laws of thermodynamics. Wada develops a discussion about the condition under which the nonextensive statistical-mechanical formalism is thermodynamically stable. The work of Plastino, Miller, and Plastino addresses the thermodynamic Legendre-transform structure and its robustness for generalizations of entropy. After these fundamental investigations, Sakagami and Taruya examine the theory for self-gravitating systems. Finally, Beck presents a novel idea of the so-called superstatistics, which provides nonextensive statistical mechanics with a physical interpretation based on nonequilibrium concepts including temperature fluctuations. Its applications to hydrodynamic turbulence and pattern formation in thermal convection states are also discussed. Nonextensive statistical mechanics is already a well-studied field, and a number of works are available in the literature. It is recommended that the interested reader visit the URL http: //tsallis.cat.cbpf.br/TEMUCO.pdf. There, one can find a comprehensive list of references to more than one thousand papers including important results that, due to lack of space, have not been mentioned in the present issue. Though there are so many published works, nonextensive statistical mechanics is still a developing field. This can naturally be understood, since the program that has been undertaken is an extremely ambitious one that makes a serious attempt to enlarge the horizons of the realm of statistical mechanics. The possible influence of nonextensive statistical mechanics on continuum mechanics and thermodynamics seems to be wide and deep. I will therefore be happy if this issue contributes to attracting the interest of researchers and stimulates research activities not only in the very field of nonextensive statistical mechanics but also in the field of continuum mechanics and thermodynamics in a wider context. As the editor of the present topical issue, I would like to express my sincere thanks to all those who joined up to make this issue. I cordially thank Professor S. Abe for advising me on the editorial policy. Without his help, the present topical issue would never have been brought out.
NASA Astrophysics Data System (ADS)
Stan Development Team
2018-01-01
Stan facilitates statistical inference at the frontiers of applied statistics and provides both a modeling language for specifying complex statistical models and a library of statistical algorithms for computing inferences with those models. These components are exposed through interfaces in environments such as R, Python, and the command line.
Using complexity metrics with R-R intervals and BPM heart rate measures.
Wallot, Sebastian; Fusaroli, Riccardo; Tylén, Kristian; Jegindø, Else-Marie
2013-01-01
Lately, growing attention in the health sciences has been paid to the dynamics of heart rate as indicator of impending failures and for prognoses. Likewise, in social and cognitive sciences, heart rate is increasingly employed as a measure of arousal, emotional engagement and as a marker of interpersonal coordination. However, there is no consensus about which measurements and analytical tools are most appropriate in mapping the temporal dynamics of heart rate and quite different metrics are reported in the literature. As complexity metrics of heart rate variability depend critically on variability of the data, different choices regarding the kind of measures can have a substantial impact on the results. In this article we compare linear and non-linear statistics on two prominent types of heart beat data, beat-to-beat intervals (R-R interval) and beats-per-min (BPM). As a proof-of-concept, we employ a simple rest-exercise-rest task and show that non-linear statistics-fractal (DFA) and recurrence (RQA) analyses-reveal information about heart beat activity above and beyond the simple level of heart rate. Non-linear statistics unveil sustained post-exercise effects on heart rate dynamics, but their power to do so critically depends on the type data that is employed: While R-R intervals are very susceptible to non-linear analyses, the success of non-linear methods for BPM data critically depends on their construction. Generally, "oversampled" BPM time-series can be recommended as they retain most of the information about non-linear aspects of heart beat dynamics.
A functional U-statistic method for association analysis of sequencing data.
Jadhav, Sneha; Tong, Xiaoran; Lu, Qing
2017-11-01
Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
Visualizing Teacher Education as a Complex System: A Nested Simplex System Approach
ERIC Educational Resources Information Center
Ludlow, Larry; Ell, Fiona; Cochran-Smith, Marilyn; Newton, Avery; Trefcer, Kaitlin; Klein, Kelsey; Grudnoff, Lexie; Haigh, Mavis; Hill, Mary F.
2017-01-01
Our purpose is to provide an exploratory statistical representation of initial teacher education as a complex system comprised of dynamic influential elements. More precisely, we reveal what the system looks like for differently-positioned teacher education stakeholders based on our framework for gathering, statistically analyzing, and graphically…
Impulse Response Operators for Structural Complexes
1990-05-12
systems of the complex. The statistical energy analysis (SEA) is one such a device [ 13, 14]. The rendering of SEA from equation (21) and/or (25) lies...Propagation.] 13. L. Cremer, M. Heckl, and E.E. Ungar 1973 Structure-Borne Sound (Springer Verlag). 14. R. H. Lyon 1975 Statistical Energy Analysis of
Grozdanov, Daniel; Herascu, Nicoleta; Reinot, Tõnu; Jankowiak, Ryszard; Zazubovich, Valter
2010-03-18
Previously published and new spectral hole burning (SHB) data on the B800 band of LH2 light-harvesting antenna complex of Rps. acidophila are analyzed in light of recent single photosynthetic complex spectroscopy (SPCS) results (for a review, see Berlin et al. Phys. Life Rev. 2007, 4, 64.). It is demonstrated that, in general, SHB-related phenomena observed for the B800 band are in qualitative agreement with the SPCS data and the protein models involving multiwell multitier protein energy landscapes. Regarding the quantitative agreement, we argue that the single-molecule behavior associated with the fastest spectral diffusion (smallest barrier) tier of the protein energy landscape is inconsistent with the SHB data. The latter discrepancy can be attributed to SPCS probing not only the dynamics of of the protein complex per se, but also that of the surrounding amorphous host and/or of the host-protein interface. It is argued that SHB (once improved models are developed) should also be able to provide the average magnitudes and probability distributions of light-induced spectral shifts and could be used to determine whether SPCS probes a set of protein complexes that are both intact and statistically relevant. SHB results are consistent with the B800 --> B850 energy-transfer models including consideration of the whole B850 density of states.
NASA Astrophysics Data System (ADS)
Matsubara, Takahiko
2003-02-01
We formulate a general method for perturbative evaluations of statistics of smoothed cosmic fields and provide useful formulae for application of the perturbation theory to various statistics. This formalism is an extensive generalization of the method used by Matsubara, who derived a weakly nonlinear formula of the genus statistic in a three-dimensional density field. After describing the general method, we apply the formalism to a series of statistics, including genus statistics, level-crossing statistics, Minkowski functionals, and a density extrema statistic, regardless of the dimensions in which each statistic is defined. The relation between the Minkowski functionals and other geometrical statistics is clarified. These statistics can be applied to several cosmic fields, including three-dimensional density field, three-dimensional velocity field, two-dimensional projected density field, and so forth. The results are detailed for second-order theory of the formalism. The effect of the bias is discussed. The statistics of smoothed cosmic fields as functions of rescaled threshold by volume fraction are discussed in the framework of second-order perturbation theory. In CDM-like models, their functional deviations from linear predictions plotted against the rescaled threshold are generally much smaller than that plotted against the direct threshold. There is still a slight meatball shift against rescaled threshold, which is characterized by asymmetry in depths of troughs in the genus curve. A theory-motivated asymmetry factor in the genus curve is proposed.
MAGMA: Generalized Gene-Set Analysis of GWAS Data
de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle
2015-01-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710
MAGMA: generalized gene-set analysis of GWAS data.
de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle
2015-04-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
The use of algorithmic behavioural transfer functions in parametric EO system performance models
NASA Astrophysics Data System (ADS)
Hickman, Duncan L.; Smith, Moira I.
2015-10-01
The use of mathematical models to predict the overall performance of an electro-optic (EO) system is well-established as a methodology and is used widely to support requirements definition, system design, and produce performance predictions. Traditionally these models have been based upon cascades of transfer functions based on established physical theory, such as the calculation of signal levels from radiometry equations, as well as the use of statistical models. However, the performance of an EO system is increasing being dominated by the on-board processing of the image data and this automated interpretation of image content is complex in nature and presents significant modelling challenges. Models and simulations of EO systems tend to either involve processing of image data as part of a performance simulation (image-flow) or else a series of mathematical functions that attempt to define the overall system characteristics (parametric). The former approach is generally more accurate but statistically and theoretically weak in terms of specific operational scenarios, and is also time consuming. The latter approach is generally faster but is unable to provide accurate predictions of a system's performance under operational conditions. An alternative and novel architecture is presented in this paper which combines the processing speed attributes of parametric models with the accuracy of image-flow representations in a statistically valid framework. An additional dimension needed to create an effective simulation is a robust software design whose architecture reflects the structure of the EO System and its interfaces. As such, the design of the simulator can be viewed as a software prototype of a new EO System or an abstraction of an existing design. This new approach has been used successfully to model a number of complex military systems and has been shown to combine improved performance estimation with speed of computation. Within the paper details of the approach and architecture are described in detail, and example results based on a practical application are then given which illustrate the performance benefits. Finally, conclusions are drawn and comments given regarding the benefits and uses of the new approach.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
Uricchio, Lawrence H; Zaitlen, Noah A; Ye, Chun Jimmie; Witte, John S; Hernandez, Ryan D
2016-07-01
The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature. © 2016 Uricchio et al.; Published by Cold Spring Harbor Laboratory Press.
Community structural characteristics and the adoption of fluoridation.
Smith, R A
1981-01-01
A study of community structural characteristics associated with fluoridation outcomes was conducted in 47 communities. A three-part outcome distinction was utilized: communities never having publicly considered the fluoridation issue, those rejecting it, and those accepting it. The independent variables reflect the complexity of the community social and economic structure, social integration, and the centralization of authority. Results of mean comparisons show statistically significant differences between the three outcome types on the independent variables. A series of discriminant analyses provides furtheor evidence of how the independent variables are associated with each outcome type. Non-considering communities are shown to be low in complexity, and high in social integration and the centralization of governmental authority. Rejecters are shown to be high in complexity, but low in social integration and centralized authority. Adopters are relatively high on all three sets of variables. Theretical reasoning is provided to support the hypothesis and why these results are expected. The utility of these results and structural explanations in general are discussed, especially for public/environmental health planning and political activities. PMID:7258427
International Community Development Statistical Bulletin. Spring 1968 General Edition.
ERIC Educational Resources Information Center
Community Development Foundation, New York, NY.
The Spring 1968 general edition of the International Community Development Statistical Bulletin describes its reporting system based on the International Standard Classification of Community Development Activities and a special project registration and progress form; briefly summarizes overall international data; and presents statistics on…
NASA Astrophysics Data System (ADS)
Griffiths, John D.
2015-12-01
The modern understanding of the brain as a large, complex network of interacting elements is a natural consequence of the Neuron Doctrine [1,2] that has been bolstered in recent years by the tools and concepts of connectomics. In this abstracted, network-centric view, the essence of neural and cognitive function derives from the flows between network elements of activity and information - or, more generally, causal influence. The appropriate characterization of causality in neural systems, therefore, is a question at the very heart of systems neuroscience.
General statistical considerations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eberhardt, L L; Gilbert, R O
From NAEG plutonium environmental studies program meeting; Las Vegas, Nevada, USA (2 Oct 1973). The high sampling variability encountered in environmental plutonium studies along with high analytical costs makes it very important that efficient soil sampling plans be used. However, efficient sampling depends on explicit and simple statements of the objectives of the study. When there are multiple objectives it may be difficult to devise a wholly suitable sampling scheme. Sampling for long-term changes in plutonium concentration in soils may also be complex and expensive. Further attention to problems associated with compositing samples is recommended, as is the consistent usemore » of random sampling as a basic technique. (auth)« less
Assessing physician job satisfaction and mental workload.
Boultinghouse, Oscar W; Hammack, Glenn G; Vo, Alexander H; Dittmar, Mary Lynne
2007-12-01
Physician job satisfaction and mental workload were evaluated in a pilot study of five physicians engaged in a telemedicine practice at The University of Texas Medical Branch at Galveston Electronic Health Network. Several previous studies have examined physician satisfaction with specific telemedicine applications; however, few have attempted to identify the underlying factors that contribute to physician satisfaction or lack thereof. One factor that has been found to affect well-being and functionality in the workplace-particularly with regard to human interaction with complex systems and tasks as seen in telemedicine-is mental workload. Workload is generally defined as the "cost" to a person for performing a complex task or tasks; however, prior to this study, it was unexplored as a variable that influences physician satisfaction. Two measures of job satisfaction were used: The Job Descriptive Index and the Job In General scales. Mental workload was evaluated by means of the National Aeronautics and Space Administration Task Load Index. The measures were administered by means of Web-based surveys and were given twice over a 6-month period. Nonparametric statistical analyses revealed that physician job satisfaction was generally high relative to that of the general population and other professionals. Mental workload scores associated with the practice of telemedicine in this environment are also high, and appeared stable over time. In addition, they are commensurate with scores found in individuals practicing tasks with elevated information-processing demands, such as quality control engineers and air traffic controllers. No relationship was found between the measures of job satisfaction and mental workload.
Quantifying networks complexity from information geometry viewpoint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Felice, Domenico, E-mail: domenico.felice@unicam.it; Mancini, Stefano; INFN-Sezione di Perugia, Via A. Pascoli, I-06123 Perugia
We consider a Gaussian statistical model whose parameter space is given by the variances of random variables. Underlying this model we identify networks by interpreting random variables as sitting on vertices and their correlations as weighted edges among vertices. We then associate to the parameter space a statistical manifold endowed with a Riemannian metric structure (that of Fisher-Rao). Going on, in analogy with the microcanonical definition of entropy in Statistical Mechanics, we introduce an entropic measure of networks complexity. We prove that it is invariant under networks isomorphism. Above all, considering networks as simplicial complexes, we evaluate this entropy onmore » simplexes and find that it monotonically increases with their dimension.« less
Hall, Lenwood W; Anderson, Ronald D
2013-01-01
The objectives of this study were to determine the relationship between Hyalella sp. abundance in four urban California streams and the following parameters: (1) 8 bulk metals (As, Cd, Cr, Cu, Pb, Hg, Ni, and Zn) and their associated sediment Threshold Effect Levels (TELs); (2) bifenthrin sediment concentrations; (3) 10 habitat metrics and total score; (4) grain size (% sand, silt and clay); (5) Total Organic Carbon (TOC); (6) dissolved oxygen; and (7) conductivity. California stream data used for this study were collected from Kirker Creek (2006 and 2007), Pleasant Grove Creek (2006, 2007 and 2008), Salinas streams (2009 and 2010) and Arcade Creek (2009 and 2010). Hyalella abundance in the four California streams generally declined when metals concentrations were elevated beyond the TELs. There was also a statistically significant negative relationship between Hyalella abundance and % silt for these 4 California streams as Hyalella were generally not present in silt areas. No statistically significant relationships were reported between Hyalella abundance and metals concentrations, bifenthrin concentrations, habitat metrics, % sand, % clay, TOC, dissolved oxygen and conductivity. The results from this study highlight the complexity of assessing which factors are responsible for determining the abundance of amphipods, such as Hyalella sp., in the natural environment.
Hawkins, Robert J; Kremer, Michael J; Swanson, Barbara; Fogg, Lou; Pierce, Penny; Pearson, Julie
2014-01-01
The level of patient satisfaction is a result of a complex set of interactions between the patient and the health care provider. It is important to quantify satisfaction with care because it involves the patient in the care experience and decreases the potential gap between expected and actual care delivered. We tested a preliminary 23-item instrument to measure patient satisfaction with general anesthesia care. The rating scale Rasch model was chosen as the framework. There were 10 items found to have sufficient evidence of stable fit statistics. Items included 2 questions related to information provided, 2 questions related to concern and kindness of the provider, and 1 question each for interpersonal skills of the provider, attention by the provider, feeling safe, well-being, privacy, and overall anesthesia satisfaction. Such actions as providing enough time to understand the anesthesia plan, answering questions related to the anesthetic, showing kindness and concern for the patient, displaying good interpersonal skills, providing adequate attention to the patient, providing a safe environment that maintains privacy and provides a sense of well-being are important actions that are well within the control of individual anesthesia providers and may lead to improved care from the perception of the patient.
Mi, Zhibao; Novitzky, Dimitri; Collins, Joseph F; Cooper, David KC
2015-01-01
The management of brain-dead organ donors is complex. The use of inotropic agents and replacement of depleted hormones (hormonal replacement therapy) is crucial for successful multiple organ procurement, yet the optimal hormonal replacement has not been identified, and the statistical adjustment to determine the best selection is not trivial. Traditional pair-wise comparisons between every pair of treatments, and multiple comparisons to all (MCA), are statistically conservative. Hsu’s multiple comparisons with the best (MCB) – adapted from the Dunnett’s multiple comparisons with control (MCC) – has been used for selecting the best treatment based on continuous variables. We selected the best hormonal replacement modality for successful multiple organ procurement using a two-step approach. First, we estimated the predicted margins by constructing generalized linear models (GLM) or generalized linear mixed models (GLMM), and then we applied the multiple comparison methods to identify the best hormonal replacement modality given that the testing of hormonal replacement modalities is independent. Based on 10-year data from the United Network for Organ Sharing (UNOS), among 16 hormonal replacement modalities, and using the 95% simultaneous confidence intervals, we found that the combination of thyroid hormone, a corticosteroid, antidiuretic hormone, and insulin was the best modality for multiple organ procurement for transplantation. PMID:25565890
Bruno de Finetti: the mathematician, the statistician, the economist, the forerunner.
Rossi, C
2001-12-30
Bruno de Finetti is possibly the best known Italian applied mathematician of the 20th century, but was he really just a mathematician? Looking at his papers it is always possible to find original and pioneering contributions to the various fields he was interested in, where he always put his mathematical "formamentis" and skills at the service of the applications, often extending standard theories and models in order to achieve more general results. Many contributions are also devoted to educational issues, in mathematics in general and in probability and statistics in particular.He really thought that mathematics and, in particular, those topics related to uncertainty, should enter in everyday life as a useful support to everyone's decision making. He always imagined and lived mathematics as a basic tool both for better understanding and describing complex phenomena and for helping decision makers in assuming coherent and feasible actions. His many important contributions to the theory of probability and to mathematical statistics are well known all over the world, thus, in the following, minor, but still pioneering, aspects of his work, related both to theory and to applications of mathematical tools, and to his work in the field of education and training of teachers, are presented. Copyright 2001 John Wiley & Sons, Ltd.
Datamining approaches for modeling tumor control probability.
Naqa, Issam El; Deasy, Joseph O; Mu, Yi; Huang, Ellen; Hope, Andrew J; Lindsay, Patricia E; Apte, Aditya; Alaly, James; Bradley, Jeffrey D
2010-11-01
Tumor control probability (TCP) to radiotherapy is determined by complex interactions between tumor biology, tumor microenvironment, radiation dosimetry, and patient-related variables. The complexity of these heterogeneous variable interactions constitutes a challenge for building predictive models for routine clinical practice. We describe a datamining framework that can unravel the higher order relationships among dosimetric dose-volume prognostic variables, interrogate various radiobiological processes, and generalize to unseen data before when applied prospectively. Several datamining approaches are discussed that include dose-volume metrics, equivalent uniform dose, mechanistic Poisson model, and model building methods using statistical regression and machine learning techniques. Institutional datasets of non-small cell lung cancer (NSCLC) patients are used to demonstrate these methods. The performance of the different methods was evaluated using bivariate Spearman rank correlations (rs). Over-fitting was controlled via resampling methods. Using a dataset of 56 patients with primary NCSLC tumors and 23 candidate variables, we estimated GTV volume and V75 to be the best model parameters for predicting TCP using statistical resampling and a logistic model. Using these variables, the support vector machine (SVM) kernel method provided superior performance for TCP prediction with an rs=0.68 on leave-one-out testing compared to logistic regression (rs=0.4), Poisson-based TCP (rs=0.33), and cell kill equivalent uniform dose model (rs=0.17). The prediction of treatment response can be improved by utilizing datamining approaches, which are able to unravel important non-linear complex interactions among model variables and have the capacity to predict on unseen data for prospective clinical applications.
Theoretical investigations on dual-beam illumination electronic speckle pattern interferometry
NASA Astrophysics Data System (ADS)
Goudemand, Nicolas
2006-07-01
Contrary to what is found in most of the existing scientific literature, where a specific frame is developed, the theory of speckle interferometry is (conveniently) presented here as a particular case of the more general theory of holographic interferometry. In addition to the intellectual benefit of dealing with a single unified theory, this brings about many advantages when it comes to discuss fundamental topics such as the three-dimensional evolution of the complex amplitude of the diffuse optical wavefronts, the degree of approximation of the leading formulas, the loss of fringe contrast, the decorrelation effects, the real influence of the terms generally neglected in out-of-focus regions. In the same way, the statistical properties of the speckle fields, usually treated as a separate subject matter, are also integrated in the theory, thus providing a comprehensive knowledge of the qualitative features of speckle interferometry methods, otherwise difficult to understand.
NASA Astrophysics Data System (ADS)
Park, Jaeheung; Kwak, Young-Sil; Mun, Jun-Chul; Min, Kyoung-Wook
2015-12-01
In this study, we estimated the topside scale height of plasma density (Hm) using the Swarm constellation and ionosondes in Korea. The Hm above Korean Peninsula is generally around 50 km. Statistical distributions of the topside scale height exhibited a complex dependence upon local time and season. The results were in general agreement with those of Tulasi Ram et al. (2009), who used the same method to calculate the topside scale height in a mid-latitude region. On the contrary, our results did not fully coincide with those obtained by Liu et al. (2007), who used electron density profiles from Arecibo Incoherent Scatter Radar (ISR) between 1966 and 2002. The disagreement may result from the limitations in our approximation method and data coverage used for estimations, as well as the inherent dependence of Hm on Geographic LONgitude (GLON).
Statistical moments of quantum-walk dynamics reveal topological quantum transitions.
Cardano, Filippo; Maffei, Maria; Massa, Francesco; Piccirillo, Bruno; de Lisio, Corrado; De Filippis, Giulio; Cataudella, Vittorio; Santamato, Enrico; Marrucci, Lorenzo
2016-04-22
Many phenomena in solid-state physics can be understood in terms of their topological properties. Recently, controlled protocols of quantum walk (QW) are proving to be effective simulators of such phenomena. Here we report the realization of a photonic QW showing both the trivial and the non-trivial topologies associated with chiral symmetry in one-dimensional (1D) periodic systems. We find that the probability distribution moments of the walker position after many steps can be used as direct indicators of the topological quantum transition: while varying a control parameter that defines the system phase, these moments exhibit a slope discontinuity at the transition point. Numerical simulations strongly support the conjecture that these features are general of 1D topological systems. Extending this approach to higher dimensions, different topological classes, and other typologies of quantum phases may offer general instruments for investigating and experimentally detecting quantum transitions in such complex systems.
Statistical moments of quantum-walk dynamics reveal topological quantum transitions
Cardano, Filippo; Maffei, Maria; Massa, Francesco; Piccirillo, Bruno; de Lisio, Corrado; De Filippis, Giulio; Cataudella, Vittorio; Santamato, Enrico; Marrucci, Lorenzo
2016-01-01
Many phenomena in solid-state physics can be understood in terms of their topological properties. Recently, controlled protocols of quantum walk (QW) are proving to be effective simulators of such phenomena. Here we report the realization of a photonic QW showing both the trivial and the non-trivial topologies associated with chiral symmetry in one-dimensional (1D) periodic systems. We find that the probability distribution moments of the walker position after many steps can be used as direct indicators of the topological quantum transition: while varying a control parameter that defines the system phase, these moments exhibit a slope discontinuity at the transition point. Numerical simulations strongly support the conjecture that these features are general of 1D topological systems. Extending this approach to higher dimensions, different topological classes, and other typologies of quantum phases may offer general instruments for investigating and experimentally detecting quantum transitions in such complex systems. PMID:27102945
Ling, Ying; Zhang, Minqiang; Locke, Kenneth D; Li, Guangming; Li, Zonglong
2016-01-01
The Circumplex Scales of Interpersonal Values (CSIV) is a 64-item self-report measure of goals from each octant of the interpersonal circumplex. We used item response theory methods to compare whether dominance models or ideal point models best described how people respond to CSIV items. Specifically, we fit a polytomous dominance model called the generalized partial credit model and an ideal point model of similar complexity called the generalized graded unfolding model to the responses of 1,893 college students. The results of both graphical comparisons of item characteristic curves and statistical comparisons of model fit suggested that an ideal point model best describes the process of responding to CSIV items. The different models produced different rank orderings of high-scoring respondents, but overall the models did not differ in their prediction of criterion variables (agentic and communal interpersonal traits and implicit motives).
Disequilibrium, complexity, the Schottky effect, and q-entropies, in paramagnetism
NASA Astrophysics Data System (ADS)
Pennini, F.; Plastino, A.
2017-12-01
We investigate connections between statistical quantifiers and paramagnetism. More concretely, we apply the notions of (i) disequilibrium and (ii) statistical complexity, to a paramagnetic system of non-coupled dipoles. Interesting insights are thereby obtained. In particular, we encounter a kind of criticality, not associated to the temperature but to the disequilibrium.
Rossi, Pierre; Gillet, François; Rohrbach, Emmanuelle; Diaby, Nouhou; Holliger, Christof
2009-01-01
The variability of terminal restriction fragment polymorphism analysis applied to complex microbial communities was assessed statistically. Recent technological improvements were implemented in the successive steps of the procedure, resulting in a standardized procedure which provided a high level of reproducibility. PMID:19749066
Wigner surmises and the two-dimensional homogeneous Poisson point process.
Sakhr, Jamal; Nieminen, John M
2006-04-01
We derive a set of identities that relate the higher-order interpoint spacing statistics of the two-dimensional homogeneous Poisson point process to the Wigner surmises for the higher-order spacing distributions of eigenvalues from the three classical random matrix ensembles. We also report a remarkable identity that equates the second-nearest-neighbor spacing statistics of the points of the Poisson process and the nearest-neighbor spacing statistics of complex eigenvalues from Ginibre's ensemble of 2 x 2 complex non-Hermitian random matrices.
Safety of robotic general surgery in elderly patients.
Buchs, Nicolas C; Addeo, Pietro; Bianco, Francesco M; Ayloo, Subhashini; Elli, Enrique F; Giulianotti, Pier C
2010-08-01
As the life expectancy of people in Western countries continues to rise, so too does the number of elderly patients. In parallel, robotic surgery continues to gain increasing acceptance, allowing for more complex operations to be performed by minimally invasive approach and extending indications for surgery to this population. The aim of this study is to assess the safety of robotic general surgery in patients 70 years and older. From April 2007 to December 2009, patients 70 years and older, who underwent various robotic procedures at our institution, were stratified into three categories of surgical complexity (low, intermediate, and high). There were 73 patients, including 39 women (53.4%) and 34 men (46.6%). The median age was 75 years (range 70-88 years). There were 7, 24, and 42 patients included, respectively, in the low, intermediate, and high surgical complexity categories. Approximately 50% of patients underwent hepatic and pancreatic resections. There was no statistically significant difference between the three groups in terms of morbidity, mortality, readmission or transfusion. Mean overall operative time was 254 ± 133 min (range 15-560 min). Perioperative mortality and morbidity was 1.4% and 15.1%, respectively. Transfusion rate was 9.6%, and median length of stay was 6 days (range 0-30 days). Robotic surgery can be performed safely in the elderly population with low mortality, acceptable morbidity, and short hospital stay. Age should not be considered as a contraindication to robotic surgery even for advanced procedures.
Ionospheric scintillation studies
NASA Technical Reports Server (NTRS)
Rino, C. L.; Freemouw, E. J.
1973-01-01
The diffracted field of a monochromatic plane wave was characterized by two complex correlation functions. For a Gaussian complex field, these quantities suffice to completely define the statistics of the field. Thus, one can in principle calculate the statistics of any measurable quantity in terms of the model parameters. The best data fits were achieved for intensity statistics derived under the Gaussian statistics hypothesis. The signal structure that achieved the best fit was nearly invariant with scintillation level and irregularity source (ionosphere or solar wind). It was characterized by the fact that more than 80% of the scattered signal power is in phase quadrature with the undeviated or coherent signal component. Thus, the Gaussian-statistics hypothesis is both convenient and accurate for channel modeling work.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (IBM VERSION)
NASA Technical Reports Server (NTRS)
Manteufel, R.
1994-01-01
The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (DEC VAX VERSION)
NASA Technical Reports Server (NTRS)
Merwarth, P. D.
1994-01-01
The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
A New Approach to Monte Carlo Simulations in Statistical Physics
NASA Astrophysics Data System (ADS)
Landau, David P.
2002-08-01
Monte Carlo simulations [1] have become a powerful tool for the study of diverse problems in statistical/condensed matter physics. Standard methods sample the probability distribution for the states of the system, most often in the canonical ensemble, and over the past several decades enormous improvements have been made in performance. Nonetheless, difficulties arise near phase transitions-due to critical slowing down near 2nd order transitions and to metastability near 1st order transitions, and these complications limit the applicability of the method. We shall describe a new Monte Carlo approach [2] that uses a random walk in energy space to determine the density of states directly. Once the density of states is known, all thermodynamic properties can be calculated. This approach can be extended to multi-dimensional parameter spaces and should be effective for systems with complex energy landscapes, e.g., spin glasses, protein folding models, etc. Generalizations should produce a broadly applicable optimization tool. 1. A Guide to Monte Carlo Simulations in Statistical Physics, D. P. Landau and K. Binder (Cambridge U. Press, Cambridge, 2000). 2. Fugao Wang and D. P. Landau, Phys. Rev. Lett. 86, 2050 (2001); Phys. Rev. E64, 056101-1 (2001).
Gap Shape Classification using Landscape Indices and Multivariate Statistics
Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung
2016-01-01
This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks’ lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap. PMID:27901127
NASA Astrophysics Data System (ADS)
Lee, An-Sheng; Lu, Wei-Li; Huang, Jyh-Jaan; Chang, Queenie; Wei, Kuo-Yen; Lin, Chin-Jung; Liou, Sofia Ya Hsuan
2016-04-01
Through the geology and climate characteristic in Taiwan, generally rivers carry a lot of suspended particles. After these particles settled, they become sediments which are good sorbent for heavy metals in river system. Consequently, sediments can be found recording contamination footprint at low flow energy region, such as estuary. Seven sediment cores were collected along Nankan River, northern Taiwan, which is seriously contaminated by factory, household and agriculture input. Physico-chemical properties of these cores were derived from Itrax-XRF Core Scanner and grain size analysis. In order to interpret these complex data matrices, the multivariate statistical techniques (cluster analysis, factor analysis and discriminant analysis) were introduced to this study. Through the statistical determination, the result indicates four types of sediment. One of them represents contamination event which shows high concentration of Cu, Zn, Pb, Ni and Fe, and low concentration of Si and Zr. Furthermore, three possible contamination sources of this type of sediment were revealed by Factor Analysis. The combination of sediment analysis and multivariate statistical techniques used provides new insights into the contamination depositional history of Nankan River and could be similarly applied to other river systems to determine the scale of anthropogenic contamination.
Gap Shape Classification using Landscape Indices and Multivariate Statistics.
Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung
2016-11-30
This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks' lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap.
NASA Astrophysics Data System (ADS)
Aldrin, John C.; Annis, Charles; Sabbagh, Harold A.; Lindgren, Eric A.
2016-02-01
A comprehensive approach to NDE and SHM characterization error (CE) evaluation is presented that follows the framework of the `ahat-versus-a' regression analysis for POD assessment. Characterization capability evaluation is typically more complex with respect to current POD evaluations and thus requires engineering and statistical expertise in the model-building process to ensure all key effects and interactions are addressed. Justifying the statistical model choice with underlying assumptions is key. Several sizing case studies are presented with detailed evaluations of the most appropriate statistical model for each data set. The use of a model-assisted approach is introduced to help assess the reliability of NDE and SHM characterization capability under a wide range of part, environmental and damage conditions. Best practices of using models are presented for both an eddy current NDE sizing and vibration-based SHM case studies. The results of these studies highlight the general protocol feasibility, emphasize the importance of evaluating key application characteristics prior to the study, and demonstrate an approach to quantify the role of varying SHM sensor durability and environmental conditions on characterization performance.
Statistical Modeling of Extreme Values and Evidence of Presence of Dragon King (DK) in Solar Wind
NASA Astrophysics Data System (ADS)
Gomes, T.; Ramos, F.; Rempel, E. L.; Silva, S.; C-L Chian, A.
2017-12-01
The solar wind constitutes a nonlinear dynamical system, presenting intermittent turbulence, multifractality and chaotic dynamics. One characteristic shared by many such complex systems is the presence of extreme events, that play an important role in several Geophysical phenomena and their statistical characterization is a problem of great practical relevance. This work investigates the presence of extreme events in time series of the modulus of the interplanetary magnetic field measured by Cluster spacecraft on February 2, 2002. One of the main results is that the solar wind near the Earth's bow shock can be modeled by the Generalized Pareto (GP) and Generalized Extreme Values (GEV) distributions. Both models present a statistically significant positive shape parameter which implyies a heavy tail in the probability distribution functions and an unbounded growth in return values as return periods become too long. There is evidence that current sheets are the main responsible for positive values of the shape parameter. It is also shown that magnetic reconnection at the interface between two interplanetary magnetic flux ropes in the solar wind can be considered as Dragon Kings (DK), a class of extreme events whose formation mechanisms are fundamentally different from others. As long as magnetic reconnection can be classified as a Dragon King, there is the possibility of its identification and even its prediction. Dragon kings had previously been identified in time series of financial crashes, nuclear power generation accidents, stock market and so on. It is believed that they are associated with the occurrence of extreme events in dynamical systems at phase transition, bifurcation, crises or tipping points.
Testing alternative ground water models using cross-validation and other methods
Foglia, L.; Mehl, S.W.; Hill, M.C.; Perona, P.; Burlando, P.
2007-01-01
Many methods can be used to test alternative ground water models. Of concern in this work are methods able to (1) rank alternative models (also called model discrimination) and (2) identify observations important to parameter estimates and predictions (equivalent to the purpose served by some types of sensitivity analysis). Some of the measures investigated are computationally efficient; others are computationally demanding. The latter are generally needed to account for model nonlinearity. The efficient model discrimination methods investigated include the information criteria: the corrected Akaike information criterion, Bayesian information criterion, and generalized cross-validation. The efficient sensitivity analysis measures used are dimensionless scaled sensitivity (DSS), composite scaled sensitivity, and parameter correlation coefficient (PCC); the other statistics are DFBETAS, Cook's D, and observation-prediction statistic. Acronyms are explained in the introduction. Cross-validation (CV) is a computationally intensive nonlinear method that is used for both model discrimination and sensitivity analysis. The methods are tested using up to five alternative parsimoniously constructed models of the ground water system of the Maggia Valley in southern Switzerland. The alternative models differ in their representation of hydraulic conductivity. A new method for graphically representing CV and sensitivity analysis results for complex models is presented and used to evaluate the utility of the efficient statistics. The results indicate that for model selection, the information criteria produce similar results at much smaller computational cost than CV. For identifying important observations, the only obviously inferior linear measure is DSS; the poor performance was expected because DSS does not include the effects of parameter correlation and PCC reveals large parameter correlations. ?? 2007 National Ground Water Association.
Generalized theory of semiflexible polymers.
Wiggins, Paul A; Nelson, Philip C
2006-03-01
DNA bending on length scales shorter than a persistence length plays an integral role in the translation of genetic information from DNA to cellular function. Quantitative experimental studies of these biological systems have led to a renewed interest in the polymer mechanics relevant for describing the conformational free energy of DNA bending induced by protein-DNA complexes. Recent experimental results from DNA cyclization studies have cast doubt on the applicability of the canonical semiflexible polymer theory, the wormlike chain (WLC) model, to DNA bending on biologically relevant length scales. This paper develops a theory of the chain statistics of a class of generalized semiflexible polymer models. Our focus is on the theoretical development of these models and the calculation of experimental observables. To illustrate our methods, we focus on a specific, illustrative model of DNA bending. We show that the WLC model generically describes the long-length-scale chain statistics of semiflexible polymers, as predicted by renormalization group arguments. In particular, we show that either the WLC or our present model adequately describes force-extension, solution scattering, and long-contour-length cyclization experiments, regardless of the details of DNA bend elasticity. In contrast, experiments sensitive to short-length-scale chain behavior can in principle reveal dramatic departures from the linear elastic behavior assumed in the WLC model. We demonstrate this explicitly by showing that our toy model can reproduce the anomalously large short-contour-length cyclization factors recently measured by Cloutier and Widom. Finally, we discuss the applicability of these models to DNA chain statistics in the context of future experiments.
Wang, Guanghui; Wu, Wells W; Zeng, Weihua; Chou, Chung-Lin; Shen, Rong-Fong
2006-05-01
A critical step in protein biomarker discovery is the ability to contrast proteomes, a process referred generally as quantitative proteomics. While stable-isotope labeling (e.g., ICAT, 18O- or 15N-labeling, or AQUA) remains the core technology used in mass spectrometry-based proteomic quantification, increasing efforts have been directed to the label-free approach that relies on direct comparison of peptide peak areas between LC-MS runs. This latter approach is attractive to investigators for its simplicity as well as cost effectiveness. In the present study, the reproducibility and linearity of using a label-free approach to highly complex proteomes were evaluated. Various amounts of proteins from different proteomes were subjected to repeated LC-MS analyses using an ion trap or Fourier transform mass spectrometer. Highly reproducible data were obtained between replicated runs, as evidenced by nearly ideal Pearson's correlation coefficients (for ion's peak areas or retention time) and average peak area ratios. In general, more than 50% and nearly 90% of the peptide ion ratios deviated less than 10% and 20%, respectively, from the average in duplicate runs. In addition, the multiplicity ratios of the amounts of proteins used correlated nicely with the observed averaged ratios of peak areas calculated from detected peptides. Furthermore, the removal of abundant proteins from the samples led to an improvement in reproducibility and linearity. A computer program has been written to automate the processing of data sets from experiments with groups of multiple samples for statistical analysis. Algorithms for outlier-resistant mean estimation and for adjusting statistical significance threshold in multiplicity of testing were incorporated to minimize the rate of false positives. The program was applied to quantify changes in proteomes of parental and p53-deficient HCT-116 human cells and found to yield reproducible results. Overall, this study demonstrates an alternative approach that allows global quantification of differentially expressed proteins in complex proteomes. The utility of this method to biomarker discovery is likely to synergize with future improvements in the detecting sensitivity of mass spectrometers.
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.
Physics-based statistical model and simulation method of RF propagation in urban environments
Pao, Hsueh-Yuan; Dvorak, Steven L.
2010-09-14
A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.
Generalized massive optimal data compression
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin
2018-05-01
In this paper, we provide a general procedure for optimally compressing N data down to n summary statistics, where n is equal to the number of parameters of interest. We show that compression to the score function - the gradient of the log-likelihood with respect to the parameters - yields n compressed statistics that are optimal in the sense that they preserve the Fisher information content of the data. Our method generalizes earlier work on linear Karhunen-Loéve compression for Gaussian data whilst recovering both lossless linear compression and quadratic estimation as special cases when they are optimal. We give a unified treatment that also includes the general non-Gaussian case as long as mild regularity conditions are satisfied, producing optimal non-linear summary statistics when appropriate. As a worked example, we derive explicitly the n optimal compressed statistics for Gaussian data in the general case where both the mean and covariance depend on the parameters.
Potentiation Following Ballistic and Nonballistic Complexes: The Effect of Strength Level.
Suchomel, Timothy J; Sato, Kimitake; DeWeese, Brad H; Ebben, William P; Stone, Michael H
2016-07-01
Suchomel, TJ, Sato, K, DeWeese, BH, Ebben, WP, and Stone, MH. Potentiation following ballistic and nonballistic complexes: the effect of strength level. J Strength Cond Res 30(7): 1825-1833, 2016-The purpose of this study was to compare the temporal profile of strong and weak subjects during ballistic and nonballistic potentiation complexes. Eight strong (relative back squat = 2.1 ± 0.1 times body mass) and 8 weak (relative back squat = 1.6 ± 0.2 times body mass) males performed squat jumps immediately and every minute up to 10 minutes following potentiation complexes that included ballistic or nonballistic concentric-only half-squat (COHS) performed at 90% of their 1 repetition maximum COHS. Jump height (JH) and allometrically scaled peak power (PPa) were compared using a series of 2 × 12 repeated measures analyses of variance. No statistically significant strength level main effects for JH (p = 0.442) or PPa (p = 0.078) existed during the ballistic condition. In contrast, statistically significant main effects for time existed for both JH (p = 0.014) and PPa (p < 0.001); however, no statistically significant pairwise comparisons were present (p > 0.05). Statistically significant strength level main effects existed for PPa (p = 0.039) but not for JH (p = 0.137) during the nonballistic condition. Post hoc analysis revealed that the strong subjects produced statistically greater PPa than the weaker subjects (p = 0.039). Statistically significant time main effects existed for time existed for PPa (p = 0.015), but not for JH (p = 0.178). No statistically significant strength level × time interaction effects for JH (p = 0.319) or PPa (p = 0.203) were present for the ballistic or nonballistic conditions. Practical significance indicated by effect sizes and the relationships between maximum potentiation and relative strength suggest that stronger subjects potentiate earlier and to a greater extent than weaker subjects during ballistic and nonballistic potentiation complexes.
Feature maps driven no-reference image quality prediction of authentically distorted images
NASA Astrophysics Data System (ADS)
Ghadiyaram, Deepti; Bovik, Alan C.
2015-03-01
Current blind image quality prediction models rely on benchmark databases comprised of singly and synthetically distorted images, thereby learning image features that are only adequate to predict human perceived visual quality on such inauthentic distortions. However, real world images often contain complex mixtures of multiple distortions. Rather than a) discounting the effect of these mixtures of distortions on an image's perceptual quality and considering only the dominant distortion or b) using features that are only proven to be efficient for singly distorted images, we deeply study the natural scene statistics of authentically distorted images, in different color spaces and transform domains. We propose a feature-maps-driven statistical approach which avoids any latent assumptions about the type of distortion(s) contained in an image, and focuses instead on modeling the remarkable consistencies in the scene statistics of real world images in the absence of distortions. We design a deep belief network that takes model-based statistical image features derived from a very large database of authentically distorted images as input and discovers good feature representations by generalizing over different distortion types, mixtures, and severities, which are later used to learn a regressor for quality prediction. We demonstrate the remarkable competence of our features for improving automatic perceptual quality prediction on a benchmark database and on the newly designed LIVE Authentic Image Quality Challenge Database and show that our approach of combining robust statistical features and the deep belief network dramatically outperforms the state-of-the-art.
Autonomous perception and decision making in cyber-physical systems
NASA Astrophysics Data System (ADS)
Sarkar, Soumik
2011-07-01
The cyber-physical system (CPS) is a relatively new interdisciplinary technology area that includes the general class of embedded and hybrid systems. CPSs require integration of computation and physical processes that involves the aspects of physical quantities such as time, energy and space during information processing and control. The physical space is the source of information and the cyber space makes use of the generated information to make decisions. This dissertation proposes an overall architecture of autonomous perception-based decision & control of complex cyber-physical systems. Perception involves the recently developed framework of Symbolic Dynamic Filtering for abstraction of physical world in the cyber space. For example, under this framework, sensor observations from a physical entity are discretized temporally and spatially to generate blocks of symbols, also called words that form a language. A grammar of a language is the set of rules that determine the relationships among words to build sentences. Subsequently, a physical system is conjectured to be a linguistic source that is capable of generating a specific language. The proposed technology is validated on various (experimental and simulated) case studies that include health monitoring of aircraft gas turbine engines, detection and estimation of fatigue damage in polycrystalline alloys, and parameter identification. Control of complex cyber-physical systems involve distributed sensing, computation, control as well as complexity analysis. A novel statistical mechanics-inspired complexity analysis approach is proposed in this dissertation. In such a scenario of networked physical systems, the distribution of physical entities determines the underlying network topology and the interaction among the entities forms the abstract cyber space. It is envisioned that the general contributions, made in this dissertation, will be useful for potential application areas such as smart power grids and buildings, distributed energy systems, advanced health care procedures and future ground and air transportation systems.
Floodplain complexity and surface metrics: influences of scale and geomorphology
Scown, Murray W.; Thoms, Martin C.; DeJager, Nathan R.
2015-01-01
Many studies of fluvial geomorphology and landscape ecology examine a single river or landscape, thus lack generality, making it difficult to develop a general understanding of the linkages between landscape patterns and larger-scale driving variables. We examined the spatial complexity of eight floodplain surfaces in widely different geographic settings and determined how patterns measured at different scales relate to different environmental drivers. Floodplain surface complexity is defined as having highly variable surface conditions that are also highly organised in space. These two components of floodplain surface complexity were measured across multiple sampling scales from LiDAR-derived DEMs. The surface character and variability of each floodplain were measured using four surface metrics; namely, standard deviation, skewness, coefficient of variation, and standard deviation of curvature from a series of moving window analyses ranging from 50 to 1000 m in radius. The spatial organisation of each floodplain surface was measured using spatial correlograms of the four surface metrics. Surface character, variability, and spatial organisation differed among the eight floodplains; and random, fragmented, highly patchy, and simple gradient spatial patterns were exhibited, depending upon the metric and window size. Differences in surface character and variability among the floodplains became statistically stronger with increasing sampling scale (window size), as did their associations with environmental variables. Sediment yield was consistently associated with differences in surface character and variability, as were flow discharge and variability at smaller sampling scales. Floodplain width was associated with differences in the spatial organization of surface conditions at smaller sampling scales, while valley slope was weakly associated with differences in spatial organisation at larger scales. A comparison of floodplain landscape patterns measured at different scales would improve our understanding of the role that different environmental variables play at different scales and in different geomorphic settings.
An Easy Tool to Predict Survival in Patients Receiving Radiation Therapy for Painful Bone Metastases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Westhoff, Paulien G., E-mail: p.g.westhoff@umcutrecht.nl; Graeff, Alexander de; Monninkhof, Evelyn M.
2014-11-15
Purpose: Patients with bone metastases have a widely varying survival. A reliable estimation of survival is needed for appropriate treatment strategies. Our goal was to assess the value of simple prognostic factors, namely, patient and tumor characteristics, Karnofsky performance status (KPS), and patient-reported scores of pain and quality of life, to predict survival in patients with painful bone metastases. Methods and Materials: In the Dutch Bone Metastasis Study, 1157 patients were treated with radiation therapy for painful bone metastases. At randomization, physicians determined the KPS; patients rated general health on a visual analogue scale (VAS-gh), valuation of life on amore » verbal rating scale (VRS-vl) and pain intensity. To assess the predictive value of the variables, we used multivariate Cox proportional hazard analyses and C-statistics for discriminative value. Of the final model, calibration was assessed. External validation was performed on a dataset of 934 patients who were treated with radiation therapy for vertebral metastases. Results: Patients had mainly breast (39%), prostate (23%), or lung cancer (25%). After a maximum of 142 weeks' follow-up, 74% of patients had died. The best predictive model included sex, primary tumor, visceral metastases, KPS, VAS-gh, and VRS-vl (C-statistic = 0.72, 95% CI = 0.70-0.74). A reduced model, with only KPS and primary tumor, showed comparable discriminative capacity (C-statistic = 0.71, 95% CI = 0.69-0.72). External validation showed a C-statistic of 0.72 (95% CI = 0.70-0.73). Calibration of the derivation and the validation dataset showed underestimation of survival. Conclusion: In predicting survival in patients with painful bone metastases, KPS combined with primary tumor was comparable to a more complex model. Considering the amount of variables in complex models and the additional burden on patients, the simple model is preferred for daily use. In addition, a risk table for survival is provided.« less
NASA Astrophysics Data System (ADS)
Tsallis, Constantino
2006-03-01
Boltzmann-Gibbs ( BG) statistical mechanics is, since well over one century, successfully used for many nonlinear dynamical systems which, in one way or another, exhibit strong chaos. A typical case is a classical many-body short-range-interacting Hamiltonian system (e.g., the Lennard-Jones model for a real gas at moderately high temperature). Its Lyapunov spectrum (which characterizes the sensitivity to initial conditions) includes positive values. This leads to ergodicity, the stationary state being thermal equilibrium, hence standard applicability of the BG theory is verified. The situation appears to be of a different nature for various phenomena occurring in living organisms. Indeed, such systems exhibit a complexity which does not really accommodate with this standard dynamical behavior. Life appears to emerge and evolve in a kind of delicate situation, at the frontier between large order (low adaptability and long memory; typically characterized by regular dynamics, hence only nonpositive Lyapunov exponents) and large disorder (high adaptability and short memory; typically characterized by strong chaos, hence at least one positive Lyapunov exponent). Along this frontier, the maximal relevant Lyapunov exponents are either zero or close to that, characterizing what is currently referred to as weak chaos. This type of situation is shared by a great variety of similar complex phenomena in economics, linguistics, to cite but a few. BG statistical mechanics is built upon the entropy S=-k∑plnp. A generalization of this form, S=k(1-∑piq)/(q-1) (with S=S), has been proposed in 1988 as a basis for formulating what is nowadays currently called nonextensive statistical mechanics. This theory appears to be particularly adapted for nonlinear dynamical systems exhibiting, precisely, weak chaos. Here, we briefly review the theory, its dynamical foundation, its applications in a variety of disciplines (with special emphasis to living systems), and its connections with the ubiquitous scale-free networks.
de la Rúa, Nicholas M.; Bustamante, Dulce M.; Menes, Marianela; Stevens, Lori; Monroy, Carlota; Kilpatrick, William; Rizzo, Donna; Klotz, Stephen A.; Schmidt, Justin; Axen, Heather J.; Dorn, Patricia L.
2014-01-01
Phylogenetic relationships of insect vectors of parasitic diseases are important for understanding the evolution of epidemiologically relevant traits, and may be useful in vector control. The subfamily Triatominae (Hemiptera:Reduviidae) includes ~140 extant species arranged in five tribes comprised of 15 genera. The genus Triatoma is the most species-rich and contains important vectors of Trypanosoma cruzi, the causative agent of Chagas disease. Triatoma species were grouped into complexes originally by morphology and more recently with the addition of information from molecular phylogenetics (the four-complex hypothesis); however, without a strict adherence to monophyly. To date, the validity of proposed species complexes has not been tested by statistical tests of topology. The goal of this study was to clarify the systematics of 19 Triatoma species from North and Central America. We inferred their evolutionary relatedness using two independent data sets: the complete nuclear Internal Transcribed Spacer-2 ribosomal DNA (ITS-2 rDNA) and head morphometrics. In addition, we used the Shimodaira-Hasegawa statistical test of topology to assess the fit of the data to a set of competing systematic hypotheses (topologies). An unconstrained topology inferred from the ITS-2 data was compared to topologies constrained based on the four-complex hypothesis or one inferred from our morphometry results. The unconstrained topology represents a statistically significant better fit of the molecular data than either the four-complex or the morphometric topology. We propose an update to the composition of species complexes in the North and Central American Triatoma, based on a phylogeny inferred from ITS-2 as a first step towards updating the phylogeny of the complexes based on monophyly and statistical tests of topologies. PMID:24681261
Stochastic Averaging for Constrained Optimization With Application to Online Resource Allocation
NASA Astrophysics Data System (ADS)
Chen, Tianyi; Mokhtari, Aryan; Wang, Xin; Ribeiro, Alejandro; Giannakis, Georgios B.
2017-06-01
Existing approaches to resource allocation for nowadays stochastic networks are challenged to meet fast convergence and tolerable delay requirements. The present paper leverages online learning advances to facilitate stochastic resource allocation tasks. By recognizing the central role of Lagrange multipliers, the underlying constrained optimization problem is formulated as a machine learning task involving both training and operational modes, with the goal of learning the sought multipliers in a fast and efficient manner. To this end, an order-optimal offline learning approach is developed first for batch training, and it is then generalized to the online setting with a procedure termed learn-and-adapt. The novel resource allocation protocol permeates benefits of stochastic approximation and statistical learning to obtain low-complexity online updates with learning errors close to the statistical accuracy limits, while still preserving adaptation performance, which in the stochastic network optimization context guarantees queue stability. Analysis and simulated tests demonstrate that the proposed data-driven approach improves the delay and convergence performance of existing resource allocation schemes.
Inference in the brain: Statistics flowing in redundant population codes
Pitkow, Xaq; Angelaki, Dora E
2017-01-01
It is widely believed that the brain performs approximate probabilistic inference to estimate causal variables in the world from ambiguous sensory data. To understand these computations, we need to analyze how information is represented and transformed by the actions of nonlinear recurrent neural networks. We propose that these probabilistic computations function by a message-passing algorithm operating at the level of redundant neural populations. To explain this framework, we review its underlying concepts, including graphical models, sufficient statistics, and message-passing, and then describe how these concepts could be implemented by recurrently connected probabilistic population codes. The relevant information flow in these networks will be most interpretable at the population level, particularly for redundant neural codes. We therefore outline a general approach to identify the essential features of a neural message-passing algorithm. Finally, we argue that to reveal the most important aspects of these neural computations, we must study large-scale activity patterns during moderately complex, naturalistic behaviors. PMID:28595050
Physiological time-series analysis: what does regularity quantify?
NASA Technical Reports Server (NTRS)
Pincus, S. M.; Goldberger, A. L.
1994-01-01
Approximate entropy (ApEn) is a recently developed statistic quantifying regularity and complexity that appears to have potential application to a wide variety of physiological and clinical time-series data. The focus here is to provide a better understanding of ApEn to facilitate its proper utilization, application, and interpretation. After giving the formal mathematical description of ApEn, we provide a multistep description of the algorithm as applied to two contrasting clinical heart rate data sets. We discuss algorithm implementation and interpretation and introduce a general mathematical hypothesis of the dynamics of a wide class of diseases, indicating the utility of ApEn to test this hypothesis. We indicate the relationship of ApEn to variability measures, the Fourier spectrum, and algorithms motivated by study of chaotic dynamics. We discuss further mathematical properties of ApEn, including the choice of input parameters, statistical issues, and modeling considerations, and we conclude with a section on caveats to ensure correct ApEn utilization.
Rejoinder: Sifting through model space
Heisey, Dennis M.; Osnas, Erik E.; Cross, Paul C.; Joly, Damien O.; Langenberg, Julia A.; Miller, Michael W.
2010-01-01
Observational data sets generated by complex processes are common in ecology. Traditionally these have been very challenging to analyze because of the limitations of available statistical tools. This seems to be changing, and these are exciting times to be involved with ecological statistics, not just because of the neo-Bayesian revival but also because of the proliferation of computationally intensive methods in general. It is now possible to fit much richer models to observational data than in the relatively recent past, which in turn has stimulated much interest in how to evaluate and compare such models. In such an immature, vibrant, and rapidly growing field, not everyone is going to agree on the best way to do things. This is reflected in the contrast of opinions offered by the discussants. Each offers a thoughtful and thought-provoking critique of our work that reflects the current thinking in a non-negligible segment of the ecological data analysis community. We want to thank them for their insights.
Stochastic Spatial Models in Ecology: A Statistical Physics Approach
NASA Astrophysics Data System (ADS)
Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.
2018-07-01
Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.
Stochastic Spatial Models in Ecology: A Statistical Physics Approach
NASA Astrophysics Data System (ADS)
Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.
2017-11-01
Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.
The Small World of Psychopathology
Borsboom, Denny; Cramer, Angélique O. J.; Schmittmann, Verena D.; Epskamp, Sacha; Waldorp, Lourens J.
2011-01-01
Background Mental disorders are highly comorbid: people having one disorder are likely to have another as well. We explain empirical comorbidity patterns based on a network model of psychiatric symptoms, derived from an analysis of symptom overlap in the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV). Principal Findings We show that a) half of the symptoms in the DSM-IV network are connected, b) the architecture of these connections conforms to a small world structure, featuring a high degree of clustering but a short average path length, and c) distances between disorders in this structure predict empirical comorbidity rates. Network simulations of Major Depressive Episode and Generalized Anxiety Disorder show that the model faithfully reproduces empirical population statistics for these disorders. Conclusions In the network model, mental disorders are inherently complex. This explains the limited successes of genetic, neuroscientific, and etiological approaches to unravel their causes. We outline a psychosystems approach to investigate the structure and dynamics of mental disorders. PMID:22114671
NASA Astrophysics Data System (ADS)
Barreiro, Andrea K.; Ly, Cheng
2017-08-01
Rapid experimental advances now enable simultaneous electrophysiological recording of neural activity at single-cell resolution across large regions of the nervous system. Models of this neural network activity will necessarily increase in size and complexity, thus increasing the computational cost of simulating them and the challenge of analyzing them. Here we present a method to approximate the activity and firing statistics of a general firing rate network model (of the Wilson-Cowan type) subject to noisy correlated background inputs. The method requires solving a system of transcendental equations and is fast compared to Monte Carlo simulations of coupled stochastic differential equations. We implement the method with several examples of coupled neural networks and show that the results are quantitatively accurate even with moderate coupling strengths and an appreciable amount of heterogeneity in many parameters. This work should be useful for investigating how various neural attributes qualitatively affect the spiking statistics of coupled neural networks.
Analysis and generation of groundwater concentration time series
NASA Astrophysics Data System (ADS)
Crăciun, Maria; Vamoş, Călin; Suciu, Nicolae
2018-01-01
Concentration time series are provided by simulated concentrations of a nonreactive solute transported in groundwater, integrated over the transverse direction of a two-dimensional computational domain and recorded at the plume center of mass. The analysis of a statistical ensemble of time series reveals subtle features that are not captured by the first two moments which characterize the approximate Gaussian distribution of the two-dimensional concentration fields. The concentration time series exhibit a complex preasymptotic behavior driven by a nonstationary trend and correlated fluctuations with time-variable amplitude. Time series with almost the same statistics are generated by successively adding to a time-dependent trend a sum of linear regression terms, accounting for correlations between fluctuations around the trend and their increments in time, and terms of an amplitude modulated autoregressive noise of order one with time-varying parameter. The algorithm generalizes mixing models used in probability density function approaches. The well-known interaction by exchange with the mean mixing model is a special case consisting of a linear regression with constant coefficients.
The lawful imprecision of human surface tilt estimation in natural scenes
2018-01-01
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. PMID:29384477
The lawful imprecision of human surface tilt estimation in natural scenes.
Kim, Seha; Burge, Johannes
2018-01-31
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. © 2018, Kim et al.
On vital aid: the why, what and how of validation
Kleywegt, Gerard J.
2009-01-01
Limitations to the data and subjectivity in the structure-determination process may cause errors in macromolecular crystal structures. Appropriate validation techniques may be used to reveal problems in structures, ideally before they are analysed, published or deposited. Additionally, such techniques may be used a posteriori to assess the (relative) merits of a model by potential users. Weak validation methods and statistics assess how well a model reproduces the information that was used in its construction (i.e. experimental data and prior knowledge). Strong methods and statistics, on the other hand, test how well a model predicts data or information that were not used in the structure-determination process. These may be data that were excluded from the process on purpose, general knowledge about macromolecular structure, information about the biological role and biochemical activity of the molecule under study or its mutants or complexes and predictions that are based on the model and that can be tested experimentally. PMID:19171968
Statistical Physics of Cascading Failures in Complex Networks
NASA Astrophysics Data System (ADS)
Panduranga, Nagendra Kumar
Systems such as the power grid, world wide web (WWW), and internet are categorized as complex systems because of the presence of a large number of interacting elements. For example, the WWW is estimated to have a billion webpages and understanding the dynamics of such a large number of individual agents (whose individual interactions might not be fully known) is a challenging task. Complex network representations of these systems have proved to be of great utility. Statistical physics is the study of emergence of macroscopic properties of systems from the characteristics of the interactions between individual molecules. Hence, statistical physics of complex networks has been an effective approach to study these systems. In this dissertation, I have used statistical physics to study two distinct phenomena in complex systems: i) Cascading failures and ii) Shortest paths in complex networks. Understanding cascading failures is considered to be one of the "holy grails" in the study of complex systems such as the power grid, transportation networks, and economic systems. Studying failures of these systems as percolation on complex networks has proved to be insightful. Previously, cascading failures have been studied extensively using two different models: k-core percolation and interdependent networks. The first part of this work combines the two models into a general model, solves it analytically, and validates the theoretical predictions through extensive computer simulations. The phase diagram of the percolation transition has been systematically studied as one varies the average local k-core threshold and the coupling between networks. The phase diagram of the combined processes is very rich and includes novel features that do not appear in the models which study each of the processes separately. For example, the phase diagram consists of first- and second-order transition regions separated by two tricritical lines that merge together and enclose a two-stage transition region. In the two-stage transition, the size of the giant component undergoes a first-order jump at a certain occupation probability followed by a continuous second-order transition at a smaller occupation probability. Furthermore, at certain fixed interdependencies, the percolation transition cycles from first-order to second-order to two-stage to first-order as the k-core threshold is increased. We setup the analytical equations describing the phase boundaries of the two-stage transition region and we derive the critical exponents for each type of transition. Understanding the shortest paths between individual elements in systems like communication networks and social media networks is important in the study of information cascades in these systems. Often, large heterogeneity can be present in the connections between nodes in these networks. Certain sets of nodes can be more highly connected among themselves than with the nodes from other sets. These sets of nodes are often referred to as 'communities'. The second part of this work studies the effect of the presence of communities on the distribution of shortest paths in a network using a modular Erdős-Renyi network model. In this model, the number of communities and the degree of modularity of the network can be tuned using the parameters of the model. We find that the model reaches a percolation threshold while tuning the degree of modularity of the network and the distribution of the shortest paths in the network can be used as an indicator of how the communities are connected.
Statistical Handbook on Consumption and Wealth in the United States.
ERIC Educational Resources Information Center
Kaul, Chandrika, Ed.; Tomaselli-Moschovitis, Valerie, Ed.
This easy-to-use statistical handbook features the most up-to-date and comprehensive data related to U.S. wealth and consumer spending patterns. More than 300 statistical tables and charts are organized into 8 detailed sections. Intended for students, teachers, and general users, the handbook contains these sections: (1) "General Economic…
ERIC Educational Resources Information Center
Center for Education Statistics (ED/OERI), Washington, DC.
The Financial Statistics machine-readable data file (MRDF) is a subfile of the larger Higher Education General Information Survey (HEGIS). It contains basic financial statistics for over 3,000 institutions of higher education in the United States and its territories. The data are arranged sequentially by institution, with institutional…
Fish: A New Computer Program for Friendly Introductory Statistics Help
ERIC Educational Resources Information Center
Brooks, Gordon P.; Raffle, Holly
2005-01-01
All introductory statistics students must master certain basic descriptive statistics, including means, standard deviations and correlations. Students must also gain insight into such complex concepts as the central limit theorem and standard error. This article introduces and describes the Friendly Introductory Statistics Help (FISH) computer…
NASA Astrophysics Data System (ADS)
Di Lorenzo, R.; Ingarao, G.; Fonti, V.
2007-05-01
The crucial task in the prevention of ductile fracture is the availability of a tool for the prediction of such defect occurrence. The technical literature presents a wide investigation on this topic and many contributions have been given by many authors following different approaches. The main class of approaches regards the development of fracture criteria: generally, such criteria are expressed by determining a critical value of a damage function which depends on stress and strain paths: ductile fracture is assumed to occur when such critical value is reached during the analysed process. There is a relevant drawback related to the utilization of ductile fracture criteria; in fact each criterion usually has good performances in the prediction of fracture for particular stress - strain paths, i.e. it works very well for certain processes but may provide no good results for other processes. On the other hand, the approaches based on damage mechanics formulation are very effective from a theoretical point of view but they are very complex and their proper calibration is quite difficult. In this paper, two different approaches are investigated to predict fracture occurrence in cold forming operations. The final aim of the proposed method is the achievement of a tool which has a general reliability i.e. it is able to predict fracture for different forming processes. The proposed approach represents a step forward within a research project focused on the utilization of innovative predictive tools for ductile fracture. The paper presents a comparison between an artificial neural network design procedure and an approach based on statistical tools; both the approaches were aimed to predict fracture occurrence/absence basing on a set of stress and strain paths data. The proposed approach is based on the utilization of experimental data available, for a given material, on fracture occurrence in different processes. More in detail, the approach consists in the analysis of experimental tests in which fracture occurs followed by the numerical simulations of such processes in order to track the stress-strain paths in the workpiece region where fracture is expected. Such data are utilized to build up a proper data set which was utilized both to train an artificial neural network and to perform a statistical analysis aimed to predict fracture occurrence. The developed statistical tool is properly designed and optimized and is able to recognize the fracture occurrence. The reliability and predictive capability of the statistical method were compared with the ones obtained from an artificial neural network developed to predict fracture occurrence. Moreover, the approach is validated also in forming processes characterized by a complex fracture mechanics.
Enabling quaternion derivatives: the generalized HR calculus
Xu, Dongpo; Jahanchahi, Cyrus; Took, Clive C.; Mandic, Danilo P.
2015-01-01
Quaternion derivatives exist only for a very restricted class of analytic (regular) functions; however, in many applications, functions of interest are real-valued and hence not analytic, a typical case being the standard real mean square error objective function. The recent HR calculus is a step forward and provides a way to calculate derivatives and gradients of both analytic and non-analytic functions of quaternion variables; however, the HR calculus can become cumbersome in complex optimization problems due to the lack of rigorous product and chain rules, a consequence of the non-commutativity of quaternion algebra. To address this issue, we introduce the generalized HR (GHR) derivatives which employ quaternion rotations in a general orthogonal system and provide the left- and right-hand versions of the quaternion derivative of general functions. The GHR calculus also solves the long-standing problems of product and chain rules, mean-value theorem and Taylor's theorem in the quaternion field. At the core of the proposed GHR calculus is quaternion rotation, which makes it possible to extend the principle to other functional calculi in non-commutative settings. Examples in statistical learning theory and adaptive signal processing support the analysis. PMID:26361555
Enabling quaternion derivatives: the generalized HR calculus.
Xu, Dongpo; Jahanchahi, Cyrus; Took, Clive C; Mandic, Danilo P
2015-08-01
Quaternion derivatives exist only for a very restricted class of analytic (regular) functions; however, in many applications, functions of interest are real-valued and hence not analytic, a typical case being the standard real mean square error objective function. The recent HR calculus is a step forward and provides a way to calculate derivatives and gradients of both analytic and non-analytic functions of quaternion variables; however, the HR calculus can become cumbersome in complex optimization problems due to the lack of rigorous product and chain rules, a consequence of the non-commutativity of quaternion algebra. To address this issue, we introduce the generalized HR (GHR) derivatives which employ quaternion rotations in a general orthogonal system and provide the left- and right-hand versions of the quaternion derivative of general functions. The GHR calculus also solves the long-standing problems of product and chain rules, mean-value theorem and Taylor's theorem in the quaternion field. At the core of the proposed GHR calculus is quaternion rotation, which makes it possible to extend the principle to other functional calculi in non-commutative settings. Examples in statistical learning theory and adaptive signal processing support the analysis.
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.
Permutation entropy of finite-length white-noise time series.
Little, Douglas J; Kane, Deb M
2016-08-01
Permutation entropy (PE) is commonly used to discriminate complex structure from white noise in a time series. While the PE of white noise is well understood in the long time-series limit, analysis in the general case is currently lacking. Here the expectation value and variance of white-noise PE are derived as functions of the number of ordinal pattern trials, N, and the embedding dimension, D. It is demonstrated that the probability distribution of the white-noise PE converges to a χ^{2} distribution with D!-1 degrees of freedom as N becomes large. It is further demonstrated that the PE variance for an arbitrary time series can be estimated as the variance of a related metric, the Kullback-Leibler entropy (KLE), allowing the qualitative N≫D! condition to be recast as a quantitative estimate of the N required to achieve a desired PE calculation precision. Application of this theory to statistical inference is demonstrated in the case of an experimentally obtained noise series, where the probability of obtaining the observed PE value was calculated assuming a white-noise time series. Standard statistical inference can be used to draw conclusions whether the white-noise null hypothesis can be accepted or rejected. This methodology can be applied to other null hypotheses, such as discriminating whether two time series are generated from different complex system states.
Yuan, Zhongshang; Liu, Hong; Zhang, Xiaoshuai; Li, Fangyu; Zhao, Jinghua; Zhang, Furen; Xue, Fuzhong
2013-01-01
Currently, the genetic variants identified by genome wide association study (GWAS) generally only account for a small proportion of the total heritability for complex disease. One crucial reason is the underutilization of gene-gene joint effects commonly encountered in GWAS, which includes their main effects and co-association. However, gene-gene co-association is often customarily put into the framework of gene-gene interaction vaguely. From the causal graph perspective, we elucidate in detail the concept and rationality of gene-gene co-association as well as its relationship with traditional gene-gene interaction, and propose two Fisher r-to-z transformation-based simple statistics to detect it. Three series of simulations further highlight that gene-gene co-association refers to the extent to which the joint effects of two genes differs from the main effects, not only due to the traditional interaction under the nearly independent condition but the correlation between two genes. The proposed statistics are more powerful than logistic regression under various situations, cannot be affected by linkage disequilibrium and can have acceptable false positive rate as long as strictly following the reasonable GWAS data analysis roadmap. Furthermore, an application to gene pathway analysis associated with leprosy confirms in practice that our proposed gene-gene co-association concepts as well as the correspondingly proposed statistics are strongly in line with reality. PMID:23923021
Gene- and pathway-based association tests for multiple traits with GWAS summary statistics.
Kwak, Il-Youp; Pan, Wei
2017-01-01
To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying association patterns (e.g. signal sparsity levels) across SNPs and traits, thus maintaining high power across a wide range of situations. Furthermore, the proposed methods are general: they can be applied to mixed types of traits, and to Z-statistics or P-values as summary statistics obtained from either a single GWAS or a meta-analysis of multiple GWAS. Our numerical studies with simulated and real data demonstrated the promising performance of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available at: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Simulating statistics of lightning-induced and man made fires
NASA Astrophysics Data System (ADS)
Krenn, R.; Hergarten, S.
2009-04-01
The frequency-area distributions of forest fires show power-law behavior with scaling exponents α in a quite narrow range, relating wildfire research to the theoretical framework of self-organized criticality. Examples of self-organized critical behavior can be found in computer simulations of simple cellular automata. The established self-organized critical Drossel-Schwabl forest fire model (DS-FFM) is one of the most widespread models in this context. Despite its qualitative agreement with event-size statistics from nature, its applicability is still questioned. Apart from general concerns that the DS-FFM apparently oversimplifies the complex nature of forest dynamics, it significantly overestimates the frequency of large fires. We present a straightforward modification of the model rules that increases the scaling exponent α by approximately 13 and brings the simulated event-size statistics close to those observed in nature. In addition, combined simulations of both the original and the modified model predict a dependence of the overall distribution on the ratio of lightning induced and man made fires as well as a difference between their respective event-size statistics. The increase of the scaling exponent with decreasing lightning probability as well as the splitting of the partial distributions are confirmed by the analysis of the Canadian Large Fire Database. As a consequence, lightning induced and man made forest fires cannot be treated separately in wildfire modeling, hazard assessment and forest management.
Long-range memory and non-Markov statistical effects in human sensorimotor coordination
NASA Astrophysics Data System (ADS)
M. Yulmetyev, Renat; Emelyanova, Natalya; Hänggi, Peter; Gafarov, Fail; Prokhorov, Alexander
2002-12-01
In this paper, the non-Markov statistical processes and long-range memory effects in human sensorimotor coordination are investigated. The theoretical basis of this study is the statistical theory of non-stationary discrete non-Markov processes in complex systems (Phys. Rev. E 62, 6178 (2000)). The human sensorimotor coordination was experimentally studied by means of standard dynamical tapping test on the group of 32 young peoples with tap numbers up to 400. This test was carried out separately for the right and the left hand according to the degree of domination of each brain hemisphere. The numerical analysis of the experimental results was made with the help of power spectra of the initial time correlation function, the memory functions of low orders and the first three points of the statistical spectrum of non-Markovity parameter. Our observations demonstrate, that with the regard to results of the standard dynamic tapping-test it is possible to divide all examinees into five different dynamic types. We have introduced the conflict coefficient to estimate quantitatively the order-disorder effects underlying life systems. The last one reflects the existence of disbalance between the nervous and the motor human coordination. The suggested classification of the neurophysiological activity represents the dynamic generalization of the well-known neuropsychological types and provides the new approach in a modern neuropsychology.
Cortical Surround Interactions and Perceptual Salience via Natural Scene Statistics
Coen-Cagli, Ruben; Dayan, Peter; Schwartz, Odelia
2012-01-01
Spatial context in images induces perceptual phenomena associated with salience and modulates the responses of neurons in primary visual cortex (V1). However, the computational and ecological principles underlying contextual effects are incompletely understood. We introduce a model of natural images that includes grouping and segmentation of neighboring features based on their joint statistics, and we interpret the firing rates of V1 neurons as performing optimal recognition in this model. We show that this leads to a substantial generalization of divisive normalization, a computation that is ubiquitous in many neural areas and systems. A main novelty in our model is that the influence of the context on a target stimulus is determined by their degree of statistical dependence. We optimized the parameters of the model on natural image patches, and then simulated neural and perceptual responses on stimuli used in classical experiments. The model reproduces some rich and complex response patterns observed in V1, such as the contrast dependence, orientation tuning and spatial asymmetry of surround suppression, while also allowing for surround facilitation under conditions of weak stimulation. It also mimics the perceptual salience produced by simple displays, and leads to readily testable predictions. Our results provide a principled account of orientation-based contextual modulation in early vision and its sensitivity to the homogeneity and spatial arrangement of inputs, and lends statistical support to the theory that V1 computes visual salience. PMID:22396635
Statistics without Tears: Complex Statistics with Simple Arithmetic
ERIC Educational Resources Information Center
Smith, Brian
2011-01-01
One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
Hon, K L; Tsang, Y C; Pong, N H; Lee, Vivian W Y; Luk, N M; Chow, C M; Leung, T F
2015-10-01
To investigate patient acceptability, efficacy, and skin biophysiological effects of a cream/cleanser combination for childhood atopic dermatitis. Paediatric dermatology clinic at a university teaching hospital in Hong Kong. Consecutive paediatric patients with atopic dermatitis who were interested in trying a new moisturiser were recruited between 1 April 2013 and 31 March 2014. Swabs and cultures from the right antecubital fossa and the worst eczematous area, disease severity (SCORing Atopic Dermatitis index), skin hydration, and transepidermal water loss were obtained prior to and following 4-week usage of a cream/cleanser containing lipid complex with shea butter extract (Ezerra cream; Hoe Pharma, Petaling Jaya, Malaysia). Global or general acceptability of treatment was documented as 'very good', 'good', 'fair', or 'poor'. A total of 34 patients with atopic dermatitis were recruited; 74% reported 'very good' or 'good', whereas 26% reported 'fair' or 'poor' general acceptability of treatment of the Ezerra cream; and 76% reported 'very good' or 'good', whereas 24% reported 'fair' or 'poor' general acceptability of treatment of the Ezerra cleanser. There were no intergroup differences in pre-usage clinical parameters of age, objective SCORing Atopic Dermatitis index, pruritus, sleep loss, skin hydration, transepidermal water loss, topical corticosteroid usage, oral antihistamine usage, or general acceptability of treatment of the prior emollient. Following use of the Ezerra cream, mean pruritus score decreased from 6.7 to 6.0 (P=0.036) and mean Children's Dermatology Life Quality Index improved from 10.0 to 8.0 (P=0.021) in the 'very good'/'good' group. There were no statistically significant differences in the acceptability of wash (P=0.526) and emollients (P=0.537) with pre-trial products. When compared with the data of another ceramide-precursor moisturiser in a previous study, there was no statistical difference in efficacy and acceptability between the two products. The trial cream was acceptable in three quarters of patients with atopic dermatitis. Patients who accepted the cream had less pruritus and improved quality of life than the non-accepting patients following its usage. The cream containing shea butter extract did not differ in acceptability or efficacy from a ceramide-precursor product. Patient acceptability is an important factor for treatment efficacy. There is a general lack of published clinical trials to document the efficacy and skin biophysiological effects of many of the proprietary moisturisers.
Ciamarra, Massimo Pica; Cheong, Siew Ann
2018-01-01
There is growing interest in the use of critical slowing down and critical fluctuations as early warning signals for critical transitions in different complex systems. However, while some studies found them effective, others found the opposite. In this paper, we investigated why this might be so, by testing three commonly used indicators: lag-1 autocorrelation, variance, and low-frequency power spectrum at anticipating critical transitions in the very-high-frequency time series data of the Australian Dollar-Japanese Yen and Swiss Franc-Japanese Yen exchange rates. Besides testing rising trends in these indicators at a strict level of confidence using the Kendall-tau test, we also required statistically significant early warning signals to be concurrent in the three indicators, which must rise to appreciable values. We then found for our data set the optimum parameters for discovering critical transitions, and showed that the set of critical transitions found is generally insensitive to variations in the parameters. Suspecting that negative results in the literature are the results of low data frequencies, we created time series with time intervals over three orders of magnitude from the raw data, and tested them for early warning signals. Early warning signals can be reliably found only if the time interval of the data is shorter than the time scale of critical transitions in our complex system of interest. Finally, we compared the set of time windows with statistically significant early warning signals with the set of time windows followed by large movements, to conclude that the early warning signals indeed provide reliable information on impending critical transitions. This reliability becomes more compelling statistically the more events we test. PMID:29538373
Wen, Haoyu; Ciamarra, Massimo Pica; Cheong, Siew Ann
2018-01-01
There is growing interest in the use of critical slowing down and critical fluctuations as early warning signals for critical transitions in different complex systems. However, while some studies found them effective, others found the opposite. In this paper, we investigated why this might be so, by testing three commonly used indicators: lag-1 autocorrelation, variance, and low-frequency power spectrum at anticipating critical transitions in the very-high-frequency time series data of the Australian Dollar-Japanese Yen and Swiss Franc-Japanese Yen exchange rates. Besides testing rising trends in these indicators at a strict level of confidence using the Kendall-tau test, we also required statistically significant early warning signals to be concurrent in the three indicators, which must rise to appreciable values. We then found for our data set the optimum parameters for discovering critical transitions, and showed that the set of critical transitions found is generally insensitive to variations in the parameters. Suspecting that negative results in the literature are the results of low data frequencies, we created time series with time intervals over three orders of magnitude from the raw data, and tested them for early warning signals. Early warning signals can be reliably found only if the time interval of the data is shorter than the time scale of critical transitions in our complex system of interest. Finally, we compared the set of time windows with statistically significant early warning signals with the set of time windows followed by large movements, to conclude that the early warning signals indeed provide reliable information on impending critical transitions. This reliability becomes more compelling statistically the more events we test.
Comparison of two fractal interpolation methods
NASA Astrophysics Data System (ADS)
Fu, Yang; Zheng, Zeyu; Xiao, Rui; Shi, Haibo
2017-03-01
As a tool for studying complex shapes and structures in nature, fractal theory plays a critical role in revealing the organizational structure of the complex phenomenon. Numerous fractal interpolation methods have been proposed over the past few decades, but they differ substantially in the form features and statistical properties. In this study, we simulated one- and two-dimensional fractal surfaces by using the midpoint displacement method and the Weierstrass-Mandelbrot fractal function method, and observed great differences between the two methods in the statistical characteristics and autocorrelation features. From the aspect of form features, the simulations of the midpoint displacement method showed a relatively flat surface which appears to have peaks with different height as the fractal dimension increases. While the simulations of the Weierstrass-Mandelbrot fractal function method showed a rough surface which appears to have dense and highly similar peaks as the fractal dimension increases. From the aspect of statistical properties, the peak heights from the Weierstrass-Mandelbrot simulations are greater than those of the middle point displacement method with the same fractal dimension, and the variances are approximately two times larger. When the fractal dimension equals to 1.2, 1.4, 1.6, and 1.8, the skewness is positive with the midpoint displacement method and the peaks are all convex, but for the Weierstrass-Mandelbrot fractal function method the skewness is both positive and negative with values fluctuating in the vicinity of zero. The kurtosis is less than one with the midpoint displacement method, and generally less than that of the Weierstrass-Mandelbrot fractal function method. The autocorrelation analysis indicated that the simulation of the midpoint displacement method is not periodic with prominent randomness, which is suitable for simulating aperiodic surface. While the simulation of the Weierstrass-Mandelbrot fractal function method has strong periodicity, which is suitable for simulating periodic surface.
ON THE DYNAMICAL DERIVATION OF EQUILIBRIUM STATISTICAL MECHANICS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prigogine, I.; Balescu, R.; Henin, F.
1960-12-01
Work on nonequilibrium statistical mechanics, which allows an extension of the kinetic proof to all results of equilibrium statistical mechanics involving a finite number of degrees of freedom, is summarized. As an introduction to the general N-body problem, the scattering theory in classical mechanics is considered. The general N-body problem is considered for the case of classical mechanics, quantum mechanics with Boltzmann statistics, and quantum mechanics including quantum statistics. Six basic diagrams, which describe the elementary processes of the dynamics of correlations, were obtained. (M.C.G.)
Subband Image Coding with Jointly Optimized Quantizers
NASA Technical Reports Server (NTRS)
Kossentini, Faouzi; Chung, Wilson C.; Smith Mark J. T.
1995-01-01
An iterative design algorithm for the joint design of complexity- and entropy-constrained subband quantizers and associated entropy coders is proposed. Unlike conventional subband design algorithms, the proposed algorithm does not require the use of various bit allocation algorithms. Multistage residual quantizers are employed here because they provide greater control of the complexity-performance tradeoffs, and also because they allow efficient and effective high-order statistical modeling. The resulting subband coder exploits statistical dependencies within subbands, across subbands, and across stages, mainly through complexity-constrained high-order entropy coding. Experimental results demonstrate that the complexity-rate-distortion performance of the new subband coder is exceptional.
Synchronization in human musical rhythms and mutually interacting complex systems
Hennig, Holger
2014-01-01
Though the music produced by an ensemble is influenced by multiple factors, including musical genre, musician skill, and individual interpretation, rhythmic synchronization is at the foundation of musical interaction. Here, we study the statistical nature of the mutual interaction between two humans synchronizing rhythms. We find that the interbeat intervals of both laypeople and professional musicians exhibit scale-free (power law) cross-correlations. Surprisingly, the next beat to be played by one person is dependent on the entire history of the other person’s interbeat intervals on timescales up to several minutes. To understand this finding, we propose a general stochastic model for mutually interacting complex systems, which suggests a physiologically motivated explanation for the occurrence of scale-free cross-correlations. We show that the observed long-term memory phenomenon in rhythmic synchronization can be imitated by fractal coupling of separately recorded or synthesized audio tracks and thus applied in electronic music. Though this study provides an understanding of fundamental characteristics of timing and synchronization at the interbrain level, the mutually interacting complex systems model may also be applied to study the dynamics of other complex systems where scale-free cross-correlations have been observed, including econophysics, physiological time series, and collective behavior of animal flocks. PMID:25114228
SNP by SNP by environment interaction network of alcoholism.
Zollanvari, Amin; Alterovitz, Gil
2017-03-14
Alcoholism has a strong genetic component. Twin studies have demonstrated the heritability of a large proportion of phenotypic variance of alcoholism ranging from 50-80%. The search for genetic variants associated with this complex behavior has epitomized sequence-based studies for nearly a decade. The limited success of genome-wide association studies (GWAS), possibly precipitated by the polygenic nature of complex traits and behaviors, however, has demonstrated the need for novel, multivariate models capable of quantitatively capturing interactions between a host of genetic variants and their association with non-genetic factors. In this regard, capturing the network of SNP by SNP or SNP by environment interactions has recently gained much interest. Here, we assessed 3,776 individuals to construct a network capable of detecting and quantifying the interactions within and between plausible genetic and environmental factors of alcoholism. In this regard, we propose the use of first-order dependence tree of maximum weight as a potential statistical learning technique to delineate the pattern of dependencies underpinning such a complex trait. Using a predictive based analysis, we further rank the genes, demographic factors, biological pathways, and the interactions represented by our SNP [Formula: see text]SNP[Formula: see text]E network. The proposed framework is quite general and can be potentially applied to the study of other complex traits.
Symmetry-Adapted Machine Learning for Tensorial Properties of Atomistic Systems.
Grisafi, Andrea; Wilkins, David M; Csányi, Gábor; Ceriotti, Michele
2018-01-19
Statistical learning methods show great promise in providing an accurate prediction of materials and molecular properties, while minimizing the need for computationally demanding electronic structure calculations. The accuracy and transferability of these models are increased significantly by encoding into the learning procedure the fundamental symmetries of rotational and permutational invariance of scalar properties. However, the prediction of tensorial properties requires that the model respects the appropriate geometric transformations, rather than invariance, when the reference frame is rotated. We introduce a formalism that extends existing schemes and makes it possible to perform machine learning of tensorial properties of arbitrary rank, and for general molecular geometries. To demonstrate it, we derive a tensor kernel adapted to rotational symmetry, which is the natural generalization of the smooth overlap of atomic positions kernel commonly used for the prediction of scalar properties at the atomic scale. The performance and generality of the approach is demonstrated by learning the instantaneous response to an external electric field of water oligomers of increasing complexity, from the isolated molecule to the condensed phase.
Symmetry-Adapted Machine Learning for Tensorial Properties of Atomistic Systems
NASA Astrophysics Data System (ADS)
Grisafi, Andrea; Wilkins, David M.; Csányi, Gábor; Ceriotti, Michele
2018-01-01
Statistical learning methods show great promise in providing an accurate prediction of materials and molecular properties, while minimizing the need for computationally demanding electronic structure calculations. The accuracy and transferability of these models are increased significantly by encoding into the learning procedure the fundamental symmetries of rotational and permutational invariance of scalar properties. However, the prediction of tensorial properties requires that the model respects the appropriate geometric transformations, rather than invariance, when the reference frame is rotated. We introduce a formalism that extends existing schemes and makes it possible to perform machine learning of tensorial properties of arbitrary rank, and for general molecular geometries. To demonstrate it, we derive a tensor kernel adapted to rotational symmetry, which is the natural generalization of the smooth overlap of atomic positions kernel commonly used for the prediction of scalar properties at the atomic scale. The performance and generality of the approach is demonstrated by learning the instantaneous response to an external electric field of water oligomers of increasing complexity, from the isolated molecule to the condensed phase.
Son, Ji Y; Ramos, Priscilla; DeWolf, Melissa; Loftus, William; Stigler, James W
2018-01-01
In this article, we begin to lay out a framework and approach for studying how students come to understand complex concepts in rich domains. Grounded in theories of embodied cognition, we advance the view that understanding of complex concepts requires students to practice, over time, the coordination of multiple concepts, and the connection of this system of concepts to situations in the world. Specifically, we explore the role that a teacher's gesture might play in supporting students' coordination of two concepts central to understanding in the domain of statistics: mean and standard deviation. In Study 1 we show that university students who have just taken a statistics course nevertheless have difficulty taking both mean and standard deviation into account when thinking about a statistical scenario. In Study 2 we show that presenting the same scenario with an accompanying gesture to represent variation significantly impacts students' interpretation of the scenario. Finally, in Study 3 we present evidence that instructional videos on the internet fail to leverage gesture as a means of facilitating understanding of complex concepts. Taken together, these studies illustrate an approach to translating current theories of cognition into principles that can guide instructional design.
Kaiser, Ulrike; Sabatowski, Rainer; Balck, Friedrich
2017-08-01
The assessment of treatment effectiveness in public health settings is ensured by indicators that reflect the changes caused by specific interventions. These indicators are also applied in benchmarking systems. The selection of constructs should be guided by their relevance for affected patients (patient reported outcomes). The interdisciplinary multimodal pain therapy (IMPT) is a complex intervention based on a biopsychosocial understanding of chronic pain. For quality assurance purposes, psychological parameters (depression, general anxiety, health-related quality of life) are included in standardized therapy assessment in pain medicine (KEDOQ), which can also be used for comparative analyses in a benchmarking system. The aim of the present study was to investigate the relevance of depressive symptoms, general anxiety and mental quality of life in patients undergoing IMPT under real life conditions. In this retrospective, one-armed and exploratory observational study we used secondary data of a routine documentation of IMST in routine care, applying several variables of the German Pain Questionnaire and the facility's comprehensive basic documentation. 352 participants with IMPT (from 2006 to 2010) were included, and the follow-up was performed over two years with six assessments. Because of statistically heterogeneous characteristics a complex analysis consisting of factor and cluster analyses was applied to build subgroups. These subgroups were explored to identify differences in depressive symptoms (HADS-D), general anxiety (HADS-A), and mental quality of life (SF 36 PSK) at the time of therapy admission and their development estimated by means of effect sizes. Analyses were performed using SPSS 21.0®. Six subgroups were derived and mainly proved to be clinically and psychologically normal, with the exception of one subgroup that consistently showed psychological impairment for all three parameters. The follow-up of the total study population revealed medium or large effects; changes in the subgroups were consistently caused by two subgroups, while the other four showed little or no change. In summary, only a small proportion of the target population (20 %) demonstrated clinically relevant scores in the psychological parameters applied. When selecting indicators for quality assurance, the heterogeneity of the target populations as well as conceptual and methodological aspects should be considered. The characteristics of the parameters intended, along with clinical and personal relevance of indicators for patients, should be investigated by specific procedures such as patient surveys and statistical analyses. Copyright © 2017. Published by Elsevier GmbH.
Bowe, Conor M; Gargan, Mary Louise; Kearns, Gerard J; Stassen, Leo F A
2015-01-01
This is a retrospective study to review the treatment and management of patients presenting with odontogenic infections in a large urban teaching hospital over a four-year period, comparing the number and complexity of odontogenic infections presenting to an acute general hospital in two periods, as follows: Group A (January 2008 to March 2010) versus Group B (April 2010 to December 2011). The background to the study is 'An alteration in patient access to primary dental care instituted by the Department of Health in April 2010'. a) to identify any alteration in the pattern and complexity of patients' presentation with odontogenic infections following recent changes in access to treatment via the Dental Treatment Services Scheme (DTSS) and the Dental Treatment Benefit Scheme (DTBS) in April 2010; and, b) to evaluate the management of severe odontogenic infections. Data was collated by a combination of a comprehensive chart review and electronic patient record analysis based on the primary discharge diagnosis as recorded in the Hospital In-Patient Enquiry (HIPE) system. Fifty patients were admitted to the National Maxillofacial Unit, St James's Hospital, under the oral and maxillofacial service over a four-year period, with an odontogenic infection as the primary diagnosis. There was an increased number of patients presenting with odontogenic infections during Group B of the study. These patients showed an increased complexity and severity of infection. Although there was an upward trend in the numbers and complexity of infections, this trending did not reach statistical significance. The primary cause of infection was dental caries in all patients. Dental caries is a preventable and treatable disease. Increased resources should be made available to support access to dental care, and thereby lessen the potential for the morbidity and mortality associated with serious odontogenic infections. The study at present continues as a prospective study.
Control Characteristics of Alcohol-Impaired Operators
NASA Technical Reports Server (NTRS)
Jex, Henry R.; McRuer, Duane T.; Allen, R. Wade; Klein, Richard H.
1974-01-01
Although the operation of vehicles like airplanes, cars, and bicycles involves a complex array of perceptual, decision and control activities, most accident statistics clearly show that intoxicated operators are a dominant cause of accidents, and not the difficulty of the task itself. This paper summarizes some recent research on the nature of the impairment of operator control under blood alcohol concentrations (BAC) up to above 0.16 percent. Alcohol toxicity is shown to be quite specific with respect to visual-motor functions involved in control of a vehicle, and experiments with a generalized workload task and special driving simulator show how these are reflected in terms of changes in operator control parameters such as response latency, gains, stability margins, and coherency.
CERN experience and strategy for the maintenance of cryogenic plants and distribution systems
NASA Astrophysics Data System (ADS)
Serio, L.; Bremer, J.; Claudet, S.; Delikaris, D.; Ferlin, G.; Pezzetti, M.; Pirotte, O.; Tavian, L.; Wagner, U.
2015-12-01
CERN operates and maintains the world largest cryogenic infrastructure ranging from ageing installations feeding detectors, test facilities and general services, to the state-of-the-art cryogenic system serving the flagship LHC machine complex. After several years of exploitation of a wide range of cryogenic installations and in particular following the last two years major shutdown to maintain and consolidate the LHC machine, we have analysed and reviewed the maintenance activities to implement an efficient and reliable exploitation of the installations. We report the results, statistics and lessons learned on the maintenance activities performed and in particular the required consolidations and major overhauling, the organization, management and methodologies implemented.
Hayes, Brett K; Heit, Evan
2018-05-01
Inductive reasoning entails using existing knowledge to make predictions about novel cases. The first part of this review summarizes key inductive phenomena and critically evaluates theories of induction. We highlight recent theoretical advances, with a special emphasis on the structured statistical approach, the importance of sampling assumptions in Bayesian models, and connectionist modeling. A number of new research directions in this field are identified including comparisons of inductive and deductive reasoning, the identification of common core processes in induction and memory tasks and induction involving category uncertainty. The implications of induction research for areas as diverse as complex decision-making and fear generalization are discussed. This article is categorized under: Psychology > Reasoning and Decision Making Psychology > Learning. © 2017 Wiley Periodicals, Inc.
The infinite limit as an eliminable approximation for phase transitions
NASA Astrophysics Data System (ADS)
Ardourel, Vincent
2018-05-01
It is generally claimed that infinite idealizations are required for explaining phase transitions within statistical mechanics (e.g. Batterman 2011). Nevertheless, Menon and Callender (2013) have outlined theoretical approaches that describe phase transitions without using the infinite limit. This paper closely investigates one of these approaches, which consists of studying the complex zeros of the partition function (Borrmann et al., 2000). Based on this theory, I argue for the plausibility for eliminating the infinite limit for studying phase transitions. I offer a new account for phase transitions in finite systems, and I argue for the use of the infinite limit as an approximation for studying phase transitions in large systems.
General Multivariate Linear Modeling of Surface Shapes Using SurfStat
Chung, Moo K.; Worsley, Keith J.; Nacewicz, Brendon, M.; Dalton, Kim M.; Davidson, Richard J.
2010-01-01
Although there are many imaging studies on traditional ROI-based amygdala volumetry, there are very few studies on modeling amygdala shape variations. This paper present a unified computational and statistical framework for modeling amygdala shape variations in a clinical population. The weighted spherical harmonic representation is used as to parameterize, to smooth out, and to normalize amygdala surfaces. The representation is subsequently used as an input for multivariate linear models accounting for nuisance covariates such as age and brain size difference using SurfStat package that completely avoids the complexity of specifying design matrices. The methodology has been applied for quantifying abnormal local amygdala shape variations in 22 high functioning autistic subjects. PMID:20620211
Fast method for reactor and feature scale coupling in ALD and CVD
Yanguas-Gil, Angel; Elam, Jeffrey W.
2017-08-08
Transport and surface chemistry of certain deposition techniques is modeled. Methods provide a model of the transport inside nanostructures as a single-particle discrete Markov chain process. This approach decouples the complexity of the surface chemistry from the transport model, thus allowing its application under general surface chemistry conditions, including atomic layer deposition (ALD) and chemical vapor deposition (CVD). Methods provide for determination of determine statistical information of the trajectory of individual molecules, such as the average interaction time or the number of wall collisions for molecules entering the nanostructures as well as to track the relative contributions to thin-film growth of different independent reaction pathways at each point of the feature.
[Factor Analysis: Principles to Evaluate Measurement Tools for Mental Health].
Campo-Arias, Adalberto; Herazo, Edwin; Oviedo, Heidi Celina
2012-09-01
The validation of a measurement tool in mental health is a complex process that usually starts by estimating reliability, to later approach its validity. Factor analysis is a way to know the number of dimensions, domains or factors of a measuring tool, generally related to the construct validity of the scale. The analysis could be exploratory or confirmatory, and helps in the selection of the items with better performance. For an acceptable factor analysis, it is necessary to follow some steps and recommendations, conduct some statistical tests, and rely on a proper sample of participants. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
NASA Astrophysics Data System (ADS)
Chernyak, Vladimir Y.; Chertkov, Michael; Bierkens, Joris; Kappen, Hilbert J.
2014-01-01
In stochastic optimal control (SOC) one minimizes the average cost-to-go, that consists of the cost-of-control (amount of efforts), cost-of-space (where one wants the system to be) and the target cost (where one wants the system to arrive), for a system participating in forced and controlled Langevin dynamics. We extend the SOC problem by introducing an additional cost-of-dynamics, characterized by a vector potential. We propose derivation of the generalized gauge-invariant Hamilton-Jacobi-Bellman equation as a variation over density and current, suggest hydrodynamic interpretation and discuss examples, e.g., ergodic control of a particle-within-a-circle, illustrating non-equilibrium space-time complexity.
Communication: The simplified generalized entropy theory of glass-formation in polymer melts.
Freed, Karl F
2015-08-07
While a wide range of non-trivial predictions of the generalized entropy theory (GET) of glass-formation in polymer melts agree with a large number of observed universal and non-universal properties of these glass-formers and even for the dependence of these properties on monomer molecular structure, the huge mathematical complexity of the theory precludes its extension to describe, for instance, the perplexing, complex behavior observed for technologically important polymer films with thickness below ∼100 nm and for which a fundamental molecular theory is lacking for the structural relaxation. The present communication describes a hugely simplified version of the theory, called the simplified generalized entropy theory (SGET) that provides one component necessary for devising a theory for the structural relaxation of thin polymer films and thereby supplements the first required ingredient, the recently developed Flory-Huggins level theory for the thermodynamic properties of thin polymer films, before the concluding third step of combining all the components into the SGET for thin polymer films. Comparisons between the predictions of the SGET and the full GET for the four characteristic temperatures of glass-formation provide good agreement for a highly non-trivial model system of polymer melts with chains of the structure of poly(n-α olefins) systems where the GET has produced good agreement with experiment. The comparisons consider values of the relative backbone and side group stiffnesses such that the glass transition temperature decreases as the amount of excess free volume diminishes, contrary to general expectations but in accord with observations for poly(n-alkyl methacrylates). Moreover, the SGET is sufficiently concise to enable its discussion in a standard course on statistical mechanics or polymer physics.
Communication: The simplified generalized entropy theory of glass-formation in polymer melts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Freed, Karl F.
2015-08-07
While a wide range of non-trivial predictions of the generalized entropy theory (GET) of glass-formation in polymer melts agree with a large number of observed universal and non-universal properties of these glass-formers and even for the dependence of these properties on monomer molecular structure, the huge mathematical complexity of the theory precludes its extension to describe, for instance, the perplexing, complex behavior observed for technologically important polymer films with thickness below ∼100 nm and for which a fundamental molecular theory is lacking for the structural relaxation. The present communication describes a hugely simplified version of the theory, called the simplifiedmore » generalized entropy theory (SGET) that provides one component necessary for devising a theory for the structural relaxation of thin polymer films and thereby supplements the first required ingredient, the recently developed Flory-Huggins level theory for the thermodynamic properties of thin polymer films, before the concluding third step of combining all the components into the SGET for thin polymer films. Comparisons between the predictions of the SGET and the full GET for the four characteristic temperatures of glass-formation provide good agreement for a highly non-trivial model system of polymer melts with chains of the structure of poly(n-α olefins) systems where the GET has produced good agreement with experiment. The comparisons consider values of the relative backbone and side group stiffnesses such that the glass transition temperature decreases as the amount of excess free volume diminishes, contrary to general expectations but in accord with observations for poly(n-alkyl methacrylates). Moreover, the SGET is sufficiently concise to enable its discussion in a standard course on statistical mechanics or polymer physics.« less
Esterman, Adrian J; Fountaine, Tim; McDermott, Robyn
2016-01-18
To determine whether certain characteristics of general practices are associated with good glycaemic control in patients with diabetes and with completing an annual cycle of care (ACC). Our cross-sectional analysis used baseline data from the Australian Diabetes Care Project conducted between 2011 and 2014. Practice characteristics were self-reported. Characteristics of the patients that were assessed included glycaemic control (HbA1c level ≤ 53 mmol/mol), age, sex, duration of diabetes, socio-economic disadvantage (SEIFA) score, the complexity of the patient's condition, and whether the patient had completed an ACC for diabetes in the past 18 months. Clustered logistic regression was used to establish predictors of glycaemic control and a completed ACC. Data were available from 147 general practices and 5455 patients with established type 1 or type 2 diabetes in three Australian states. After adjustment for other patient characteristics, only the patient completing an ACC was statistically significant as a predictor of glycaemic control (P = 0.011). In a multivariate model, the practice having a chronic disease-focused practice nurse (P = 0.036) and running educational events for patients with diabetes (P = 0.004) were statistically significant predictors of the patient having complete an ACC. Patient characteristics are moderately good predictors of whether the patient is in glycaemic control, whereas practice characteristics appear to predict only the likelihood of patients completing an ACC. The ACC is an established indicator of good diabetes management. This is the first study to report a positive association between having completed an ACC and the patient being in glycaemic control.
Learning predictive statistics from temporal sequences: Dynamics and strategies.
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe
2017-10-01
Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.
Moore, Jason H; Amos, Ryan; Kiralis, Jeff; Andrews, Peter C
2015-01-01
Simulation plays an essential role in the development of new computational and statistical methods for the genetic analysis of complex traits. Most simulations start with a statistical model using methods such as linear or logistic regression that specify the relationship between genotype and phenotype. This is appealing due to its simplicity and because these statistical methods are commonly used in genetic analysis. It is our working hypothesis that simulations need to move beyond simple statistical models to more realistically represent the biological complexity of genetic architecture. The goal of the present study was to develop a prototype genotype–phenotype simulation method and software that are capable of simulating complex genetic effects within the context of a hierarchical biology-based framework. Specifically, our goal is to simulate multilocus epistasis or gene–gene interaction where the genetic variants are organized within the framework of one or more genes, their regulatory regions and other regulatory loci. We introduce here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating data in this manner. This approach combines a biological hierarchy, a flexible mathematical framework, a liability threshold model for defining disease endpoints, and a heuristic search strategy for identifying high-order epistatic models of disease susceptibility. We provide several simulation examples using genetic models exhibiting independent main effects and three-way epistatic effects. PMID:25395175
Statistical speed of quantum states: Generalized quantum Fisher information and Schatten speed
NASA Astrophysics Data System (ADS)
Gessner, Manuel; Smerzi, Augusto
2018-02-01
We analyze families of measures for the quantum statistical speed which include as special cases the quantum Fisher information, the trace speed, i.e., the quantum statistical speed obtained from the trace distance, and more general quantifiers obtained from the family of Schatten norms. These measures quantify the statistical speed under generic quantum evolutions and are obtained by maximizing classical measures over all possible quantum measurements. We discuss general properties, optimal measurements, and upper bounds on the speed of separable states. We further provide a physical interpretation for the trace speed by linking it to an analog of the quantum Cramér-Rao bound for median-unbiased quantum phase estimation.
The effect of inclusion classrooms on the science achievement of general education students
NASA Astrophysics Data System (ADS)
Dodd, Matthew Robert
General education and Special Education students from three high schools in Rutherford County were sampled to determine the effect on their academic achievement on the Tennessee Biology I Gateway Exam in Inclusion classrooms. Each student's predicted and actual Gateway Exam scores from the academic year 2006--2007 were used to determine the effect the student's classroom had on his academic achievement. Independent variables used in the study were gender, ethnicity, socioeconomic level, grade point average, type of classroom (general or Inclusion), and type student (General Education or Special Education). The statistical tests used in this study were a t-test and a Mann--Whitney U Test. From this study, the effect of the Inclusion classroom on general education students was not significant statistically. Although the Inclusion classroom allows the special education student to succeed in the classroom, the effect on general education students is negligible. This study also provided statistical data that the Inclusion classroom did not improve the special education students' academic performances on the Gateway Exam. Students in a general education classroom with a GPA above 3.000 and those from a household without a low socioeconomic status performed at a statistically different level in this study.
GaussianCpG: a Gaussian model for detection of CpG island in human genome sequences.
Yu, Ning; Guo, Xuan; Zelikovsky, Alexander; Pan, Yi
2017-05-24
As crucial markers in identifying biological elements and processes in mammalian genomes, CpG islands (CGI) play important roles in DNA methylation, gene regulation, epigenetic inheritance, gene mutation, chromosome inactivation and nuclesome retention. The generally accepted criteria of CGI rely on: (a) %G+C content is ≥ 50%, (b) the ratio of the observed CpG content and the expected CpG content is ≥ 0.6, and (c) the general length of CGI is greater than 200 nucleotides. Most existing computational methods for the prediction of CpG island are programmed on these rules. However, many experimentally verified CpG islands deviate from these artificial criteria. Experiments indicate that in many cases %G+C is < 50%, CpG obs /CpG exp varies, and the length of CGI ranges from eight nucleotides to a few thousand of nucleotides. It implies that CGI detection is not just a straightly statistical task and some unrevealed rules probably are hidden. A novel Gaussian model, GaussianCpG, is developed for detection of CpG islands on human genome. We analyze the energy distribution over genomic primary structure for each CpG site and adopt the parameters from statistics of Human genome. The evaluation results show that the new model can predict CpG islands efficiently by balancing both sensitivity and specificity over known human CGI data sets. Compared with other models, GaussianCpG can achieve better performance in CGI detection. Our Gaussian model aims to simplify the complex interaction between nucleotides. The model is computed not by the linear statistical method but by the Gaussian energy distribution and accumulation. The parameters of Gaussian function are not arbitrarily designated but deliberately chosen by optimizing the biological statistics. By using the pseudopotential analysis on CpG islands, the novel model is validated on both the real and artificial data sets.
Zhang, Han; Wheeler, William; Hyland, Paula L; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai
2016-06-01
Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs.
Zhang, Han; Wheeler, William; Hyland, Paula L.; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai
2016-01-01
Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs. PMID:27362418
ERIC Educational Resources Information Center
DeMark, Sarah F.; Behrens, John T.
2004-01-01
Whereas great advances have been made in the statistical sophistication of assessments in terms of evidence accumulation and task selection, relatively little statistical work has explored the possibility of applying statistical techniques to data for the purposes of determining appropriate domain understanding and to generate task-level scoring…
Measuring the intangibles: a metrics for the economic complexity of countries and products.
Cristelli, Matthieu; Gabrielli, Andrea; Tacchella, Andrea; Caldarelli, Guido; Pietronero, Luciano
2013-01-01
We investigate a recent methodology we have proposed to extract valuable information on the competitiveness of countries and complexity of products from trade data. Standard economic theories predict a high level of specialization of countries in specific industrial sectors. However, a direct analysis of the official databases of exported products by all countries shows that the actual situation is very different. Countries commonly considered as developed ones are extremely diversified, exporting a large variety of products from very simple to very complex. At the same time countries generally considered as less developed export only the products also exported by the majority of countries. This situation calls for the introduction of a non-monetary and non-income-based measure for country economy complexity which uncovers the hidden potential for development and growth. The statistical approach we present here consists of coupled non-linear maps relating the competitiveness/fitness of countries to the complexity of their products. The fixed point of this transformation defines a metrics for the fitness of countries and the complexity of products. We argue that the key point to properly extract the economic information is the non-linearity of the map which is necessary to bound the complexity of products by the fitness of the less competitive countries exporting them. We present a detailed comparison of the results of this approach directly with those of the Method of Reflections by Hidalgo and Hausmann, showing the better performance of our method and a more solid economic, scientific and consistent foundation.
Measuring the Intangibles: A Metrics for the Economic Complexity of Countries and Products
Cristelli, Matthieu; Gabrielli, Andrea; Tacchella, Andrea; Caldarelli, Guido; Pietronero, Luciano
2013-01-01
We investigate a recent methodology we have proposed to extract valuable information on the competitiveness of countries and complexity of products from trade data. Standard economic theories predict a high level of specialization of countries in specific industrial sectors. However, a direct analysis of the official databases of exported products by all countries shows that the actual situation is very different. Countries commonly considered as developed ones are extremely diversified, exporting a large variety of products from very simple to very complex. At the same time countries generally considered as less developed export only the products also exported by the majority of countries. This situation calls for the introduction of a non-monetary and non-income-based measure for country economy complexity which uncovers the hidden potential for development and growth. The statistical approach we present here consists of coupled non-linear maps relating the competitiveness/fitness of countries to the complexity of their products. The fixed point of this transformation defines a metrics for the fitness of countries and the complexity of products. We argue that the key point to properly extract the economic information is the non-linearity of the map which is necessary to bound the complexity of products by the fitness of the less competitive countries exporting them. We present a detailed comparison of the results of this approach directly with those of the Method of Reflections by Hidalgo and Hausmann, showing the better performance of our method and a more solid economic, scientific and consistent foundation. PMID:23940633
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hero, Alfred O.; Rajaratnam, Bala
When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining
Hero, Alfred O.; Rajaratnam, Bala
2015-01-01
When can reliable inference be drawn in fue “Big Data” context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for “Big Data”. Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks. PMID:27087700
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining
Hero, Alfred O.; Rajaratnam, Bala
2015-12-09
When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less
Dietrich, Stefan; Floegel, Anna; Troll, Martina; Kühn, Tilman; Rathmann, Wolfgang; Peters, Anette; Sookthai, Disorn; von Bergen, Martin; Kaaks, Rudolf; Adamski, Jerzy; Prehn, Cornelia; Boeing, Heiner; Schulze, Matthias B; Illig, Thomas; Pischon, Tobias; Knüppel, Sven; Wang-Sattler, Rui; Drogan, Dagmar
2016-10-01
The application of metabolomics in prospective cohort studies is statistically challenging. Given the importance of appropriate statistical methods for selection of disease-associated metabolites in highly correlated complex data, we combined random survival forest (RSF) with an automated backward elimination procedure that addresses such issues. Our RSF approach was illustrated with data from the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam study, with concentrations of 127 serum metabolites as exposure variables and time to development of type 2 diabetes mellitus (T2D) as outcome variable. Out of this data set, Cox regression with a stepwise selection method was recently published. Replication of methodical comparison (RSF and Cox regression) was conducted in two independent cohorts. Finally, the R-code for implementing the metabolite selection procedure into the RSF-syntax is provided. The application of the RSF approach in EPIC-Potsdam resulted in the identification of 16 incident T2D-associated metabolites which slightly improved prediction of T2D when used in addition to traditional T2D risk factors and also when used together with classical biomarkers. The identified metabolites partly agreed with previous findings using Cox regression, though RSF selected a higher number of highly correlated metabolites. The RSF method appeared to be a promising approach for identification of disease-associated variables in complex data with time to event as outcome. The demonstrated RSF approach provides comparable findings as the generally used Cox regression, but also addresses the problem of multicollinearity and is suitable for high-dimensional data. © The Author 2016; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association.
TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies
van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.
2013-01-01
To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524
Lin, Tin-Chi; Marucci-Wellman, Helen R; Willetts, Joanna L; Brennan, Melanye J; Verma, Santosh K
2016-12-01
A common issue in descriptive injury epidemiology is that in order to calculate injury rates that account for the time spent in an activity, both injury cases and exposure time of specific activities need to be collected. In reality, few national surveys have this capacity. To address this issue, we combined statistics from two different national complex surveys as inputs for the numerator and denominator to estimate injury rate, accounting for the time spent in specific activities and included a procedure to estimate variance using the combined surveys. The 2010 National Health Interview Survey (NHIS) was used to quantify injuries, and the 2010 American Time Use Survey (ATUS) was used to quantify time of exposure to specific activities. The injury rate was estimated by dividing the average number of injuries (from NHIS) by average exposure hours (from ATUS), both measured for specific activities. The variance was calculated using the 'delta method', a general method for variance estimation with complex surveys. Among the five types of injuries examined, 'sport and exercise' had the highest rate (12.64 injuries per 100 000 h), followed by 'working around house/yard' (6.14), driving/riding a motor vehicle (2.98), working (1.45) and sleeping/resting/eating/drinking (0.23). The results show a ranking of injury rate by activity quite different from estimates using population as the denominator. Our approach produces an estimate of injury risk which includes activity exposure time and may more reliably reflect the underlying injury risks, offering an alternative method for injury surveillance and research. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Nonlinear decoding of a complex movie from the mammalian retina
Deny, Stéphane; Martius, Georg
2018-01-01
Retina is a paradigmatic system for studying sensory encoding: the transformation of light into spiking activity of ganglion cells. The inverse problem, where stimulus is reconstructed from spikes, has received less attention, especially for complex stimuli that should be reconstructed “pixel-by-pixel”. We recorded around a hundred neurons from a dense patch in a rat retina and decoded movies of multiple small randomly-moving discs. We constructed nonlinear (kernelized and neural network) decoders that improved significantly over linear results. An important contribution to this was the ability of nonlinear decoders to reliably separate between neural responses driven by locally fluctuating light signals, and responses at locally constant light driven by spontaneous-like activity. This improvement crucially depended on the precise, non-Poisson temporal structure of individual spike trains, which originated in the spike-history dependence of neural responses. We propose a general principle by which downstream circuitry could discriminate between spontaneous and stimulus-driven activity based solely on higher-order statistical structure in the incoming spike trains. PMID:29746463
Culture of science: strange history of the methodological thinking in psychology.
Toomela, Aaro
2007-03-01
In pre-World-War-II psychology, two directions in methodological thought-the German-Austrian and North American ways-could be differentiated. After the war, the German-Austrian methodological orientation has been largely abandoned. Compared to the pre-WWII German-Austrian psychology, modern mainstream psychology is more concerned with accumulation of facts than with general theory. Furthermore, the focus on qualitative data-in addition to quantitative data-is rarely visible. Only external-physical or statistical-rather than psychological controls are taken into account in empirical studies. Fragments--rather than wholes-and relationships are studied, and single cases that contradict group data are not analyzed. Instead of complex psychological types simple trait differences are studied, and prediction is not followed by thorough analysis of the whole situation. Last (but not least), data are not systematically related to complex theory. These limits have hindered the growth of knowledge in the behavioral sciences. A new return to an updated version of the German-Austrian methodological trajectory is suggested.
A permutation testing framework to compare groups of brain networks.
Simpson, Sean L; Lyday, Robert G; Hayasaka, Satoru; Marsh, Anthony P; Laurienti, Paul J
2013-01-01
Brain network analyses have moved to the forefront of neuroimaging research over the last decade. However, methods for statistically comparing groups of networks have lagged behind. These comparisons have great appeal for researchers interested in gaining further insight into complex brain function and how it changes across different mental states and disease conditions. Current comparison approaches generally either rely on a summary metric or on mass-univariate nodal or edge-based comparisons that ignore the inherent topological properties of the network, yielding little power and failing to make network level comparisons. Gleaning deeper insights into normal and abnormal changes in complex brain function demands methods that take advantage of the wealth of data present in an entire brain network. Here we propose a permutation testing framework that allows comparing groups of networks while incorporating topological features inherent in each individual network. We validate our approach using simulated data with known group differences. We then apply the method to functional brain networks derived from fMRI data.
Microscale Obstacle Resolving Air Quality Model Evaluation with the Michelstadt Case
Rakai, Anikó; Kristóf, Gergely
2013-01-01
Modelling pollutant dispersion in cities is challenging for air quality models as the urban obstacles have an important effect on the flow field and thus the dispersion. Computational Fluid Dynamics (CFD) models with an additional scalar dispersion transport equation are a possible way to resolve the flowfield in the urban canopy and model dispersion taking into consideration the effect of the buildings explicitly. These models need detailed evaluation with the method of verification and validation to gain confidence in their reliability and use them as a regulatory purpose tool in complex urban geometries. This paper shows the performance of an open source general purpose CFD code, OpenFOAM for a complex urban geometry, Michelstadt, which has both flow field and dispersion measurement data. Continuous release dispersion results are discussed to show the strengths and weaknesses of the modelling approach, focusing on the value of the turbulent Schmidt number, which was found to give best statistical metric results with a value of 0.7. PMID:24027450
Microscale obstacle resolving air quality model evaluation with the Michelstadt case.
Rakai, Anikó; Kristóf, Gergely
2013-01-01
Modelling pollutant dispersion in cities is challenging for air quality models as the urban obstacles have an important effect on the flow field and thus the dispersion. Computational Fluid Dynamics (CFD) models with an additional scalar dispersion transport equation are a possible way to resolve the flowfield in the urban canopy and model dispersion taking into consideration the effect of the buildings explicitly. These models need detailed evaluation with the method of verification and validation to gain confidence in their reliability and use them as a regulatory purpose tool in complex urban geometries. This paper shows the performance of an open source general purpose CFD code, OpenFOAM for a complex urban geometry, Michelstadt, which has both flow field and dispersion measurement data. Continuous release dispersion results are discussed to show the strengths and weaknesses of the modelling approach, focusing on the value of the turbulent Schmidt number, which was found to give best statistical metric results with a value of 0.7.
A new test method for the evaluation of total antioxidant activity of herbal products.
Zaporozhets, Olga A; Krushynska, Olena A; Lipkovska, Natalia A; Barvinchenko, Valentina N
2004-01-14
A new test method for measuring the antioxidant power of herbal products, based on solid-phase spectrophotometry using tetrabenzo-[b,f,j,n][1,5,9,13]-tetraazacyclohexadecine-Cu(II) complex immobilized on silica gel, is proposed. The absorbance of the modified sorbent (lambda(max) = 712 nm) increases proportionally to the total antioxidant activity of the sample solution. The method represents an attractive alternative to the mostly used radical scavenging capacity assays, because they generally require complex long-lasting stages to be carried out. The proposed test method is simple ("drop and measure" procedure is applied), rapid (10 min/sample), requires only the monitoring of time and absorbance, and provides good statistical parameters (s(r)
Martian lineaments from Mariner 6 and 7 photographs
NASA Technical Reports Server (NTRS)
Schultz, P. H.; Ingerson, F. E.
1973-01-01
Mariner 6 and 7 photographs were used to investigate the nature and importance of linear surface trends on Mars. Cross correlations of frequency-azimuth distributions of linear trends from different Mariner frames indicate that lineations not recognized as topographic features have a component of pseudoforms, probably introduced during digital reconstruction of the pictures. Similar statistical tests may aid in the analysis of surface trends from future satellites and space probes. The most reliable data were separated into photometrically defined provinces. Meridiani Sinus and Margaritifer Sinus display five major trends in common, which are interpreted as extensions of crustal weaknesses related to the enormous equatorial canyon revealed in Mariner 6 and 9 pictures. Alignments of crater wall segments generally match these trends and suggest structural control of crater plan. Crater chains, however, do not match these trends and are interpreted as secondary impacts. Rose diagrams of lineations in Deucalionis Regio exhibit much more complexity and are believed to reflect a better preserved or more complex geologic history.
White, Thomas E; Rojas, Bibiana; Mappes, Johanna; Rautiala, Petri; Kemp, Darrell J
2017-09-01
Much of what we know about human colour perception has come from psychophysical studies conducted in tightly-controlled laboratory settings. An enduring challenge, however, lies in extrapolating this knowledge to the noisy conditions that characterize our actual visual experience. Here we combine statistical models of visual perception with empirical data to explore how chromatic (hue/saturation) and achromatic (luminant) information underpins the detection and classification of stimuli in a complex forest environment. The data best support a simple linear model of stimulus detection as an additive function of both luminance and saturation contrast. The strength of each predictor is modest yet consistent across gross variation in viewing conditions, which accords with expectation based upon general primate psychophysics. Our findings implicate simple visual cues in the guidance of perception amidst natural noise, and highlight the potential for informing human vision via a fusion between psychophysical modelling and real-world behaviour. © 2017 The Author(s).
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Self-replication with magnetic dipolar colloids
NASA Astrophysics Data System (ADS)
Dempster, Joshua M.; Zhang, Rui; Olvera de la Cruz, Monica
2015-10-01
Colloidal self-replication represents an exciting research frontier in soft matter physics. Currently, all reported self-replication schemes involve coating colloidal particles with stimuli-responsive molecules to allow switchable interactions. In this paper, we introduce a scheme using ferromagnetic dipolar colloids and preprogrammed external magnetic fields to create an autonomous self-replication system. Interparticle dipole-dipole forces and periodically varying weak-strong magnetic fields cooperate to drive colloid monomers from the solute onto templates, bind them into replicas, and dissolve template complexes. We present three general design principles for autonomous linear replicators, derived from a focused study of a minimalist sphere-dimer magnetic system in which single binding sites allow formation of dimeric templates. We show via statistical models and computer simulations that our system exhibits nonlinear growth of templates and produces nearly exponential growth (low error rate) upon adding an optimized competing electrostatic potential. We devise experimental strategies for constructing the required magnetic colloids based on documented laboratory techniques. We also present qualitative ideas about building more complex self-replicating structures utilizing magnetic colloids.
ERIC Educational Resources Information Center
Olsen, Robert J.
2008-01-01
I describe how data pooling and data visualization can be employed in the first-semester general chemistry laboratory to introduce core statistical concepts such as central tendency and dispersion of a data set. The pooled data are plotted as a 1-D scatterplot, a purpose-designed number line through which statistical features of the data are…
NASA Astrophysics Data System (ADS)
Bianchi, Eugenio; Haggard, Hal M.; Rovelli, Carlo
2017-08-01
We show that in Oeckl's boundary formalism the boundary vectors that do not have a tensor form represent, in a precise sense, statistical states. Therefore the formalism incorporates quantum statistical mechanics naturally. We formulate general-covariant quantum statistical mechanics in this language. We illustrate the formalism by showing how it accounts for the Unruh effect. We observe that the distinction between pure and mixed states weakens in the general covariant context, suggesting that local gravitational processes are naturally statistical without a sharp quantal versus probabilistic distinction.
Learning the Language of Statistics: Challenges and Teaching Approaches
ERIC Educational Resources Information Center
Dunn, Peter K.; Carey, Michael D.; Richardson, Alice M.; McDonald, Christine
2016-01-01
Learning statistics requires learning the language of statistics. Statistics draws upon words from general English, mathematical English, discipline-specific English and words used primarily in statistics. This leads to many linguistic challenges in teaching statistics and the way in which the language is used in statistics creates an extra layer…
Marin, Manuela M.; Leder, Helmut
2013-01-01
Subjective complexity has been found to be related to hedonic measures of preference, pleasantness and beauty, but there is no consensus about the nature of this relationship in the visual and musical domains. Moreover, the affective content of stimuli has been largely neglected so far in the study of complexity but is crucial in many everyday contexts and in aesthetic experiences. We thus propose a cross-domain approach that acknowledges the multidimensional nature of complexity and that uses a wide range of objective complexity measures combined with subjective ratings. In four experiments, we employed pictures of affective environmental scenes, representational paintings, and Romantic solo and chamber music excerpts. Stimuli were pre-selected to vary in emotional content (pleasantness and arousal) and complexity (low versus high number of elements). For each set of stimuli, in a between-subjects design, ratings of familiarity, complexity, pleasantness and arousal were obtained for a presentation time of 25 s from 152 participants. In line with Berlyne’s collative-motivation model, statistical analyses controlling for familiarity revealed a positive relationship between subjective complexity and arousal, and the highest correlations were observed for musical stimuli. Evidence for a mediating role of arousal in the complexity-pleasantness relationship was demonstrated in all experiments, but was only significant for females with regard to music. The direction and strength of the linear relationship between complexity and pleasantness depended on the stimulus type and gender. For environmental scenes, the root mean square contrast measures and measures of compressed file size correlated best with subjective complexity, whereas only edge detection based on phase congruency yielded equivalent results for representational paintings. Measures of compressed file size and event density also showed positive correlations with complexity and arousal in music, which is relevant for the discussion on which aspects of complexity are domain-specific and which are domain-general. PMID:23977295
Marin, Manuela M; Leder, Helmut
2013-01-01
Subjective complexity has been found to be related to hedonic measures of preference, pleasantness and beauty, but there is no consensus about the nature of this relationship in the visual and musical domains. Moreover, the affective content of stimuli has been largely neglected so far in the study of complexity but is crucial in many everyday contexts and in aesthetic experiences. We thus propose a cross-domain approach that acknowledges the multidimensional nature of complexity and that uses a wide range of objective complexity measures combined with subjective ratings. In four experiments, we employed pictures of affective environmental scenes, representational paintings, and Romantic solo and chamber music excerpts. Stimuli were pre-selected to vary in emotional content (pleasantness and arousal) and complexity (low versus high number of elements). For each set of stimuli, in a between-subjects design, ratings of familiarity, complexity, pleasantness and arousal were obtained for a presentation time of 25 s from 152 participants. In line with Berlyne's collative-motivation model, statistical analyses controlling for familiarity revealed a positive relationship between subjective complexity and arousal, and the highest correlations were observed for musical stimuli. Evidence for a mediating role of arousal in the complexity-pleasantness relationship was demonstrated in all experiments, but was only significant for females with regard to music. The direction and strength of the linear relationship between complexity and pleasantness depended on the stimulus type and gender. For environmental scenes, the root mean square contrast measures and measures of compressed file size correlated best with subjective complexity, whereas only edge detection based on phase congruency yielded equivalent results for representational paintings. Measures of compressed file size and event density also showed positive correlations with complexity and arousal in music, which is relevant for the discussion on which aspects of complexity are domain-specific and which are domain-general.
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-08
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Vajawat, Mayuri; Deepika, P. C.; Kumar, Vijay; Rajeshwari, P.
2015-01-01
Aim: To compare the efficacy of powered toothbrushes in improving gingival health and reducing salivary red complex counts as compared to manual toothbrushes, among autistic individuals. Materials and Methods: Forty autistics was selected. Test group received powered toothbrushes, and control group received manual toothbrushes. Plaque index and gingival index were recorded. Unstimulated saliva was collected for analysis of red complex organisms using polymerase chain reaction. Results: A statistically significant reduction in the plaque scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.002 for controls). This reduction was statistically more significant in the test group (P = 0.024). A statistically significant reduction in the gingival scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.001 for controls). This reduction was statistically more significant in the test group (P = 0.042). No statistically significant reduction in the detection rate of red complex organisms were seen at 4 weeks in both the groups. Conclusion: Powered toothbrushes result in a significant overall improvement in gingival health when constant reinforcement of oral hygiene instructions is given. PMID:26681855
AlignNemo: a local network alignment method to integrate homology and topology.
Ciriello, Giovanni; Mina, Marco; Guzzi, Pietro H; Cannataro, Mario; Guerra, Concettina
2012-01-01
Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo.
OPTIMIZING THROUGH CO-EVOLUTIONARY AVALANCHES
DOE Office of Scientific and Technical Information (OSTI.GOV)
S. BOETTCHER; A. PERCUS
2000-08-01
We explore a new general-purpose heuristic for finding high-quality solutions to hard optimization problems. The method, called extremal optimization, is inspired by ''self-organized critically,'' a concept introduced to describe emergent complexity in many physical systems. In contrast to Genetic Algorithms which operate on an entire ''gene-pool'' of possible solutions, extremal optimization successively replaces extremely undesirable elements of a sub-optimal solution with new, random ones. Large fluctuations, called ''avalanches,'' ensue that efficiently explore many local optima. Drawing upon models used to simulate far-from-equilibrium dynamics, extremal optimization complements approximation methods inspired by equilibrium statistical physics, such as simulated annealing. With only onemore » adjustable parameter, its performance has proved competitive with more elaborate methods, especially near phase transitions. Those phase transitions are found in the parameter space of most optimization problems, and have recently been conjectured to be the origin of some of the hardest instances in computational complexity. We will demonstrate how extremal optimization can be implemented for a variety of combinatorial optimization problems. We believe that extremal optimization will be a useful tool in the investigation of phase transitions in combinatorial optimization problems, hence valuable in elucidating the origin of computational complexity.« less
Protonation free energy levels in complex molecular systems.
Antosiewicz, Jan M
2008-04-01
All proteins, nucleic acids, and other biomolecules contain residues capable of exchanging protons with their environment. These proton transfer phenomena lead to pH sensitivity of many molecular processes underlying biological phenomena. In the course of biological evolution, Nature has invented some mechanisms to use pH gradients to regulate biomolecular processes inside cells or in interstitial fluids. Therefore, an ability to model protonation equilibria in molecular systems accurately would be of enormous value for our understanding of biological processes and for possible rational influence on them, like in developing pH dependent drugs to treat particular diseases. This work presents a derivation, by thermodynamic and statistical mechanical methods, of an expression for the free energy of a complex molecular system at arbitrary ionization state of its titratable residues. This constitutes one of the elements of modeling protonation equilibria. Starting from a consideration of a simple acid-base equilibrium of a model compound with a single tritratable group, we arrive at an expression which is of general validity for complex systems. The only approximation used in this derivation is the postulating that the interaction energy between any pair of titratable sites does not depend on the protonation states of all the remaining ionizable groups.
The Problem of Size in Robust Design
NASA Technical Reports Server (NTRS)
Koch, Patrick N.; Allen, Janet K.; Mistree, Farrokh; Mavris, Dimitri
1997-01-01
To facilitate the effective solution of multidisciplinary, multiobjective complex design problems, a departure from the traditional parametric design analysis and single objective optimization approaches is necessary in the preliminary stages of design. A necessary tradeoff becomes one of efficiency vs. accuracy as approximate models are sought to allow fast analysis and effective exploration of a preliminary design space. In this paper we apply a general robust design approach for efficient and comprehensive preliminary design to a large complex system: a high speed civil transport (HSCT) aircraft. Specifically, we investigate the HSCT wing configuration design, incorporating life cycle economic uncertainties to identify economically robust solutions. The approach is built on the foundation of statistical experimentation and modeling techniques and robust design principles, and is specialized through incorporation of the compromise Decision Support Problem for multiobjective design. For large problems however, as in the HSCT example, this robust design approach developed for efficient and comprehensive design breaks down with the problem of size - combinatorial explosion in experimentation and model building with number of variables -and both efficiency and accuracy are sacrificed. Our focus in this paper is on identifying and discussing the implications and open issues associated with the problem of size for the preliminary design of large complex systems.
Optimal causal inference: estimating stored information and approximating causal architecture.
Still, Susanne; Crutchfield, James P; Ellison, Christopher J
2010-09-01
We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate-distortion theory to use causal shielding--a natural principle of learning. We study two distinct cases of causal inference: optimal causal filtering and optimal causal estimation. Filtering corresponds to the ideal case in which the probability distribution of measurement sequences is known, giving a principled method to approximate a system's causal structure at a desired level of representation. We show that in the limit in which a model-complexity constraint is relaxed, filtering finds the exact causal architecture of a stochastic dynamical system, known as the causal-state partition. From this, one can estimate the amount of historical information the process stores. More generally, causal filtering finds a graded model-complexity hierarchy of approximations to the causal architecture. Abrupt changes in the hierarchy, as a function of approximation, capture distinct scales of structural organization. For nonideal cases with finite data, we show how the correct number of the underlying causal states can be found by optimal causal estimation. A previously derived model-complexity control term allows us to correct for the effect of statistical fluctuations in probability estimates and thereby avoid overfitting.
A brief introduction to mixed effects modelling and multi-model inference in ecology
Donaldson, Lynda; Correa-Cano, Maria Eugenia; Goodwin, Cecily E.D.
2018-01-01
The use of linear mixed effects models (LMMs) is increasingly common in the analysis of biological data. Whilst LMMs offer a flexible approach to modelling a broad range of data types, ecological data are often complex and require complex model structures, and the fitting and interpretation of such models is not always straightforward. The ability to achieve robust biological inference requires that practitioners know how and when to apply these tools. Here, we provide a general overview of current methods for the application of LMMs to biological data, and highlight the typical pitfalls that can be encountered in the statistical modelling process. We tackle several issues regarding methods of model selection, with particular reference to the use of information theory and multi-model inference in ecology. We offer practical solutions and direct the reader to key references that provide further technical detail for those seeking a deeper understanding. This overview should serve as a widely accessible code of best practice for applying LMMs to complex biological problems and model structures, and in doing so improve the robustness of conclusions drawn from studies investigating ecological and evolutionary questions. PMID:29844961
A brief introduction to mixed effects modelling and multi-model inference in ecology.
Harrison, Xavier A; Donaldson, Lynda; Correa-Cano, Maria Eugenia; Evans, Julian; Fisher, David N; Goodwin, Cecily E D; Robinson, Beth S; Hodgson, David J; Inger, Richard
2018-01-01
The use of linear mixed effects models (LMMs) is increasingly common in the analysis of biological data. Whilst LMMs offer a flexible approach to modelling a broad range of data types, ecological data are often complex and require complex model structures, and the fitting and interpretation of such models is not always straightforward. The ability to achieve robust biological inference requires that practitioners know how and when to apply these tools. Here, we provide a general overview of current methods for the application of LMMs to biological data, and highlight the typical pitfalls that can be encountered in the statistical modelling process. We tackle several issues regarding methods of model selection, with particular reference to the use of information theory and multi-model inference in ecology. We offer practical solutions and direct the reader to key references that provide further technical detail for those seeking a deeper understanding. This overview should serve as a widely accessible code of best practice for applying LMMs to complex biological problems and model structures, and in doing so improve the robustness of conclusions drawn from studies investigating ecological and evolutionary questions.
A statistical physics perspective on criticality in financial markets
NASA Astrophysics Data System (ADS)
Bury, Thomas
2013-11-01
Stock markets are complex systems exhibiting collective phenomena and particular features such as synchronization, fluctuations distributed as power-laws, non-random structures and similarity to neural networks. Such specific properties suggest that markets operate at a very special point. Financial markets are believed to be critical by analogy to physical systems, but little statistically founded evidence has been given. Through a data-based methodology and comparison to simulations inspired by the statistical physics of complex systems, we show that the Dow Jones and index sets are not rigorously critical. However, financial systems are closer to criticality in the crash neighborhood.
Review of Statistical Methods for Analysing Healthcare Resources and Costs
Mihaylova, Borislava; Briggs, Andrew; O'Hagan, Anthony; Thompson, Simon G
2011-01-01
We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work. Copyright © 2010 John Wiley & Sons, Ltd. PMID:20799344
A Hilly path through the thermodynamics and statistical mechanics of protein solutions.
Wills, Peter R
2016-12-01
The opus of Don Winzor in the fields of physical and analytical biochemistry is a major component of that certain antipodean approach to this broad area of research that blossomed in the second half of the twentieth century. The need to formulate problems in terms of thermodynamic nonideality posed the challenge of describing a clear route from molecular interactions to the parameters that biochemists routinely measure. Mapping out this route required delving into the statistical mechanics of solutions of macromolecules, and at every turn mathematically complex, rigorous, general results that had previously been derived previously, often by Terrell Hill, came to the fore. Central to this work were the definition of the "thermodynamic activity", the pivotal position of the polynomial expansion of the osmotic pressure in terms of molar concentration and the relationship of virial coefficients to details of the forces between limited-size groups of interacting molecules. All of this was richly exploited in the task of taking account of excluded volume and electrostatic interactions, especially in the use of sedimentation equilibrium to determine values of constants for molecular association reactions. Such an approach has proved relevant to the study of molecular interactions generally, even those between the main macromolecular solute and components of the solvent, by using techniques such as exclusion and affinity chromatography as well as light scattering.
Towards tests of quark-hadron duality with functional analysis and spectral function data
NASA Astrophysics Data System (ADS)
Boito, Diogo; Caprini, Irinel
2017-04-01
The presence of terms that violate quark-hadron duality in the expansion of QCD Green's functions is a generally accepted fact. Recently, a new approach was proposed for the study of duality violations (DVs), which exploits the existence of a rigorous lower bound on the functional distance, measured in a certain norm, between a "true" correlator and its approximant calculated theoretically along a contour in the complex energy plane. In the present paper, we pursue the investigation of functional-analysis-based tests towards their application to real spectral function data. We derive a closed analytic expression for the minimal functional distance based on the general weighted L2 norm and discuss its relation with the distance measured in the L∞ norm. Using fake data sets obtained from a realistic toy model in which we allow for covariances inspired from the publicly available ALEPH spectral functions, we obtain, by Monte Carlo simulations, the statistical distribution of the strength parameter that measures the magnitude of the DV term added to the usual operator product expansion. The results show that, if the region with large errors near the end point of the spectrum in τ decays is excluded, the functional-analysis-based tests using either L2 or L∞ norms are able to detect, in a statistically significant way, the presence of DVs in realistic spectral function pseudodata.
Statistical Analysis of Complexity Generators for Cost Estimation
NASA Technical Reports Server (NTRS)
Rowell, Ginger Holmes
1999-01-01
Predicting the cost of cutting edge new technologies involved with spacecraft hardware can be quite complicated. A new feature of the NASA Air Force Cost Model (NAFCOM), called the Complexity Generator, is being developed to model the complexity factors that drive the cost of space hardware. This parametric approach is also designed to account for the differences in cost, based on factors that are unique to each system and subsystem. The cost driver categories included in this model are weight, inheritance from previous missions, technical complexity, and management factors. This paper explains the Complexity Generator framework, the statistical methods used to select the best model within this framework, and the procedures used to find the region of predictability and the prediction intervals for the cost of a mission.
NASA Astrophysics Data System (ADS)
Ji, Xinye; Shen, Chaopeng; Riley, William J.
2015-12-01
Soil moisture statistical fractal is an important tool for downscaling remotely-sensed observations and has the potential to play a key role in multi-scale hydrologic modeling. The fractal was first introduced two decades ago, but relatively little is known regarding how its scaling exponents evolve in time in response to climatic forcings. Previous studies have neglected the process of moisture re-distribution due to regional groundwater flow. In this study we used a physically-based surface-subsurface processes model and numerical experiments to elucidate the patterns and controls of fractal temporal evolution in two U.S. Midwest basins. Groundwater flow was found to introduce large-scale spatial structure, thereby reducing the scaling exponents (τ), which has implications for the transferability of calibrated parameters to predict τ. However, the groundwater effects depend on complex interactions with other physical controls such as soil texture and land use. The fractal scaling exponents, while in general showing a seasonal mode that correlates with mean moisture content, display hysteresis after storm events that can be divided into three phases, consistent with literature findings: (a) wetting, (b) re-organizing, and (c) dry-down. Modeling experiments clearly show that the hysteresis is attributed to soil texture, whose "patchiness" is the primary contributing factor. We generalized phenomenological rules for the impacts of rainfall, soil texture, groundwater flow, and land use on τ evolution. Grid resolution has a mild influence on the results and there is a strong correlation between predictions of τ from different resolutions. Overall, our results suggest that groundwater flow should be given more consideration in studies of the soil moisture statistical fractal, especially in regions with a shallow water table.
Resolving Structural Variability in Network Models and the Brain
Klimm, Florian; Bassett, Danielle S.; Carlson, Jean M.; Mucha, Peter J.
2014-01-01
Large-scale white matter pathways crisscrossing the cortex create a complex pattern of connectivity that underlies human cognitive function. Generative mechanisms for this architecture have been difficult to identify in part because little is known in general about mechanistic drivers of structured networks. Here we contrast network properties derived from diffusion spectrum imaging data of the human brain with 13 synthetic network models chosen to probe the roles of physical network embedding and temporal network growth. We characterize both the empirical and synthetic networks using familiar graph metrics, but presented here in a more complete statistical form, as scatter plots and distributions, to reveal the full range of variability of each measure across scales in the network. We focus specifically on the degree distribution, degree assortativity, hierarchy, topological Rentian scaling, and topological fractal scaling—in addition to several summary statistics, including the mean clustering coefficient, the shortest path-length, and the network diameter. The models are investigated in a progressive, branching sequence, aimed at capturing different elements thought to be important in the brain, and range from simple random and regular networks, to models that incorporate specific growth rules and constraints. We find that synthetic models that constrain the network nodes to be physically embedded in anatomical brain regions tend to produce distributions that are most similar to the corresponding measurements for the brain. We also find that network models hardcoded to display one network property (e.g., assortativity) do not in general simultaneously display a second (e.g., hierarchy). This relative independence of network properties suggests that multiple neurobiological mechanisms might be at play in the development of human brain network architecture. Together, the network models that we develop and employ provide a potentially useful starting point for the statistical inference of brain network structure from neuroimaging data. PMID:24675546
Generalized functional linear models for gene-based case-control association studies.
Fan, Ruzong; Wang, Yifan; Mills, James L; Carter, Tonia C; Lobach, Iryna; Wilson, Alexander F; Bailey-Wilson, Joan E; Weeks, Daniel E; Xiong, Momiao
2014-11-01
By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. © 2014 WILEY PERIODICALS, INC.
Generalized Functional Linear Models for Gene-based Case-Control Association Studies
Mills, James L.; Carter, Tonia C.; Lobach, Iryna; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Weeks, Daniel E.; Xiong, Momiao
2014-01-01
By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene are disease-related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease data sets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. PMID:25203683
A SURVEY OF LABORATORY AND STATISTICAL ISSUES RELATED TO FARMWORKER EXPOSURE STUDIES
Developing internally valid, and perhaps generalizable, farmworker exposure studies is a complex process that involves many statistical and laboratory considerations. Statistics are an integral component of each study beginning with the design stage and continuing to the final da...
Smith, Joseph M.; Mather, Martha E.
2012-01-01
Ecological indicators are science-based tools used to assess how human activities have impacted environmental resources. For monitoring and environmental assessment, existing species assemblage data can be used to make these comparisons through time or across sites. An impediment to using assemblage data, however, is that these data are complex and need to be simplified in an ecologically meaningful way. Because multivariate statistics are mathematical relationships, statistical groupings may not make ecological sense and will not have utility as indicators. Our goal was to define a process to select defensible and ecologically interpretable statistical simplifications of assemblage data in which researchers and managers can have confidence. For this, we chose a suite of statistical methods, compared the groupings that resulted from these analyses, identified convergence among groupings, then we interpreted the groupings using species and ecological guilds. When we tested this approach using a statewide stream fish dataset, not all statistical methods worked equally well. For our dataset, logistic regression (Log), detrended correspondence analysis (DCA), cluster analysis (CL), and non-metric multidimensional scaling (NMDS) provided consistent, simplified output. Specifically, the Log, DCA, CL-1, and NMDS-1 groupings were ≥60% similar to each other, overlapped with the fluvial-specialist ecological guild, and contained a common subset of species. Groupings based on number of species (e.g., Log, DCA, CL and NMDS) outperformed groupings based on abundance [e.g., principal components analysis (PCA) and Poisson regression]. Although the specific methods that worked on our test dataset have generality, here we are advocating a process (e.g., identifying convergent groupings with redundant species composition that are ecologically interpretable) rather than the automatic use of any single statistical tool. We summarize this process in step-by-step guidance for the future use of these commonly available ecological and statistical methods in preparing assemblage data for use in ecological indicators.
Analysis of Immune Complex Structure by Statistical Mechanics and Light Scattering Techniques.
NASA Astrophysics Data System (ADS)
Busch, Nathan Adams
1995-01-01
The size and structure of immune complexes determine their behavior in the immune system. The chemical physics of the complex formation is not well understood; this is due in part to inadequate characterization of the proteins involved, and in part by lack of sufficiently well developed theoretical techniques. Understanding the complex formation will permit rational design of strategies for inhibiting tissue deposition of the complexes. A statistical mechanical model of the proteins based upon the theory of associating fluids was developed. The multipole electrostatic potential for each protein used in this study was characterized for net protein charge, dipole moment magnitude, and dipole moment direction. The binding sites, between the model antigen and antibodies, were characterized for their net surface area, energy, and position relative to the dipole moment of the protein. The equilibrium binding graphs generated with the protein statistical mechanical model compares favorably with experimental data obtained from radioimmunoassay results. The isothermal compressibility predicted by the model agrees with results obtained from dynamic light scattering. The statistical mechanics model was used to investigate association between the model antigen and selected pairs of antibodies. It was found that, in accordance to expectations from thermodynamic arguments, the highest total binding energy yielded complex distributions which were skewed to higher complex size. From examination of the simulated formation of ring structures from linear chain complexes, and from the joint shape probability surfaces, it was found that ring configurations were formed by the "folding" of linear chains until the ends are within binding distance. By comparing the single antigen/two antibody system which differ only in their respective binding site locations, it was found that binding site location influences complex size and shape distributions only when ring formation occurs. The internal potential energy of a ring complex is considerably less than that of the non-associating system; therefore the ring complexes are quite stable and show no evidence of breaking, and collapsing into smaller complexes. The ring formation will occur only in systems where the total free energy of each complex may be minimized. Thus, ring formation will occur even though entropically unfavorable conformations result if the total free energy can be minimized by doing so.
Dose fractionation theorem in 3-D reconstruction (tomography)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glaeser, R.M.
It is commonly assumed that the large number of projections for single-axis tomography precludes its application to most beam-labile specimens. However, Hegerl and Hoppe have pointed out that the total dose required to achieve statistical significance for each voxel of a computed 3-D reconstruction is the same as that required to obtain a single 2-D image of that isolated voxel, at the same level of statistical significance. Thus a statistically significant 3-D image can be computed from statistically insignificant projections, as along as the total dosage that is distributed among these projections is high enough that it would have resultedmore » in a statistically significant projection, if applied to only one image. We have tested this critical theorem by simulating the tomographic reconstruction of a realistic 3-D model created from an electron micrograph. The simulations verify the basic conclusions of high absorption, signal-dependent noise, varying specimen contrast and missing angular range. Furthermore, the simulations demonstrate that individual projections in the series of fractionated-dose images can be aligned by cross-correlation because they contain significant information derived from the summation of features from different depths in the structure. This latter information is generally not useful for structural interpretation prior to 3-D reconstruction, owing to the complexity of most specimens investigated by single-axis tomography. These results, in combination with dose estimates for imaging single voxels and measurements of radiation damage in the electron microscope, demonstrate that it is feasible to use single-axis tomography with soft X-ray microscopy of frozen-hydrated specimens.« less
NASA Astrophysics Data System (ADS)
Salvi, Kaustubh; Villarini, Gabriele; Vecchi, Gabriel A.
2017-10-01
Unprecedented alterations in precipitation characteristics over the last century and especially in the last two decades have posed serious socio-economic problems to society in terms of hydro-meteorological extremes, in particular flooding and droughts. The origin of these alterations has its roots in changing climatic conditions; however, its threatening implications can only be dealt with through meticulous planning that is based on realistic and skillful decadal precipitation predictions (DPPs). Skillful DPPs represent a very challenging prospect because of the complexities associated with precipitation predictions. Because of the limited skill and coarse spatial resolution, the DPPs provided by General Circulation Models (GCMs) fail to be directly applicable for impact assessment. Here, we focus on nine GCMs and quantify the seasonally and regionally averaged skill in DPPs over the continental United States. We address the problems pertaining to the limited skill and resolution by applying linear and kernel regression-based statistical downscaling approaches. For both the approaches, statistical relationships established over the calibration period (1961-1990) are applied to the retrospective and near future decadal predictions by GCMs to obtain DPPs at ∼4 km resolution. The skill is quantified across different metrics that evaluate potential skill, biases, long-term statistical properties, and uncertainty. Both the statistical approaches show improvements with respect to the raw GCM data, particularly in terms of the long-term statistical properties and uncertainty, irrespective of lead time. The outcome of the study is monthly DPPs from nine GCMs with 4-km spatial resolution, which can be used as a key input for impacts assessments.
Structural Behavioral Study on the General Aviation Network Based on Complex Network
NASA Astrophysics Data System (ADS)
Zhang, Liang; Lu, Na
2017-12-01
The general aviation system is an open and dissipative system with complex structures and behavioral features. This paper has established the system model and network model for general aviation. We have analyzed integral attributes and individual attributes by applying the complex network theory and concluded that the general aviation network has influential enterprise factors and node relations. We have checked whether the network has small world effect, scale-free property and network centrality property which a complex network should have by applying degree distribution of functions and proved that the general aviation network system is a complex network. Therefore, we propose to achieve the evolution process of the general aviation industrial chain to collaborative innovation cluster of advanced-form industries by strengthening network multiplication effect, stimulating innovation performance and spanning the structural hole path.
Baker, Bruce W.; Augustine, David J.; Sedgwick, James A.; Lubow, Bruce C.
2013-01-01
Colonial, burrowing herbivores can be engineers of grassland and shrubland ecosystems worldwide. Spatial variation in landscapes suggests caution when extrapolating single-place studies of single species, but lack of data and the need to generalize often leads to ‘model system’ thinking and application of results beyond appropriate statistical inference. Generalizations about the engineering effects of prairie dogs (Cynomys sp.) developed largely from intensive study at a single complex of black-tailed prairie dogs C. ludovicianus in northern mixed prairie, but have been extrapolated to other ecoregions and prairie dog species in North America, and other colonial, burrowing herbivores. We tested the paradigm that prairie dogs decrease vegetation volume and the cover of grasses and tall shrubs, and increase bare ground and forb cover. We sampled vegetation on and off 279 colonies at 13 complexes of 3 prairie dog species widely distributed across 5 ecoregions in North America. The paradigm was generally supported at 7 black-tailed prairie dog complexes in northern mixed prairie, where vegetation volume, grass cover, and tall shrub cover were lower, and bare ground and forb cover were higher, on colonies than at paired off-colony sites. Outside the northern mixed prairie, all 3 prairie dog species consistently reduced vegetation volume, but their effects on cover of plant functional groups varied with prairie dog species and the grazing tolerance of dominant perennial grasses. White-tailed prairie dogs C. leucurus in sagebrush steppe did not reduce shrub cover, whereas black-tailed prairie dogs suppressed shrub cover at all complexes with tall shrubs in the surrounding habitat matrix. Black-tailed prairie dogs in shortgrass steppe and Gunnison's prairie dogs C. gunnisoni in Colorado Plateau grassland both had relatively minor effects on grass cover, which may reflect the dominance of grazing-tolerant shortgrasses at both complexes. Variation in modification of vegetation structure may be understood in terms of the responses of different dominant perennial grasses to intense defoliation and differences in foraging behavior among prairie dog species. Spatial variation in the engineering role of prairie dogs suggests spatial variation in their keystone role, and spatial variation in the roles of other ecosystem engineers. Thus, ecosystem engineering can have a spatial component not evident from single-place studies.
Multiple commodities in statistical microeconomics: Model and market
NASA Astrophysics Data System (ADS)
Baaquie, Belal E.; Yu, Miao; Du, Xin
2016-11-01
A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.
Cohen, Alan A.; Leroux, Maxime; Faucher, Samuel; Morissette-Thomas, Vincent; Legault, Véronique; Fried, Linda P.; Ferrucci, Luigi
2015-01-01
Physiological dysregulation may underlie aging and many chronic diseases, but is challenging to quantify because of the complexity of the underlying systems. Recently, we described a measure of physiological dysregulation, DM, that uses statistical distance to assess the degree to which an individual’s biomarker profile is normal versus aberrant. However, the sensitivity of DM to details of the calculation method has not yet been systematically assessed. In particular, the number and choice of biomarkers and the definition of the reference population (RP, the population used to define a “normal” profile) may be important. Here, we address this question by validating the method on 44 common clinical biomarkers from three longitudinal cohort studies and one cross-sectional survey. DMs calculated on different biomarker subsets show that while the signal of physiological dysregulation increases with the number of biomarkers included, the value of additional markers diminishes as more are added and inclusion of 10-15 is generally sufficient. As long as enough markers are included, individual markers have little effect on the final metric, and even DMs calculated from mutually exclusive groups of markers correlate with each other at r~0.4-0.5. We also used data subsets to generate thousands of combinations of study populations and RPs to address sensitivity to differences in age range, sex, race, data set, sample size, and their interactions. Results were largely consistent (but not identical) regardless of the choice of RP; however, the signal was generally clearer with a younger and healthier RP, and RPs too different from the study population performed poorly. Accordingly, biomarker and RP choice are not particularly important in most cases, but caution should be used across very different populations or for fine-scale analyses. Biologically, the lack of sensitivity to marker choice and better performance of younger, healthier RPs confirm an interpretation of DM physiological dysregulation and as an emergent property of a complex system. PMID:25875923
Maher, Anthony D; Fonville, Judith M; Coen, Muireann; Lindon, John C; Rae, Caroline D; Nicholson, Jeremy K
2012-01-17
The high level of complexity in nuclear magnetic resonance (NMR) metabolic spectroscopic data sets has fueled the development of experimental and mathematical techniques that enhance latent biomarker recovery and improve model interpretability. We previously showed that statistical total correlation spectroscopy (STOCSY) can be used to edit NMR spectra to remove drug metabolite signatures that obscure metabolic variation of diagnostic interest. Here, we extend this "STOCSY editing" concept to a generalized scaling procedure for NMR data that enhances recovery of latent biochemical information and improves biological classification and interpretation. We call this new procedure STOCSY-scaling (STOCSY(S)). STOCSY(S) exploits the fixed proportionality in a set of NMR spectra between resonances from the same molecule to suppress or enhance features correlated with a resonance of interest. We demonstrate this new approach using two exemplar data sets: (a) a streptozotocin rat model (n = 30) of type 1 diabetes and (b) a human epidemiological study utilizing plasma NMR spectra of patients with metabolic syndrome (n = 67). In both cases significant biomarker discovery improvement was observed by using STOCSY(S): the approach successfully suppressed interfering NMR signals from glucose and lactate that otherwise dominate the variation in the streptozotocin study, which then allowed recovery of biomarkers such as glycine, which were otherwise obscured. In the metabolic syndrome study, we used STOCSY(S) to enhance variation from the high-density lipoprotein cholesterol peak, improving the prediction of individuals with metabolic syndrome from controls in orthogonal projections to latent structures discriminant analysis models and facilitating the biological interpretation of the results. Thus, STOCSY(S) is a versatile technique that is applicable in any situation in which variation, either biological or otherwise, dominates a data set at the expense of more interesting or important features. This approach is generally appropriate for many types of NMR-based complex mixture analyses and hence for wider applications in bioanalytical science.
Does speed matter? The impact of operative time on outcome in laparoscopic surgery
Jackson, Timothy D.; Wannares, Jeffrey J.; Lancaster, R. Todd; Rattner, David W.
2012-01-01
Introduction Controversy exists concerning the importance of operative time on patient outcomes. It is unclear whether faster is better or haste makes waste or similarly whether slower procedures represent a safe, meticulous approach or inexperienced dawdling. The objective of the present study was to determine the effect of operative time on 30-day outcomes in laparoscopic surgery. Methods Patients who underwent laparoscopic general surgery procedures (colectomy, cholecystectomy, Nissen fundoplication, inguinal hernia, and gastric bypass) from the ACS-NSQIP 2005–2008 participant use file were identified. Exclusion criteria were defined a priori to identify same-day admission, elective procedures. Operative time was divided into deciles and summary statistics were analyzed. Univariate analyses using a Cochran-Armitage test for trend were completed. The effect of operative time on 30-day morbidity was further analyzed for each procedure type using multivariate regression controlling for case complexity and additional patient factors. Patients within the highest deciles were excluded to reduce outlier effect. Results A total of 76,748 elective general surgical patients who underwent laparoscopic procedures were analyzed. Univariate analyses of deciles of operative time demonstrated a statistically significant trend (p \\ 0.0001) toward increasing odds of complications with increasing operative time for laparoscopic colectomy (n = 10,135), cholecystectomy (n = 37,407), Nissen fundoplication (n = 4,934), and gastric bypass (n = 17,842). The trend was not found to be significant for laparoscopic inguinal hernia repair (n = 6,430; p = 0.14). Multivariate modeling revealed the effect of operative time to remain significant after controlling for additional patient factors. Conclusion Increasing operative time was associated with increased odds of complications and, therefore, it appears that speed may matter in laparoscopic surgery. These analyses are limited in their inability to adjust for all patient factors, potential confounders, and case complexities. Additional hierarchical multivariate analyses at the surgeon level would be important to examine this relationship further. PMID:21298533
Does speed matter? The impact of operative time on outcome in laparoscopic surgery.
Jackson, Timothy D; Wannares, Jeffrey J; Lancaster, R Todd; Rattner, David W; Hutter, Matthew M
2011-07-01
Controversy exists concerning the importance of operative time on patient outcomes. It is unclear whether faster is better or haste makes waste or similarly whether slower procedures represent a safe, meticulous approach or inexperienced dawdling. The objective of the present study was to determine the effect of operative time on 30-day outcomes in laparoscopic surgery. Patients who underwent laparoscopic general surgery procedures (colectomy, cholecystectomy, Nissen fundoplication, inguinal hernia, and gastric bypass) from the ACS-NSQIP 2005-2008 participant use file were identified. Exclusion criteria were defined a priori to identify same-day admission, elective procedures. Operative time was divided into deciles and summary statistics were analyzed. Univariate analyses using a Cochran-Armitage test for trend were completed. The effect of operative time on 30-day morbidity was further analyzed for each procedure type using multivariate regression controlling for case complexity and additional patient factors. Patients within the highest deciles were excluded to reduce outlier effect. A total of 76,748 elective general surgical patients who underwent laparoscopic procedures were analyzed. Univariate analyses of deciles of operative time demonstrated a statistically significant trend (p<0.0001) toward increasing odds of complications with increasing operative time for laparoscopic colectomy (n=10,135), cholecystectomy (n=37,407), Nissen fundoplication (n=4,934), and gastric bypass (n=17,842). The trend was not found to be significant for laparoscopic inguinal hernia repair (n=6,430; p=0.14). Multivariate modeling revealed the effect of operative time to remain significant after controlling for additional patient factors. Increasing operative time was associated with increased odds of complications and, therefore, it appears that speed may matter in laparoscopic surgery. These analyses are limited in their inability to adjust for all patient factors, potential confounders, and case complexities. Additional hierarchical multivariate analyses at the surgeon level would be important to examine this relationship further.
PREFACE: Counting Complexity: An international workshop on statistical mechanics and combinatorics
NASA Astrophysics Data System (ADS)
de Gier, Jan; Warnaar, Ole
2006-07-01
On 10-15 July 2005 the conference `Counting Complexity: An international workshop on statistical mechanics and combinatorics' was held on Dunk Island, Queensland, Australia in celebration of Tony Guttmann's 60th birthday. Dunk Island provided the perfect setting for engaging in almost all of Tony's life-long passions: swimming, running, food, wine and, of course, plenty of mathematics and physics. The conference was attended by many of Tony's close scientific friends from all over the world, and most talks were presented by his past and present collaborators. This volume contains the proceedings of the meeting and consists of 24 refereed research papers in the fields of statistical mechanics, condensed matter physics and combinatorics. These papers provide an excellent illustration of the breadth and scope of Tony's work. The very first contribution, written by Stu Whittington, contains an overview of the many scientific achievements of Tony over the past 40 years in mathematics and physics. The organizing committee, consisting of Richard Brak, Aleks Owczarek, Jan de Gier, Emma Lockwood, Andrew Rechnitzer and Ole Warnaar, gratefully acknowledges the Australian Mathematical Society (AustMS), the Australian Mathematical Sciences Institute (AMSI), the ARC Centre of Excellence for Mathematics and Statistics of Complex Systems (MASCOS), the ARC Complex Open Systems Research Network (COSNet), the Institute of Physics (IOP) and the Department of Mathematics and Statistics of The University of Melbourne for financial support in organizing the conference. Tony, we hope that your future years in mathematics will be numerous. Count yourself lucky! Tony Guttman
Miernik, Marta; Madziarska, Katarzyna; Klinger, Marian; Weyde, Wacław; Więckiewicz, Włodzimierz
2017-08-01
End-stage renal disease (ESRD) patients are considered as a group of high risk of oral cavity diseases. One of the determinants of alveolar bone loss and increased teeth mobility in ESRD patients might be the bone abnormalities associated with chronic kidney disease-mineral and bone disorder (CKD-MBD). The aim of the study was to compare the general health condition, number and location of teeth in a group of ESRD patients with the group of peers from general population and revealing the risk factors of tooth loss. The ESRD group included 63 patients, 23 females and 40 males, undergoing dialysis with a mean age of 62.4 ± 15.6. The general population sample consisted of 37 people, 20 females and 17 males, applying for general practitioner visit, with a mean age of 65.5 ± 11.1. All the participants were using just public health care insurance. The data analysis was based on anamnesis, history of CKD, selected biochemical parameters of blood tests and clinical examination. There was no statistical difference in the prosthetic needs of patients undergoing dialysis and the general population. In both groups the situation is alarming. The new procedures are needed to develop complex health care for ESRD and general population patients, emphasizing prophylaxis of tooth-loss and prosthetic treatment in order to maintain good level of life quality.
Uncovering Local Trends in Genetic Effects of Multiple Phenotypes via Functional Linear Models.
Vsevolozhskaya, Olga A; Zaykin, Dmitri V; Barondess, David A; Tong, Xiaoren; Jadhav, Sneha; Lu, Qing
2016-04-01
Recent technological advances equipped researchers with capabilities that go beyond traditional genotyping of loci known to be polymorphic in a general population. Genetic sequences of study participants can now be assessed directly. This capability removed technology-driven bias toward scoring predominantly common polymorphisms and let researchers reveal a wealth of rare and sample-specific variants. Although the relative contributions of rare and common polymorphisms to trait variation are being debated, researchers are faced with the need for new statistical tools for simultaneous evaluation of all variants within a region. Several research groups demonstrated flexibility and good statistical power of the functional linear model approach. In this work we extend previous developments to allow inclusion of multiple traits and adjustment for additional covariates. Our functional approach is unique in that it provides a nuanced depiction of effects and interactions for the variables in the model by representing them as curves varying over a genetic region. We demonstrate flexibility and competitive power of our approach by contrasting its performance with commonly used statistical tools and illustrate its potential for discovery and characterization of genetic architecture of complex traits using sequencing data from the Dallas Heart Study. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.
Strategies for mapping heterogeneous recessive traits by allele-sharing methods.
Feingold, E; Siegmund, D O
1997-01-01
We investigate strategies for detecting linkage of recessive and partially recessive traits, using sibling pairs and inbred individuals. We assume that a genomewide search is being conducted and that locus heterogeneity of the trait is likely. For sibling pairs, we evaluate the efficiency of different statistics under the assumption that one does not know the true degree of recessiveness of the trait. We recommend a sibling-pair statistic that is a linear compromise between two previously suggested statistics. We also compare the power of sibling pairs to that of more distant relatives, such as cousins. For inbred individuals, we evaluate the power of offspring of different types of matings and compare them to sibling pairs. Over a broad range of trait etiologies, sibling pairs are more powerful than inbred individuals, but for traits caused by very rare alleles, particularly in the case of heterogeneity, inbred individuals can be much more powerful. The models we develop can also be used to examine specific situations other than those we look at. We present this analysis in the idealized context of a dense set of highly polymorphic markers. In general, incorporation of real-world complexities makes inbred individuals, particularly offspring of distant relatives, look slightly less useful than our results imply. PMID:9106544
NASA Astrophysics Data System (ADS)
Spampinato, A.; Axinte, D. A.
2017-12-01
The mechanisms of interaction between bodies with statistically arranged features present characteristics common to different abrasive processes, such as dressing of abrasive tools. In contrast with the current empirical approach used to estimate the results of operations based on attritive interactions, the method we present in this paper allows us to predict the output forces and the topography of a simulated grinding wheel for a set of specific operational parameters (speed ratio and radial feed-rate), providing a thorough understanding of the complex mechanisms regulating these processes. In modelling the dressing mechanisms, the abrasive characteristics of both bodies (grain size, geometry, inter-space and protrusion) are first simulated; thus, their interaction is simulated in terms of grain collisions. Exploiting a specifically designed contact/impact evaluation algorithm, the model simulates the collisional effects of the dresser abrasives on the grinding wheel topography (grain fracture/break-out). The method has been tested for the case of a diamond rotary dresser, predicting output forces within less than 10% error and obtaining experimentally validated grinding wheel topographies. The study provides a fundamental understanding of the dressing operation, enabling the improvement of its performance in an industrial scenario, while being of general interest in modelling collision-based processes involving statistically distributed elements.
Resolving the Antarctic contribution to sea-level rise: a hierarchical modelling framework.
Zammit-Mangion, Andrew; Rougier, Jonathan; Bamber, Jonathan; Schön, Nana
2014-06-01
Determining the Antarctic contribution to sea-level rise from observational data is a complex problem. The number of physical processes involved (such as ice dynamics and surface climate) exceeds the number of observables, some of which have very poor spatial definition. This has led, in general, to solutions that utilise strong prior assumptions or physically based deterministic models to simplify the problem. Here, we present a new approach for estimating the Antarctic contribution, which only incorporates descriptive aspects of the physically based models in the analysis and in a statistical manner. By combining physical insights with modern spatial statistical modelling techniques, we are able to provide probability distributions on all processes deemed to play a role in both the observed data and the contribution to sea-level rise. Specifically, we use stochastic partial differential equations and their relation to geostatistical fields to capture our physical understanding and employ a Gaussian Markov random field approach for efficient computation. The method, an instantiation of Bayesian hierarchical modelling, naturally incorporates uncertainty in order to reveal credible intervals on all estimated quantities. The estimated sea-level rise contribution using this approach corroborates those found using a statistically independent method. © 2013 The Authors. Environmetrics Published by John Wiley & Sons, Ltd.
Liu, Wei; Ding, Jinhui
2018-04-01
The application of the principle of the intention-to-treat (ITT) to the analysis of clinical trials is challenged in the presence of missing outcome data. The consequences of stopping an assigned treatment in a withdrawn subject are unknown. It is difficult to make a single assumption about missing mechanisms for all clinical trials because there are complicated reactions in the human body to drugs due to the presence of complex biological networks, leading to data missing randomly or non-randomly. Currently there is no statistical method that can tell whether a difference between two treatments in the ITT population of a randomized clinical trial with missing data is significant at a pre-specified level. Making no assumptions about the missing mechanisms, we propose a generalized complete-case (GCC) analysis based on the data of completers. An evaluation of the impact of missing data on the ITT analysis reveals that a statistically significant GCC result implies a significant treatment effect in the ITT population at a pre-specified significance level unless, relative to the comparator, the test drug is poisonous to the non-completers as documented in their medical records. Applications of the GCC analysis are illustrated using literature data, and its properties and limits are discussed.
Statistical and methodological considerations for the interpretation of intranasal oxytocin studies
Walum, Hasse; Waldman, Irwin D.; Young, Larry J.
2015-01-01
Over the last decade, oxytocin (OT) has received focus in numerous studies associating intranasal administration of this peptide with various aspects of human social behavior. These studies in humans are inspired by animal research, especially in rodents, showing that central manipulations of the OT system affect behavioral phenotypes related to social cognition, including parental behavior, social bonding and individual recognition. Taken together, these studies in humans appear to provide compelling, but sometimes bewildering evidence for the role of OT in influencing a vast array of complex social cognitive processes in humans. In this paper we investigate to what extent the human intranasal OT literature lends support to the hypothesis that intranasal OT consistently influences a wide spectrum of social behavior in humans. We do this by considering statistical features of studies within this field, including factors like statistical power, pre-study odds and bias. Our conclusion is that intranasal OT studies are generally underpowered and that there is a high probability that most of the published intranasal OT findings do not represent true effects. Thus the remarkable reports that intranasal OT influences a large number of human social behaviors should be viewed with healthy skepticism, and we make recommendations to improve the reliability of human OT studies in the future. PMID:26210057
Resolving the Antarctic contribution to sea-level rise: a hierarchical modelling framework†
Zammit-Mangion, Andrew; Rougier, Jonathan; Bamber, Jonathan; Schön, Nana
2014-01-01
Determining the Antarctic contribution to sea-level rise from observational data is a complex problem. The number of physical processes involved (such as ice dynamics and surface climate) exceeds the number of observables, some of which have very poor spatial definition. This has led, in general, to solutions that utilise strong prior assumptions or physically based deterministic models to simplify the problem. Here, we present a new approach for estimating the Antarctic contribution, which only incorporates descriptive aspects of the physically based models in the analysis and in a statistical manner. By combining physical insights with modern spatial statistical modelling techniques, we are able to provide probability distributions on all processes deemed to play a role in both the observed data and the contribution to sea-level rise. Specifically, we use stochastic partial differential equations and their relation to geostatistical fields to capture our physical understanding and employ a Gaussian Markov random field approach for efficient computation. The method, an instantiation of Bayesian hierarchical modelling, naturally incorporates uncertainty in order to reveal credible intervals on all estimated quantities. The estimated sea-level rise contribution using this approach corroborates those found using a statistically independent method. © 2013 The Authors. Environmetrics Published by John Wiley & Sons, Ltd. PMID:25505370
Statistical analysis of magnetically soft particles in magnetorheological elastomers
NASA Astrophysics Data System (ADS)
Gundermann, T.; Cremer, P.; Löwen, H.; Menzel, A. M.; Odenbach, S.
2017-04-01
The physical properties of magnetorheological elastomers (MRE) are a complex issue and can be influenced and controlled in many ways, e.g. by applying a magnetic field, by external mechanical stimuli, or by an electric potential. In general, the response of MRE materials to these stimuli is crucially dependent on the distribution of the magnetic particles inside the elastomer. Specific knowledge of the interactions between particles or particle clusters is of high relevance for understanding the macroscopic rheological properties and provides an important input for theoretical calculations. In order to gain a better insight into the correlation between the macroscopic effects and microstructure and to generate a database for theoretical analysis, x-ray micro-computed tomography (X-μCT) investigations as a base for a statistical analysis of the particle configurations were carried out. Different MREs with quantities of 2-15 wt% (0.27-2.3 vol%) of iron powder and different allocations of the particles inside the matrix were prepared. The X-μCT results were edited by an image processing software regarding the geometrical properties of the particles with and without the influence of an external magnetic field. Pair correlation functions for the positions of the particles inside the elastomer were calculated to statistically characterize the distributions of the particles in the samples.
50 CFR 600.410 - Collection and maintenance of statistics.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 50 Wildlife and Fisheries 10 2011-10-01 2011-10-01 false Collection and maintenance of statistics... of Statistics § 600.410 Collection and maintenance of statistics. (a) General. (1) All statistics..., the Assistant Administrator will remove all identifying particulars from the statistics if doing so is...
50 CFR 600.410 - Collection and maintenance of statistics.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 50 Wildlife and Fisheries 8 2010-10-01 2010-10-01 false Collection and maintenance of statistics... of Statistics § 600.410 Collection and maintenance of statistics. (a) General. (1) All statistics..., the Assistant Administrator will remove all identifying particulars from the statistics if doing so is...