Random matrix ensembles for many-body quantum systems
NASA Astrophysics Data System (ADS)
Vyas, Manan; Seligman, Thomas H.
2018-04-01
Classical random matrix ensembles were originally introduced in physics to approximate quantum many-particle nuclear interactions. However, there exists a plethora of quantum systems whose dynamics is explained in terms of few-particle (predom-inantly two-particle) interactions. The random matrix models incorporating the few-particle nature of interactions are known as embedded random matrix ensembles. In the present paper, we provide a brief overview of these two ensembles and illustrate how the embedded ensembles can be successfully used to study decoherence of a qubit interacting with an environment, both for fermionic and bosonic embedded ensembles. Numerical calculations show the dependence of decoherence on the nature of the environment.
Symmetry Transition Preserving Chirality in QCD: A Versatile Random Matrix Model
NASA Astrophysics Data System (ADS)
Kanazawa, Takuya; Kieburg, Mario
2018-06-01
We consider a random matrix model which interpolates between the chiral Gaussian unitary ensemble and the Gaussian unitary ensemble while preserving chiral symmetry. This ensemble describes flavor symmetry breaking for staggered fermions in 3D QCD as well as in 4D QCD at high temperature or in 3D QCD at a finite isospin chemical potential. Our model is an Osborn-type two-matrix model which is equivalent to the elliptic ensemble but we consider the singular value statistics rather than the complex eigenvalue statistics. We report on exact results for the partition function and the microscopic level density of the Dirac operator in the ɛ regime of QCD. We compare these analytical results with Monte Carlo simulations of the matrix model.
Chaos and random matrices in supersymmetric SYK
NASA Astrophysics Data System (ADS)
Hunter-Jones, Nicholas; Liu, Junyu
2018-05-01
We use random matrix theory to explore late-time chaos in supersymmetric quantum mechanical systems. Motivated by the recent study of supersymmetric SYK models and their random matrix classification, we consider the Wishart-Laguerre unitary ensemble and compute the spectral form factors and frame potentials to quantify chaos and randomness. Compared to the Gaussian ensembles, we observe the absence of a dip regime in the form factor and a slower approach to Haar-random dynamics. We find agreement between our random matrix analysis and predictions from the supersymmetric SYK model, and discuss the implications for supersymmetric chaotic systems.
Gravitational lensing by eigenvalue distributions of random matrix models
NASA Astrophysics Data System (ADS)
Martínez Alonso, Luis; Medina, Elena
2018-05-01
We propose to use eigenvalue densities of unitary random matrix ensembles as mass distributions in gravitational lensing. The corresponding lens equations reduce to algebraic equations in the complex plane which can be treated analytically. We prove that these models can be applied to describe lensing by systems of edge-on galaxies. We illustrate our analysis with the Gaussian and the quartic unitary matrix ensembles.
NASA Astrophysics Data System (ADS)
Olekhno, N. A.; Beltukov, Y. M.
2018-05-01
Random impedance networks are widely used as a model to describe plasmon resonances in disordered metal-dielectric and other two-component nanocomposites. In the present work, the spectral properties of resonances in random networks are studied within the framework of the random matrix theory. We have shown that the appropriate ensemble of random matrices for the considered problem is the Jacobi ensemble (the MANOVA ensemble). The obtained analytical expressions for the density of states in such resonant networks show a good agreement with the results of numerical simulations in a wide range of metal filling fractions 0
Random density matrices versus random evolution of open system
NASA Astrophysics Data System (ADS)
Pineda, Carlos; Seligman, Thomas H.
2015-10-01
We present and compare two families of ensembles of random density matrices. The first, static ensemble, is obtained foliating an unbiased ensemble of density matrices. As criterion we use fixed purity as the simplest example of a useful convex function. The second, dynamic ensemble, is inspired in random matrix models for decoherence where one evolves a separable pure state with a random Hamiltonian until a given value of purity in the central system is achieved. Several families of Hamiltonians, adequate for different physical situations, are studied. We focus on a two qubit central system, and obtain exact expressions for the static case. The ensemble displays a peak around Werner-like states, modulated by nodes on the degeneracies of the density matrices. For moderate and strong interactions good agreement between the static and the dynamic ensembles is found. Even in a model where one qubit does not interact with the environment excellent agreement is found, but only if there is maximal entanglement with the interacting one. The discussion is started recalling similar considerations for scattering theory. At the end, we comment on the reach of the results for other convex functions of the density matrix, and exemplify the situation with the von Neumann entropy.
A random matrix approach to credit risk.
Münnix, Michael C; Schäfer, Rudi; Guhr, Thomas
2014-01-01
We estimate generic statistical properties of a structural credit risk model by considering an ensemble of correlation matrices. This ensemble is set up by Random Matrix Theory. We demonstrate analytically that the presence of correlations severely limits the effect of diversification in a credit portfolio if the correlations are not identically zero. The existence of correlations alters the tails of the loss distribution considerably, even if their average is zero. Under the assumption of randomly fluctuating correlations, a lower bound for the estimation of the loss distribution is provided.
A Random Matrix Approach to Credit Risk
Guhr, Thomas
2014-01-01
We estimate generic statistical properties of a structural credit risk model by considering an ensemble of correlation matrices. This ensemble is set up by Random Matrix Theory. We demonstrate analytically that the presence of correlations severely limits the effect of diversification in a credit portfolio if the correlations are not identically zero. The existence of correlations alters the tails of the loss distribution considerably, even if their average is zero. Under the assumption of randomly fluctuating correlations, a lower bound for the estimation of the loss distribution is provided. PMID:24853864
Crossover ensembles of random matrices and skew-orthogonal polynomials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumar, Santosh, E-mail: skumar.physics@gmail.com; Pandey, Akhilesh, E-mail: ap0700@mail.jnu.ac.in
2011-08-15
Highlights: > We study crossover ensembles of Jacobi family of random matrices. > We consider correlations for orthogonal-unitary and symplectic-unitary crossovers. > We use the method of skew-orthogonal polynomials and quaternion determinants. > We prove universality of spectral correlations in crossover ensembles. > We discuss applications to quantum conductance and communication theory problems. - Abstract: In a recent paper (S. Kumar, A. Pandey, Phys. Rev. E, 79, 2009, p. 026211) we considered Jacobi family (including Laguerre and Gaussian cases) of random matrix ensembles and reported exact solutions of crossover problems involving time-reversal symmetry breaking. In the present paper we givemore » details of the work. We start with Dyson's Brownian motion description of random matrix ensembles and obtain universal hierarchic relations among the unfolded correlation functions. For arbitrary dimensions we derive the joint probability density (jpd) of eigenvalues for all transitions leading to unitary ensembles as equilibrium ensembles. We focus on the orthogonal-unitary and symplectic-unitary crossovers and give generic expressions for jpd of eigenvalues, two-point kernels and n-level correlation functions. This involves generalization of the theory of skew-orthogonal polynomials to crossover ensembles. We also consider crossovers in the circular ensembles to show the generality of our method. In the large dimensionality limit, correlations in spectra with arbitrary initial density are shown to be universal when expressed in terms of a rescaled symmetry breaking parameter. Applications of our crossover results to communication theory and quantum conductance problems are also briefly discussed.« less
Distribution of Schmidt-like eigenvalues for Gaussian ensembles of the random matrix theory
NASA Astrophysics Data System (ADS)
Pato, Mauricio P.; Oshanin, Gleb
2013-03-01
We study the probability distribution function P(β)n(w) of the Schmidt-like random variable w = x21/(∑j = 1nx2j/n), where xj, (j = 1, 2, …, n), are unordered eigenvalues of a given n × n β-Gaussian random matrix, β being the Dyson symmetry index. This variable, by definition, can be considered as a measure of how any individual (randomly chosen) eigenvalue deviates from the arithmetic mean value of all eigenvalues of a given random matrix, and its distribution is calculated with respect to the ensemble of such β-Gaussian random matrices. We show that in the asymptotic limit n → ∞ and for arbitrary β the distribution P(β)n(w) converges to the Marčenko-Pastur form, i.e. is defined as P_{n}^{( \\beta )}(w) \\sim \\sqrt{(4 - w)/w} for w ∈ [0, 4] and equals zero outside of the support, despite the fact that formally w is defined on the interval [0, n]. Furthermore, for Gaussian unitary ensembles (β = 2) we present exact explicit expressions for P(β = 2)n(w) which are valid for arbitrary n and analyse their behaviour.
Spectral statistics of random geometric graphs
NASA Astrophysics Data System (ADS)
Dettmann, C. P.; Georgiou, O.; Knight, G.
2017-04-01
We use random matrix theory to study the spectrum of random geometric graphs, a fundamental model of spatial networks. Considering ensembles of random geometric graphs we look at short-range correlations in the level spacings of the spectrum via the nearest-neighbour and next-nearest-neighbour spacing distribution and long-range correlations via the spectral rigidity Δ3 statistic. These correlations in the level spacings give information about localisation of eigenvectors, level of community structure and the level of randomness within the networks. We find a parameter-dependent transition between Poisson and Gaussian orthogonal ensemble statistics. That is the spectral statistics of spatial random geometric graphs fits the universality of random matrix theory found in other models such as Erdős-Rényi, Barabási-Albert and Watts-Strogatz random graphs.
Products of random matrices from fixed trace and induced Ginibre ensembles
NASA Astrophysics Data System (ADS)
Akemann, Gernot; Cikovic, Milan
2018-05-01
We investigate the microcanonical version of the complex induced Ginibre ensemble, by introducing a fixed trace constraint for its second moment. Like for the canonical Ginibre ensemble, its complex eigenvalues can be interpreted as a two-dimensional Coulomb gas, which are now subject to a constraint and a modified, collective confining potential. Despite the lack of determinantal structure in this fixed trace ensemble, we compute all its density correlation functions at finite matrix size and compare to a fixed trace ensemble of normal matrices, representing a different Coulomb gas. Our main tool of investigation is the Laplace transform, that maps back the fixed trace to the induced Ginibre ensemble. Products of random matrices have been used to study the Lyapunov and stability exponents for chaotic dynamical systems, where the latter are based on the complex eigenvalues of the product matrix. Because little is known about the universality of the eigenvalue distribution of such product matrices, we then study the product of m induced Ginibre matrices with a fixed trace constraint—which are clearly non-Gaussian—and M ‑ m such Ginibre matrices without constraint. Using an m-fold inverse Laplace transform, we obtain a concise result for the spectral density of such a mixed product matrix at finite matrix size, for arbitrary fixed m and M. Very recently local and global universality was proven by the authors and their coworker for a more general, single elliptic fixed trace ensemble in the bulk of the spectrum. Here, we argue that the spectral density of mixed products is in the same universality class as the product of M independent induced Ginibre ensembles.
NASA Astrophysics Data System (ADS)
Hu, Xing-Biao; Li, Shi-Hao
2017-07-01
The relationship between matrix integrals and integrable systems was revealed more than 20 years ago. As is known, matrix integrals over a Gaussian ensemble used in random matrix theory could act as the τ-function of several hierarchies of integrable systems. In this article, we will show that the time-dependent partition function of the Bures ensemble, whose measure has many interesting geometric properties, could act as the τ-function of BKP and DKP hierarchies. In addition, if discrete time variables are introduced, then this partition function could act as the τ-function of discrete BKP and DKP hierarchies. In particular, there are some links between the partition function of the Bures ensemble and Toda-type equations.
Embedded random matrix ensembles from nuclear structure and their recent applications
NASA Astrophysics Data System (ADS)
Kota, V. K. B.; Chavda, N. D.
Embedded random matrix ensembles generated by random interactions (of low body rank and usually two-body) in the presence of a one-body mean field, introduced in nuclear structure physics, are now established to be indispensable in describing statistical properties of a large number of isolated finite quantum many-particle systems. Lie algebra symmetries of the interactions, as identified from nuclear shell model and the interacting boson model, led to the introduction of a variety of embedded ensembles (EEs). These ensembles with a mean field and chaos generating two-body interaction generate in three different stages, delocalization of wave functions in the Fock space of the mean-field basis states. The last stage corresponds to what one may call thermalization and complex nuclei, as seen from many shell model calculations, lie in this region. Besides briefly describing them, their recent applications to nuclear structure are presented and they are (i) nuclear level densities with interactions; (ii) orbit occupancies; (iii) neutrinoless double beta decay nuclear transition matrix elements as transition strengths. In addition, their applications are also presented briefly that go beyond nuclear structure and they are (i) fidelity, decoherence, entanglement and thermalization in isolated finite quantum systems with interactions; (ii) quantum transport in disordered networks connected by many-body interactions with centrosymmetry; (iii) semicircle to Gaussian transition in eigenvalue densities with k-body random interactions and its relation to the Sachdev-Ye-Kitaev (SYK) model for majorana fermions.
Constructing acoustic timefronts using random matrix theory.
Hegewisch, Katherine C; Tomsovic, Steven
2013-10-01
In a recent letter [Hegewisch and Tomsovic, Europhys. Lett. 97, 34002 (2012)], random matrix theory is introduced for long-range acoustic propagation in the ocean. The theory is expressed in terms of unitary propagation matrices that represent the scattering between acoustic modes due to sound speed fluctuations induced by the ocean's internal waves. The scattering exhibits a power-law decay as a function of the differences in mode numbers thereby generating a power-law, banded, random unitary matrix ensemble. This work gives a more complete account of that approach and extends the methods to the construction of an ensemble of acoustic timefronts. The result is a very efficient method for studying the statistical properties of timefronts at various propagation ranges that agrees well with propagation based on the parabolic equation. It helps identify which information about the ocean environment can be deduced from the timefronts and how to connect features of the data to that environmental information. It also makes direct connections to methods used in other disordered waveguide contexts where the use of random matrix theory has a multi-decade history.
On Connected Diagrams and Cumulants of Erdős-Rényi Matrix Models
NASA Astrophysics Data System (ADS)
Khorunzhiy, O.
2008-08-01
Regarding the adjacency matrices of n-vertex graphs and related graph Laplacian we introduce two families of discrete matrix models constructed both with the help of the Erdős-Rényi ensemble of random graphs. Corresponding matrix sums represent the characteristic functions of the average number of walks and closed walks over the random graph. These sums can be considered as discrete analogues of the matrix integrals of random matrix theory. We study the diagram structure of the cumulant expansions of logarithms of these matrix sums and analyze the limiting expressions as n → ∞ in the cases of constant and vanishing edge probabilities.
Fidelity decay of the two-level bosonic embedded ensembles of random matrices
NASA Astrophysics Data System (ADS)
Benet, Luis; Hernández-Quiroz, Saúl; Seligman, Thomas H.
2010-12-01
We study the fidelity decay of the k-body embedded ensembles of random matrices for bosons distributed over two single-particle states. Fidelity is defined in terms of a reference Hamiltonian, which is a purely diagonal matrix consisting of a fixed one-body term and includes the diagonal of the perturbing k-body embedded ensemble matrix, and the perturbed Hamiltonian which includes the residual off-diagonal elements of the k-body interaction. This choice mimics the typical mean-field basis used in many calculations. We study separately the cases k = 2 and 3. We compute the ensemble-averaged fidelity decay as well as the fidelity of typical members with respect to an initial random state. Average fidelity displays a revival at the Heisenberg time, t = tH = 1, and a freeze in the fidelity decay, during which periodic revivals of period tH are observed. We obtain the relevant scaling properties with respect to the number of bosons and the strength of the perturbation. For certain members of the ensemble, we find that the period of the revivals during the freeze of fidelity occurs at fractional times of tH. These fractional periodic revivals are related to the dominance of specific k-body terms in the perturbation.
Time series, correlation matrices and random matrix models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vinayak; Seligman, Thomas H.
2014-01-08
In this set of five lectures the authors have presented techniques to analyze open classical and quantum systems using correlation matrices. For diverse reasons we shall see that random matrices play an important role to describe a null hypothesis or a minimum information hypothesis for the description of a quantum system or subsystem. In the former case various forms of correlation matrices of time series associated with the classical observables of some system. The fact that such series are necessarily finite, inevitably introduces noise and this finite time influence lead to a random or stochastic component in these time series.more » By consequence random correlation matrices have a random component, and corresponding ensembles are used. In the latter we use random matrices to describe high temperature environment or uncontrolled perturbations, ensembles of differing chaotic systems etc. The common theme of the lectures is thus the importance of random matrix theory in a wide range of fields in and around physics.« less
NASA Astrophysics Data System (ADS)
Paramonov, L. E.
2012-05-01
Light scattering by isotropic ensembles of ellipsoidal particles is considered in the Rayleigh-Gans-Debye approximation. It is proved that randomly oriented ellipsoidal particles are optically equivalent to polydisperse randomly oriented spheroidal particles and polydisperse spherical particles. Density functions of the shape and size distributions for equivalent ensembles of spheroidal and spherical particles are presented. In the anomalous diffraction approximation, equivalent ensembles of particles are shown to also have equal extinction, scattering, and absorption coefficients. Consequences of optical equivalence are considered. The results are illustrated by numerical calculations of the angular dependence of the scattering phase function using the T-matrix method and the Mie theory.
Fidelity under isospectral perturbations: a random matrix study
NASA Astrophysics Data System (ADS)
Leyvraz, F.; García, A.; Kohler, H.; Seligman, T. H.
2013-07-01
The set of Hamiltonians generated by all unitary transformations from a single Hamiltonian is the largest set of isospectral Hamiltonians we can form. Taking advantage of the fact that the unitary group can be generated from Hermitian matrices we can take the ones generated by the Gaussian unitary ensemble with a small parameter as small perturbations. Similarly, the transformations generated by Hermitian antisymmetric matrices from orthogonal matrices form isospectral transformations among symmetric matrices. Based on this concept we can obtain the fidelity decay of a system that decays under a random isospectral perturbation with well-defined properties regarding time-reversal invariance. If we choose the Hamiltonian itself also from a classical random matrix ensemble, then we obtain solutions in terms of form factors in the limit of large matrices.
A stochastic Markov chain model to describe lung cancer growth and metastasis.
Newton, Paul K; Mason, Jeremy; Bethel, Kelly; Bazhenova, Lyudmila A; Nieva, Jorge; Kuhn, Peter
2012-01-01
A stochastic Markov chain model for metastatic progression is developed for primary lung cancer based on a network construction of metastatic sites with dynamics modeled as an ensemble of random walkers on the network. We calculate a transition matrix, with entries (transition probabilities) interpreted as random variables, and use it to construct a circular bi-directional network of primary and metastatic locations based on postmortem tissue analysis of 3827 autopsies on untreated patients documenting all primary tumor locations and metastatic sites from this population. The resulting 50 potential metastatic sites are connected by directed edges with distributed weightings, where the site connections and weightings are obtained by calculating the entries of an ensemble of transition matrices so that the steady-state distribution obtained from the long-time limit of the Markov chain dynamical system corresponds to the ensemble metastatic distribution obtained from the autopsy data set. We condition our search for a transition matrix on an initial distribution of metastatic tumors obtained from the data set. Through an iterative numerical search procedure, we adjust the entries of a sequence of approximations until a transition matrix with the correct steady-state is found (up to a numerical threshold). Since this constrained linear optimization problem is underdetermined, we characterize the statistical variance of the ensemble of transition matrices calculated using the means and variances of their singular value distributions as a diagnostic tool. We interpret the ensemble averaged transition probabilities as (approximately) normally distributed random variables. The model allows us to simulate and quantify disease progression pathways and timescales of progression from the lung position to other sites and we highlight several key findings based on the model.
Finite-range Coulomb gas models of banded random matrices and quantum kicked rotors
NASA Astrophysics Data System (ADS)
Pandey, Akhilesh; Kumar, Avanish; Puri, Sanjay
2017-11-01
Dyson demonstrated an equivalence between infinite-range Coulomb gas models and classical random matrix ensembles for the study of eigenvalue statistics. We introduce finite-range Coulomb gas (FRCG) models via a Brownian matrix process, and study them analytically and by Monte Carlo simulations. These models yield new universality classes, and provide a theoretical framework for the study of banded random matrices (BRMs) and quantum kicked rotors (QKRs). We demonstrate that, for a BRM of bandwidth b and a QKR of chaos parameter α , the appropriate FRCG model has the effective range d =b2/N =α2/N , for large N matrix dimensionality. As d increases, there is a transition from Poisson to classical random matrix statistics.
Finite-range Coulomb gas models of banded random matrices and quantum kicked rotors.
Pandey, Akhilesh; Kumar, Avanish; Puri, Sanjay
2017-11-01
Dyson demonstrated an equivalence between infinite-range Coulomb gas models and classical random matrix ensembles for the study of eigenvalue statistics. We introduce finite-range Coulomb gas (FRCG) models via a Brownian matrix process, and study them analytically and by Monte Carlo simulations. These models yield new universality classes, and provide a theoretical framework for the study of banded random matrices (BRMs) and quantum kicked rotors (QKRs). We demonstrate that, for a BRM of bandwidth b and a QKR of chaos parameter α, the appropriate FRCG model has the effective range d=b^{2}/N=α^{2}/N, for large N matrix dimensionality. As d increases, there is a transition from Poisson to classical random matrix statistics.
NASA Astrophysics Data System (ADS)
Kota, V. K. B.
2003-07-01
Smoothed forms for expectation values < K> E of positive definite operators K follow from the K-density moments either directly or in many other ways each giving a series expansion (involving polynomials in E). In large spectroscopic spaces one has to partition the many particle spaces into subspaces. Partitioning leads to new expansions for expectation values. It is shown that all the expansions converge to compact forms depending on the nature of the operator K and the operation of embedded random matrix ensembles and quantum chaos in many particle spaces. Explicit results are given for occupancies < ni> E, spin-cutoff factors < JZ2> E and strength sums < O†O> E, where O is a one-body transition operator.
Wigner surmises and the two-dimensional homogeneous Poisson point process.
Sakhr, Jamal; Nieminen, John M
2006-04-01
We derive a set of identities that relate the higher-order interpoint spacing statistics of the two-dimensional homogeneous Poisson point process to the Wigner surmises for the higher-order spacing distributions of eigenvalues from the three classical random matrix ensembles. We also report a remarkable identity that equates the second-nearest-neighbor spacing statistics of the points of the Poisson process and the nearest-neighbor spacing statistics of complex eigenvalues from Ginibre's ensemble of 2 x 2 complex non-Hermitian random matrices.
Semistochastic approach to many electron systems
NASA Astrophysics Data System (ADS)
Grossjean, M. K.; Grossjean, M. F.; Schulten, K.; Tavan, P.
1992-08-01
A Pariser-Parr-Pople (PPP) Hamiltonian of an 8π electron system of the molecule octatetraene, represented in a configuration-interaction basis (CI basis), is analyzed with respect to the statistical properties of its matrix elements. Based on this analysis we develop an effective Hamiltonian, which represents virtual excitations by a Gaussian orthogonal ensemble (GOE). We also examine numerical approaches which replace the original Hamiltonian by a semistochastically generated CI matrix. In that CI matrix, the matrix elements of high energy excitations are choosen randomly according to distributions reflecting the statistics of the original CI matrix.
Almost sure convergence in quantum spin glasses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buzinski, David, E-mail: dab197@case.edu; Meckes, Elizabeth, E-mail: elizabeth.meckes@case.edu
2015-12-15
Recently, Keating, Linden, and Wells [Markov Processes Relat. Fields 21(3), 537-555 (2015)] showed that the density of states measure of a nearest-neighbor quantum spin glass model is approximately Gaussian when the number of particles is large. The density of states measure is the ensemble average of the empirical spectral measure of a random matrix; in this paper, we use concentration of measure and entropy techniques together with the result of Keating, Linden, and Wells to show that in fact the empirical spectral measure of such a random matrix is almost surely approximately Gaussian itself with no ensemble averaging. We alsomore » extend this result to a spherical quantum spin glass model and to the more general coupling geometries investigated by Erdős and Schröder [Math. Phys., Anal. Geom. 17(3-4), 441–464 (2014)].« less
Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices
NASA Astrophysics Data System (ADS)
Passemier, Damien; McKay, Matthew R.; Chen, Yang
2015-07-01
Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.
Schweiner, Frank; Laturner, Jeanine; Main, Jörg; Wunner, Günter
2017-11-01
Until now only for specific crossovers between Poissonian statistics (P), the statistics of a Gaussian orthogonal ensemble (GOE), or the statistics of a Gaussian unitary ensemble (GUE) have analytical formulas for the level spacing distribution function been derived within random matrix theory. We investigate arbitrary crossovers in the triangle between all three statistics. To this aim we propose an according formula for the level spacing distribution function depending on two parameters. Comparing the behavior of our formula for the special cases of P→GUE, P→GOE, and GOE→GUE with the results from random matrix theory, we prove that these crossovers are described reasonably. Recent investigations by F. Schweiner et al. [Phys. Rev. E 95, 062205 (2017)2470-004510.1103/PhysRevE.95.062205] have shown that the Hamiltonian of magnetoexcitons in cubic semiconductors can exhibit all three statistics in dependence on the system parameters. Evaluating the numerical results for magnetoexcitons in dependence on the excitation energy and on a parameter connected with the cubic valence band structure and comparing the results with the formula proposed allows us to distinguish between regular and chaotic behavior as well as between existent or broken antiunitary symmetries. Increasing one of the two parameters, transitions between different crossovers, e.g., from the P→GOE to the P→GUE crossover, are observed and discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adachi, Satoshi; Toda, Mikito; Kubotani, Hiroto
The fixed-trace ensemble of random complex matrices is the fundamental model that excellently describes the entanglement in the quantum states realized in a coupled system by its strongly chaotic dynamical evolution [see H. Kubotani, S. Adachi, M. Toda, Phys. Rev. Lett. 100 (2008) 240501]. The fixed-trace ensemble fully takes into account the conservation of probability for quantum states. The present paper derives for the first time the exact analytical formula of the one-body distribution function of singular values of random complex matrices in the fixed-trace ensemble. The distribution function of singular values (i.e. Schmidt eigenvalues) of a quantum state ismore » so important since it describes characteristics of the entanglement in the state. The derivation of the exact analytical formula utilizes two recent achievements in mathematics, which appeared in 1990s. The first is the Kaneko theory that extends the famous Selberg integral by inserting a hypergeometric type weight factor into the integrand to obtain an analytical formula for the extended integral. The second is the Petkovsek-Wilf-Zeilberger theory that calculates definite hypergeometric sums in a closed form.« less
Universal shocks in the Wishart random-matrix ensemble.
Blaizot, Jean-Paul; Nowak, Maciej A; Warchoł, Piotr
2013-05-01
We show that the derivative of the logarithm of the average characteristic polynomial of a diffusing Wishart matrix obeys an exact partial differential equation valid for an arbitrary value of N, the size of the matrix. In the large N limit, this equation generalizes the simple inviscid Burgers equation that has been obtained earlier for Hermitian or unitary matrices. The solution, through the method of characteristics, presents singularities that we relate to the precursors of shock formation in the Burgers equation. The finite N effects appear as a viscosity term in the Burgers equation. Using a scaling analysis of the complete equation for the characteristic polynomial, in the vicinity of the shocks, we recover in a simple way the universal Bessel oscillations (so-called hard-edge singularities) familiar in random-matrix theory.
Bootstrapping on Undirected Binary Networks Via Statistical Mechanics
NASA Astrophysics Data System (ADS)
Fushing, Hsieh; Chen, Chen; Liu, Shan-Yu; Koehl, Patrice
2014-09-01
We propose a new method inspired from statistical mechanics for extracting geometric information from undirected binary networks and generating random networks that conform to this geometry. In this method an undirected binary network is perceived as a thermodynamic system with a collection of permuted adjacency matrices as its states. The task of extracting information from the network is then reformulated as a discrete combinatorial optimization problem of searching for its ground state. To solve this problem, we apply multiple ensembles of temperature regulated Markov chains to establish an ultrametric geometry on the network. This geometry is equipped with a tree hierarchy that captures the multiscale community structure of the network. We translate this geometry into a Parisi adjacency matrix, which has a relative low energy level and is in the vicinity of the ground state. The Parisi adjacency matrix is then further optimized by making block permutations subject to the ultrametric geometry. The optimal matrix corresponds to the macrostate of the original network. An ensemble of random networks is then generated such that each of these networks conforms to this macrostate; the corresponding algorithm also provides an estimate of the size of this ensemble. By repeating this procedure at different scales of the ultrametric geometry of the network, it is possible to compute its evolution entropy, i.e. to estimate the evolution of its complexity as we move from a coarse to a fine description of its geometric structure. We demonstrate the performance of this method on simulated as well as real data networks.
Random pure states: Quantifying bipartite entanglement beyond the linear statistics.
Vivo, Pierpaolo; Pato, Mauricio P; Oshanin, Gleb
2016-05-01
We analyze the properties of entangled random pure states of a quantum system partitioned into two smaller subsystems of dimensions N and M. Framing the problem in terms of random matrices with a fixed-trace constraint, we establish, for arbitrary N≤M, a general relation between the n-point densities and the cross moments of the eigenvalues of the reduced density matrix, i.e., the so-called Schmidt eigenvalues, and the analogous functionals of the eigenvalues of the Wishart-Laguerre ensemble of the random matrix theory. This allows us to derive explicit expressions for two-level densities, and also an exact expression for the variance of von Neumann entropy at finite N,M. Then, we focus on the moments E{K^{a}} of the Schmidt number K, the reciprocal of the purity. This is a random variable supported on [1,N], which quantifies the number of degrees of freedom effectively contributing to the entanglement. We derive a wealth of analytical results for E{K^{a}} for N=2 and 3 and arbitrary M, and also for square N=M systems by spotting for the latter a connection with the probability P(x_{min}^{GUE}≥sqrt[2N]ξ) that the smallest eigenvalue x_{min}^{GUE} of an N×N matrix belonging to the Gaussian unitary ensemble is larger than sqrt[2N]ξ. As a by-product, we present an exact asymptotic expansion for P(x_{min}^{GUE}≥sqrt[2N]ξ) for finite N as ξ→∞. Our results are corroborated by numerical simulations whenever possible, with excellent agreement.
On the Wigner law in dilute random matrices
NASA Astrophysics Data System (ADS)
Khorunzhy, A.; Rodgers, G. J.
1998-12-01
We consider ensembles of N × N symmetric matrices whose entries are weakly dependent random variables. We show that random dilution can change the limiting eigenvalue distribution of such matrices. We prove that under general and natural conditions the normalised eigenvalue counting function coincides with the semicircle (Wigner) distribution in the limit N → ∞. This can be explained by the observation that dilution (or more generally, random modulation) eliminates the weak dependence (or correlations) between random matrix entries. It also supports our earlier conjecture that the Wigner distribution is stable to random dilution and modulation.
Localization in covariance matrices of coupled heterogenous Ornstein-Uhlenbeck processes
NASA Astrophysics Data System (ADS)
Barucca, Paolo
2014-12-01
We define a random-matrix ensemble given by the infinite-time covariance matrices of Ornstein-Uhlenbeck processes at different temperatures coupled by a Gaussian symmetric matrix. The spectral properties of this ensemble are shown to be in qualitative agreement with some stylized facts of financial markets. Through the presented model formulas are given for the analysis of heterogeneous time series. Furthermore evidence for a localization transition in eigenvectors related to small and large eigenvalues in cross-correlations analysis of this model is found, and a simple explanation of localization phenomena in financial time series is provided. Finally we identify both in our model and in real financial data an inverted-bell effect in correlation between localized components and their local temperature: high- and low-temperature components are the most localized ones.
Eigenvalue density of cross-correlations in Sri Lankan financial market
NASA Astrophysics Data System (ADS)
Nilantha, K. G. D. R.; Ranasinghe; Malmini, P. K. C.
2007-05-01
We apply the universal properties with Gaussian orthogonal ensemble (GOE) of random matrices namely spectral properties, distribution of eigenvalues, eigenvalue spacing predicted by random matrix theory (RMT) to compare cross-correlation matrix estimators from emerging market data. The daily stock prices of the Sri Lankan All share price index and Milanka price index from August 2004 to March 2005 were analyzed. Most eigenvalues in the spectrum of the cross-correlation matrix of stock price changes agree with the universal predictions of RMT. We find that the cross-correlation matrix satisfies the universal properties of the GOE of real symmetric random matrices. The eigen distribution follows the RMT predictions in the bulk but there are some deviations at the large eigenvalues. The nearest-neighbor spacing and the next nearest-neighbor spacing of the eigenvalues were examined and found that they follow the universality of GOE. RMT with deterministic correlations found that each eigenvalue from deterministic correlations is observed at values, which are repelled from the bulk distribution.
Simple Emergent Power Spectra from Complex Inflationary Physics
NASA Astrophysics Data System (ADS)
Dias, Mafalda; Frazer, Jonathan; Marsh, M. C. David
2016-09-01
We construct ensembles of random scalar potentials for Nf-interacting scalar fields using nonequilibrium random matrix theory, and use these to study the generation of observables during small-field inflation. For Nf=O (few ), these heavily featured scalar potentials give rise to power spectra that are highly nonlinear, at odds with observations. For Nf≫1 , the superhorizon evolution of the perturbations is generically substantial, yet the power spectra simplify considerably and become more predictive, with most realizations being well approximated by a linear power spectrum. This provides proof of principle that complex inflationary physics can give rise to simple emergent power spectra. We explain how these results can be understood in terms of large Nf universality of random matrix theory.
Simple Emergent Power Spectra from Complex Inflationary Physics.
Dias, Mafalda; Frazer, Jonathan; Marsh, M C David
2016-09-30
We construct ensembles of random scalar potentials for N_{f}-interacting scalar fields using nonequilibrium random matrix theory, and use these to study the generation of observables during small-field inflation. For N_{f}=O(few), these heavily featured scalar potentials give rise to power spectra that are highly nonlinear, at odds with observations. For N_{f}≫1, the superhorizon evolution of the perturbations is generically substantial, yet the power spectra simplify considerably and become more predictive, with most realizations being well approximated by a linear power spectrum. This provides proof of principle that complex inflationary physics can give rise to simple emergent power spectra. We explain how these results can be understood in terms of large N_{f} universality of random matrix theory.
QCD dirac operator at nonzero chemical potential: lattice data and matrix model.
Akemann, Gernot; Wettig, Tilo
2004-03-12
Recently, a non-Hermitian chiral random matrix model was proposed to describe the eigenvalues of the QCD Dirac operator at nonzero chemical potential. This matrix model can be constructed from QCD by mapping it to an equivalent matrix model which has the same symmetries as QCD with chemical potential. Its microscopic spectral correlations are conjectured to be identical to those of the QCD Dirac operator. We investigate this conjecture by comparing large ensembles of Dirac eigenvalues in quenched SU(3) lattice QCD at a nonzero chemical potential to the analytical predictions of the matrix model. Excellent agreement is found in the two regimes of weak and strong non-Hermiticity, for several different lattice volumes.
Tensor Minkowski Functionals for random fields on the sphere
NASA Astrophysics Data System (ADS)
Chingangbam, Pravabati; Yogendran, K. P.; Joby, P. K.; Ganesan, Vidhya; Appleby, Stephen; Park, Changbom
2017-12-01
We generalize the translation invariant tensor-valued Minkowski Functionals which are defined on two-dimensional flat space to the unit sphere. We apply them to level sets of random fields. The contours enclosing boundaries of level sets of random fields give a spatial distribution of random smooth closed curves. We outline a method to compute the tensor-valued Minkowski Functionals numerically for any random field on the sphere. Then we obtain analytic expressions for the ensemble expectation values of the matrix elements for isotropic Gaussian and Rayleigh fields. The results hold on flat as well as any curved space with affine connection. We elucidate the way in which the matrix elements encode information about the Gaussian nature and statistical isotropy (or departure from isotropy) of the field. Finally, we apply the method to maps of the Galactic foreground emissions from the 2015 PLANCK data and demonstrate their high level of statistical anisotropy and departure from Gaussianity.
Inflation with a graceful exit in a random landscape
NASA Astrophysics Data System (ADS)
Pedro, F. G.; Westphal, A.
2017-03-01
We develop a stochastic description of small-field inflationary histories with a graceful exit in a random potential whose Hessian is a Gaussian random matrix as a model of the unstructured part of the string landscape. The dynamical evolution in such a random potential from a small-field inflation region towards a viable late-time de Sitter (dS) minimum maps to the dynamics of Dyson Brownian motion describing the relaxation of non-equilibrium eigenvalue spectra in random matrix theory. We analytically compute the relaxation probability in a saddle point approximation of the partition function of the eigenvalue distribution of the Wigner ensemble describing the mass matrices of the critical points. When applied to small-field inflation in the landscape, this leads to an exponentially strong bias against small-field ranges and an upper bound N ≪ 10 on the number of light fields N participating during inflation from the non-observation of negative spatial curvature.
2014-09-01
optimal diagonal loading which minimizes the MSE. The be- havior of optimal diagonal loading when the arrival process is composed of plane waves embedded...observation vectors. The examples of the ensemble correlation matrix corresponding to the input process consisting of a single or multiple plane waves...Y ∗ij is a complex-conjugate of Yij. This result is used in order to evaluate the expectations of different quadratic forms. The Poincare -Nash
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kawano, Toshihiko
2015-11-10
This theoretical treatment of low-energy compound nucleus reactions begins with the Bohr hypothesis, with corrections, and various statistical theories. The author investigates the statistical properties of the scattering matrix containing a Gaussian Orthogonal Ensemble (GOE) Hamiltonian in the propagator. The following conclusions are reached: For all parameter values studied, the numerical average of MC-generated cross sections coincides with the result of the Verbaarschot, Weidenmueller, Zirnbauer triple-integral formula. Energy average and ensemble average agree reasonably well when the width I is one or two orders of magnitude larger than the average resonance spacing d. In the strong-absorption limit, the channel degree-of-freedommore » ν a is 2. The direct reaction increases the inelastic cross sections while the elastic cross section is reduced.« less
Marino, Ricardo; Majumdar, Satya N; Schehr, Grégory; Vivo, Pierpaolo
2016-09-01
Let P_{β}^{(V)}(N_{I}) be the probability that a N×Nβ-ensemble of random matrices with confining potential V(x) has N_{I} eigenvalues inside an interval I=[a,b] on the real line. We introduce a general formalism, based on the Coulomb gas technique and the resolvent method, to compute analytically P_{β}^{(V)}(N_{I}) for large N. We show that this probability scales for large N as P_{β}^{(V)}(N_{I})≈exp[-βN^{2}ψ^{(V)}(N_{I}/N)], where β is the Dyson index of the ensemble. The rate function ψ^{(V)}(k_{I}), independent of β, is computed in terms of single integrals that can be easily evaluated numerically. The general formalism is then applied to the classical β-Gaussian (I=[-L,L]), β-Wishart (I=[1,L]), and β-Cauchy (I=[-L,L]) ensembles. Expanding the rate function around its minimum, we find that generically the number variance var(N_{I}) exhibits a nonmonotonic behavior as a function of the size of the interval, with a maximum that can be precisely characterized. These analytical results, corroborated by numerical simulations, provide the full counting statistics of many systems where random matrix models apply. In particular, we present results for the full counting statistics of zero-temperature one-dimensional spinless fermions in a harmonic trap.
Horizon in random matrix theory, the Hawking radiation, and flow of cold atoms.
Franchini, Fabio; Kravtsov, Vladimir E
2009-10-16
We propose a Gaussian scalar field theory in a curved 2D metric with an event horizon as the low-energy effective theory for a weakly confined, invariant random matrix ensemble (RME). The presence of an event horizon naturally generates a bath of Hawking radiation, which introduces a finite temperature in the model in a nontrivial way. A similar mapping with a gravitational analogue model has been constructed for a Bose-Einstein condensate (BEC) pushed to flow at a velocity higher than its speed of sound, with Hawking radiation as sound waves propagating over the cold atoms. Our work suggests a threefold connection between a moving BEC system, black-hole physics and unconventional RMEs with possible experimental applications.
Schur polynomials and biorthogonal random matrix ensembles
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tierz, Miguel
The study of the average of Schur polynomials over a Stieltjes-Wigert ensemble has been carried out by Dolivet and Tierz [J. Math. Phys. 48, 023507 (2007); e-print arXiv:hep-th/0609167], where it was shown that it is equal to quantum dimensions. Using the same approach, we extend the result to the biorthogonal case. We also study, using the Littlewood-Richardson rule, some particular cases of the quantum dimension result. Finally, we show that the notion of Giambelli compatibility of Schur averages, introduced by Borodin et al. [Adv. Appl. Math. 37, 209 (2006); e-print arXiv:math-ph/0505021], also holds in the biorthogonal setting.
Statistics of the epoch of reionization 21-cm signal - I. Power spectrum error-covariance
NASA Astrophysics Data System (ADS)
Mondal, Rajesh; Bharadwaj, Somnath; Majumdar, Suman
2016-02-01
The non-Gaussian nature of the epoch of reionization (EoR) 21-cm signal has a significant impact on the error variance of its power spectrum P(k). We have used a large ensemble of seminumerical simulations and an analytical model to estimate the effect of this non-Gaussianity on the entire error-covariance matrix {C}ij. Our analytical model shows that {C}ij has contributions from two sources. One is the usual variance for a Gaussian random field which scales inversely of the number of modes that goes into the estimation of P(k). The other is the trispectrum of the signal. Using the simulated 21-cm Signal Ensemble, an ensemble of the Randomized Signal and Ensembles of Gaussian Random Ensembles we have quantified the effect of the trispectrum on the error variance {C}II. We find that its relative contribution is comparable to or larger than that of the Gaussian term for the k range 0.3 ≤ k ≤ 1.0 Mpc-1, and can be even ˜200 times larger at k ˜ 5 Mpc-1. We also establish that the off-diagonal terms of {C}ij have statistically significant non-zero values which arise purely from the trispectrum. This further signifies that the error in different k modes are not independent. We find a strong correlation between the errors at large k values (≥0.5 Mpc-1), and a weak correlation between the smallest and largest k values. There is also a small anticorrelation between the errors in the smallest and intermediate k values. These results are relevant for the k range that will be probed by the current and upcoming EoR 21-cm experiments.
The difference between two random mixed quantum states: exact and asymptotic spectral analysis
NASA Astrophysics Data System (ADS)
Mejía, José; Zapata, Camilo; Botero, Alonso
2017-01-01
We investigate the spectral statistics of the difference of two density matrices, each of which is independently obtained by partially tracing a random bipartite pure quantum state. We first show how a closed-form expression for the exact joint eigenvalue probability density function for arbitrary dimensions can be obtained from the joint probability density function of the diagonal elements of the difference matrix, which is straightforward to compute. Subsequently, we use standard results from free probability theory to derive a relatively simple analytic expression for the asymptotic eigenvalue density (AED) of the difference matrix ensemble, and using Carlson’s theorem, we obtain an expression for its absolute moments. These results allow us to quantify the typical asymptotic distance between the two random mixed states using various distance measures; in particular, we obtain the almost sure asymptotic behavior of the operator norm distance and the trace distance.
Anderson Localization in Quark-Gluon Plasma
NASA Astrophysics Data System (ADS)
Kovács, Tamás G.; Pittler, Ferenc
2010-11-01
At low temperature the low end of the QCD Dirac spectrum is well described by chiral random matrix theory. In contrast, at high temperature there is no similar statistical description of the spectrum. We show that at high temperature the lowest part of the spectrum consists of a band of statistically uncorrelated eigenvalues obeying essentially Poisson statistics and the corresponding eigenvectors are extremely localized. Going up in the spectrum the spectral density rapidly increases and the eigenvectors become more and more delocalized. At the same time the spectral statistics gradually crosses over to the bulk statistics expected from the corresponding random matrix ensemble. This phenomenon is reminiscent of Anderson localization in disordered conductors. Our findings are based on staggered Dirac spectra in quenched lattice simulations with the SU(2) gauge group.
Horizon in Random Matrix Theory, the Hawking Radiation, and Flow of Cold Atoms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Franchini, Fabio; Kravtsov, Vladimir E.
2009-10-16
We propose a Gaussian scalar field theory in a curved 2D metric with an event horizon as the low-energy effective theory for a weakly confined, invariant random matrix ensemble (RME). The presence of an event horizon naturally generates a bath of Hawking radiation, which introduces a finite temperature in the model in a nontrivial way. A similar mapping with a gravitational analogue model has been constructed for a Bose-Einstein condensate (BEC) pushed to flow at a velocity higher than its speed of sound, with Hawking radiation as sound waves propagating over the cold atoms. Our work suggests a threefold connectionmore » between a moving BEC system, black-hole physics and unconventional RMEs with possible experimental applications.« less
Neural Representation of Spatial Topology in the Rodent Hippocampus
Chen, Zhe; Gomperts, Stephen N.; Yamamoto, Jun; Wilson, Matthew A.
2014-01-01
Pyramidal cells in the rodent hippocampus often exhibit clear spatial tuning in navigation. Although it has been long suggested that pyramidal cell activity may underlie a topological code rather than a topographic code, it remains unclear whether an abstract spatial topology can be encoded in the ensemble spiking activity of hippocampal place cells. Using a statistical approach developed previously, we investigate this question and related issues in greater details. We recorded ensembles of hippocampal neurons as rodents freely foraged in one and two-dimensional spatial environments, and we used a “decode-to-uncover” strategy to examine the temporally structured patterns embedded in the ensemble spiking activity in the absence of observed spatial correlates during periods of rodent navigation or awake immobility. Specifically, the spatial environment was represented by a finite discrete state space. Trajectories across spatial locations (“states”) were associated with consistent hippocampal ensemble spiking patterns, which were characterized by a state transition matrix. From this state transition matrix, we inferred a topology graph that defined the connectivity in the state space. In both one and two-dimensional environments, the extracted behavior patterns from the rodent hippocampal population codes were compared against randomly shuffled spike data. In contrast to a topographic code, our results support the efficiency of topological coding in the presence of sparse sample size and fuzzy space mapping. This computational approach allows us to quantify the variability of ensemble spiking activity, to examine hippocampal population codes during off-line states, and to quantify the topological complexity of the environment. PMID:24102128
Universality and Thouless energy in the supersymmetric Sachdev-Ye-Kitaev model
NASA Astrophysics Data System (ADS)
García-García, Antonio M.; Jia, Yiyang; Verbaarschot, Jacobus J. M.
2018-05-01
We investigate the supersymmetric Sachdev-Ye-Kitaev (SYK) model, N Majorana fermions with infinite range interactions in 0 +1 dimensions. We have found that, close to the ground state E ≈0 , discrete symmetries alter qualitatively the spectral properties with respect to the non-supersymmetric SYK model. The average spectral density at finite N , which we compute analytically and numerically, grows exponentially with N for E ≈0 . However the chiral condensate, which is normalized with respect the total number of eigenvalues, vanishes in the thermodynamic limit. Slightly above E ≈0 , the spectral density grows exponentially with the energy. Deep in the quantum regime, corresponding to the first O (N ) eigenvalues, the average spectral density is universal and well described by random matrix ensembles with chiral and superconducting discrete symmetries. The dynamics for E ≈0 is investigated by level fluctuations. Also in this case we find excellent agreement with the prediction of chiral and superconducting random matrix ensembles for eigenvalue separations smaller than the Thouless energy, which seems to scale linearly with N . Deviations beyond the Thouless energy, which describes how ergodicity is approached, are universally characterized by a quadratic growth of the number variance. In the time domain, we have found analytically that the spectral form factor g (t ), obtained from the connected two-level correlation function of the unfolded spectrum, decays as 1 /t2 for times shorter but comparable to the Thouless time with g (0 ) related to the coefficient of the quadratic growth of the number variance. Our results provide further support that quantum black holes are ergodic and therefore can be classified by random matrix theory.
Ensembles of physical states and random quantum circuits on graphs
NASA Astrophysics Data System (ADS)
Hamma, Alioscia; Santra, Siddhartha; Zanardi, Paolo
2012-11-01
In this paper we continue and extend the investigations of the ensembles of random physical states introduced in Hamma [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.109.040502 109, 040502 (2012)]. These ensembles are constructed by finite-length random quantum circuits (RQC) acting on the (hyper)edges of an underlying (hyper)graph structure. The latter encodes for the locality structure associated with finite-time quantum evolutions generated by physical, i.e., local, Hamiltonians. Our goal is to analyze physical properties of typical states in these ensembles; in particular here we focus on proxies of quantum entanglement as purity and α-Renyi entropies. The problem is formulated in terms of matrix elements of superoperators which depend on the graph structure, choice of probability measure over the local unitaries, and circuit length. In the α=2 case these superoperators act on a restricted multiqubit space generated by permutation operators associated to the subsets of vertices of the graph. For permutationally invariant interactions the dynamics can be further restricted to an exponentially smaller subspace. We consider different families of RQCs and study their typical entanglement properties for finite time as well as their asymptotic behavior. We find that area law holds in average and that the volume law is a typical property (that is, it holds in average and the fluctuations around the average are vanishing for the large system) of physical states. The area law arises when the evolution time is O(1) with respect to the size L of the system, while the volume law arises as is typical when the evolution time scales like O(L).
Random SU(2) invariant tensors
NASA Astrophysics Data System (ADS)
Li, Youning; Han, Muxin; Ruan, Dong; Zeng, Bei
2018-04-01
SU(2) invariant tensors are states in the (local) SU(2) tensor product representation but invariant under the global group action. They are of importance in the study of loop quantum gravity. A random tensor is an ensemble of tensor states. An average over the ensemble is carried out when computing any physical quantities. The random tensor exhibits a phenomenon known as ‘concentration of measure’, which states that for any bipartition the average value of entanglement entropy of its reduced density matrix is asymptotically the maximal possible as the local dimensions go to infinity. We show that this phenomenon is also true when the average is over the SU(2) invariant subspace instead of the entire space for rank-n tensors in general. It is shown in our earlier work Li et al (2017 New J. Phys. 19 063029) that the subleading correction of the entanglement entropy has a mild logarithmic divergence when n = 4. In this paper, we show that for n > 4 the subleading correction is not divergent but a finite number. In some special situation, the number could be even smaller than 1/2, which is the subleading correction of random state over the entire Hilbert space of tensors.
NASA Astrophysics Data System (ADS)
Warchoł, Piotr
2018-06-01
The public transportation system of Cuernavaca, Mexico, exhibits random matrix theory statistics. In particular, the fluctuation of times between the arrival of buses on a given bus stop, follows the Wigner surmise for the Gaussian unitary ensemble. To model this, we propose an agent-based approach in which each bus driver tries to optimize his arrival time to the next stop with respect to an estimated arrival time of his predecessor. We choose a particular form of the associated utility function and recover the appropriate distribution in numerical experiments for a certain value of the only parameter of the model. We then investigate whether this value of the parameter is otherwise distinguished within an information theoretic approach and give numerical evidence that indeed it is associated with a minimum of averaged pairwise mutual information.
NASA Astrophysics Data System (ADS)
Gong, Ming; Hofer, B.; Zallo, E.; Trotta, R.; Luo, Jun-Wei; Schmidt, O. G.; Zhang, Chuanwei
2014-05-01
We develop an effective model to describe the statistical properties of exciton fine structure splitting (FSS) and polarization angle in quantum dot ensembles (QDEs) using only a few symmetry-related parameters. The connection between the effective model and the random matrix theory is established. Such effective model is verified both theoretically and experimentally using several rather different types of QDEs, each of which contains hundreds to thousands of QDs. The model naturally addresses three fundamental issues regarding the FSS and polarization angels of QDEs, which are frequently encountered in both theories and experiments. The answers to these fundamental questions yield an approach to characterize the optical properties of QDEs. Potential applications of the effective model are also discussed.
NASA Astrophysics Data System (ADS)
Torres-Herrera, E. J.; García-García, Antonio M.; Santos, Lea F.
2018-02-01
We study numerically and analytically the quench dynamics of isolated many-body quantum systems. Using full random matrices from the Gaussian orthogonal ensemble, we obtain analytical expressions for the evolution of the survival probability, density imbalance, and out-of-time-ordered correlator. They are compared with numerical results for a one-dimensional-disordered model with two-body interactions and shown to bound the decay rate of this realistic system. Power-law decays are seen at intermediate times, and dips below the infinite time averages (correlation holes) occur at long times for all three quantities when the system exhibits level repulsion. The fact that these features are shared by both the random matrix and the realistic disordered model indicates that they are generic to nonintegrable interacting quantum systems out of equilibrium. Assisted by the random matrix analytical results, we propose expressions that describe extremely well the dynamics of the realistic chaotic system at different time scales.
Deterministic matrices matching the compressed sensing phase transitions of Gaussian random matrices
Monajemi, Hatef; Jafarpour, Sina; Gavish, Matan; Donoho, David L.; Ambikasaran, Sivaram; Bacallado, Sergio; Bharadia, Dinesh; Chen, Yuxin; Choi, Young; Chowdhury, Mainak; Chowdhury, Soham; Damle, Anil; Fithian, Will; Goetz, Georges; Grosenick, Logan; Gross, Sam; Hills, Gage; Hornstein, Michael; Lakkam, Milinda; Lee, Jason; Li, Jian; Liu, Linxi; Sing-Long, Carlos; Marx, Mike; Mittal, Akshay; Monajemi, Hatef; No, Albert; Omrani, Reza; Pekelis, Leonid; Qin, Junjie; Raines, Kevin; Ryu, Ernest; Saxe, Andrew; Shi, Dai; Siilats, Keith; Strauss, David; Tang, Gary; Wang, Chaojun; Zhou, Zoey; Zhu, Zhen
2013-01-01
In compressed sensing, one takes samples of an N-dimensional vector using an matrix A, obtaining undersampled measurements . For random matrices with independent standard Gaussian entries, it is known that, when is k-sparse, there is a precisely determined phase transition: for a certain region in the (,)-phase diagram, convex optimization typically finds the sparsest solution, whereas outside that region, it typically fails. It has been shown empirically that the same property—with the same phase transition location—holds for a wide range of non-Gaussian random matrix ensembles. We report extensive experiments showing that the Gaussian phase transition also describes numerous deterministic matrices, including Spikes and Sines, Spikes and Noiselets, Paley Frames, Delsarte-Goethals Frames, Chirp Sensing Matrices, and Grassmannian Frames. Namely, for each of these deterministic matrices in turn, for a typical k-sparse object, we observe that convex optimization is successful over a region of the phase diagram that coincides with the region known for Gaussian random matrices. Our experiments considered coefficients constrained to for four different sets , and the results establish our finding for each of the four associated phase transitions. PMID:23277588
Spectra of Adjacency Matrices in Networks of Extreme Introverts and Extroverts
NASA Astrophysics Data System (ADS)
Bassler, Kevin E.; Ezzatabadipour, Mohammadmehdi; Zia, R. K. P.
Many interesting properties were discovered in recent studies of preferred degree networks, suitable for describing social behavior of individuals who tend to prefer a certain number of contacts. In an extreme version (coined the XIE model), introverts always cut links while extroverts always add them. While the intra-group links are static, the cross-links are dynamic and lead to an ensemble of bipartite graphs, with extraordinary correlations between elements of the incidence matrix: nij In the steady state, this system can be regarded as one in thermal equilibrium with long-ranged interactions between the nij's, and displays an extreme Thouless effect. Here, we report simulation studies of a different perspective of networks, namely, the spectra associated with this ensemble of adjacency matrices {aij } . As a baseline, we first consider the spectra associated with a simple random (Erdős-Rényi) ensemble of bipartite graphs, where simulation results can be understood analytically. Work supported by the NSF through Grant DMR-1507371.
Work distributions for random sudden quantum quenches
NASA Astrophysics Data System (ADS)
Łobejko, Marcin; Łuczka, Jerzy; Talkner, Peter
2017-05-01
The statistics of work performed on a system by a sudden random quench is investigated. Considering systems with finite dimensional Hilbert spaces we model a sudden random quench by randomly choosing elements from a Gaussian unitary ensemble (GUE) consisting of Hermitian matrices with identically, Gaussian distributed matrix elements. A probability density function (pdf) of work in terms of initial and final energy distributions is derived and evaluated for a two-level system. Explicit results are obtained for quenches with a sharply given initial Hamiltonian, while the work pdfs for quenches between Hamiltonians from two independent GUEs can only be determined in explicit form in the limits of zero and infinite temperature. The same work distribution as for a sudden random quench is obtained for an adiabatic, i.e., infinitely slow, protocol connecting the same initial and final Hamiltonians.
Single-qubit decoherence under a separable coupling to a random matrix environment
NASA Astrophysics Data System (ADS)
Carrera, M.; Gorin, T.; Seligman, T. H.
2014-08-01
This paper describes the dynamics of a quantum two-level system (qubit) under the influence of an environment modeled by an ensemble of random matrices. In distinction to earlier work, we consider here separable couplings and focus on a regime where the decoherence time is of the same order of magnitude as the environmental Heisenberg time. We derive an analytical expression in the linear response approximation, and study its accuracy by comparison with numerical simulations. We discuss a series of unusual properties, such as purity oscillations, strong signatures of spectral correlations (in the environment Hamiltonian), memory effects, and symmetry-breaking equilibrium states.
NASA Astrophysics Data System (ADS)
Zhang, Hongqin; Tian, Xiangjun
2018-04-01
Ensemble-based data assimilation methods often use the so-called localization scheme to improve the representation of the ensemble background error covariance (Be). Extensive research has been undertaken to reduce the computational cost of these methods by using the localized ensemble samples to localize Be by means of a direct decomposition of the local correlation matrix C. However, the computational costs of the direct decomposition of the local correlation matrix C are still extremely high due to its high dimension. In this paper, we propose an efficient local correlation matrix decomposition approach based on the concept of alternating directions. This approach is intended to avoid direct decomposition of the correlation matrix. Instead, we first decompose the correlation matrix into 1-D correlation matrices in the three coordinate directions, then construct their empirical orthogonal function decomposition at low resolution. This procedure is followed by the 1-D spline interpolation process to transform the above decompositions to the high-resolution grid. Finally, an efficient correlation matrix decomposition is achieved by computing the very similar Kronecker product. We conducted a series of comparison experiments to illustrate the validity and accuracy of the proposed local correlation matrix decomposition approach. The effectiveness of the proposed correlation matrix decomposition approach and its efficient localization implementation of the nonlinear least-squares four-dimensional variational assimilation are further demonstrated by several groups of numerical experiments based on the Advanced Research Weather Research and Forecasting model.
The supersymmetric method in random matrix theory and applications to QCD
NASA Astrophysics Data System (ADS)
Verbaarschot, Jacobus
2004-12-01
The supersymmetric method is a powerful method for the nonperturbative evaluation of quenched averages in disordered systems. Among others, this method has been applied to the statistical theory of S-matrix fluctuations, the theory of universal conductance fluctuations and the microscopic spectral density of the QCD Dirac operator. We start this series of lectures with a general review of Random Matrix Theory and the statistical theory of spectra. An elementary introduction of the supersymmetric method in Random Matrix Theory is given in the second and third lecture. We will show that a Random Matrix Theory can be rewritten as an integral over a supermanifold. This integral will be worked out in detail for the Gaussian Unitary Ensemble that describes level correlations in systems with broken time-reversal invariance. We especially emphasize the role of symmetries. As a second example of the application of the supersymmetric method we discuss the calculation of the microscopic spectral density of the QCD Dirac operator. This is the eigenvalue density near zero on the scale of the average level spacing which is known to be given by chiral Random Matrix Theory. Also in this case we use symmetry considerations to rewrite the generating function for the resolvent as an integral over a supermanifold. The main topic of the second last lecture is the recent developments on the relation between the supersymmetric partition function and integrable hierarchies (in our case the Toda lattice hierarchy). We will show that this relation is an efficient way to calculate superintegrals. Several examples that were given in previous lectures will be worked out by means of this new method. Finally, we will discuss the quenched QCD Dirac spectrum at nonzero chemical potential. Because of the nonhermiticity of the Dirac operator the usual supersymmetric method has not been successful in this case. However, we will show that the supersymmetric partition function can be evaluated by means of the replica limit of the Toda lattice equation.
Parametric number covariance in quantum chaotic spectra.
Vinayak; Kumar, Sandeep; Pandey, Akhilesh
2016-03-01
We study spectral parametric correlations in quantum chaotic systems and introduce the number covariance as a measure of such correlations. We derive analytic results for the classical random matrix ensembles using the binary correlation method and obtain compact expressions for the covariance. We illustrate the universality of this measure by presenting the spectral analysis of the quantum kicked rotors for the time-reversal invariant and time-reversal noninvariant cases. A local version of the parametric number variance introduced earlier is also investigated.
Random matrix theory for transition strengths: Applications and open questions
NASA Astrophysics Data System (ADS)
Kota, V. K. B.
2017-12-01
Embedded random matrix ensembles are generic models for describing statistical properties of finite isolated interacting quantum many-particle systems. A finite quantum system, induced by a transition operator, makes transitions from its states to the states of the same system or to those of another system. Examples are electromagnetic transitions (then the initial and final systems are same), nuclear beta and double beta decay (then the initial and final systems are different) and so on. Using embedded ensembles (EE), there are efforts to derive a good statistical theory for transition strengths. With m fermions (or bosons) in N mean-field single particle levels and interacting via two-body forces, we have with GOE embedding, the so called EGOE(1+2). Now, the transition strength density (transition strength multiplied by the density of states at the initial and final energies) is a convolution of the density generated by the mean-field one-body part with a bivariate spreading function due to the two-body interaction. Using the embedding U(N) algebra, it is established, for a variety of transition operators, that the spreading function, for sufficiently strong interactions, is close to a bivariate Gaussian. Also, as the interaction strength increases, the spreading function exhibits a transition from bivariate Breit-Wigner to bivariate Gaussian form. In appropriate limits, this EE theory reduces to the polynomial theory of Draayer, French and Wong on one hand and to the theory due to Flambaum and Izrailev for one-body transition operators on the other. Using spin-cutoff factors for projecting angular momentum, the theory is applied to nuclear matrix elements for neutrinoless double beta decay (NDBD). In this paper we will describe: (i) various developments in the EE theory for transition strengths; (ii) results for nuclear matrix elements for 130Te and 136Xe NDBD; (iii) important open questions in the current form of the EE theory.
Generalized Pauli constraints in reduced density matrix functional theory.
Theophilou, Iris; Lathiotakis, Nektarios N; Marques, Miguel A L; Helbig, Nicole
2015-04-21
Functionals of the one-body reduced density matrix (1-RDM) are routinely minimized under Coleman's ensemble N-representability conditions. Recently, the topic of pure-state N-representability conditions, also known as generalized Pauli constraints, received increased attention following the discovery of a systematic way to derive them for any number of electrons and any finite dimensionality of the Hilbert space. The target of this work is to assess the potential impact of the enforcement of the pure-state conditions on the results of reduced density-matrix functional theory calculations. In particular, we examine whether the standard minimization of typical 1-RDM functionals under the ensemble N-representability conditions violates the pure-state conditions for prototype 3-electron systems. We also enforce the pure-state conditions, in addition to the ensemble ones, for the same systems and functionals and compare the correlation energies and optimal occupation numbers with those obtained by the enforcement of the ensemble conditions alone.
Statistical Methods in Ai: Rare Event Learning Using Associative Rules and Higher-Order Statistics
NASA Astrophysics Data System (ADS)
Iyer, V.; Shetty, S.; Iyengar, S. S.
2015-07-01
Rare event learning has not been actively researched since lately due to the unavailability of algorithms which deal with big samples. The research addresses spatio-temporal streams from multi-resolution sensors to find actionable items from a perspective of real-time algorithms. This computing framework is independent of the number of input samples, application domain, labelled or label-less streams. A sampling overlap algorithm such as Brooks-Iyengar is used for dealing with noisy sensor streams. We extend the existing noise pre-processing algorithms using Data-Cleaning trees. Pre-processing using ensemble of trees using bagging and multi-target regression showed robustness to random noise and missing data. As spatio-temporal streams are highly statistically correlated, we prove that a temporal window based sampling from sensor data streams converges after n samples using Hoeffding bounds. Which can be used for fast prediction of new samples in real-time. The Data-cleaning tree model uses a nonparametric node splitting technique, which can be learned in an iterative way which scales linearly in memory consumption for any size input stream. The improved task based ensemble extraction is compared with non-linear computation models using various SVM kernels for speed and accuracy. We show using empirical datasets the explicit rule learning computation is linear in time and is only dependent on the number of leafs present in the tree ensemble. The use of unpruned trees (t) in our proposed ensemble always yields minimum number (m) of leafs keeping pre-processing computation to n × t log m compared to N2 for Gram Matrix. We also show that the task based feature induction yields higher Qualify of Data (QoD) in the feature space compared to kernel methods using Gram Matrix.
Ensemble solute transport in two-dimensional operator-scaling random fields
NASA Astrophysics Data System (ADS)
Monnig, Nathan D.; Benson, David A.; Meerschaert, Mark M.
2008-02-01
Motivated by field measurements of aquifer hydraulic conductivity (K), recent techniques were developed to construct anisotropic fractal random fields in which the scaling, or self-similarity parameter, varies with direction and is defined by a matrix. Ensemble numerical results are analyzed for solute transport through these two-dimensional "operator-scaling" fractional Brownian motion ln(K) fields. Both the longitudinal and transverse Hurst coefficients, as well as the "radius of isotropy" are important to both plume growth rates and the timing and duration of breakthrough. It is possible to create operator-scaling fractional Brownian motion fields that have more "continuity" or stratification in the direction of transport. The effects on a conservative solute plume are continually faster-than-Fickian growth rates, highly non-Gaussian shapes, and a heavier tail early in the breakthrough curve. Contrary to some analytic stochastic theories for monofractal K fields, the plume growth rates never exceed A. Mercado's (1967) purely stratified aquifer growth rate of plume apparent dispersivity proportional to mean distance. Apparent superstratified growth must be the result of other demonstrable factors, such as initial plume size.
NASA Astrophysics Data System (ADS)
Kuijlaars, A. B. J.
2001-08-01
The asymptotic behavior of polynomials that are orthogonal with respect to a slowly decaying weight is very different from the asymptotic behavior of polynomials that are orthogonal with respect to a Freud-type weight. While the latter has been extensively studied, much less is known about the former. Following an earlier investigation into the zero behavior, we study here the asymptotics of the density of states in a unitary ensemble of random matrices with a slowly decaying weight. This measure is also naturally connected with the orthogonal polynomials. It is shown that, after suitable rescaling, the weak limit is the same as the weak limit of the rescaled zeros.
A new method for determining the optimal lagged ensemble
DelSole, T.; Tippett, M. K.; Pegion, K.
2017-01-01
Abstract We propose a general methodology for determining the lagged ensemble that minimizes the mean square forecast error. The MSE of a lagged ensemble is shown to depend only on a quantity called the cross‐lead error covariance matrix, which can be estimated from a short hindcast data set and parameterized in terms of analytic functions of time. The resulting parameterization allows the skill of forecasts to be evaluated for an arbitrary ensemble size and initialization frequency. Remarkably, the parameterization also can estimate the MSE of a burst ensemble simply by taking the limit of an infinitely small interval between initialization times. This methodology is applied to forecasts of the Madden Julian Oscillation (MJO) from version 2 of the Climate Forecast System version 2 (CFSv2). For leads greater than a week, little improvement is found in the MJO forecast skill when ensembles larger than 5 days are used or initializations greater than 4 times per day. We find that if the initialization frequency is too infrequent, important structures of the lagged error covariance matrix are lost. Lastly, we demonstrate that the forecast error at leads ≥10 days can be reduced by optimally weighting the lagged ensemble members. The weights are shown to depend only on the cross‐lead error covariance matrix. While the methodology developed here is applied to CFSv2, the technique can be easily adapted to other forecast systems. PMID:28580050
Conditional random matrix ensembles and the stability of dynamical systems
NASA Astrophysics Data System (ADS)
Kirk, Paul; Rolando, Delphine M. Y.; MacLean, Adam L.; Stumpf, Michael P. H.
2015-08-01
Random matrix theory (RMT) has found applications throughout physics and applied mathematics, in subject areas as diverse as communications networks, population dynamics, neuroscience, and models of the banking system. Many of these analyses exploit elegant analytical results, particularly the circular law and its extensions. In order to apply these results, assumptions must be made about the distribution of matrix elements. Here we demonstrate that the choice of matrix distribution is crucial. In particular, adopting an unrealistic matrix distribution for the sake of analytical tractability is liable to lead to misleading conclusions. We focus on the application of RMT to the long-standing, and at times fractious, ‘diversity-stability debate’, which is concerned with establishing whether large complex systems are likely to be stable. Early work (and subsequent elaborations) brought RMT to bear on the debate by modelling the entries of a system’s Jacobian matrix as independent and identically distributed (i.i.d.) random variables. These analyses were successful in yielding general results that were not tied to any specific system, but relied upon a restrictive i.i.d. assumption. Other studies took an opposing approach, seeking to elucidate general principles of stability through the analysis of specific systems. Here we develop a statistical framework that reconciles these two contrasting approaches. We use a range of illustrative dynamical systems examples to demonstrate that: (i) stability probability cannot be summarily deduced from any single property of the system (e.g. its diversity); and (ii) our assessment of stability depends on adequately capturing the details of the systems analysed. Failing to condition on the structure of dynamical systems will skew our analysis and can, even for very small systems, result in an unnecessarily pessimistic diagnosis of their stability.
Effect of chiral symmetry on chaotic scattering from Majorana zero modes.
Schomerus, H; Marciani, M; Beenakker, C W J
2015-04-24
In many of the experimental systems that may host Majorana zero modes, a so-called chiral symmetry exists that protects overlapping zero modes from splitting up. This symmetry is operative in a superconducting nanowire that is narrower than the spin-orbit scattering length, and at the Dirac point of a superconductor-topological insulator heterostructure. Here we show that chiral symmetry strongly modifies the dynamical and spectral properties of a chaotic scatterer, even if it binds only a single zero mode. These properties are quantified by the Wigner-Smith time-delay matrix Q=-iℏS^{†}dS/dE, the Hermitian energy derivative of the scattering matrix, related to the density of states by ρ=(2πℏ)^{-1}TrQ. We compute the probability distribution of Q and ρ, dependent on the number ν of Majorana zero modes, in the chiral ensembles of random-matrix theory. Chiral symmetry is essential for a significant ν dependence.
Pai, Priyadarshini P; Mondal, Sukanta
2016-10-01
Proteins interact with carbohydrates to perform various cellular interactions. Of the many carbohydrate ligands that proteins bind with, mannose constitute an important class, playing important roles in host defense mechanisms. Accurate identification of mannose-interacting residues (MIR) may provide important clues to decipher the underlying mechanisms of protein-mannose interactions during infections. This study proposes an approach using an ensemble of base classifiers for prediction of MIR using their evolutionary information in the form of position-specific scoring matrix. The base classifiers are random forests trained by different subsets of training data set Dset128 using 10-fold cross-validation. The optimized ensemble of base classifiers, MOWGLI, is then used to predict MIR on protein chains of the test data set Dtestset29 which showed a promising performance with 92.0% accurate prediction. An overall improvement of 26.6% in precision was observed upon comparison with the state-of-art. It is hoped that this approach, yielding enhanced predictions, could be eventually used for applications in drug design and vaccine development.
Random Matrix Theory and Econophysics
NASA Astrophysics Data System (ADS)
Rosenow, Bernd
2000-03-01
Random Matrix Theory (RMT) [1] is used in many branches of physics as a ``zero information hypothesis''. It describes generic behavior of different classes of systems, while deviations from its universal predictions allow to identify system specific properties. We use methods of RMT to analyze the cross-correlation matrix C of stock price changes [2] of the largest 1000 US companies. In addition to its scientific interest, the study of correlations between the returns of different stocks is also of practical relevance in quantifying the risk of a given stock portfolio. We find [3,4] that the statistics of most of the eigenvalues of the spectrum of C agree with the predictions of RMT, while there are deviations for some of the largest eigenvalues. We interpret these deviations as a system specific property, e.g. containing genuine information about correlations in the stock market. We demonstrate that C shares universal properties with the Gaussian orthogonal ensemble of random matrices. Furthermore, we analyze the eigenvectors of C through their inverse participation ratio and find eigenvectors with large ratios at both edges of the eigenvalue spectrum - a situation reminiscent of localization theory results. This work was done in collaboration with V. Plerou, P. Gopikrishnan, T. Guhr, L.A.N. Amaral, and H.E Stanley and is related to recent work of Laloux et al.. 1. T. Guhr, A. Müller Groeling, and H.A. Weidenmüller, ``Random Matrix Theories in Quantum Physics: Common Concepts'', Phys. Rep. 299, 190 (1998). 2. See, e.g. R.N. Mantegna and H.E. Stanley, Econophysics: Correlations and Complexity in Finance (Cambridge University Press, Cambridge, England, 1999). 3. V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, and H.E. Stanley, ``Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series'', Phys. Rev. Lett. 83, 1471 (1999). 4. V. Plerou, P. Gopikrishnan, T. Guhr, B. Rosenow, L.A.N. Amaral, and H.E. Stanley, ``Random Matrix Theory Analysis of Diffusion in Stock Price Dynamics, preprint
Random matrix approach to cross correlations in financial data
NASA Astrophysics Data System (ADS)
Plerou, Vasiliki; Gopikrishnan, Parameswaran; Rosenow, Bernd; Amaral, Luís A.; Guhr, Thomas; Stanley, H. Eugene
2002-06-01
We analyze cross correlations between price fluctuations of different stocks using methods of random matrix theory (RMT). Using two large databases, we calculate cross-correlation matrices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hayashi, A.; Hashimoto, T.; Horibe, M.
The quantum color coding scheme proposed by Korff and Kempe [e-print quant-ph/0405086] is easily extended so that the color coding quantum system is allowed to be entangled with an extra auxiliary quantum system. It is shown that in the extended scheme we need only {approx}2{radical}(N) quantum colors to order N objects in large N limit, whereas {approx}N/e quantum colors are required in the original nonextended version. The maximum success probability has asymptotics expressed by the Tracy-Widom distribution of the largest eigenvalue of a random Gaussian unitary ensemble (GUE) matrix.
Compressed sensing of hyperspectral images based on scrambled block Hadamard ensemble
NASA Astrophysics Data System (ADS)
Wang, Li; Feng, Yan
2016-11-01
A fast measurement matrix based on scrambled block Hadamard ensemble for compressed sensing (CS) of hyperspectral images (HSI) is investigated. The proposed measurement matrix offers several attractive features. First, the proposed measurement matrix possesses Gaussian behavior, which illustrates that the matrix is universal and requires a near-optimal number of samples for exact reconstruction. In addition, it could be easily implemented in the optical domain due to its integer-valued elements. More importantly, the measurement matrix only needs small memory for storage in the sampling process. Experimental results on HSIs reveal that the reconstruction performance of the proposed measurement matrix is comparable or better than Gaussian matrix and Bernoulli matrix using different reconstruction algorithms while consuming less computational time. The proposed matrix could be used in CS of HSI, which would save the storage memory on board, improve the sampling efficiency, and ameliorate the reconstruction quality.
Random Matrix Approach to Quantum Adiabatic Evolution Algorithms
NASA Technical Reports Server (NTRS)
Boulatov, Alexei; Smelyanskiy, Vadier N.
2004-01-01
We analyze the power of quantum adiabatic evolution algorithms (Q-QA) for solving random NP-hard optimization problems within a theoretical framework based on the random matrix theory (RMT). We present two types of the driven RMT models. In the first model, the driving Hamiltonian is represented by Brownian motion in the matrix space. We use the Brownian motion model to obtain a description of multiple avoided crossing phenomena. We show that the failure mechanism of the QAA is due to the interaction of the ground state with the "cloud" formed by all the excited states, confirming that in the driven RMT models. the Landau-Zener mechanism of dissipation is not important. We show that the QAEA has a finite probability of success in a certain range of parameters. implying the polynomial complexity of the algorithm. The second model corresponds to the standard QAEA with the problem Hamiltonian taken from the Gaussian Unitary RMT ensemble (GUE). We show that the level dynamics in this model can be mapped onto the dynamics in the Brownian motion model. However, the driven RMT model always leads to the exponential complexity of the algorithm due to the presence of the long-range intertemporal correlations of the eigenvalues. Our results indicate that the weakness of effective transitions is the leading effect that can make the Markovian type QAEA successful.
A method for determining the weak statistical stationarity of a random process
NASA Technical Reports Server (NTRS)
Sadeh, W. Z.; Koper, C. A., Jr.
1978-01-01
A method for determining the weak statistical stationarity of a random process is presented. The core of this testing procedure consists of generating an equivalent ensemble which approximates a true ensemble. Formation of an equivalent ensemble is accomplished through segmenting a sufficiently long time history of a random process into equal, finite, and statistically independent sample records. The weak statistical stationarity is ascertained based on the time invariance of the equivalent-ensemble averages. Comparison of these averages with their corresponding time averages over a single sample record leads to a heuristic estimate of the ergodicity of a random process. Specific variance tests are introduced for evaluating the statistical independence of the sample records, the time invariance of the equivalent-ensemble autocorrelations, and the ergodicity. Examination and substantiation of these procedures were conducted utilizing turbulent velocity signals.
Correlations of RMT characteristic polynomials and integrability: Hermitean matrices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osipov, Vladimir Al., E-mail: Vladimir.Osipov@uni-due.d; Kanzieper, Eugene, E-mail: Eugene.Kanzieper@hit.ac.i; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100
Integrable theory is formulated for correlation functions of characteristic polynomials associated with invariant non-Gaussian ensembles of Hermitean random matrices. By embedding the correlation functions of interest into a more general theory of {tau} functions, we (i) identify a zoo of hierarchical relations satisfied by {tau} functions in an abstract infinite-dimensional space and (ii) present a technology to translate these relations into hierarchically structured nonlinear differential equations describing the correlation functions of characteristic polynomials in the physical, spectral space. Implications of this formalism for fermionic, bosonic, and supersymmetric variations of zero-dimensional replica field theories are discussed at length. A particular emphasismore » is placed on the phenomenon of fermionic-bosonic factorisation of random-matrix-theory correlation functions.« less
RSAT: regulatory sequence analysis tools.
Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques
2008-07-01
The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.
Randomized central limit theorems: A unified theory.
Eliazar, Iddo; Klafter, Joseph
2010-08-01
The central limit theorems (CLTs) characterize the macroscopic statistical behavior of large ensembles of independent and identically distributed random variables. The CLTs assert that the universal probability laws governing ensembles' aggregate statistics are either Gaussian or Lévy, and that the universal probability laws governing ensembles' extreme statistics are Fréchet, Weibull, or Gumbel. The scaling schemes underlying the CLTs are deterministic-scaling all ensemble components by a common deterministic scale. However, there are "random environment" settings in which the underlying scaling schemes are stochastic-scaling the ensemble components by different random scales. Examples of such settings include Holtsmark's law for gravitational fields and the Stretched Exponential law for relaxation times. In this paper we establish a unified theory of randomized central limit theorems (RCLTs)-in which the deterministic CLT scaling schemes are replaced with stochastic scaling schemes-and present "randomized counterparts" to the classic CLTs. The RCLT scaling schemes are shown to be governed by Poisson processes with power-law statistics, and the RCLTs are shown to universally yield the Lévy, Fréchet, and Weibull probability laws.
Vyas, Manan; Kota, V K B; Chavda, N D
2010-03-01
Finite interacting Fermi systems with a mean-field and a chaos generating two-body interaction are modeled by one plus two-body embedded Gaussian orthogonal ensemble of random matrices with spin degree of freedom [called EGOE(1+2)-s]. Numerical calculations are used to demonstrate that, as lambda , the strength of the interaction (measured in the units of the average spacing of the single-particle levels defining the mean-field), increases, generically there is Poisson to GOE transition in level fluctuations, Breit-Wigner to Gaussian transition in strength functions (also called local density of states) and also a duality region where information entropy will be the same in both the mean-field and interaction defined basis. Spin dependence of the transition points lambda_{c} , lambdaF, and lambdad , respectively, is described using the propagator for the spectral variances and the formula for the propagator is derived. We further establish that the duality region corresponds to a region of thermalization. For this purpose we compared the single-particle entropy defined by the occupancies of the single-particle orbitals with thermodynamic entropy and information entropy for various lambda values and they are very close to each other at lambda=lambdad.
Ensemble-type numerical uncertainty information from single model integrations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rauser, Florian, E-mail: florian.rauser@mpimet.mpg.de; Marotzke, Jochem; Korn, Peter
2015-07-01
We suggest an algorithm that quantifies the discretization error of time-dependent physical quantities of interest (goals) for numerical models of geophysical fluid dynamics. The goal discretization error is estimated using a sum of weighted local discretization errors. The key feature of our algorithm is that these local discretization errors are interpreted as realizations of a random process. The random process is determined by the model and the flow state. From a class of local error random processes we select a suitable specific random process by integrating the model over a short time interval at different resolutions. The weights of themore » influences of the local discretization errors on the goal are modeled as goal sensitivities, which are calculated via automatic differentiation. The integration of the weighted realizations of local error random processes yields a posterior ensemble of goal approximations from a single run of the numerical model. From the posterior ensemble we derive the uncertainty information of the goal discretization error. This algorithm bypasses the requirement of detailed knowledge about the models discretization to generate numerical error estimates. The algorithm is evaluated for the spherical shallow-water equations. For two standard test cases we successfully estimate the error of regional potential energy, track its evolution, and compare it to standard ensemble techniques. The posterior ensemble shares linear-error-growth properties with ensembles of multiple model integrations when comparably perturbed. The posterior ensemble numerical error estimates are of comparable size as those of a stochastic physics ensemble.« less
A Random Forest-based ensemble method for activity recognition.
Feng, Zengtao; Mo, Lingfei; Li, Meng
2015-01-01
This paper presents a multi-sensor ensemble approach to human physical activity (PA) recognition, using random forest. We designed an ensemble learning algorithm, which integrates several independent Random Forest classifiers based on different sensor feature sets to build a more stable, more accurate and faster classifier for human activity recognition. To evaluate the algorithm, PA data collected from the PAMAP (Physical Activity Monitoring for Aging People), which is a standard, publicly available database, was utilized to train and test. The experimental results show that the algorithm is able to correctly recognize 19 PA types with an accuracy of 93.44%, while the training is faster than others. The ensemble classifier system based on the RF (Random Forest) algorithm can achieve high recognition accuracy and fast calculation.
Modeling cometary photopolarimetric characteristics with Sh-matrix method
NASA Astrophysics Data System (ADS)
Kolokolova, L.; Petrov, D.
2017-12-01
Cometary dust is dominated by particles of complex shape and structure, which are often considered as fractal aggregates. Rigorous modeling of light scattering by such particles, even using parallelized codes and NASA supercomputer resources, is very computer time and memory consuming. We are presenting a new approach to modeling cometary dust that is based on the Sh-matrix technique (e.g., Petrov et al., JQSRT, 112, 2012). This method is based on the T-matrix technique (e.g., Mishchenko et al., JQSRT, 55, 1996) and was developed after it had been found that the shape-dependent factors could be separated from the size- and refractive-index-dependent factors and presented as a shape matrix, or Sh-matrix. Size and refractive index dependences are incorporated through analytical operations on the Sh-matrix to produce the elements of T-matrix. Sh-matrix method keeps all advantages of the T-matrix method, including analytical averaging over particle orientation. Moreover, the surface integrals describing the Sh-matrix elements themselves can be solvable analytically for particles of any shape. This makes Sh-matrix approach an effective technique to simulate light scattering by particles of complex shape and surface structure. In this paper, we present cometary dust as an ensemble of Gaussian random particles. The shape of these particles is described by a log-normal distribution of their radius length and direction (Muinonen, EMP, 72, 1996). Changing one of the parameters of this distribution, the correlation angle, from 0 to 90 deg., we can model a variety of particles from spheres to particles of a random complex shape. We survey the angular and spectral dependencies of intensity and polarization resulted from light scattering by such particles, studying how they depend on the particle shape, size, and composition (including porous particles to simulate aggregates) to find the best fit to the cometary observations.
Targeting functional motifs of a protein family
NASA Astrophysics Data System (ADS)
Bhadola, Pradeep; Deo, Nivedita
2016-10-01
The structural organization of a protein family is investigated by devising a method based on the random matrix theory (RMT), which uses the physiochemical properties of the amino acid with multiple sequence alignment. A graphical method to represent protein sequences using physiochemical properties is devised that gives a fast, easy, and informative way of comparing the evolutionary distances between protein sequences. A correlation matrix associated with each property is calculated, where the noise reduction and information filtering is done using RMT involving an ensemble of Wishart matrices. The analysis of the eigenvalue statistics of the correlation matrix for the β -lactamase family shows the universal features as observed in the Gaussian orthogonal ensemble (GOE). The property-based approach captures the short- as well as the long-range correlation (approximately following GOE) between the eigenvalues, whereas the previous approach (treating amino acids as characters) gives the usual short-range correlations, while the long-range correlations are the same as that of an uncorrelated series. The distribution of the eigenvector components for the eigenvalues outside the bulk (RMT bound) deviates significantly from RMT observations and contains important information about the system. The information content of each eigenvector of the correlation matrix is quantified by introducing an entropic estimate, which shows that for the β -lactamase family the smallest eigenvectors (low eigenmodes) are highly localized as well as informative. These small eigenvectors when processed gives clusters involving positions that have well-defined biological and structural importance matching with experiments. The approach is crucial for the recognition of structural motifs as shown in β -lactamase (and other families) and selectively identifies the important positions for targets to deactivate (activate) the enzymatic actions.
NASA Astrophysics Data System (ADS)
Deelan Cunden, Fabio; Facchi, Paolo; Florio, Giuseppe; Pascazio, Saverio
2013-05-01
Let a pure state | ψ> be chosen randomly in an NM-dimensional Hilbert space, and consider the reduced density matrix ρ A of an N-dimensional subsystem. The bipartite entanglement properties of | ψ> are encoded in the spectrum of ρ A . By means of a saddle point method and using a "Coulomb gas" model for the eigenvalues, we obtain the typical spectrum of reduced density matrices. We consider the cases of an unbiased ensemble of pure states and of a fixed value of the purity. We finally obtain the eigenvalue distribution by using a statistical mechanics approach based on the introduction of a partition function.
Polarized ensembles of random pure states
NASA Astrophysics Data System (ADS)
Deelan Cunden, Fabio; Facchi, Paolo; Florio, Giuseppe
2013-08-01
A new family of polarized ensembles of random pure states is presented. These ensembles are obtained by linear superposition of two random pure states with suitable distributions, and are quite manageable. We will use the obtained results for two purposes: on the one hand we will be able to derive an efficient strategy for sampling states from isopurity manifolds. On the other, we will characterize the deviation of a pure quantum state from separability under the influence of noise.
Entropy of spatial network ensembles
NASA Astrophysics Data System (ADS)
Coon, Justin P.; Dettmann, Carl P.; Georgiou, Orestis
2018-04-01
We analyze complexity in spatial network ensembles through the lens of graph entropy. Mathematically, we model a spatial network as a soft random geometric graph, i.e., a graph with two sources of randomness, namely nodes located randomly in space and links formed independently between pairs of nodes with probability given by a specified function (the "pair connection function") of their mutual distance. We consider the general case where randomness arises in node positions as well as pairwise connections (i.e., for a given pair distance, the corresponding edge state is a random variable). Classical random geometric graph and exponential graph models can be recovered in certain limits. We derive a simple bound for the entropy of a spatial network ensemble and calculate the conditional entropy of an ensemble given the node location distribution for hard and soft (probabilistic) pair connection functions. Under this formalism, we derive the connection function that yields maximum entropy under general constraints. Finally, we apply our analytical framework to study two practical examples: ad hoc wireless networks and the US flight network. Through the study of these examples, we illustrate that both exhibit properties that are indicative of nearly maximally entropic ensembles.
Signatures of bifurcation on quantum correlations: Case of the quantum kicked top
NASA Astrophysics Data System (ADS)
Bhosale, Udaysinh T.; Santhanam, M. S.
2017-01-01
Quantum correlations reflect the quantumness of a system and are useful resources for quantum information and computational processes. Measures of quantum correlations do not have a classical analog and yet are influenced by classical dynamics. In this work, by modeling the quantum kicked top as a multiqubit system, the effect of classical bifurcations on measures of quantum correlations such as the quantum discord, geometric discord, and Meyer and Wallach Q measure is studied. The quantum correlation measures change rapidly in the vicinity of a classical bifurcation point. If the classical system is largely chaotic, time averages of the correlation measures are in good agreement with the values obtained by considering the appropriate random matrix ensembles. The quantum correlations scale with the total spin of the system, representing its semiclassical limit. In the vicinity of trivial fixed points of the kicked top, the scaling function decays as a power law. In the chaotic limit, for large total spin, quantum correlations saturate to a constant, which we obtain analytically, based on random matrix theory, for the Q measure. We also suggest that it can have experimental consequences.
Analysis of network clustering behavior of the Chinese stock market
NASA Astrophysics Data System (ADS)
Chen, Huan; Mai, Yong; Li, Sai-Ping
2014-11-01
Random Matrix Theory (RMT) and the decomposition of correlation matrix method are employed to analyze spatial structure of stocks interactions and collective behavior in the Shanghai and Shenzhen stock markets in China. The result shows that there exists prominent sector structures, with subsectors including the Real Estate (RE), Commercial Banks (CB), Pharmaceuticals (PH), Distillers&Vintners (DV) and Steel (ST) industries. Furthermore, the RE and CB subsectors are mostly anti-correlated. We further study the temporal behavior of the dataset and find that while the sector structures are relatively stable from 2007 through 2013, the correlation between the real estate and commercial bank stocks shows large variations. By employing the ensemble empirical mode decomposition (EEMD) method, we show that this anti-correlation behavior is closely related to the monetary and austerity policies of the Chinese government during the period of study.
Levy Matrices and Financial Covariances
NASA Astrophysics Data System (ADS)
Burda, Zdzislaw; Jurkiewicz, Jerzy; Nowak, Maciej A.; Papp, Gabor; Zahed, Ismail
2003-10-01
In a given market, financial covariances capture the intra-stock correlations and can be used to address statistically the bulk nature of the market as a complex system. We provide a statistical analysis of three SP500 covariances with evidence for raw tail distributions. We study the stability of these tails against reshuffling for the SP500 data and show that the covariance with the strongest tails is robust, with a spectral density in remarkable agreement with random Lévy matrix theory. We study the inverse participation ratio for the three covariances. The strong localization observed at both ends of the spectral density is analogous to the localization exhibited in the random Lévy matrix ensemble. We discuss two competitive mechanisms responsible for the occurrence of an extensive and delocalized eigenvalue at the edge of the spectrum: (a) the Lévy character of the entries of the correlation matrix and (b) a sort of off-diagonal order induced by underlying inter-stock correlations. (b) can be destroyed by reshuffling, while (a) cannot. We show that the stocks with the largest scattering are the least susceptible to correlations, and likely candidates for the localized states. We introduce a simple model for price fluctuations which captures behavior of the SP500 covariances. It may be of importance for assets diversification.
NASA Astrophysics Data System (ADS)
Müller, Christian L.; Sbalzarini, Ivo F.; van Gunsteren, Wilfred F.; Žagrović, Bojan; Hünenberger, Philippe H.
2009-06-01
The concept of high-resolution shapes (also referred to as folds or states, depending on the context) of a polymer chain plays a central role in polymer science, structural biology, bioinformatics, and biopolymer dynamics. However, although the idea of shape is intuitively very useful, there is no unambiguous mathematical definition for this concept. In the present work, the distributions of high-resolution shapes within the ideal random-walk ensembles with N =3,…,6 beads (or up to N =10 for some properties) are investigated using a systematic (grid-based) approach based on a simple working definition of shapes relying on the root-mean-square atomic positional deviation as a metric (i.e., to define the distance between pairs of structures) and a single cutoff criterion for the shape assignment. Although the random-walk ensemble appears to represent the paramount of homogeneity and randomness, this analysis reveals that the distribution of shapes within this ensemble, i.e., in the total absence of interatomic interactions characteristic of a specific polymer (beyond the generic connectivity constraint), is significantly inhomogeneous. In particular, a specific (densest) shape occurs with a local probability that is 1.28, 1.79, 2.94, and 10.05 times (N =3,…,6) higher than the corresponding average over all possible shapes (these results can tentatively be extrapolated to a factor as large as about 1028 for N =100). The qualitative results of this analysis lead to a few rather counterintuitive suggestions, namely, that, e.g., (i) a fold classification analysis applied to the random-walk ensemble would lead to the identification of random-walk "folds;" (ii) a clustering analysis applied to the random-walk ensemble would also lead to the identification random-walk "states" and associated relative free energies; and (iii) a random-walk ensemble of polymer chains could lead to well-defined diffraction patterns in hypothetical fiber or crystal diffraction experiments. The inhomogeneous nature of the shape probability distribution identified here for random walks may represent a significant underlying baseline effect in the analysis of real polymer chain ensembles (i.e., in the presence of specific interatomic interactions). As a consequence, a part of what is called a polymer shape may actually reside just "in the eye of the beholder" rather than in the nature of the interactions between the constituting atoms, and the corresponding observation-related bias should be taken into account when drawing conclusions from shape analyses as applied to real structural ensembles.
Müller, Christian L; Sbalzarini, Ivo F; van Gunsteren, Wilfred F; Zagrović, Bojan; Hünenberger, Philippe H
2009-06-07
The concept of high-resolution shapes (also referred to as folds or states, depending on the context) of a polymer chain plays a central role in polymer science, structural biology, bioinformatics, and biopolymer dynamics. However, although the idea of shape is intuitively very useful, there is no unambiguous mathematical definition for this concept. In the present work, the distributions of high-resolution shapes within the ideal random-walk ensembles with N=3,...,6 beads (or up to N=10 for some properties) are investigated using a systematic (grid-based) approach based on a simple working definition of shapes relying on the root-mean-square atomic positional deviation as a metric (i.e., to define the distance between pairs of structures) and a single cutoff criterion for the shape assignment. Although the random-walk ensemble appears to represent the paramount of homogeneity and randomness, this analysis reveals that the distribution of shapes within this ensemble, i.e., in the total absence of interatomic interactions characteristic of a specific polymer (beyond the generic connectivity constraint), is significantly inhomogeneous. In particular, a specific (densest) shape occurs with a local probability that is 1.28, 1.79, 2.94, and 10.05 times (N=3,...,6) higher than the corresponding average over all possible shapes (these results can tentatively be extrapolated to a factor as large as about 10(28) for N=100). The qualitative results of this analysis lead to a few rather counterintuitive suggestions, namely, that, e.g., (i) a fold classification analysis applied to the random-walk ensemble would lead to the identification of random-walk "folds;" (ii) a clustering analysis applied to the random-walk ensemble would also lead to the identification random-walk "states" and associated relative free energies; and (iii) a random-walk ensemble of polymer chains could lead to well-defined diffraction patterns in hypothetical fiber or crystal diffraction experiments. The inhomogeneous nature of the shape probability distribution identified here for random walks may represent a significant underlying baseline effect in the analysis of real polymer chain ensembles (i.e., in the presence of specific interatomic interactions). As a consequence, a part of what is called a polymer shape may actually reside just "in the eye of the beholder" rather than in the nature of the interactions between the constituting atoms, and the corresponding observation-related bias should be taken into account when drawing conclusions from shape analyses as applied to real structural ensembles.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Olsen, Seth, E-mail: seth.olsen@uq.edu.au
2015-01-28
This paper reviews basic results from a theory of the a priori classical probabilities (weights) in state-averaged complete active space self-consistent field (SA-CASSCF) models. It addresses how the classical probabilities limit the invariance of the self-consistency condition to transformations of the complete active space configuration interaction (CAS-CI) problem. Such transformations are of interest for choosing representations of the SA-CASSCF solution that are diabatic with respect to some interaction. I achieve the known result that a SA-CASSCF can be self-consistently transformed only within degenerate subspaces of the CAS-CI ensemble density matrix. For uniformly distributed (“microcanonical”) SA-CASSCF ensembles, self-consistency is invariant tomore » any unitary CAS-CI transformation that acts locally on the ensemble support. Most SA-CASSCF applications in current literature are microcanonical. A problem with microcanonical SA-CASSCF models for problems with “more diabatic than adiabatic” states is described. The problem is that not all diabatic energies and couplings are self-consistently resolvable. A canonical-ensemble SA-CASSCF strategy is proposed to solve the problem. For canonical-ensemble SA-CASSCF, the equilibrated ensemble is a Boltzmann density matrix parametrized by its own CAS-CI Hamiltonian and a Lagrange multiplier acting as an inverse “temperature,” unrelated to the physical temperature. Like the convergence criterion for microcanonical-ensemble SA-CASSCF, the equilibration condition for canonical-ensemble SA-CASSCF is invariant to transformations that act locally on the ensemble CAS-CI density matrix. The advantage of a canonical-ensemble description is that more adiabatic states can be included in the support of the ensemble without running into convergence problems. The constraint on the dimensionality of the problem is relieved by the introduction of an energy constraint. The method is illustrated with a complete active space valence-bond (CASVB) analysis of the charge/bond resonance electronic structure of a monomethine cyanine: Michler’s hydrol blue. The diabatic CASVB representation is shown to vary weakly for “temperatures” corresponding to visible photon energies. Canonical-ensemble SA-CASSCF enables the resolution of energies and couplings for all covalent and ionic CASVB structures contributing to the SA-CASSCF ensemble. The CASVB solution describes resonance of charge- and bond-localized electronic structures interacting via bridge resonance superexchange. The resonance couplings can be separated into channels associated with either covalent charge delocalization or chemical bonding interactions, with the latter significantly stronger than the former.« less
Olsen, Seth
2015-01-28
This paper reviews basic results from a theory of the a priori classical probabilities (weights) in state-averaged complete active space self-consistent field (SA-CASSCF) models. It addresses how the classical probabilities limit the invariance of the self-consistency condition to transformations of the complete active space configuration interaction (CAS-CI) problem. Such transformations are of interest for choosing representations of the SA-CASSCF solution that are diabatic with respect to some interaction. I achieve the known result that a SA-CASSCF can be self-consistently transformed only within degenerate subspaces of the CAS-CI ensemble density matrix. For uniformly distributed ("microcanonical") SA-CASSCF ensembles, self-consistency is invariant to any unitary CAS-CI transformation that acts locally on the ensemble support. Most SA-CASSCF applications in current literature are microcanonical. A problem with microcanonical SA-CASSCF models for problems with "more diabatic than adiabatic" states is described. The problem is that not all diabatic energies and couplings are self-consistently resolvable. A canonical-ensemble SA-CASSCF strategy is proposed to solve the problem. For canonical-ensemble SA-CASSCF, the equilibrated ensemble is a Boltzmann density matrix parametrized by its own CAS-CI Hamiltonian and a Lagrange multiplier acting as an inverse "temperature," unrelated to the physical temperature. Like the convergence criterion for microcanonical-ensemble SA-CASSCF, the equilibration condition for canonical-ensemble SA-CASSCF is invariant to transformations that act locally on the ensemble CAS-CI density matrix. The advantage of a canonical-ensemble description is that more adiabatic states can be included in the support of the ensemble without running into convergence problems. The constraint on the dimensionality of the problem is relieved by the introduction of an energy constraint. The method is illustrated with a complete active space valence-bond (CASVB) analysis of the charge/bond resonance electronic structure of a monomethine cyanine: Michler's hydrol blue. The diabatic CASVB representation is shown to vary weakly for "temperatures" corresponding to visible photon energies. Canonical-ensemble SA-CASSCF enables the resolution of energies and couplings for all covalent and ionic CASVB structures contributing to the SA-CASSCF ensemble. The CASVB solution describes resonance of charge- and bond-localized electronic structures interacting via bridge resonance superexchange. The resonance couplings can be separated into channels associated with either covalent charge delocalization or chemical bonding interactions, with the latter significantly stronger than the former.
Using histograms to introduce randomization in the generation of ensembles of decision trees
Kamath, Chandrika; Cantu-Paz, Erick; Littau, David
2005-02-22
A system for decision tree ensembles that includes a module to read the data, a module to create a histogram, a module to evaluate a potential split according to some criterion using the histogram, a module to select a split point randomly in an interval around the best split, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method includes the steps of reading the data; creating a histogram; evaluating a potential split according to some criterion using the histogram, selecting a split point randomly in an interval around the best split, splitting the data, and combining multiple decision trees in ensembles.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Theophilou, Iris; Helbig, Nicole; Lathiotakis, Nektarios N.
Functionals of the one-body reduced density matrix (1-RDM) are routinely minimized under Coleman’s ensemble N-representability conditions. Recently, the topic of pure-state N-representability conditions, also known as generalized Pauli constraints, received increased attention following the discovery of a systematic way to derive them for any number of electrons and any finite dimensionality of the Hilbert space. The target of this work is to assess the potential impact of the enforcement of the pure-state conditions on the results of reduced density-matrix functional theory calculations. In particular, we examine whether the standard minimization of typical 1-RDM functionals under the ensemble N-representability conditions violatesmore » the pure-state conditions for prototype 3-electron systems. We also enforce the pure-state conditions, in addition to the ensemble ones, for the same systems and functionals and compare the correlation energies and optimal occupation numbers with those obtained by the enforcement of the ensemble conditions alone.« less
Ensemble Solute Transport in 2-D Operator-Stable Random Fields
NASA Astrophysics Data System (ADS)
Monnig, N. D.; Benson, D. A.
2006-12-01
The heterogeneous velocity field that exists at many scales in an aquifer will typically cause a dissolved solute plume to grow at a rate faster than Fick's Law predicts. Some statistical model must be adopted to account for the aquifer structure that engenders the velocity heterogeneity. A fractional Brownian motion (fBm) model has been shown to create the long-range correlation that can produce continually faster-than-Fickian plume growth. Previous fBm models have assumed isotropic scaling (defined here by a scalar Hurst coefficient). Motivated by field measurements of aquifer hydraulic conductivity, recent techniques were developed to construct random fields with anisotropic scaling with a self-similarity parameter that is defined by a matrix. The growth of ensemble plumes is analyzed for transport through 2-D "operator- stable" fBm hydraulic conductivity (K) fields. Both the longitudinal and transverse Hurst coefficients are important to both plume growth rates and the timing and duration of breakthrough. Smaller Hurst coefficients in the transverse direction lead to more "continuity" or stratification in the direction of transport. The result is continually faster-than-Fickian growth rates, highly non-Gaussian ensemble plumes, and a longer tail early in the breakthrough curve. Contrary to some analytic stochastic theories for monofractal K fields, the plume growth rate never exceeds Mercado's [1967] purely stratified aquifer growth rate of plume apparent dispersivity proportional to mean distance. Apparent super-Mercado growth must be the result of other factors, such as larger plumes corresponding to either a larger initial plume size or greater variance of the ln(K) field.
A simple new filter for nonlinear high-dimensional data assimilation
NASA Astrophysics Data System (ADS)
Tödter, Julian; Kirchgessner, Paul; Ahrens, Bodo
2015-04-01
The ensemble Kalman filter (EnKF) and its deterministic variants, mostly square root filters such as the ensemble transform Kalman filter (ETKF), represent a popular alternative to variational data assimilation schemes and are applied in a wide range of operational and research activities. Their forecast step employs an ensemble integration that fully respects the nonlinear nature of the analyzed system. In the analysis step, they implicitly assume the prior state and observation errors to be Gaussian. Consequently, in nonlinear systems, the analysis mean and covariance are biased, and these filters remain suboptimal. In contrast, the fully nonlinear, non-Gaussian particle filter (PF) only relies on Bayes' theorem, which guarantees an exact asymptotic behavior, but because of the so-called curse of dimensionality it is exposed to weight collapse. This work shows how to obtain a new analysis ensemble whose mean and covariance exactly match the Bayesian estimates. This is achieved by a deterministic matrix square root transformation of the forecast ensemble, and subsequently a suitable random rotation that significantly contributes to filter stability while preserving the required second-order statistics. The forecast step remains as in the ETKF. The proposed algorithm, which is fairly easy to implement and computationally efficient, is referred to as the nonlinear ensemble transform filter (NETF). The properties and performance of the proposed algorithm are investigated via a set of Lorenz experiments. They indicate that such a filter formulation can increase the analysis quality, even for relatively small ensemble sizes, compared to other ensemble filters in nonlinear, non-Gaussian scenarios. Furthermore, localization enhances the potential applicability of this PF-inspired scheme in larger-dimensional systems. Finally, the novel algorithm is coupled to a large-scale ocean general circulation model. The NETF is stable, behaves reasonably and shows a good performance with a realistic ensemble size. The results confirm that, in principle, it can be applied successfully and as simple as the ETKF in high-dimensional problems without further modifications of the algorithm, even though it is only based on the particle weights. This proves that the suggested method constitutes a useful filter for nonlinear, high-dimensional data assimilation, and is able to overcome the curse of dimensionality even in deterministic systems.
Gaussian memory in kinematic matrix theory for self-propellers.
Nourhani, Amir; Crespi, Vincent H; Lammert, Paul E
2014-12-01
We extend the kinematic matrix ("kinematrix") formalism [Phys. Rev. E 89, 062304 (2014)], which via simple matrix algebra accesses ensemble properties of self-propellers influenced by uncorrelated noise, to treat Gaussian correlated noises. This extension brings into reach many real-world biological and biomimetic self-propellers for which inertia is significant. Applying the formalism, we analyze in detail ensemble behaviors of a 2D self-propeller with velocity fluctuations and orientation evolution driven by an Ornstein-Uhlenbeck process. On the basis of exact results, a variety of dynamical regimes determined by the inertial, speed-fluctuation, orientational diffusion, and emergent disorientation time scales are delineated and discussed.
NASA Astrophysics Data System (ADS)
Goyal, Sandeep K.; Singh, Rajeev; Ghosh, Sibasish
2016-01-01
Mixed states of a quantum system, represented by density operators, can be decomposed as a statistical mixture of pure states in a number of ways where each decomposition can be viewed as a different preparation recipe. However the fact that the density matrix contains full information about the ensemble makes it impossible to estimate the preparation basis for the quantum system. Here we present a measurement scheme to (seemingly) improve the performance of unsharp measurements. We argue that in some situations this scheme is capable of providing statistics from a single copy of the quantum system, thus making it possible to perform state tomography from a single copy. One of the by-products of the scheme is a way to distinguish between different preparation methods used to prepare the state of the quantum system. However, our numerical simulations disagree with our intuitive predictions. We show that a counterintuitive property of a biased classical random walk is responsible for the proposed mechanism not working.
Exploring diversity in ensemble classification: Applications in large area land cover mapping
NASA Astrophysics Data System (ADS)
Mellor, Andrew; Boukir, Samia
2017-07-01
Ensemble classifiers, such as random forests, are now commonly applied in the field of remote sensing, and have been shown to perform better than single classifier systems, resulting in reduced generalisation error. Diversity across the members of ensemble classifiers is known to have a strong influence on classification performance - whereby classifier errors are uncorrelated and more uniformly distributed across ensemble members. The relationship between ensemble diversity and classification performance has not yet been fully explored in the fields of information science and machine learning and has never been examined in the field of remote sensing. This study is a novel exploration of ensemble diversity and its link to classification performance, applied to a multi-class canopy cover classification problem using random forests and multisource remote sensing and ancillary GIS data, across seven million hectares of diverse dry-sclerophyll dominated public forests in Victoria Australia. A particular emphasis is placed on analysing the relationship between ensemble diversity and ensemble margin - two key concepts in ensemble learning. The main novelty of our work is on boosting diversity by emphasizing the contribution of lower margin instances used in the learning process. Exploring the influence of tree pruning on diversity is also a new empirical analysis that contributes to a better understanding of ensemble performance. Results reveal insights into the trade-off between ensemble classification accuracy and diversity, and through the ensemble margin, demonstrate how inducing diversity by targeting lower margin training samples is a means of achieving better classifier performance for more difficult or rarer classes and reducing information redundancy in classification problems. Our findings inform strategies for collecting training data and designing and parameterising ensemble classifiers, such as random forests. This is particularly important in large area remote sensing applications, for which training data is costly and resource intensive to collect.
Ensemble Bayesian forecasting system Part I: Theory and algorithms
NASA Astrophysics Data System (ADS)
Herr, Henry D.; Krzysztofowicz, Roman
2015-05-01
The ensemble Bayesian forecasting system (EBFS), whose theory was published in 2001, is developed for the purpose of quantifying the total uncertainty about a discrete-time, continuous-state, non-stationary stochastic process such as a time series of stages, discharges, or volumes at a river gauge. The EBFS is built of three components: an input ensemble forecaster (IEF), which simulates the uncertainty associated with random inputs; a deterministic hydrologic model (of any complexity), which simulates physical processes within a river basin; and a hydrologic uncertainty processor (HUP), which simulates the hydrologic uncertainty (an aggregate of all uncertainties except input). It works as a Monte Carlo simulator: an ensemble of time series of inputs (e.g., precipitation amounts) generated by the IEF is transformed deterministically through a hydrologic model into an ensemble of time series of outputs, which is next transformed stochastically by the HUP into an ensemble of time series of predictands (e.g., river stages). Previous research indicated that in order to attain an acceptable sampling error, the ensemble size must be on the order of hundreds (for probabilistic river stage forecasts and probabilistic flood forecasts) or even thousands (for probabilistic stage transition forecasts). The computing time needed to run the hydrologic model this many times renders the straightforward simulations operationally infeasible. This motivates the development of the ensemble Bayesian forecasting system with randomization (EBFSR), which takes full advantage of the analytic meta-Gaussian HUP and generates multiple ensemble members after each run of the hydrologic model; this auxiliary randomization reduces the required size of the meteorological input ensemble and makes it operationally feasible to generate a Bayesian ensemble forecast of large size. Such a forecast quantifies the total uncertainty, is well calibrated against the prior (climatic) distribution of predictand, possesses a Bayesian coherence property, constitutes a random sample of the predictand, and has an acceptable sampling error-which makes it suitable for rational decision making under uncertainty.
NASA Technical Reports Server (NTRS)
Mishchenko, Michael I.; Dlugach, Janna M.; Zakharova, Nadezhda T.
2016-01-01
The numerically exact superposition T-matrix method is used to model far-field electromagnetic scattering by two types of particulate object. Object 1 is a fixed configuration which consists of N identical spherical particles (with N 200 or 400) quasi-randomly populating a spherical volume V having a median size parameter of 50. Object 2 is a true discrete random medium (DRM) comprising the same number N of particles randomly moving throughout V. The median particle size parameter is fixed at 4. We show that if Object 1 is illuminated by a quasi-monochromatic parallel beam then it generates a typical speckle pattern having no resemblance to the scattering pattern generated by Object 2. However, if Object 1 is illuminated by a parallel polychromatic beam with a 10 bandwidth then it generates a scattering pattern that is largely devoid of speckles and closely reproduces the quasi-monochromatic pattern generated by Object 2. This result serves to illustrate the capacity of the concept of electromagnetic scattering by a DRM to encompass fixed quasi-random particulate samples provided that they are illuminated by polychromatic light.
Kinematic matrix theory and universalities in self-propellers and active swimmers.
Nourhani, Amir; Lammert, Paul E; Borhan, Ali; Crespi, Vincent H
2014-06-01
We describe an efficient and parsimonious matrix-based theory for studying the ensemble behavior of self-propellers and active swimmers, such as nanomotors or motile bacteria, that are typically studied by differential-equation-based Langevin or Fokker-Planck formalisms. The kinematic effects for elementary processes of motion are incorporated into a matrix, called the "kinematrix," from which we immediately obtain correlators and the mean and variance of angular and position variables (and thus effective diffusivity) by simple matrix algebra. The kinematrix formalism enables us recast the behaviors of a diverse range of self-propellers into a unified form, revealing universalities in their ensemble behavior in terms of new emergent time scales. Active fluctuations and hydrodynamic interactions can be expressed as an additive composition of separate self-propellers.
Kota, V K B; Chavda, N D; Sahu, R
2006-04-01
Interacting many-particle systems with a mean-field one-body part plus a chaos generating random two-body interaction having strength lambda exhibit Poisson to Gaussian orthogonal ensemble and Breit-Wigner (BW) to Gaussian transitions in level fluctuations and strength functions with transition points marked by lambda = lambda c and lambda = lambda F, respectively; lambda F > lambda c. For these systems a theory for the matrix elements of one-body transition operators is available, as valid in the Gaussian domain, with lambda > lambda F, in terms of orbital occupation numbers, level densities, and an integral involving a bivariate Gaussian in the initial and final energies. Here we show that, using a bivariate-t distribution, the theory extends below from the Gaussian regime to the BW regime up to lambda = lambda c. This is well tested in numerical calculations for 6 spinless fermions in 12 single-particle states.
Adjoints and Low-rank Covariance Representation
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; Cohn, Stephen E.
2000-01-01
Quantitative measures of the uncertainty of Earth System estimates can be as important as the estimates themselves. Second moments of estimation errors are described by the covariance matrix, whose direct calculation is impractical when the number of degrees of freedom of the system state is large. Ensemble and reduced-state approaches to prediction and data assimilation replace full estimation error covariance matrices by low-rank approximations. The appropriateness of such approximations depends on the spectrum of the full error covariance matrix, whose calculation is also often impractical. Here we examine the situation where the error covariance is a linear transformation of a forcing error covariance. We use operator norms and adjoints to relate the appropriateness of low-rank representations to the conditioning of this transformation. The analysis is used to investigate low-rank representations of the steady-state response to random forcing of an idealized discrete-time dynamical system.
NASA Astrophysics Data System (ADS)
Yamamoto, Takuya; Nishigaki, Shinsuke M.
2018-02-01
We compute individual distributions of low-lying eigenvalues of a chiral random matrix ensemble interpolating symplectic and unitary symmetry classes by the Nyström-type method of evaluating the Fredholm Pfaffian and resolvents of the quaternion kernel. The one-parameter family of these distributions is shown to fit excellently the Dirac spectra of SU(2) lattice gauge theory with a constant U(1) background or dynamically fluctuating U(1) gauge field, which weakly breaks the pseudoreality of the unperturbed SU(2) Dirac operator. The observed linear dependence of the crossover parameter with the strength of the U(1) perturbations leads to precise determination of the pseudo-scalar decay constant, as well as the chiral condensate in the effective chiral Lagrangian of the AI class.
In silico prediction of splice-altering single nucleotide variants in the human genome.
Jian, Xueqiu; Boerwinkle, Eric; Liu, Xiaoming
2014-12-16
In silico tools have been developed to predict variants that may have an impact on pre-mRNA splicing. The major limitation of the application of these tools to basic research and clinical practice is the difficulty in interpreting the output. Most tools only predict potential splice sites given a DNA sequence without measuring splicing signal changes caused by a variant. Another limitation is the lack of large-scale evaluation studies of these tools. We compared eight in silico tools on 2959 single nucleotide variants within splicing consensus regions (scSNVs) using receiver operating characteristic analysis. The Position Weight Matrix model and MaxEntScan outperformed other methods. Two ensemble learning methods, adaptive boosting and random forests, were used to construct models that take advantage of individual methods. Both models further improved prediction, with outputs of directly interpretable prediction scores. We applied our ensemble scores to scSNVs from the Catalogue of Somatic Mutations in Cancer database. Analysis showed that predicted splice-altering scSNVs are enriched in recurrent scSNVs and known cancer genes. We pre-computed our ensemble scores for all potential scSNVs across the human genome, providing a whole genome level resource for identifying splice-altering scSNVs discovered from large-scale sequencing studies.
Kumar, Sanjeev; Karmeshu
2018-04-01
A theoretical investigation is presented that characterizes the emerging sub-threshold membrane potential and inter-spike interval (ISI) distributions of an ensemble of IF neurons that group together and fire together. The squared-noise intensity σ 2 of the ensemble of neurons is treated as a random variable to account for the electrophysiological variations across population of nearly identical neurons. Employing superstatistical framework, both ISI distribution and sub-threshold membrane potential distribution of neuronal ensemble are obtained in terms of generalized K-distribution. The resulting distributions exhibit asymptotic behavior akin to stretched exponential family. Extensive simulations of the underlying SDE with random σ 2 are carried out. The results are found to be in excellent agreement with the analytical results. The analysis has been extended to cover the case corresponding to independent random fluctuations in drift in addition to random squared-noise intensity. The novelty of the proposed analytical investigation for the ensemble of IF neurons is that it yields closed form expressions of probability distributions in terms of generalized K-distribution. Based on a record of spiking activity of thousands of neurons, the findings of the proposed model are validated. The squared-noise intensity σ 2 of identified neurons from the data is found to follow gamma distribution. The proposed generalized K-distribution is found to be in excellent agreement with that of empirically obtained ISI distribution of neuronal ensemble. Copyright © 2018 Elsevier B.V. All rights reserved.
Creating ensembles of decision trees through sampling
Kamath, Chandrika; Cantu-Paz, Erick
2005-08-30
A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.
Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy
Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing
2016-01-01
Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651
Multivariate localization methods for ensemble Kalman filtering
NASA Astrophysics Data System (ADS)
Roh, S.; Jun, M.; Szunyogh, I.; Genton, M. G.
2015-05-01
In ensemble Kalman filtering (EnKF), the small number of ensemble members that is feasible to use in a practical data assimilation application leads to sampling variability of the estimates of the background error covariances. The standard approach to reducing the effects of this sampling variability, which has also been found to be highly efficient in improving the performance of EnKF, is the localization of the estimates of the covariances. One family of localization techniques is based on taking the Schur (entry-wise) product of the ensemble-based sample covariance matrix and a correlation matrix whose entries are obtained by the discretization of a distance-dependent correlation function. While the proper definition of the localization function for a single state variable has been extensively investigated, a rigorous definition of the localization function for multiple state variables has been seldom considered. This paper introduces two strategies for the construction of localization functions for multiple state variables. The proposed localization functions are tested by assimilating simulated observations experiments into the bivariate Lorenz 95 model with their help.
Inoue, R; Yonehara, T; Miyamoto, Y; Koashi, M; Kozuma, M
2009-09-11
Three-dimensional entanglement of orbital angular momentum states of an atomic qutrit and a single photon qutrit has been observed. Their full state was reconstructed using quantum state tomography. The fidelity to the maximally entangled state of Schmidt rank 3 exceeds the threshold 2/3. This result confirms that the density matrix cannot be decomposed into an ensemble of pure states of Schmidt rank 1 or 2. That is, the Schmidt number of the density matrix must be equal to or greater than 3.
Distribution law of the Dirac eigenmodes in QCD
NASA Astrophysics Data System (ADS)
Catillo, Marco; Glozman, Leonid Ya.
2018-04-01
The near-zero modes of the Dirac operator are connected to spontaneous breaking of chiral symmetry in QCD (SBCS) via the Banks-Casher relation. At the same time, the distribution of the near-zero modes is well described by the Random Matrix Theory (RMT) with the Gaussian Unitary Ensemble (GUE). Then, it has become a standard lore that a randomness, as observed through distributions of the near-zero modes of the Dirac operator, is a consequence of SBCS. The higher-lying modes of the Dirac operator are not affected by SBCS and are sensitive to confinement physics and related SU(2)CS and SU(2NF) symmetries. We study the distribution of the near-zero and higher-lying eigenmodes of the overlap Dirac operator within NF = 2 dynamical simulations. We find that both the distributions of the near-zero and higher-lying modes are perfectly described by GUE of RMT. This means that randomness, while consistent with SBCS, is not a consequence of SBCS and is linked to the confining chromo-electric field.
Scaled Particle Theory for Multicomponent Hard Sphere Fluids Confined in Random Porous Media.
Chen, W; Zhao, S L; Holovko, M; Chen, X S; Dong, W
2016-06-23
The formulation of scaled particle theory (SPT) is presented for a quite general model of fluids confined in a random porous media, i.e., a multicomponent hard sphere (HS) fluid in a multicomponent hard sphere or a multicomponent overlapping hard sphere (OHS) matrix. The analytical expressions for pressure, Helmholtz free energy, and chemical potential are derived. The thermodynamic consistency of the proposed theory is established. Moreover, we show that there is an isomorphism between the SPT for a multicomponent system and that for a one-component system. Results from grand canonical ensemble Monte Carlo simulations are also presented for a binary HS mixture in a one-component HS or a one-component OHS matrix. The accuracy of various variants derived from the basic SPT formulation is appraised against the simulation results. Scaled particle theory, initially formulated for a bulk HS fluid, has not only provided an analytical tool for calculating thermodynamic properties of HS fluid but also helped to gain very useful insight for elaborating other theoretical approaches such as the fundamental measure theory (FMT). We expect that the general SPT for multicomponent systems developed in this work can contribute to the study of confined fluids in a similar way.
NASA Astrophysics Data System (ADS)
Grayver, Alexander V.; Kuvshinov, Alexey V.
2016-05-01
This paper presents a methodology to sample equivalence domain (ED) in nonlinear partial differential equation (PDE)-constrained inverse problems. For this purpose, we first applied state-of-the-art stochastic optimization algorithm called Covariance Matrix Adaptation Evolution Strategy (CMAES) to identify low-misfit regions of the model space. These regions were then randomly sampled to create an ensemble of equivalent models and quantify uncertainty. CMAES is aimed at exploring model space globally and is robust on very ill-conditioned problems. We show that the number of iterations required to converge grows at a moderate rate with respect to number of unknowns and the algorithm is embarrassingly parallel. We formulated the problem by using the generalized Gaussian distribution. This enabled us to seamlessly use arbitrary norms for residual and regularization terms. We show that various regularization norms facilitate studying different classes of equivalent solutions. We further show how performance of the standard Metropolis-Hastings Markov chain Monte Carlo algorithm can be substantially improved by using information CMAES provides. This methodology was tested by using individual and joint inversions of magneotelluric, controlled-source electromagnetic (EM) and global EM induction data.
Beyond the excised ensemble: modelling elliptic curve L-functions with random matrices
NASA Astrophysics Data System (ADS)
Cooper, I. A.; Morris, Patrick W.; Snaith, N. C.
2016-02-01
The ‘excised ensemble’, a random matrix model for the zeros of quadratic twist families of elliptic curve L-functions, was introduced by Dueñez et al (2012 J. Phys. A: Math. Theor. 45 115207) The excised model is motivated by a formula for central values of these L-functions in a paper by Kohnen and Zagier (1981 Invent. Math. 64 175-98). This formula indicates that for a finite set of L-functions from a family of quadratic twists, the central values are all either zero or are greater than some positive cutoff. The excised model imposes this same condition on the central values of characteristic polynomials of matrices from {SO}(2N). Strangely, the cutoff on the characteristic polynomials that results in a convincing model for the L-function zeros is significantly smaller than that which we would obtain by naively transferring Kohnen and Zagier’s cutoff to the {SO}(2N) ensemble. In this current paper we investigate a modification to the excised model. It lacks the simplicity of the original excised ensemble, but it serves to explain the reason for the unexpectedly low cutoff in the original excised model. Additionally, the distribution of central L-values is ‘choppier’ than the distribution of characteristic polynomials, in the sense that it is a superposition of a series of peaks: the characteristic polynomial distribution is a smooth approximation to this. The excised model did not attempt to incorporate these successive peaks, only the initial cutoff. Here we experiment with including some of the structure of the L-value distribution. The conclusion is that a critical feature of a good model is to associate the correct mass to the first peak of the L-value distribution.
2012-02-01
AFRL-RX-TY-TR-2012-0022 ANALYSIS OF COMMERCIALLY AVAILABLE HELMET AND BOOT OPTIONS FOR THE JOINT FIREFIGHTER INTEGRATED RESPONSE ENSEMBLE...Interim Technical Report 01-SEP-2010 -- 31-JAN-2011 Analysis of Commercially Available Firefighting Helmet and Boot Options for the Joint Firefighter...ensemble. A requirements correlation matrix was generated and sent to industry detailing objective and threshold measurements for both the helmet
NASA Astrophysics Data System (ADS)
Durner, Maximilian; Márton, Zoltán.; Hillenbrand, Ulrich; Ali, Haider; Kleinsteuber, Martin
2017-03-01
In this work, a new ensemble method for the task of category recognition in different environments is presented. The focus is on service robotic perception in an open environment, where the robot's task is to recognize previously unseen objects of predefined categories, based on training on a public dataset. We propose an ensemble learning approach to be able to flexibly combine complementary sources of information (different state-of-the-art descriptors computed on color and depth images), based on a Markov Random Field (MRF). By exploiting its specific characteristics, the MRF ensemble method can also be executed as a Dynamic Classifier Selection (DCS) system. In the experiments, the committee- and topology-dependent performance boost of our ensemble is shown. Despite reduced computational costs and using less information, our strategy performs on the same level as common ensemble approaches. Finally, the impact of large differences between datasets is analyzed.
An analytical approach to gravitational lensing by an ensemble of axisymmetric lenses
NASA Technical Reports Server (NTRS)
Lee, Man Hoi; Spergel, David N.
1990-01-01
The problem of gravitational lensing by an ensemble of identical axisymmetric lenses randomly distributed on a single lens plane is considered and a formal expression is derived for the joint probability density of finding shear and convergence at a random point on the plane. The amplification probability for a source can be accurately estimated from the distribution in shear and convergence. This method is applied to two cases: lensing by an ensemble of point masses and by an ensemble of objects with Gaussian surface mass density. There is no convergence for point masses whereas shear is negligible for wide Gaussian lenses.
Ensemble Pruning for Glaucoma Detection in an Unbalanced Data Set.
Adler, Werner; Gefeller, Olaf; Gul, Asma; Horn, Folkert K; Khan, Zardad; Lausen, Berthold
2016-12-07
Random forests are successful classifier ensemble methods consisting of typically 100 to 1000 classification trees. Ensemble pruning techniques reduce the computational cost, especially the memory demand, of random forests by reducing the number of trees without relevant loss of performance or even with increased performance of the sub-ensemble. The application to the problem of an early detection of glaucoma, a severe eye disease with low prevalence, based on topographical measurements of the eye background faces specific challenges. We examine the performance of ensemble pruning strategies for glaucoma detection in an unbalanced data situation. The data set consists of 102 topographical features of the eye background of 254 healthy controls and 55 glaucoma patients. We compare the area under the receiver operating characteristic curve (AUC), and the Brier score on the total data set, in the majority class, and in the minority class of pruned random forest ensembles obtained with strategies based on the prediction accuracy of greedily grown sub-ensembles, the uncertainty weighted accuracy, and the similarity between single trees. To validate the findings and to examine the influence of the prevalence of glaucoma in the data set, we additionally perform a simulation study with lower prevalences of glaucoma. In glaucoma classification all three pruning strategies lead to improved AUC and smaller Brier scores on the total data set with sub-ensembles as small as 30 to 80 trees compared to the classification results obtained with the full ensemble consisting of 1000 trees. In the simulation study, we were able to show that the prevalence of glaucoma is a critical factor and lower prevalence decreases the performance of our pruning strategies. The memory demand for glaucoma classification in an unbalanced data situation based on random forests could effectively be reduced by the application of pruning strategies without loss of performance in a population with increased risk of glaucoma.
Energetic Consistency and Coupling of the Mean and Covariance Dynamics
NASA Technical Reports Server (NTRS)
Cohn, Stephen E.
2008-01-01
The dynamical state of the ocean and atmosphere is taken to be a large dimensional random vector in a range of large-scale computational applications, including data assimilation, ensemble prediction, sensitivity analysis, and predictability studies. In each of these applications, numerical evolution of the covariance matrix of the random state plays a central role, because this matrix is used to quantify uncertainty in the state of the dynamical system. Since atmospheric and ocean dynamics are nonlinear, there is no closed evolution equation for the covariance matrix, nor for the mean state. Therefore approximate evolution equations must be used. This article studies theoretical properties of the evolution equations for the mean state and covariance matrix that arise in the second-moment closure approximation (third- and higher-order moment discard). This approximation was introduced by EPSTEIN [1969] in an early effort to introduce a stochastic element into deterministic weather forecasting, and was studied further by FLEMING [1971a,b], EPSTEIN and PITCHER [1972], and PITCHER [1977], also in the context of atmospheric predictability. It has since fallen into disuse, with a simpler one being used in current large-scale applications. The theoretical results of this article make a case that this approximation should be reconsidered for use in large-scale applications, however, because the second moment closure equations possess a property of energetic consistency that the approximate equations now in common use do not possess. A number of properties of solutions of the second-moment closure equations that result from this energetic consistency will be established.
Abuassba, Adnan O M; Zhang, Dezheng; Luo, Xiong; Shaheryar, Ahmad; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L 2 -norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets.
Abuassba, Adnan O. M.; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets. PMID:28546808
Improving ensemble decision tree performance using Adaboost and Bagging
NASA Astrophysics Data System (ADS)
Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie
2015-12-01
Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.
A Statistical Test of Walrasian Equilibrium by Means of Complex Networks Theory
NASA Astrophysics Data System (ADS)
Bargigli, Leonardo; Viaggiu, Stefano; Lionetto, Andrea
2016-10-01
We represent an exchange economy in terms of statistical ensembles for complex networks by introducing the concept of market configuration. This is defined as a sequence of nonnegative discrete random variables {w_{ij}} describing the flow of a given commodity from agent i to agent j. This sequence can be arranged in a nonnegative matrix W which we can regard as the representation of a weighted and directed network or digraph G. Our main result consists in showing that general equilibrium theory imposes highly restrictive conditions upon market configurations, which are in most cases not fulfilled by real markets. An explicit example with reference to the e-MID interbank credit market is provided.
Ranking and combining multiple predictors without labeled data
Parisi, Fabio; Strino, Francesco; Nadler, Boaz; Kluger, Yuval
2014-01-01
In a broad range of classification and decision-making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This scenario is different from the standard supervised setting, where each classifier’s accuracy can be assessed using available labeled data, and raises two questions: Given only the predictions of several classifiers over a large set of unlabeled test data, is it possible to (i) reliably rank them and (ii) construct a metaclassifier more accurate than most classifiers in the ensemble? Here we present a spectral approach to address these questions. First, assuming conditional independence between classifiers, we show that the off-diagonal entries of their covariance matrix correspond to a rank-one matrix. Moreover, the classifiers can be ranked using the leading eigenvector of this covariance matrix, because its entries are proportional to their balanced accuracies. Second, via a linear approximation to the maximum likelihood estimator, we derive the Spectral Meta-Learner (SML), an unsupervised ensemble classifier whose weights are equal to these eigenvector entries. On both simulated and real data, SML typically achieves a higher accuracy than most classifiers in the ensemble and can provide a better starting point than majority voting for estimating the maximum likelihood solution. Furthermore, SML is robust to the presence of small malicious groups of classifiers designed to veer the ensemble prediction away from the (unknown) ground truth. PMID:24474744
NASA Astrophysics Data System (ADS)
Ebrahimi, R.; Zohren, S.
2018-03-01
In this paper we extend the orthogonal polynomials approach for extreme value calculations of Hermitian random matrices, developed by Nadal and Majumdar (J. Stat. Mech. P04001 arXiv:1102.0738), to normal random matrices and 2D Coulomb gases in general. Firstly, we show that this approach provides an alternative derivation of results in the literature. More precisely, we show convergence of the rescaled eigenvalue with largest modulus of a normal Gaussian ensemble to a Gumbel distribution, as well as universality for an arbitrary radially symmetric potential. Secondly, it is shown that this approach can be generalised to obtain convergence of the eigenvalue with smallest modulus and its universality for ring distributions. Most interestingly, the here presented techniques are used to compute all slowly varying finite N correction of the above distributions, which is important for practical applications, given the slow convergence. Another interesting aspect of this work is the fact that we can use standard techniques from Hermitian random matrices to obtain the extreme value statistics of non-Hermitian random matrices resembling the large N expansion used in context of the double scaling limit of Hermitian matrix models in string theory.
Multivariate localization methods for ensemble Kalman filtering
NASA Astrophysics Data System (ADS)
Roh, S.; Jun, M.; Szunyogh, I.; Genton, M. G.
2015-12-01
In ensemble Kalman filtering (EnKF), the small number of ensemble members that is feasible to use in a practical data assimilation application leads to sampling variability of the estimates of the background error covariances. The standard approach to reducing the effects of this sampling variability, which has also been found to be highly efficient in improving the performance of EnKF, is the localization of the estimates of the covariances. One family of localization techniques is based on taking the Schur (element-wise) product of the ensemble-based sample covariance matrix and a correlation matrix whose entries are obtained by the discretization of a distance-dependent correlation function. While the proper definition of the localization function for a single state variable has been extensively investigated, a rigorous definition of the localization function for multiple state variables that exist at the same locations has been seldom considered. This paper introduces two strategies for the construction of localization functions for multiple state variables. The proposed localization functions are tested by assimilating simulated observations experiments into the bivariate Lorenz 95 model with their help.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aleksandrov, I. A., E-mail: Aleksandrov@isp.nsc.ru; Mansurov, V. G.; Zhuravlev, K. S.
2016-08-15
The carrier recombination dynamics in an ensemble of GaN/AlN quantum dots is studied. The model proposed for describing this dynamics takes into account the transition of carriers between quantum dots and defects in a matrix. Comparison of the experimental and calculated photoluminescence decay curves shows that the interaction between quantum dots and defects slows down photoluminescence decay in the ensemble of GaN/AlN quantum dots.
Automatic Estimation of Osteoporotic Fracture Cases by Using Ensemble Learning Approaches.
Kilic, Niyazi; Hosgormez, Erkan
2016-03-01
Ensemble learning methods are one of the most powerful tools for the pattern classification problems. In this paper, the effects of ensemble learning methods and some physical bone densitometry parameters on osteoporotic fracture detection were investigated. Six feature set models were constructed including different physical parameters and they fed into the ensemble classifiers as input features. As ensemble learning techniques, bagging, gradient boosting and random subspace (RSM) were used. Instance based learning (IBk) and random forest (RF) classifiers applied to six feature set models. The patients were classified into three groups such as osteoporosis, osteopenia and control (healthy), using ensemble classifiers. Total classification accuracy and f-measure were also used to evaluate diagnostic performance of the proposed ensemble classification system. The classification accuracy has reached to 98.85 % by the combination of model 6 (five BMD + five T-score values) using RSM-RF classifier. The findings of this paper suggest that the patients will be able to be warned before a bone fracture occurred, by just examining some physical parameters that can easily be measured without invasive operations.
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Random matrices with external source and the asymptotic behaviour of multiple orthogonal polynomials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aptekarev, Alexander I; Lysov, Vladimir G; Tulyakov, Dmitrii N
2011-02-28
Ensembles of random Hermitian matrices with a distribution measure defined by an anharmonic potential perturbed by an external source are considered. The limiting characteristics of the eigenvalue distribution of the matrices in these ensembles are related to the asymptotic behaviour of a certain system of multiple orthogonal polynomials. Strong asymptotic formulae are derived for this system. As a consequence, for matrices in this ensemble the limit mean eigenvalue density is found, and a variational principle is proposed to characterize this density. Bibliography: 35 titles.
NASA Astrophysics Data System (ADS)
Wirtz, Tim; Kieburg, Mario; Guhr, Thomas
2017-06-01
The correlated Wishart model provides the standard benchmark when analyzing time series of any kind. Unfortunately, the real case, which is the most relevant one in applications, poses serious challenges for analytical calculations. Often these challenges are due to square root singularities which cannot be handled using common random matrix techniques. We present a new way to tackle this issue. Using supersymmetry, we carry out an anlaytical study which we support by numerical simulations. For large but finite matrix dimensions, we show that statistical properties of the fully correlated real Wishart model generically approach those of a correlated real Wishart model with doubled matrix dimensions and doubly degenerate empirical eigenvalues. This holds for the local and global spectral statistics. With Monte Carlo simulations we show that this is even approximately true for small matrix dimensions. We explicitly investigate the k-point correlation function as well as the distribution of the largest eigenvalue for which we find a surprisingly compact formula in the doubly degenerate case. Moreover we show that on the local scale the k-point correlation function exhibits the sine and the Airy kernel in the bulk and at the soft edges, respectively. We also address the positions and the fluctuations of the possible outliers in the data.
NASA Astrophysics Data System (ADS)
Li, Hui; Hong, Lu-Yao; Zhou, Qing; Yu, Hai-Jie
2015-08-01
The business failure of numerous companies results in financial crises. The high social costs associated with such crises have made people to search for effective tools for business risk prediction, among which, support vector machine is very effective. Several modelling means, including single-technique modelling, hybrid modelling, and ensemble modelling, have been suggested in forecasting business risk with support vector machine. However, existing literature seldom focuses on the general modelling frame for business risk prediction, and seldom investigates performance differences among different modelling means. We reviewed researches on forecasting business risk with support vector machine, proposed the general assisted prediction modelling frame with hybridisation and ensemble (APMF-WHAE), and finally, investigated the use of principal components analysis, support vector machine, random sampling, and group decision, under the general frame in forecasting business risk. Under the APMF-WHAE frame with support vector machine as the base predictive model, four specific predictive models were produced, namely, pure support vector machine, a hybrid support vector machine involved with principal components analysis, a support vector machine ensemble involved with random sampling and group decision, and an ensemble of hybrid support vector machine using group decision to integrate various hybrid support vector machines on variables produced from principle components analysis and samples from random sampling. The experimental results indicate that hybrid support vector machine and ensemble of hybrid support vector machines were able to produce dominating performance than pure support vector machine and support vector machine ensemble.
A numerical approximation to the elastic properties of sphere-reinforced composites
NASA Astrophysics Data System (ADS)
Segurado, J.; Llorca, J.
2002-10-01
Three-dimensional cubic unit cells containing 30 non-overlapping identical spheres randomly distributed were generated using a new, modified random sequential adsortion algorithm suitable for particle volume fractions of up to 50%. The elastic constants of the ensemble of spheres embedded in a continuous and isotropic elastic matrix were computed through the finite element analysis of the three-dimensional periodic unit cells, whose size was chosen as a compromise between the minimum size required to obtain accurate results in the statistical sense and the maximum one imposed by the computational cost. Three types of materials were studied: rigid spheres and spherical voids in an elastic matrix and a typical composite made up of glass spheres in an epoxy resin. The moduli obtained for different unit cells showed very little scatter, and the average values obtained from the analysis of four unit cells could be considered very close to the "exact" solution to the problem, in agreement with the results of Drugan and Willis (J. Mech. Phys. Solids 44 (1996) 497) referring to the size of the representative volume element for elastic composites. They were used to assess the accuracy of three classical analytical models: the Mori-Tanaka mean-field analysis, the generalized self-consistent method, and Torquato's third-order approximation.
The emergence of collective phenomena in systems with random interactions
NASA Astrophysics Data System (ADS)
Abramkina, Volha
Emergent phenomena are one of the most profound topics in modern science, addressing the ways that collectivities and complex patterns appear due to multiplicity of components and simple interactions. Ensembles of random Hamiltonians allow one to explore emergent phenomena in a statistical way. In this work we adopt a shell model approach with a two-body interaction Hamiltonian. The sets of the two-body interaction strengths are selected at random, resulting in the two-body random ensemble (TBRE). Symmetries such as angular momentum, isospin, and parity entangled with complex many-body dynamics result in surprising order discovered in the spectrum of low-lying excitations. The statistical patterns exhibited in the TBRE are remarkably similar to those observed in real nuclei. Signs of almost every collective feature seen in nuclei, namely, pairing superconductivity, deformation, and vibration, have been observed in random ensembles [3, 4, 5, 6]. In what follows a systematic investigation of nuclear shape collectivities in random ensembles is conducted. The development of the mean field, its geometry, multipole collectivities and their dependence on the underlying two-body interaction are explored. Apart from the role of static symmetries such as SU(2) angular momentum and isospin groups, the emergence of dynamical symmetries including the seniority SU(2), rotational symmetry, as well as the Elliot SU(3) is shown to be an important precursor for the existence of geometric collectivities.
Finite plateau in spectral gap of polychromatic constrained random networks
NASA Astrophysics Data System (ADS)
Avetisov, V.; Gorsky, A.; Nechaev, S.; Valba, O.
2017-12-01
We consider critical behavior in the ensemble of polychromatic Erdős-Rényi networks and regular random graphs, where network vertices are painted in different colors. The links can be randomly removed and added to the network subject to the condition of the vertex degree conservation. In these constrained graphs we run the Metropolis procedure, which favors the connected unicolor triads of nodes. Changing the chemical potential, μ , of such triads, for some wide region of μ , we find the formation of a finite plateau in the number of intercolor links, which exactly matches the finite plateau in the network algebraic connectivity (the value of the first nonvanishing eigenvalue of the Laplacian matrix, λ2). We claim that at the plateau the spontaneously broken Z2 symmetry is restored by the mechanism of modes collectivization in clusters of different colors. The phenomena of a finite plateau formation holds also for polychromatic networks with M ≥2 colors. The behavior of polychromatic networks is analyzed via the spectral properties of their adjacency and Laplacian matrices.
The random coding bound is tight for the average code.
NASA Technical Reports Server (NTRS)
Gallager, R. G.
1973-01-01
The random coding bound of information theory provides a well-known upper bound to the probability of decoding error for the best code of a given rate and block length. The bound is constructed by upperbounding the average error probability over an ensemble of codes. The bound is known to give the correct exponential dependence of error probability on block length for transmission rates above the critical rate, but it gives an incorrect exponential dependence at rates below a second lower critical rate. Here we derive an asymptotic expression for the average error probability over the ensemble of codes used in the random coding bound. The result shows that the weakness of the random coding bound at rates below the second critical rate is due not to upperbounding the ensemble average, but rather to the fact that the best codes are much better than the average at low rates.
Zheng, Lianqing; Chen, Mengen; Yang, Wei
2009-06-21
To overcome the pseudoergodicity problem, conformational sampling can be accelerated via generalized ensemble methods, e.g., through the realization of random walks along prechosen collective variables, such as spatial order parameters, energy scaling parameters, or even system temperatures or pressures, etc. As usually observed, in generalized ensemble simulations, hidden barriers are likely to exist in the space perpendicular to the collective variable direction and these residual free energy barriers could greatly abolish the sampling efficiency. This sampling issue is particularly severe when the collective variable is defined in a low-dimension subset of the target system; then the "Hamiltonian lagging" problem, which reveals the fact that necessary structural relaxation falls behind the move of the collective variable, may be likely to occur. To overcome this problem in equilibrium conformational sampling, we adopted the orthogonal space random walk (OSRW) strategy, which was originally developed in the context of free energy simulation [L. Zheng, M. Chen, and W. Yang, Proc. Natl. Acad. Sci. U.S.A. 105, 20227 (2008)]. Thereby, generalized ensemble simulations can simultaneously escape both the explicit barriers along the collective variable direction and the hidden barriers that are strongly coupled with the collective variable move. As demonstrated in our model studies, the present OSRW based generalized ensemble treatments show improved sampling capability over the corresponding classical generalized ensemble treatments.
Spectral statistics of the uni-modular ensemble
NASA Astrophysics Data System (ADS)
Joyner, Christopher H.; Smilansky, Uzy; Weidenmüller, Hans A.
2017-09-01
We investigate the spectral statistics of Hermitian matrices in which the elements are chosen uniformly from U(1) , called the uni-modular ensemble (UME), in the limit of large matrix size. Using three complimentary methods; a supersymmetric integration method, a combinatorial graph-theoretical analysis and a Brownian motion approach, we are able to derive expressions for 1 / N corrections to the mean spectral moments and also analyse the fluctuations about this mean. By addressing the same ensemble from three different point of view, we can critically compare their relative advantages and derive some new results.
Evolutionary Ensemble for In Silico Prediction of Ames Test Mutagenicity
NASA Astrophysics Data System (ADS)
Chen, Huanhuan; Yao, Xin
Driven by new regulations and animal welfare, the need to develop in silico models has increased recently as alternative approaches to safety assessment of chemicals without animal testing. This paper describes a novel machine learning ensemble approach to building an in silico model for the prediction of the Ames test mutagenicity, one of a battery of the most commonly used experimental in vitro and in vivo genotoxicity tests for safety evaluation of chemicals. Evolutionary random neural ensemble with negative correlation learning (ERNE) [1] was developed based on neural networks and evolutionary algorithms. ERNE combines the method of bootstrap sampling on training data with the method of random subspace feature selection to ensure diversity in creating individuals within an initial ensemble. Furthermore, while evolving individuals within the ensemble, it makes use of the negative correlation learning, enabling individual NNs to be trained as accurate as possible while still manage to maintain them as diverse as possible. Therefore, the resulting individuals in the final ensemble are capable of cooperating collectively to achieve better generalization of prediction. The empirical experiment suggest that ERNE is an effective ensemble approach for predicting the Ames test mutagenicity of chemicals.
NASA Astrophysics Data System (ADS)
Forrester, Peter J.; Trinh, Allan K.
2018-05-01
The neighbourhood of the largest eigenvalue λmax in the Gaussian unitary ensemble (GUE) and Laguerre unitary ensemble (LUE) is referred to as the soft edge. It is known that there exists a particular centring and scaling such that the distribution of λmax tends to a universal form, with an error term bounded by 1/N2/3. We take up the problem of computing the exact functional form of the leading error term in a large N asymptotic expansion for both the GUE and LUE—two versions of the LUE are considered, one with the parameter a fixed and the other with a proportional to N. Both settings in the LUE case allow for an interpretation in terms of the distribution of a particular weighted path length in a model involving exponential variables on a rectangular grid, as the grid size gets large. We give operator theoretic forms of the corrections, which are corollaries of knowledge of the first two terms in the large N expansion of the scaled kernel and are readily computed using a method due to Bornemann. We also give expressions in terms of the solutions of particular systems of coupled differential equations, which provide an alternative method of computation. Both characterisations are well suited to a thinned generalisation of the original ensemble, whereby each eigenvalue is deleted independently with probability (1 - ξ). In Sec. V, we investigate using simulation the question of whether upon an appropriate centring and scaling a wider class of complex Hermitian random matrix ensembles have their leading correction to the distribution of λmax proportional to 1/N2/3.
Fast Constrained Spectral Clustering and Cluster Ensemble with Random Projection
Liu, Wenfen
2017-01-01
Constrained spectral clustering (CSC) method can greatly improve the clustering accuracy with the incorporation of constraint information into spectral clustering and thus has been paid academic attention widely. In this paper, we propose a fast CSC algorithm via encoding landmark-based graph construction into a new CSC model and applying random sampling to decrease the data size after spectral embedding. Compared with the original model, the new algorithm has the similar results with the increase of its model size asymptotically; compared with the most efficient CSC algorithm known, the new algorithm runs faster and has a wider range of suitable data sets. Meanwhile, a scalable semisupervised cluster ensemble algorithm is also proposed via the combination of our fast CSC algorithm and dimensionality reduction with random projection in the process of spectral ensemble clustering. We demonstrate by presenting theoretical analysis and empirical results that the new cluster ensemble algorithm has advantages in terms of efficiency and effectiveness. Furthermore, the approximate preservation of random projection in clustering accuracy proved in the stage of consensus clustering is also suitable for the weighted k-means clustering and thus gives the theoretical guarantee to this special kind of k-means clustering where each point has its corresponding weight. PMID:29312447
Randomized central limit theorems: A unified theory
NASA Astrophysics Data System (ADS)
Eliazar, Iddo; Klafter, Joseph
2010-08-01
The central limit theorems (CLTs) characterize the macroscopic statistical behavior of large ensembles of independent and identically distributed random variables. The CLTs assert that the universal probability laws governing ensembles’ aggregate statistics are either Gaussian or Lévy, and that the universal probability laws governing ensembles’ extreme statistics are Fréchet, Weibull, or Gumbel. The scaling schemes underlying the CLTs are deterministic—scaling all ensemble components by a common deterministic scale. However, there are “random environment” settings in which the underlying scaling schemes are stochastic—scaling the ensemble components by different random scales. Examples of such settings include Holtsmark’s law for gravitational fields and the Stretched Exponential law for relaxation times. In this paper we establish a unified theory of randomized central limit theorems (RCLTs)—in which the deterministic CLT scaling schemes are replaced with stochastic scaling schemes—and present “randomized counterparts” to the classic CLTs. The RCLT scaling schemes are shown to be governed by Poisson processes with power-law statistics, and the RCLTs are shown to universally yield the Lévy, Fréchet, and Weibull probability laws.
Ensemble approach for differentiation of malignant melanoma
NASA Astrophysics Data System (ADS)
Rastgoo, Mojdeh; Morel, Olivier; Marzani, Franck; Garcia, Rafael
2015-04-01
Melanoma is the deadliest type of skin cancer, yet it is the most treatable kind depending on its early diagnosis. The early prognosis of melanoma is a challenging task for both clinicians and dermatologists. Due to the importance of early diagnosis and in order to assist the dermatologists, we propose an automated framework based on ensemble learning methods and dermoscopy images to differentiate melanoma from dysplastic and benign lesions. The evaluation of our framework on the recent and public dermoscopy benchmark (PH2 dataset) indicates the potential of proposed method. Our evaluation, using only global features, revealed that ensembles such as random forest perform better than single learner. Using random forest ensemble and combination of color and texture features, our framework achieved the highest sensitivity of 94% and specificity of 92%.
Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang
2016-01-01
For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system. PMID:27835638
Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang
2016-01-01
For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system.
NASA Astrophysics Data System (ADS)
Durazo, Juan A.; Kostelich, Eric J.; Mahalov, Alex
2017-09-01
We propose a targeted observation strategy, based on the influence matrix diagnostic, that optimally selects where additional observations may be placed to improve ionospheric forecasts. This strategy is applied in data assimilation observing system experiments, where synthetic electron density vertical profiles, which represent those of Constellation Observing System for Meteorology, Ionosphere, and Climate/Formosa satellite 3, are assimilated into the Thermosphere-Ionosphere-Electrodynamics General Circulation Model using the local ensemble transform Kalman filter during the 26 September 2011 geomagnetic storm. During each analysis step, the observation vector is augmented with five synthetic vertical profiles optimally placed to target electron density errors, using our targeted observation strategy. Forecast improvement due to assimilation of augmented vertical profiles is measured with the root-mean-square error (RMSE) of analyzed electron density, averaged over 600 km regions centered around the augmented vertical profile locations. Assimilating vertical profiles with targeted locations yields about 60%-80% reduction in electron density RMSE, compared to a 15% average reduction when assimilating randomly placed vertical profiles. Assimilating vertical profiles whose locations target the zonal component of neutral winds (Un) yields on average a 25% RMSE reduction in Un estimates, compared to a 2% average improvement obtained with randomly placed vertical profiles. These results demonstrate that our targeted strategy can improve data assimilation efforts during extreme events by detecting regions where additional observations would provide the largest benefit to the forecast.
NASA Astrophysics Data System (ADS)
Zapata Norberto, B.; Morales-Casique, E.; Herrera, G. S.
2017-12-01
Severe land subsidence due to groundwater extraction may occur in multiaquifer systems where highly compressible aquitards are present. The highly compressible nature of the aquitards leads to nonlinear consolidation where the groundwater flow parameters are stress-dependent. The case is further complicated by the heterogeneity of the hydrogeologic and geotechnical properties of the aquitards. We explore the effect of realistic vertical heterogeneity of hydrogeologic and geotechnical parameters on the consolidation of highly compressible aquitards by means of 1-D Monte Carlo numerical simulations. 2000 realizations are generated for each of the following parameters: hydraulic conductivity (K), compression index (Cc) and void ratio (e). The correlation structure, the mean and the variance for each parameter were obtained from a literature review about field studies in the lacustrine sediments of Mexico City. The results indicate that among the parameters considered, random K has the largest effect on the ensemble average behavior of the system. Random K leads to the largest variance (and therefore largest uncertainty) of total settlement, groundwater flux and time to reach steady state conditions. We further propose a data assimilation scheme by means of ensemble Kalman filter to estimate the ensemble mean distribution of K, pore-pressure and total settlement. We consider the case where pore-pressure measurements are available at given time intervals. We test our approach by generating a 1-D realization of K with exponential spatial correlation, and solving the nonlinear flow and consolidation problem. These results are taken as our "true" solution. We take pore-pressure "measurements" at different times from this "true" solution. The ensemble Kalman filter method is then employed to estimate ensemble mean distribution of K, pore-pressure and total settlement based on the sequential assimilation of these pore-pressure measurements. The ensemble-mean estimates from this procedure closely approximate those from the "true" solution. This procedure can be easily extended to other random variables such as compression index and void ratio.
Ensemble Smoother implemented in parallel for groundwater problems applications
NASA Astrophysics Data System (ADS)
Leyva, E.; Herrera, G. S.; de la Cruz, L. M.
2013-05-01
Data assimilation is a process that links forecasting models and measurements using the benefits from both sources. The Ensemble Kalman Filter (EnKF) is a data-assimilation sequential-method that was designed to address two of the main problems related to the use of the Extended Kalman Filter (EKF) with nonlinear models in large state spaces, i-e the use of a closure problem and massive computational requirements associated with the storage and subsequent integration of the error covariance matrix. The EnKF has gained popularity because of its simple conceptual formulation and relative ease of implementation. It has been used successfully in various applications of meteorology and oceanography and more recently in petroleum engineering and hydrogeology. The Ensemble Smoother (ES) is a method similar to EnKF, it was proposed by Van Leeuwen and Evensen (1996). Herrera (1998) proposed a version of the ES which we call Ensemble Smoother of Herrera (ESH) to distinguish it from the former. It was introduced for space-time optimization of groundwater monitoring networks. In recent years, this method has been used for data assimilation and parameter estimation in groundwater flow and transport models. The ES method uses Monte Carlo simulation, which consists of generating repeated realizations of the random variable considered, using a flow and transport model. However, often a large number of model runs are required for the moments of the variable to converge. Therefore, depending on the complexity of problem a serial computer may require many hours of continuous use to apply the ES. For this reason, it is required to parallelize the process in order to do it in a reasonable time. In this work we present the results of a parallelization strategy to reduce the execution time for doing a high number of realizations. The software GWQMonitor by Herrera (1998), implements all the algorithms required for the ESH in Fortran 90. We develop a script in Python using mpi4py, in order to execute GWQMonitor in parallel, applying the MPI library. Our approach is to calculate the initial inputs for each realization, and run groups of these realizations in separate processors. The only modification to the GWQMonitor was the final calculation of the covariance matrix. This strategy was applied to the study of a simplified aquifer in a rectangular domain of a single layer. We show the speedup and efficiency for different number of processors.
Wavelet analysis of biological tissue's Mueller-matrix images
NASA Astrophysics Data System (ADS)
Tomka, Yu. Ya.
2008-05-01
The interrelations between statistics of the 1st-4th orders of the ensemble of Mueller-matrix images and geometric structure of birefringent architectonic nets of different morphological structure have been analyzed. The sensitivity of asymmetry and excess of statistic distributions of matrix elements Cik to changing of orientation structure of optically anisotropic protein fibrils of physiologically normal and pathologically changed biological tissues architectonics has been shown.
Hartree and Exchange in Ensemble Density Functional Theory: Avoiding the Nonuniqueness Disaster.
Gould, Tim; Pittalis, Stefano
2017-12-15
Ensemble density functional theory is a promising method for the efficient and accurate calculation of excitations of quantum systems, at least if useful functionals can be developed to broaden its domain of practical applicability. Here, we introduce a guaranteed single-valued "Hartree-exchange" ensemble density functional, E_{Hx}[n], in terms of the right derivative of the universal ensemble density functional with respect to the coupling constant at vanishing interaction. We show that E_{Hx}[n] is straightforwardly expressible using block eigenvalues of a simple matrix [Eq. (14)]. Specialized expressions for E_{Hx}[n] from the literature, including those involving superpositions of Slater determinants, can now be regarded as originating from the unifying picture presented here. We thus establish a clear and practical description for Hartree and exchange in ensemble systems.
NASA Astrophysics Data System (ADS)
Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo
2016-07-01
Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Building Diversified Multiple Trees for classification in high dimensional noisy biomedical data.
Li, Jiuyong; Liu, Lin; Liu, Jixue; Green, Ryan
2017-12-01
It is common that a trained classification model is applied to the operating data that is deviated from the training data because of noise. This paper will test an ensemble method, Diversified Multiple Tree (DMT), on its capability for classifying instances in a new laboratory using the classifier built on the instances of another laboratory. DMT is tested on three real world biomedical data sets from different laboratories in comparison with four benchmark ensemble methods, AdaBoost, Bagging, Random Forests, and Random Trees. Experiments have also been conducted on studying the limitation of DMT and its possible variations. Experimental results show that DMT is significantly more accurate than other benchmark ensemble classifiers on classifying new instances of a different laboratory from the laboratory where instances are used to build the classifier. This paper demonstrates that an ensemble classifier, DMT, is more robust in classifying noisy data than other widely used ensemble methods. DMT works on the data set that supports multiple simple trees.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reboiro, M., E-mail: reboiro@fisica.unlp.edu.ar; Civitarese, O., E-mail: osvaldo.civitarese@fisica.unlp.edu.ar; Ramírez, R.
2017-03-15
The degree of coherence in a hybrid system composed of superconducting flux-qubits and an electron ensemble is analysed. Both, the interactions among the electrons and among the superconducting flux-qubits are taken into account. The time evolution of the hybrid system is solved exactly, and discussed in terms of the reduced density matrix of each subsystem. It is seen that the inclusion of a line width, for the electrons and for the superconducting flux-qubits, influences the pattern of spin-squeezing and the coherence of the superconducting flux qubits. - Highlights: • The degree of coherence in a hybrid system, composed of superconductingmore » flux qubits and an electron ensemble, is analysed. • The time evolution of the hybrid system is solved exactly and discussed in terms of the reduced density matrix of each subsystem. • It is shown that the initial state of the system evolves to a stationary squeezed state.« less
Ensemble Feature Learning of Genomic Data Using Support Vector Machine
Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R.; Braytee, Ali; Kennedy, Paul J.
2016-01-01
The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data. PMID:27304923
Ozçift, Akin
2011-05-01
Supervised classification algorithms are commonly used in the designing of computer-aided diagnosis systems. In this study, we present a resampling strategy based Random Forests (RF) ensemble classifier to improve diagnosis of cardiac arrhythmia. Random forests is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees. In this way, an RF ensemble classifier performs better than a single tree from classification performance point of view. In general, multiclass datasets having unbalanced distribution of sample sizes are difficult to analyze in terms of class discrimination. Cardiac arrhythmia is such a dataset that has multiple classes with small sample sizes and it is therefore adequate to test our resampling based training strategy. The dataset contains 452 samples in fourteen types of arrhythmias and eleven of these classes have sample sizes less than 15. Our diagnosis strategy consists of two parts: (i) a correlation based feature selection algorithm is used to select relevant features from cardiac arrhythmia dataset. (ii) RF machine learning algorithm is used to evaluate the performance of selected features with and without simple random sampling to evaluate the efficiency of proposed training strategy. The resultant accuracy of the classifier is found to be 90.0% and this is a quite high diagnosis performance for cardiac arrhythmia. Furthermore, three case studies, i.e., thyroid, cardiotocography and audiology, are used to benchmark the effectiveness of the proposed method. The results of experiments demonstrated the efficiency of random sampling strategy in training RF ensemble classification algorithm. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bouhaj, M.; von Estorff, O.; Peiffer, A.
2017-09-01
In the application of Statistical Energy Analysis "SEA" to complex assembled structures, a purely predictive model often exhibits errors. These errors are mainly due to a lack of accurate modelling of the power transmission mechanism described through the Coupling Loss Factors (CLF). Experimental SEA (ESEA) is practically used by the automotive and aerospace industry to verify and update the model or to derive the CLFs for use in an SEA predictive model when analytical estimates cannot be made. This work is particularly motivated by the lack of procedures that allow an estimate to be made of the variance and confidence intervals of the statistical quantities when using the ESEA technique. The aim of this paper is to introduce procedures enabling a statistical description of measured power input, vibration energies and the derived SEA parameters. Particular emphasis is placed on the identification of structural CLFs of complex built-up structures comparing different methods. By adopting a Stochastic Energy Model (SEM), the ensemble average in ESEA is also addressed. For this purpose, expressions are obtained to randomly perturb the energy matrix elements and generate individual samples for the Monte Carlo (MC) technique applied to derive the ensemble averaged CLF. From results of ESEA tests conducted on an aircraft fuselage section, the SEM approach provides a better performance of estimated CLFs compared to classical matrix inversion methods. The expected range of CLF values and the synthesized energy are used as quality criteria of the matrix inversion, allowing to assess critical SEA subsystems, which might require a more refined statistical description of the excitation and the response fields. Moreover, the impact of the variance of the normalized vibration energy on uncertainty of the derived CLFs is outlined.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Okunev, V. D.; Samoilenko, Z. A.; Burkhovetski, V. V.
The growth of La{sub 0.7}Sr{sub 0.3}MnO{sub 3} films in magnetron plasma, in special conditions, leads to the appearance of ensembles of micron-sized spherical crystalline clusters with fractal structure, which we consider to be a new form of self-organization in solids. Each ensemble contains 10{sup 5}-10{sup 6} elementary clusters, 100-250 A in diameter. Interaction of the clusters in the ensemble is realized through the interatomic chemical bonds, intrinsic to the manganites. Integration of peripheral areas of interacting clusters results in the formation of common intercluster medium in the ensemble. We argue that the ensembles with fractal structure built into paramagnetic disorderedmore » matrix have ferromagnetic properties. Absence of sharp borders between elementary clusters and the presence of common intercluster medium inside each ensemble permits to rearrange magnetic order and to change the volume of the ferromagnetic phase, providing automatically a high sensitivity of the material to the external field.« less
A mesoscale hybrid data assimilation system based on the JMA nonhydrostatic model
NASA Astrophysics Data System (ADS)
Ito, K.; Kunii, M.; Kawabata, T. T.; Saito, K. K.; Duc, L. L.
2015-12-01
This work evaluates the potential of a hybrid ensemble Kalman filter and four-dimensional variational (4D-Var) data assimilation system for predicting severe weather events from a deterministic point of view. This hybrid system is an adjoint-based 4D-Var system using a background error covariance matrix constructed from the mixture of a so-called NMC method and perturbations in a local ensemble transform Kalman filter data assimilation system, both of which are based on the Japan Meteorological Agency nonhydrostatic model. To construct the background error covariance matrix, we investigated two types of schemes. One is a spatial localization scheme and the other is neighboring ensemble approach, which regards the result at a horizontally spatially shifted point in each ensemble member as that obtained from a different realization of ensemble simulation. An assimilation of a pseudo single-observation located to the north of a tropical cyclone (TC) yielded an analysis increment of wind and temperature physically consistent with what is expected for a mature TC in both hybrid systems, whereas an analysis increment in a 4D-Var system using a static background error covariance distorted a structure of the mature TC. Real data assimilation experiments applied to 4 TCs and 3 local heavy rainfall events showed that hybrid systems and EnKF provided better initial conditions than the NMC-based 4D-Var, both for predicting the intensity and track forecast of TCs and for the location and amount of local heavy rainfall events.
Virial expansion for almost diagonal random matrices
NASA Astrophysics Data System (ADS)
Yevtushenko, Oleg; Kravtsov, Vladimir E.
2003-08-01
Energy level statistics of Hermitian random matrices hat H with Gaussian independent random entries Higeqj is studied for a generic ensemble of almost diagonal random matrices with langle|Hii|2rangle ~ 1 and langle|Hi\
Universality in the dynamical properties of seismic vibrations
NASA Astrophysics Data System (ADS)
Chatterjee, Soumya; Barat, P.; Mukherjee, Indranil
2018-02-01
We have studied the statistical properties of the observed magnitudes of seismic vibration data in discrete time in an attempt to understand the underlying complex dynamical processes. The observed magnitude data are taken from six different geographical locations. All possible magnitudes are considered in the analysis including catastrophic vibrations, foreshocks, aftershocks and commonplace daily vibrations. The probability distribution functions of these data sets obey scaling law and display a certain universality characteristic. To investigate the universality features in the observed data generated by a complex process, we applied Random Matrix Theory (RMT) in the framework of Gaussian Orthogonal Ensemble (GOE). For all these six places the observed data show a close fit with the predictions of RMT. This reinforces the idea of universality in the dynamical processes generating seismic vibrations.
NASA Astrophysics Data System (ADS)
Gelfan, Alexander; Moreido, Vsevolod
2017-04-01
Ensemble hydrological forecasting allows for describing uncertainty caused by variability of meteorological conditions in the river basin for the forecast lead-time. At the same time, in snowmelt-dependent river basins another significant source of uncertainty relates to variability of initial conditions of the basin (snow water equivalent, soil moisture content, etc.) prior to forecast issue. Accurate long-term hydrological forecast is most crucial for large water management systems, such as the Cheboksary reservoir (the catchment area is 374 000 sq.km) located in the Middle Volga river in Russia. Accurate forecasts of water inflow volume, maximum discharge and other flow characteristics are of great value for this basin, especially before the beginning of the spring freshet season that lasts here from April to June. The semi-distributed hydrological model ECOMAG was used to develop long-term ensemble forecast of daily water inflow into the Cheboksary reservoir. To describe variability of the meteorological conditions and construct ensemble of possible weather scenarios for the lead-time of the forecast, two approaches were applied. The first one utilizes 50 weather scenarios observed in the previous years (similar to the ensemble streamflow prediction (ESP) procedure), the second one uses 1000 synthetic scenarios simulated by a stochastic weather generator. We investigated the evolution of forecast uncertainty reduction, expressed as forecast efficiency, over various consequent forecast issue dates and lead time. We analyzed the Nash-Sutcliffe efficiency of inflow hindcasts for the period 1982 to 2016 starting from 1st of March with 15 days frequency for lead-time of 1 to 6 months. This resulted in the forecast efficiency matrix with issue dates versus lead-time that allows for predictability identification of the basin. The matrix was constructed separately for observed and synthetic weather ensembles.
Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition.
Bardsiri, Mahshid Khatibi; Eftekhari, Mahdi
2014-01-01
In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree (DT) are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the features of the datasets are divided into some groups. Then, for each of these groups, three ensemble classifiers, namely, random forest, rotation forest and AdaBoost.M1 are employed. Also, some fusion methods are introduced for combining the ensemble classifiers obtained in the previous step. After this step, three classifiers are produced based on the combination of classifiers of types random forest, rotation forest and AdaBoost.M1. Finally, the three different classifiers achieved are combined to make an overall classifier. Experimental results show that the overall classifier obtained by the genetic algorithm (GA) weighting fusion method, is the best one in comparison to previously applied methods in terms of classification accuracy.
Spectral partitioning in equitable graphs.
Barucca, Paolo
2017-06-01
Graph partitioning problems emerge in a wide variety of complex systems, ranging from biology to finance, but can be rigorously analyzed and solved only for a few graph ensembles. Here, an ensemble of equitable graphs, i.e., random graphs with a block-regular structure, is studied, for which analytical results can be obtained. In particular, the spectral density of this ensemble is computed exactly for a modular and bipartite structure. Kesten-McKay's law for random regular graphs is found analytically to apply also for modular and bipartite structures when blocks are homogeneous. An exact solution to graph partitioning for two equal-sized communities is proposed and verified numerically, and a conjecture on the absence of an efficient recovery detectability transition in equitable graphs is suggested. A final discussion summarizes results and outlines their relevance for the solution of graph partitioning problems in other graph ensembles, in particular for the study of detectability thresholds and resolution limits in stochastic block models.
Machine Learning Predictions of a Multiresolution Climate Model Ensemble
NASA Astrophysics Data System (ADS)
Anderson, Gemma J.; Lucas, Donald D.
2018-05-01
Statistical models of high-resolution climate models are useful for many purposes, including sensitivity and uncertainty analyses, but building them can be computationally prohibitive. We generated a unique multiresolution perturbed parameter ensemble of a global climate model. We use a novel application of a machine learning technique known as random forests to train a statistical model on the ensemble to make high-resolution model predictions of two important quantities: global mean top-of-atmosphere energy flux and precipitation. The random forests leverage cheaper low-resolution simulations, greatly reducing the number of high-resolution simulations required to train the statistical model. We demonstrate that high-resolution predictions of these quantities can be obtained by training on an ensemble that includes only a small number of high-resolution simulations. We also find that global annually averaged precipitation is more sensitive to resolution changes than to any of the model parameters considered.
Spectral partitioning in equitable graphs
NASA Astrophysics Data System (ADS)
Barucca, Paolo
2017-06-01
Graph partitioning problems emerge in a wide variety of complex systems, ranging from biology to finance, but can be rigorously analyzed and solved only for a few graph ensembles. Here, an ensemble of equitable graphs, i.e., random graphs with a block-regular structure, is studied, for which analytical results can be obtained. In particular, the spectral density of this ensemble is computed exactly for a modular and bipartite structure. Kesten-McKay's law for random regular graphs is found analytically to apply also for modular and bipartite structures when blocks are homogeneous. An exact solution to graph partitioning for two equal-sized communities is proposed and verified numerically, and a conjecture on the absence of an efficient recovery detectability transition in equitable graphs is suggested. A final discussion summarizes results and outlines their relevance for the solution of graph partitioning problems in other graph ensembles, in particular for the study of detectability thresholds and resolution limits in stochastic block models.
Path statistics, memory, and coarse-graining of continuous-time random walks on networks
Kion-Crosby, Willow; Morozov, Alexandre V.
2015-01-01
Continuous-time random walks (CTRWs) on discrete state spaces, ranging from regular lattices to complex networks, are ubiquitous across physics, chemistry, and biology. Models with coarse-grained states (for example, those employed in studies of molecular kinetics) or spatial disorder can give rise to memory and non-exponential distributions of waiting times and first-passage statistics. However, existing methods for analyzing CTRWs on complex energy landscapes do not address these effects. Here we use statistical mechanics of the nonequilibrium path ensemble to characterize first-passage CTRWs on networks with arbitrary connectivity, energy landscape, and waiting time distributions. Our approach can be applied to calculating higher moments (beyond the mean) of path length, time, and action, as well as statistics of any conservative or non-conservative force along a path. For homogeneous networks, we derive exact relations between length and time moments, quantifying the validity of approximating a continuous-time process with its discrete-time projection. For more general models, we obtain recursion relations, reminiscent of transfer matrix and exact enumeration techniques, to efficiently calculate path statistics numerically. We have implemented our algorithm in PathMAN (Path Matrix Algorithm for Networks), a Python script that users can apply to their model of choice. We demonstrate the algorithm on a few representative examples which underscore the importance of non-exponential distributions, memory, and coarse-graining in CTRWs. PMID:26646868
Entanglement transitions induced by large deviations
NASA Astrophysics Data System (ADS)
Bhosale, Udaysinh T.
2017-12-01
The probability of large deviations of the smallest Schmidt eigenvalue for random pure states of bipartite systems, denoted as A and B , is computed analytically using a Coulomb gas method. It is shown that this probability, for large N , goes as exp[-β N2Φ (ζ ) ] , where the parameter β is the Dyson index of the ensemble, ζ is the large deviation parameter, while the rate function Φ (ζ ) is calculated exactly. Corresponding equilibrium Coulomb charge density is derived for its large deviations. Effects of the large deviations of the extreme (largest and smallest) Schmidt eigenvalues on the bipartite entanglement are studied using the von Neumann entropy. Effect of these deviations is also studied on the entanglement between subsystems 1 and 2, obtained by further partitioning the subsystem A , using the properties of the density matrix's partial transpose ρ12Γ. The density of states of ρ12Γ is found to be close to the Wigner's semicircle law with these large deviations. The entanglement properties are captured very well by a simple random matrix model for the partial transpose. The model predicts the entanglement transition across a critical large deviation parameter ζ . Log negativity is used to quantify the entanglement between subsystems 1 and 2. Analytical formulas for it are derived using the simple model. Numerical simulations are in excellent agreement with the analytical results.
Entanglement transitions induced by large deviations.
Bhosale, Udaysinh T
2017-12-01
The probability of large deviations of the smallest Schmidt eigenvalue for random pure states of bipartite systems, denoted as A and B, is computed analytically using a Coulomb gas method. It is shown that this probability, for large N, goes as exp[-βN^{2}Φ(ζ)], where the parameter β is the Dyson index of the ensemble, ζ is the large deviation parameter, while the rate function Φ(ζ) is calculated exactly. Corresponding equilibrium Coulomb charge density is derived for its large deviations. Effects of the large deviations of the extreme (largest and smallest) Schmidt eigenvalues on the bipartite entanglement are studied using the von Neumann entropy. Effect of these deviations is also studied on the entanglement between subsystems 1 and 2, obtained by further partitioning the subsystem A, using the properties of the density matrix's partial transpose ρ_{12}^{Γ}. The density of states of ρ_{12}^{Γ} is found to be close to the Wigner's semicircle law with these large deviations. The entanglement properties are captured very well by a simple random matrix model for the partial transpose. The model predicts the entanglement transition across a critical large deviation parameter ζ. Log negativity is used to quantify the entanglement between subsystems 1 and 2. Analytical formulas for it are derived using the simple model. Numerical simulations are in excellent agreement with the analytical results.
Chaos and complexity by design
Roberts, Daniel A.; Yoshida, Beni
2017-04-20
We study the relationship between quantum chaos and pseudorandomness by developing probes of unitary design. A natural probe of randomness is the “frame poten-tial,” which is minimized by unitary k-designs and measures the 2-norm distance between the Haar random unitary ensemble and another ensemble. A natural probe of quantum chaos is out-of-time-order (OTO) four-point correlation functions. We also show that the norm squared of a generalization of out-of-time-order 2k-point correlators is proportional to the kth frame potential, providing a quantitative connection between chaos and pseudorandomness. In addition, we prove that these 2k-point correlators for Pauli operators completely determine the k-foldmore » channel of an ensemble of unitary operators. Finally, we use a counting argument to obtain a lower bound on the quantum circuit complexity in terms of the frame potential. This provides a direct link between chaos, complexity, and randomness.« less
Efficient quantum pseudorandomness with simple graph states
NASA Astrophysics Data System (ADS)
Mezher, Rawad; Ghalbouni, Joe; Dgheim, Joseph; Markham, Damian
2018-02-01
Measurement based (MB) quantum computation allows for universal quantum computing by measuring individual qubits prepared in entangled multipartite states, known as graph states. Unless corrected for, the randomness of the measurements leads to the generation of ensembles of random unitaries, where each random unitary is identified with a string of possible measurement results. We show that repeating an MB scheme an efficient number of times, on a simple graph state, with measurements at fixed angles and no feedforward corrections, produces a random unitary ensemble that is an ɛ -approximate t design on n qubits. Unlike previous constructions, the graph is regular and is also a universal resource for measurement based quantum computing, closely related to the brickwork state.
Characteristics of level-spacing statistics in chaotic graphene billiards.
Huang, Liang; Lai, Ying-Cheng; Grebogi, Celso
2011-03-01
A fundamental result in nonrelativistic quantum nonlinear dynamics is that the spectral statistics of quantum systems that possess no geometric symmetry, but whose classical dynamics are chaotic, are described by those of the Gaussian orthogonal ensemble (GOE) or the Gaussian unitary ensemble (GUE), in the presence or absence of time-reversal symmetry, respectively. For massless spin-half particles such as neutrinos in relativistic quantum mechanics in a chaotic billiard, the seminal work of Berry and Mondragon established the GUE nature of the level-spacing statistics, due to the combination of the chirality of Dirac particles and the confinement, which breaks the time-reversal symmetry. A question is whether the GOE or the GUE statistics can be observed in experimentally accessible, relativistic quantum systems. We demonstrate, using graphene confinements in which the quasiparticle motions are governed by the Dirac equation in the low-energy regime, that the level-spacing statistics are persistently those of GOE random matrices. We present extensive numerical evidence obtained from the tight-binding approach and a physical explanation for the GOE statistics. We also find that the presence of a weak magnetic field switches the statistics to those of GUE. For a strong magnetic field, Landau levels become influential, causing the level-spacing distribution to deviate markedly from the random-matrix predictions. Issues addressed also include the effects of a number of realistic factors on level-spacing statistics such as next nearest-neighbor interactions, different lattice orientations, enhanced hopping energy for atoms on the boundary, and staggered potential due to graphene-substrate interactions.
Treating Sample Covariances for Use in Strongly Coupled Atmosphere-Ocean Data Assimilation
NASA Astrophysics Data System (ADS)
Smith, Polly J.; Lawless, Amos S.; Nichols, Nancy K.
2018-01-01
Strongly coupled data assimilation requires cross-domain forecast error covariances; information from ensembles can be used, but limited sampling means that ensemble derived error covariances are routinely rank deficient and/or ill-conditioned and marred by noise. Thus, they require modification before they can be incorporated into a standard assimilation framework. Here we compare methods for improving the rank and conditioning of multivariate sample error covariance matrices for coupled atmosphere-ocean data assimilation. The first method, reconditioning, alters the matrix eigenvalues directly; this preserves the correlation structures but does not remove sampling noise. We show that it is better to recondition the correlation matrix rather than the covariance matrix as this prevents small but dynamically important modes from being lost. The second method, model state-space localization via the Schur product, effectively removes sample noise but can dampen small cross-correlation signals. A combination that exploits the merits of each is found to offer an effective alternative.
Mazurowski, Maciej A; Zurada, Jacek M; Tourassi, Georgia D
2009-07-01
Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examined ensemble techniques provide a statistically significant improvement (AUC = 0.905 +/- 0.024) in performance as compared to the original IT-CAD system (AUC = 0.865 +/- 0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters.
Hamiltonian mean-field model: effect of temporal perturbation in coupling matrix
NASA Astrophysics Data System (ADS)
Bhadra, Nivedita; Patra, Soumen K.
2018-05-01
The Hamiltonian mean-field (HMF) model is a system of fully coupled rotators which exhibits a second-order phase transition at some critical energy in its canonical ensemble. We investigate the case where the interaction between the rotors is governed by a time-dependent coupling matrix. Our numerical study reveals a shift in the critical point due to the temporal modulation. The shift in the critical point is shown to be independent of the modulation frequency above some threshold value, whereas the impact of the amplitude of modulation is dominant. In the microcanonical ensemble, the system with constant coupling reaches a quasi-stationary state (QSS) at an energy near the critical point. Our result indicates that the QSS subsists in presence of such temporal modulation of the coupling parameter.
Stabilizing canonical-ensemble calculations in the auxiliary-field Monte Carlo method
NASA Astrophysics Data System (ADS)
Gilbreth, C. N.; Alhassid, Y.
2015-03-01
Quantum Monte Carlo methods are powerful techniques for studying strongly interacting Fermi systems. However, implementing these methods on computers with finite-precision arithmetic requires careful attention to numerical stability. In the auxiliary-field Monte Carlo (AFMC) method, low-temperature or large-model-space calculations require numerically stabilized matrix multiplication. When adapting methods used in the grand-canonical ensemble to the canonical ensemble of fixed particle number, the numerical stabilization increases the number of required floating-point operations for computing observables by a factor of the size of the single-particle model space, and thus can greatly limit the systems that can be studied. We describe an improved method for stabilizing canonical-ensemble calculations in AFMC that exhibits better scaling, and present numerical tests that demonstrate the accuracy and improved performance of the method.
Cluster ensemble based on Random Forests for genetic data.
Alhusain, Luluah; Hafez, Alaaeldin M
2017-01-01
Clustering plays a crucial role in several application domains, such as bioinformatics. In bioinformatics, clustering has been extensively used as an approach for detecting interesting patterns in genetic data. One application is population structure analysis, which aims to group individuals into subpopulations based on shared genetic variations, such as single nucleotide polymorphisms. Advances in DNA sequencing technology have facilitated the obtainment of genetic datasets with exceptional sizes. Genetic data usually contain hundreds of thousands of genetic markers genotyped for thousands of individuals, making an efficient means for handling such data desirable. Random Forests (RFs) has emerged as an efficient algorithm capable of handling high-dimensional data. RFs provides a proximity measure that can capture different levels of co-occurring relationships between variables. RFs has been widely considered a supervised learning method, although it can be converted into an unsupervised learning method. Therefore, RF-derived proximity measure combined with a clustering technique may be well suited for determining the underlying structure of unlabeled data. This paper proposes, RFcluE, a cluster ensemble approach for determining the underlying structure of genetic data based on RFs. The approach comprises a cluster ensemble framework to combine multiple runs of RF clustering. Experiments were conducted on high-dimensional, real genetic dataset to evaluate the proposed approach. The experiments included an examination of the impact of parameter changes, comparing RFcluE performance against other clustering methods, and an assessment of the relationship between the diversity and quality of the ensemble and its effect on RFcluE performance. This paper proposes, RFcluE, a cluster ensemble approach based on RF clustering to address the problem of population structure analysis and demonstrate the effectiveness of the approach. The paper also illustrates that applying a cluster ensemble approach, combining multiple RF clusterings, produces more robust and higher-quality results as a consequence of feeding the ensemble with diverse views of high-dimensional genetic data obtained through bagging and random subspace, the two key features of the RF algorithm.
Fitting a function to time-dependent ensemble averaged data.
Fogelmark, Karl; Lomholt, Michael A; Irbäck, Anders; Ambjörnsson, Tobias
2018-05-03
Time-dependent ensemble averages, i.e., trajectory-based averages of some observable, are of importance in many fields of science. A crucial objective when interpreting such data is to fit these averages (for instance, squared displacements) with a function and extract parameters (such as diffusion constants). A commonly overlooked challenge in such function fitting procedures is that fluctuations around mean values, by construction, exhibit temporal correlations. We show that the only available general purpose function fitting methods, correlated chi-square method and the weighted least squares method (which neglects correlation), fail at either robust parameter estimation or accurate error estimation. We remedy this by deriving a new closed-form error estimation formula for weighted least square fitting. The new formula uses the full covariance matrix, i.e., rigorously includes temporal correlations, but is free of the robustness issues, inherent to the correlated chi-square method. We demonstrate its accuracy in four examples of importance in many fields: Brownian motion, damped harmonic oscillation, fractional Brownian motion and continuous time random walks. We also successfully apply our method, weighted least squares including correlation in error estimation (WLS-ICE), to particle tracking data. The WLS-ICE method is applicable to arbitrary fit functions, and we provide a publically available WLS-ICE software.
Locally Weighted Ensemble Clustering.
Huang, Dong; Wang, Chang-Dong; Lai, Jian-Huang
2018-05-01
Due to its ability to combine multiple base clusterings into a probably better and more robust clustering, the ensemble clustering technique has been attracting increasing attention in recent years. Despite the significant success, one limitation to most of the existing ensemble clustering methods is that they generally treat all base clusterings equally regardless of their reliability, which makes them vulnerable to low-quality base clusterings. Although some efforts have been made to (globally) evaluate and weight the base clusterings, yet these methods tend to view each base clustering as an individual and neglect the local diversity of clusters inside the same base clustering. It remains an open problem how to evaluate the reliability of clusters and exploit the local diversity in the ensemble to enhance the consensus performance, especially, in the case when there is no access to data features or specific assumptions on data distribution. To address this, in this paper, we propose a novel ensemble clustering approach based on ensemble-driven cluster uncertainty estimation and local weighting strategy. In particular, the uncertainty of each cluster is estimated by considering the cluster labels in the entire ensemble via an entropic criterion. A novel ensemble-driven cluster validity measure is introduced, and a locally weighted co-association matrix is presented to serve as a summary for the ensemble of diverse clusters. With the local diversity in ensembles exploited, two novel consensus functions are further proposed. Extensive experiments on a variety of real-world datasets demonstrate the superiority of the proposed approach over the state-of-the-art.
Quantum State Transfer via Noisy Photonic and Phononic Waveguides
NASA Astrophysics Data System (ADS)
Vermersch, B.; Guimond, P.-O.; Pichler, H.; Zoller, P.
2017-03-01
We describe a quantum state transfer protocol, where a quantum state of photons stored in a first cavity can be faithfully transferred to a second distant cavity via an infinite 1D waveguide, while being immune to arbitrary noise (e.g., thermal noise) injected into the waveguide. We extend the model and protocol to a cavity QED setup, where atomic ensembles, or single atoms representing quantum memory, are coupled to a cavity mode. We present a detailed study of sensitivity to imperfections, and apply a quantum error correction protocol to account for random losses (or additions) of photons in the waveguide. Our numerical analysis is enabled by matrix product state techniques to simulate the complete quantum circuit, which we generalize to include thermal input fields. Our discussion applies both to photonic and phononic quantum networks.
Bacterial accumulation in viscosity gradients
NASA Astrophysics Data System (ADS)
Waisbord, Nicolas; Guasto, Jeffrey
2016-11-01
Cell motility is greatly modified by fluid rheology. In particular, the physical environments in which cells function, are often characterized by gradients of viscous biopolymers, such as mucus and extracellular matrix, which impact processes ranging from reproduction to digestion to biofilm formation. To understand how spatial heterogeneity of fluid rheology affects the motility and transport of swimming cells, we use hydrogel microfluidic devices to generate viscosity gradients in a simple, polymeric, Newtonian fluid. Using video microscopy, we characterize the random walk motility patterns of model bacteria (Bacillus subtilis), showing that both wild-type ('run-and-tumble') cells and smooth-swimming mutants accumulate in the viscous region of the fluid. Through statistical analysis of individual cell trajectories and body kinematics in both homogeneous and heterogeneous viscous environments, we discriminate passive, physical effects from active sensing processes to explain the observed cell accumulation at the ensemble level.
Fidelity decay in interacting two-level boson systems: Freezing and revivals
NASA Astrophysics Data System (ADS)
Benet, Luis; Hernández-Quiroz, Saúl; Seligman, Thomas H.
2011-05-01
We study the fidelity decay in the k-body embedded ensembles of random matrices for bosons distributed in two single-particle states, considering the reference or unperturbed Hamiltonian as the one-body terms and the diagonal part of the k-body embedded ensemble of random matrices and the perturbation as the residual off-diagonal part of the interaction. We calculate the ensemble-averaged fidelity with respect to an initial random state within linear response theory to second order on the perturbation strength and demonstrate that it displays the freeze of the fidelity. During the freeze, the average fidelity exhibits periodic revivals at integer values of the Heisenberg time tH. By selecting specific k-body terms of the residual interaction, we find that the periodicity of the revivals during the freeze of fidelity is an integer fraction of tH, thus relating the period of the revivals with the range of the interaction k of the perturbing terms. Numerical calculations confirm the analytical results.
A scattering database of marine particles and its application in optical analysis
NASA Astrophysics Data System (ADS)
Xu, G.; Yang, P.; Kattawar, G.; Zhang, X.
2016-12-01
In modeling the scattering properties of marine particles (e.g. phytoplankton), the laboratory studies imply a need to properly account for the influence of particle morphology, in addition to size and composition. In this study, a marine particle scattering database is constructed using a collection of distorted hexahedral shapes. Specifically, the scattering properties of each size bin and refractive index are obtained by the ensemble average associated with distorted hexahedra with randomly tilted facets and selected aspect ratios (from elongated to flattened). The randomness degree in shape-generation process defines the geometric irregularity of the particles in the group. The geometric irregularity and particle aspect ratios constitute a set of "shape factors" to be accounted for (e.g. in best-fit analysis). To cover most of the marine particle size range, we combine the Invariant Imbedding T-matrix (II-TM) method and the Physical-Geometric Optics Hybrid (PGOH) method in the calculations. The simulated optical properties are shown and compared with those obtained from Lorenz-Mie Theory. Using the scattering database, we present a preliminary optical analysis of laboratory-measured optical properties of marine particles.
Kalman filter data assimilation: targeting observations and parameter estimation.
Bellsky, Thomas; Kostelich, Eric J; Mahalov, Alex
2014-06-01
This paper studies the effect of targeted observations on state and parameter estimates determined with Kalman filter data assimilation (DA) techniques. We first provide an analytical result demonstrating that targeting observations within the Kalman filter for a linear model can significantly reduce state estimation error as opposed to fixed or randomly located observations. We next conduct observing system simulation experiments for a chaotic model of meteorological interest, where we demonstrate that the local ensemble transform Kalman filter (LETKF) with targeted observations based on largest ensemble variance is skillful in providing more accurate state estimates than the LETKF with randomly located observations. Additionally, we find that a hybrid ensemble Kalman filter parameter estimation method accurately updates model parameters within the targeted observation context to further improve state estimation.
Kalman filter data assimilation: Targeting observations and parameter estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bellsky, Thomas, E-mail: bellskyt@asu.edu; Kostelich, Eric J.; Mahalov, Alex
2014-06-15
This paper studies the effect of targeted observations on state and parameter estimates determined with Kalman filter data assimilation (DA) techniques. We first provide an analytical result demonstrating that targeting observations within the Kalman filter for a linear model can significantly reduce state estimation error as opposed to fixed or randomly located observations. We next conduct observing system simulation experiments for a chaotic model of meteorological interest, where we demonstrate that the local ensemble transform Kalman filter (LETKF) with targeted observations based on largest ensemble variance is skillful in providing more accurate state estimates than the LETKF with randomly locatedmore » observations. Additionally, we find that a hybrid ensemble Kalman filter parameter estimation method accurately updates model parameters within the targeted observation context to further improve state estimation.« less
Mazurowski, Maciej A.; Zurada, Jacek M.; Tourassi, Georgia D.
2009-01-01
Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examined ensemble techniques provide a statistically significant improvement (AUC=0.905±0.024) in performance as compared to the original IT-CAD system (AUC=0.865±0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters. PMID:19673196
HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy.
Hu, Huan; Zhang, Li; Ai, Haixin; Zhang, Hui; Fan, Yetian; Zhao, Qi; Liu, Hongsheng
2018-03-27
LncRNA plays an important role in many biological and disease progression by binding to related proteins. However, the experimental methods for studying lncRNA-protein interactions are time-consuming and expensive. Although there are a few models designed to predict the interactions of ncRNA-protein, they all have some common drawbacks that limit their predictive performance. In this study, we present a model called HLPI-Ensemble designed specifically for human lncRNA-protein interactions. HLPI-Ensemble adopts the ensemble strategy based on three mainstream machine learning algorithms of Support Vector Machines (SVM), Random Forests (RF) and Extreme Gradient Boosting (XGB) to generate HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble, respectively. The results of 10-fold cross-validation show that HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble achieved AUCs of 0.95, 0.96 and 0.96, respectively, in the test dataset. Furthermore, we compared the performance of the HLPI-Ensemble models with the previous models through external validation dataset. The results show that the false positives (FPs) of HLPI-Ensemble models are much lower than that of the previous models, and other evaluation indicators of HLPI-Ensemble models are also higher than those of the previous models. It is further showed that HLPI-Ensemble models are superior in predicting human lncRNA-protein interaction compared with previous models. The HLPI-Ensemble is publicly available at: http://ccsipb.lnu.edu.cn/hlpiensemble/ .
Classroom Environment as Related to Contest Ratings among High School Performing Ensembles.
ERIC Educational Resources Information Center
Hamann, Donald L.; And Others
1990-01-01
Examines influence of classroom environments, measured by the Classroom Environment Scale, Form R (CESR), on vocal and instrumental ensembles' musical achievement at festival contests. Using random sample, reveals subjects with higher scores on CESR scales of involvement, affiliation, teacher support, and organization received better contest…
Hwang, Yoo Na; Lee, Ju Hwan; Kim, Ga Young; Shin, Eun Seok; Kim, Sung Min
2018-01-01
The purpose of this study was to propose a hybrid ensemble classifier to characterize coronary plaque regions in intravascular ultrasound (IVUS) images. Pixels were allocated to one of four tissues (fibrous tissue (FT), fibro-fatty tissue (FFT), necrotic core (NC), and dense calcium (DC)) through processes of border segmentation, feature extraction, feature selection, and classification. Grayscale IVUS images and their corresponding virtual histology images were acquired from 11 patients with known or suspected coronary artery disease using 20 MHz catheter. A total of 102 hybrid textural features including first order statistics (FOS), gray level co-occurrence matrix (GLCM), extended gray level run-length matrix (GLRLM), Laws, local binary pattern (LBP), intensity, and discrete wavelet features (DWF) were extracted from IVUS images. To select optimal feature sets, genetic algorithm was implemented. A hybrid ensemble classifier based on histogram and texture information was then used for plaque characterization in this study. The optimal feature set was used as input of this ensemble classifier. After tissue characterization, parameters including sensitivity, specificity, and accuracy were calculated to validate the proposed approach. A ten-fold cross validation approach was used to determine the statistical significance of the proposed method. Our experimental results showed that the proposed method had reliable performance for tissue characterization in IVUS images. The hybrid ensemble classification method outperformed other existing methods by achieving characterization accuracy of 81% for FFT and 75% for NC. In addition, this study showed that Laws features (SSV and SAV) were key indicators for coronary tissue characterization. The proposed method had high clinical applicability for image-based tissue characterization. Copyright © 2017 Elsevier B.V. All rights reserved.
From deep TLS validation to ensembles of atomic models built from elemental motions
Urzhumtsev, Alexandre; Afonine, Pavel V.; Van Benschoten, Andrew H.; ...
2015-07-28
The translation–libration–screw model first introduced by Cruickshank, Schomaker and Trueblood describes the concerted motions of atomic groups. Using TLS models can improve the agreement between calculated and experimental diffraction data. Because the T, L and S matrices describe a combination of atomic vibrations and librations, TLS models can also potentially shed light on molecular mechanisms involving correlated motions. However, this use of TLS models in mechanistic studies is hampered by the difficulties in translating the results of refinement into molecular movement or a structural ensemble. To convert the matrices into a constituent molecular movement, the matrix elements must satisfy severalmore » conditions. Refining the T, L and S matrix elements as independent parameters without taking these conditions into account may result in matrices that do not represent concerted molecular movements. Here, a mathematical framework and the computational tools to analyze TLS matrices, resulting in either explicit decomposition into descriptions of the underlying motions or a report of broken conditions, are described. The description of valid underlying motions can then be output as a structural ensemble. All methods are implemented as part of the PHENIX project.« less
Emergence of a spectral gap in a class of random matrices associated with split graphs
NASA Astrophysics Data System (ADS)
Bassler, Kevin E.; Zia, R. K. P.
2018-01-01
Motivated by the intriguing behavior displayed in a dynamic network that models a population of extreme introverts and extroverts (XIE), we consider the spectral properties of ensembles of random split graph adjacency matrices. We discover that, in general, a gap emerges in the bulk spectrum between -1 and 0 that contains a single eigenvalue. An analytic expression for the bulk distribution is derived and verified with numerical analysis. We also examine their relation to chiral ensembles, which are associated with bipartite graphs.
Amozegar, M; Khorasani, K
2016-04-01
In this paper, a new approach for Fault Detection and Isolation (FDI) of gas turbine engines is proposed by developing an ensemble of dynamic neural network identifiers. For health monitoring of the gas turbine engine, its dynamics is first identified by constructing three separate or individual dynamic neural network architectures. Specifically, a dynamic multi-layer perceptron (MLP), a dynamic radial-basis function (RBF) neural network, and a dynamic support vector machine (SVM) are trained to individually identify and represent the gas turbine engine dynamics. Next, three ensemble-based techniques are developed to represent the gas turbine engine dynamics, namely, two heterogeneous ensemble models and one homogeneous ensemble model. It is first shown that all ensemble approaches do significantly improve the overall performance and accuracy of the developed system identification scheme when compared to each of the stand-alone solutions. The best selected stand-alone model (i.e., the dynamic RBF network) and the best selected ensemble architecture (i.e., the heterogeneous ensemble) in terms of their performances in achieving an accurate system identification are then selected for solving the FDI task. The required residual signals are generated by using both a single model-based solution and an ensemble-based solution under various gas turbine engine health conditions. Our extensive simulation studies demonstrate that the fault detection and isolation task achieved by using the residuals that are obtained from the dynamic ensemble scheme results in a significantly more accurate and reliable performance as illustrated through detailed quantitative confusion matrix analysis and comparative studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Randomizing Genome-Scale Metabolic Networks
Samal, Areejit; Martin, Olivier C.
2011-01-01
Networks coming from protein-protein interactions, transcriptional regulation, signaling, or metabolism may appear to have “unusual” properties. To quantify this, it is appropriate to randomize the network and test the hypothesis that the network is not statistically different from expected in a motivated ensemble. However, when dealing with metabolic networks, the randomization of the network using edge exchange generates fictitious reactions that are biochemically meaningless. Here we provide several natural ensembles of randomized metabolic networks. A first constraint is to use valid biochemical reactions. Further constraints correspond to imposing appropriate functional constraints. We explain how to perform these randomizations with the help of Markov Chain Monte Carlo (MCMC) and show that they allow one to approach the properties of biological metabolic networks. The implication of the present work is that the observed global structural properties of real metabolic networks are likely to be the consequence of simple biochemical and functional constraints. PMID:21779409
The Principle of Energetic Consistency
NASA Technical Reports Server (NTRS)
Cohn, Stephen E.
2009-01-01
A basic result in estimation theory is that the minimum variance estimate of the dynamical state, given the observations, is the conditional mean estimate. This result holds independently of the specifics of any dynamical or observation nonlinearity or stochasticity, requiring only that the probability density function of the state, conditioned on the observations, has two moments. For nonlinear dynamics that conserve a total energy, this general result implies the principle of energetic consistency: if the dynamical variables are taken to be the natural energy variables, then the sum of the total energy of the conditional mean and the trace of the conditional covariance matrix (the total variance) is constant between observations. Ensemble Kalman filtering methods are designed to approximate the evolution of the conditional mean and covariance matrix. For them the principle of energetic consistency holds independently of ensemble size, even with covariance localization. However, full Kalman filter experiments with advection dynamics have shown that a small amount of numerical dissipation can cause a large, state-dependent loss of total variance, to the detriment of filter performance. The principle of energetic consistency offers a simple way to test whether this spurious loss of variance limits ensemble filter performance in full-blown applications. The classical second-moment closure (third-moment discard) equations also satisfy the principle of energetic consistency, independently of the rank of the conditional covariance matrix. Low-rank approximation of these equations offers an energetically consistent, computationally viable alternative to ensemble filtering. Current formulations of long-window, weak-constraint, four-dimensional variational methods are designed to approximate the conditional mode rather than the conditional mean. Thus they neglect the nonlinear bias term in the second-moment closure equation for the conditional mean. The principle of energetic consistency implies that, to precisely the extent that growing modes are important in data assimilation, this term is also important.
Designing boosting ensemble of relational fuzzy systems.
Scherer, Rafał
2010-10-01
A method frequently used in classification systems for improving classification accuracy is to combine outputs of several classifiers. Among various types of classifiers, fuzzy ones are tempting because of using intelligible fuzzy if-then rules. In the paper we build an AdaBoost ensemble of relational neuro-fuzzy classifiers. Relational fuzzy systems bond input and output fuzzy linguistic values by a binary relation; thus, fuzzy rules have additional, comparing to traditional fuzzy systems, weights - elements of a fuzzy relation matrix. Thanks to this the system is better adjustable to data during learning. In the paper an ensemble of relational fuzzy systems is proposed. The problem is that such an ensemble contains separate rule bases which cannot be directly merged. As systems are separate, we cannot treat fuzzy rules coming from different systems as rules from the same (single) system. In the paper, the problem is addressed by a novel design of fuzzy systems constituting the ensemble, resulting in normalization of individual rule bases during learning. The method described in the paper is tested on several known benchmarks and compared with other machine learning solutions from the literature.
Power to Detect Intervention Effects on Ensembles of Social Networks
ERIC Educational Resources Information Center
Sweet, Tracy M.; Junker, Brian W.
2016-01-01
The hierarchical network model (HNM) is a framework introduced by Sweet, Thomas, and Junker for modeling interventions and other covariate effects on ensembles of social networks, such as what would be found in randomized controlled trials in education research. In this article, we develop calculations for the power to detect an intervention…
Experimental Study of Quantum Graphs with Microwave Networks
NASA Astrophysics Data System (ADS)
Fu, Ziyuan; Koch, Trystan; Antonsen, Thomas; Ott, Edward; Anlage, Steven; Wave Chaos Team
An experimental setup consisting of microwave networks is used to simulate quantum graphs. The networks are constructed from coaxial cables connected by T junctions. The networks are built for operation both at room temperature and superconducting versions that operate at cryogenic temperatures. In the experiments, a phase shifter is connected to one of the network bonds to generate an ensemble of quantum graphs by varying the phase delay. The eigenvalue spectrum is found from S-parameter measurements on one-port graphs. With the experimental data, the nearest-neighbor spacing statistics and the impedance statistics of the graphs are examined. It is also demonstrated that time-reversal invariance for microwave propagation in the graphs can be broken without increasing dissipation significantly by making nodes with circulators. Random matrix theory (RMT) successfully describes universal statistical properties of the system. We acknowledge support under contract AFOSR COE Grant FA9550-15-1-0171.
Theory of Stochastic Laplacian Growth
NASA Astrophysics Data System (ADS)
Alekseev, Oleg; Mineev-Weinstein, Mark
2017-07-01
We generalize the diffusion-limited aggregation by issuing many randomly-walking particles, which stick to a cluster at the discrete time unit providing its growth. Using simple combinatorial arguments we determine probabilities of different growth scenarios and prove that the most probable evolution is governed by the deterministic Laplacian growth equation. A potential-theoretical analysis of the growth probabilities reveals connections with the tau-function of the integrable dispersionless limit of the two-dimensional Toda hierarchy, normal matrix ensembles, and the two-dimensional Dyson gas confined in a non-uniform magnetic field. We introduce the time-dependent Hamiltonian, which generates transitions between different classes of equivalence of closed curves, and prove the Hamiltonian structure of the interface dynamics. Finally, we propose a relation between probabilities of growth scenarios and the semi-classical limit of certain correlation functions of "light" exponential operators in the Liouville conformal field theory on a pseudosphere.
NASA Astrophysics Data System (ADS)
Bianconi, Ginestra
2009-03-01
In this paper we generalize the concept of random networks to describe network ensembles with nontrivial features by a statistical mechanics approach. This framework is able to describe undirected and directed network ensembles as well as weighted network ensembles. These networks might have nontrivial community structure or, in the case of networks embedded in a given space, they might have a link probability with a nontrivial dependence on the distance between the nodes. These ensembles are characterized by their entropy, which evaluates the cardinality of networks in the ensemble. In particular, in this paper we define and evaluate the structural entropy, i.e., the entropy of the ensembles of undirected uncorrelated simple networks with given degree sequence. We stress the apparent paradox that scale-free degree distributions are characterized by having small structural entropy while they are so widely encountered in natural, social, and technological complex systems. We propose a solution to the paradox by proving that scale-free degree distributions are the most likely degree distribution with the corresponding value of the structural entropy. Finally, the general framework we present in this paper is able to describe microcanonical ensembles of networks as well as canonical or hidden-variable network ensembles with significant implications for the formulation of network-constructing algorithms.
Mass Conservation and Positivity Preservation with Ensemble-type Kalman Filter Algorithms
NASA Technical Reports Server (NTRS)
Janjic, Tijana; McLaughlin, Dennis B.; Cohn, Stephen E.; Verlaan, Martin
2013-01-01
Maintaining conservative physical laws numerically has long been recognized as being important in the development of numerical weather prediction (NWP) models. In the broader context of data assimilation, concerted efforts to maintain conservation laws numerically and to understand the significance of doing so have begun only recently. In order to enforce physically based conservation laws of total mass and positivity in the ensemble Kalman filter, we incorporate constraints to ensure that the filter ensemble members and the ensemble mean conserve mass and remain nonnegative through measurement updates. We show that the analysis steps of ensemble transform Kalman filter (ETKF) algorithm and ensemble Kalman filter algorithm (EnKF) can conserve the mass integral, but do not preserve positivity. Further, if localization is applied or if negative values are simply set to zero, then the total mass is not conserved either. In order to ensure mass conservation, a projection matrix that corrects for localization effects is constructed. In order to maintain both mass conservation and positivity preservation through the analysis step, we construct a data assimilation algorithms based on quadratic programming and ensemble Kalman filtering. Mass and positivity are both preserved by formulating the filter update as a set of quadratic programming problems that incorporate constraints. Some simple numerical experiments indicate that this approach can have a significant positive impact on the posterior ensemble distribution, giving results that are more physically plausible both for individual ensemble members and for the ensemble mean. The results show clear improvements in both analyses and forecasts, particularly in the presence of localized features. Behavior of the algorithm is also tested in presence of model error.
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling
NASA Astrophysics Data System (ADS)
Galelli, S.; Castelletti, A.
2013-02-01
Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modeling. In this paper we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modeling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalization property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally very efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analyzed on two real-world case studies (Marina catchment (Singapore) and Canning River (Western Australia)) representing two different morphoclimatic contexts comparatively with other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling
NASA Astrophysics Data System (ADS)
Galelli, S.; Castelletti, A.
2013-07-01
Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
The Cauchy Two-Matrix Model, C-Toda Lattice and CKP Hierarchy
NASA Astrophysics Data System (ADS)
Li, Chunxia; Li, Shi-Hao
2018-06-01
This paper mainly talks about the Cauchy two-matrix model and its corresponding integrable hierarchy with the help of orthogonal polynomial theory and Toda-type equations. Starting from the symmetric reduction in Cauchy biorthogonal polynomials, we derive the Toda equation of CKP type (or the C-Toda lattice) as well as its Lax pair by introducing time flows. Then, matrix integral solutions to the C-Toda lattice are extended to give solutions to the CKP hierarchy which reveals the time-dependent partition function of the Cauchy two-matrix model is nothing but the τ -function of the CKP hierarchy. At last, the connection between the Cauchy two-matrix model and Bures ensemble is established from the point of view of integrable systems.
Quiet planting in the locked constraints satisfaction problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zdeborova, Lenka; Krzakala, Florent
2009-01-01
We study the planted ensemble of locked constraint satisfaction problems. We describe the connection between the random and planted ensembles. The use of the cavity method is combined with arguments from reconstruction on trees and first and second moment considerations; in particular the connection with the reconstruction on trees appears to be crucial. Our main result is the location of the hard region in the planted ensemble, thus providing hard satisfiable benchmarks. In a part of that hard region instances have with high probability a single satisfying assignment.
NASA Astrophysics Data System (ADS)
Deng, Ziwang; Liu, Jinliang; Qiu, Xin; Zhou, Xiaolan; Zhu, Huaiping
2017-10-01
A novel method for daily temperature and precipitation downscaling is proposed in this study which combines the Ensemble Optimal Interpolation (EnOI) and bias correction techniques. For downscaling temperature, the day to day seasonal cycle of high resolution temperature of the NCEP climate forecast system reanalysis (CFSR) is used as background state. An enlarged ensemble of daily temperature anomaly relative to this seasonal cycle and information from global climate models (GCMs) are used to construct a gain matrix for each calendar day. Consequently, the relationship between large and local-scale processes represented by the gain matrix will change accordingly. The gain matrix contains information of realistic spatial correlation of temperature between different CFSR grid points, between CFSR grid points and GCM grid points, and between different GCM grid points. Therefore, this downscaling method keeps spatial consistency and reflects the interaction between local geographic and atmospheric conditions. Maximum and minimum temperatures are downscaled using the same method. For precipitation, because of the non-Gaussianity issue, a logarithmic transformation is used to daily total precipitation prior to conducting downscaling. Cross validation and independent data validation are used to evaluate this algorithm. Finally, data from a 29-member ensemble of phase 5 of the Coupled Model Intercomparison Project (CMIP5) GCMs are downscaled to CFSR grid points in Ontario for the period from 1981 to 2100. The results show that this method is capable of generating high resolution details without changing large scale characteristics. It results in much lower absolute errors in local scale details at most grid points than simple spatial downscaling methods. Biases in the downscaled data inherited from GCMs are corrected with a linear method for temperatures and distribution mapping for precipitation. The downscaled ensemble projects significant warming with amplitudes of 3.9 and 6.5 °C for 2050s and 2080s relative to 1990s in Ontario, respectively; Cooling degree days and hot days will significantly increase over southern Ontario and heating degree days and cold days will significantly decrease in northern Ontario. Annual total precipitation will increase over Ontario and heavy precipitation events will increase as well. These results are consistent with conclusions in many other studies in the literature.
A fast ergodic algorithm for generating ensembles of equilateral random polygons
NASA Astrophysics Data System (ADS)
Varela, R.; Hinson, K.; Arsuaga, J.; Diao, Y.
2009-03-01
Knotted structures are commonly found in circular DNA and along the backbone of certain proteins. In order to properly estimate properties of these three-dimensional structures it is often necessary to generate large ensembles of simulated closed chains (i.e. polygons) of equal edge lengths (such polygons are called equilateral random polygons). However finding efficient algorithms that properly sample the space of equilateral random polygons is a difficult problem. Currently there are no proven algorithms that generate equilateral random polygons with its theoretical distribution. In this paper we propose a method that generates equilateral random polygons in a 'step-wise uniform' way. We prove that this method is ergodic in the sense that any given equilateral random polygon can be generated by this method and we show that the time needed to generate an equilateral random polygon of length n is linear in terms of n. These two properties make this algorithm a big improvement over the existing generating methods. Detailed numerical comparisons of our algorithm with other widely used algorithms are provided.
Kim, Hyoungrae; Jang, Cheongyun; Yadav, Dharmendra K; Kim, Mi-Hyun
2017-03-23
The accuracy of any 3D-QSAR, Pharmacophore and 3D-similarity based chemometric target fishing models are highly dependent on a reasonable sample of active conformations. Since a number of diverse conformational sampling algorithm exist, which exhaustively generate enough conformers, however model building methods relies on explicit number of common conformers. In this work, we have attempted to make clustering algorithms, which could find reasonable number of representative conformer ensembles automatically with asymmetric dissimilarity matrix generated from openeye tool kit. RMSD was the important descriptor (variable) of each column of the N × N matrix considered as N variables describing the relationship (network) between the conformer (in a row) and the other N conformers. This approach used to evaluate the performance of the well-known clustering algorithms by comparison in terms of generating representative conformer ensembles and test them over different matrix transformation functions considering the stability. In the network, the representative conformer group could be resampled for four kinds of algorithms with implicit parameters. The directed dissimilarity matrix becomes the only input to the clustering algorithms. Dunn index, Davies-Bouldin index, Eta-squared values and omega-squared values were used to evaluate the clustering algorithms with respect to the compactness and the explanatory power. The evaluation includes the reduction (abstraction) rate of the data, correlation between the sizes of the population and the samples, the computational complexity and the memory usage as well. Every algorithm could find representative conformers automatically without any user intervention, and they reduced the data to 14-19% of the original values within 1.13 s per sample at the most. The clustering methods are simple and practical as they are fast and do not ask for any explicit parameters. RCDTC presented the maximum Dunn and omega-squared values of the four algorithms in addition to consistent reduction rate between the population size and the sample size. The performance of the clustering algorithms was consistent over different transformation functions. Moreover, the clustering method can also be applied to molecular dynamics sampling simulation results.
Ensemble habitat mapping of invasive plant species
Stohlgren, T.J.; Ma, P.; Kumar, S.; Rocca, M.; Morisette, J.T.; Jarnevich, C.S.; Benson, N.
2010-01-01
Ensemble species distribution models combine the strengths of several species environmental matching models, while minimizing the weakness of any one model. Ensemble models may be particularly useful in risk analysis of recently arrived, harmful invasive species because species may not yet have spread to all suitable habitats, leaving species-environment relationships difficult to determine. We tested five individual models (logistic regression, boosted regression trees, random forest, multivariate adaptive regression splines (MARS), and maximum entropy model or Maxent) and ensemble modeling for selected nonnative plant species in Yellowstone and Grand Teton National Parks, Wyoming; Sequoia and Kings Canyon National Parks, California, and areas of interior Alaska. The models are based on field data provided by the park staffs, combined with topographic, climatic, and vegetation predictors derived from satellite data. For the four invasive plant species tested, ensemble models were the only models that ranked in the top three models for both field validation and test data. Ensemble models may be more robust than individual species-environment matching models for risk analysis. ?? 2010 Society for Risk Analysis.
On the use of transition matrix methods with extended ensembles.
Escobedo, Fernando A; Abreu, Charlles R A
2006-03-14
Different extended ensemble schemes for non-Boltzmann sampling (NBS) of a selected reaction coordinate lambda were formulated so that they employ (i) "variable" sampling window schemes (that include the "successive umbrella sampling" method) to comprehensibly explore the lambda domain and (ii) transition matrix methods to iteratively obtain the underlying free-energy eta landscape (or "importance" weights) associated with lambda. The connection between "acceptance ratio" and transition matrix methods was first established to form the basis of the approach for estimating eta(lambda). The validity and performance of the different NBS schemes were then assessed using as lambda coordinate the configurational energy of the Lennard-Jones fluid. For the cases studied, it was found that the convergence rate in the estimation of eta is little affected by the use of data from high-order transitions, while it is noticeably improved by the use of a broader window of sampling in the variable window methods. Finally, it is shown how an "elastic" window of sampling can be used to effectively enact (nonuniform) preferential sampling over the lambda domain, and how to stitch the weights from separate one-dimensional NBS runs to produce a eta surface over a two-dimensional domain.
Helmholtz and Gibbs ensembles, thermodynamic limit and bistability in polymer lattice models
NASA Astrophysics Data System (ADS)
Giordano, Stefano
2017-12-01
Representing polymers by random walks on a lattice is a fruitful approach largely exploited to study configurational statistics of polymer chains and to develop efficient Monte Carlo algorithms. Nevertheless, the stretching and the folding/unfolding of polymer chains within the Gibbs (isotensional) and the Helmholtz (isometric) ensembles of the statistical mechanics have not been yet thoroughly analysed by means of the lattice methodology. This topic, motivated by the recent introduction of several single-molecule force spectroscopy techniques, is investigated in the present paper. In particular, we analyse the force-extension curves under the Gibbs and Helmholtz conditions and we give a proof of the ensembles equivalence in the thermodynamic limit for polymers represented by a standard random walk on a lattice. Then, we generalize these concepts for lattice polymers that can undergo conformational transitions or, equivalently, for chains composed of bistable or two-state elements (that can be either folded or unfolded). In this case, the isotensional condition leads to a plateau-like force-extension response, whereas the isometric condition causes a sawtooth-like force-extension curve, as predicted by numerous experiments. The equivalence of the ensembles is finally proved also for lattice polymer systems exhibiting conformational transitions.
Application of Ensemble Detection and Analysis to Modeling Uncertainty in Non Stationary Process
NASA Technical Reports Server (NTRS)
Racette, Paul
2010-01-01
Characterization of non stationary and nonlinear processes is a challenge in many engineering and scientific disciplines. Climate change modeling and projection, retrieving information from Doppler measurements of hydrometeors, and modeling calibration architectures and algorithms in microwave radiometers are example applications that can benefit from improvements in the modeling and analysis of non stationary processes. Analyses of measured signals have traditionally been limited to a single measurement series. Ensemble Detection is a technique whereby mixing calibrated noise produces an ensemble measurement set. The collection of ensemble data sets enables new methods for analyzing random signals and offers powerful new approaches to studying and analyzing non stationary processes. Derived information contained in the dynamic stochastic moments of a process will enable many novel applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tyson, Jon
2009-06-15
Matrix monotonicity is used to obtain upper bounds on minimum-error distinguishability of arbitrary ensembles of mixed quantum states. This generalizes one direction of a two-sided bound recently obtained by the author [J. Tyson, J. Math. Phys. 50, 032106 (2009)]. It is shown that the previously obtained special case has unique properties.
Multiple-instance ensemble learning for hyperspectral images
NASA Astrophysics Data System (ADS)
Ergul, Ugur; Bilgin, Gokhan
2017-10-01
An ensemble framework for multiple-instance (MI) learning (MIL) is introduced for use in hyperspectral images (HSIs) by inspiring the bagging (bootstrap aggregation) method in ensemble learning. Ensemble-based bagging is performed by a small percentage of training samples, and MI bags are formed by a local windowing process with variable window sizes on selected instances. In addition to bootstrap aggregation, random subspace is another method used to diversify base classifiers. The proposed method is implemented using four MIL classification algorithms. The classifier model learning phase is carried out with MI bags, and the estimation phase is performed over single-test instances. In the experimental part of the study, two different HSIs that have ground-truth information are used, and comparative results are demonstrated with state-of-the-art classification methods. In general, the MI ensemble approach produces more compact results in terms of both diversity and error compared to equipollent non-MIL algorithms.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Random-fractal Ansatz for the configurations of two-dimensional critical systems
NASA Astrophysics Data System (ADS)
Lee, Ching Hua; Ozaki, Dai; Matsueda, Hiroaki
2016-12-01
Critical systems have always intrigued physicists and precipitated the development of new techniques. Recently, there has been renewed interest in the information contained in the configurations of classical critical systems, whose computation do not require full knowledge of the wave function. Inspired by holographic duality, we investigated the entanglement properties of the classical configurations (snapshots) of the Potts model by introducing an Ansatz ensemble of random fractal images. By virtue of the central limit theorem, our Ansatz accurately reproduces the entanglement spectra of actual Potts snapshots without any fine tuning of parameters or artificial restrictions on ensemble choice. It provides a microscopic interpretation of the results of previous studies, which established a relation between the scaling behavior of snapshot entropy and the critical exponent. More importantly, it elucidates the role of ensemble disorder in restoring conformal invariance, an aspect previously ignored. Away from criticality, the breakdown of scale invariance leads to a renormalization of the parameter Σ in the random fractal Ansatz, whose variation can be used as an alternative determination of the critical exponent. We conclude by providing a recipe for the explicit construction of fractal unit cells consistent with a given scaling exponent.
Barvinsky, A O
2007-08-17
The density matrix of the Universe for the microcanonical ensemble in quantum cosmology describes an equipartition in the physical phase space of the theory (sum over everything), but in terms of the observable spacetime geometry this ensemble is peaked about the set of recently obtained cosmological instantons limited to a bounded range of the cosmological constant. This suggests the mechanism of constraining the landscape of string vacua and a possible solution to the dark energy problem in the form of the quasiequilibrium decay of the microcanonical state of the Universe.
Thermodynamic characterization of synchronization-optimized oscillator networks
NASA Astrophysics Data System (ADS)
Yanagita, Tatsuo; Ichinomiya, Takashi
2014-12-01
We consider a canonical ensemble of synchronization-optimized networks of identical oscillators under external noise. By performing a Markov chain Monte Carlo simulation using the Kirchhoff index, i.e., the sum of the inverse eigenvalues of the Laplacian matrix (as a graph Hamiltonian of the network), we construct more than 1 000 different synchronization-optimized networks. We then show that the transition from star to core-periphery structure depends on the connectivity of the network, and is characterized by the node degree variance of the synchronization-optimized ensemble. We find that thermodynamic properties such as heat capacity show anomalies for sparse networks.
Single-ping ADCP measurements in the Strait of Gibraltar
NASA Astrophysics Data System (ADS)
Sammartino, Simone; García Lafuente, Jesús; Naranjo, Cristina; Sánchez Garrido, José Carlos; Sánchez Leal, Ricardo
2016-04-01
In most Acoustic Doppler Current Profiler (ADCP) user manuals, it is widely recommended to apply ensemble averaging of the single-pings measurements, in order to obtain reliable observations of the current speed. The random error related to the single-ping measurement is typically too high to be used directly, while the averaging operation reduces the ensemble error of a factor of approximately √N, with N the number of averaged pings. A 75 kHz ADCP moored in the western exit of the Strait of Gibraltar, included in the long-term monitoring of the Mediterranean outflow, has recently served as test setup for a different approach to current measurements. The ensemble averaging has been disabled, while maintaining the internal coordinate conversion made by the instrument, and a series of single-ping measurements has been collected every 36 seconds during a period of approximately 5 months. The huge amount of data has been fluently handled by the instrument, and no abnormal battery consumption has been recorded. On the other hand a long and unique series of very high frequency current measurements has been collected. Results of this novel approach have been exploited in a dual way: from a statistical point of view, the availability of single-ping measurements allows a real estimate of the (a posteriori) ensemble average error of both current and ancillary variables. While the theoretical random error for horizontal velocity is estimated a priori as ˜2 cm s-1 for a 50 pings ensemble, the value obtained by the a posteriori averaging is ˜15 cm s-1, with an asymptotical behavior starting from an averaging size of 10 pings per ensemble. This result suggests the presence of external sources of random error (e.g.: turbulence), of higher magnitude than the internal sources (ADCP intrinsic precision), which cannot be reduced by the ensemble averaging. On the other hand, although the instrumental configuration is clearly not suitable for a precise estimation of turbulent parameters, some hints of the turbulent structure of the flow can be obtained by the empirical computation of zonal Reynolds stress (along the predominant direction of the current) and rate of production and dissipation of turbulent kinetic energy. All the parameters show a clear correlation with tidal fluctuations of the current, with maximum values coinciding with flood tides, during the maxima of the outflow Mediterranean current.
Understanding genetic regulatory networks
NASA Astrophysics Data System (ADS)
Kauffman, Stuart
2003-04-01
Random Boolean networks (RBM) were introduced about 35 years ago as first crude models of genetic regulatory networks. RBNs are comprised of N on-off genes, connected by a randomly assigned regulatory wiring diagram where each gene has K inputs, and each gene is controlled by a randomly assigned Boolean function. This procedure samples at random from the ensemble of all possible NK Boolean networks. The central ideas are to study the typical, or generic properties of this ensemble, and see 1) whether characteristic differences appear as K and biases in Boolean functions are introducted, and 2) whether a subclass of this ensemble has properties matching real cells. Such networks behave in an ordered or a chaotic regime, with a phase transition, "the edge of chaos" between the two regimes. Networks with continuous variables exhibit the same two regimes. Substantial evidence suggests that real cells are in the ordered regime. A key concept is that of an attractor. This is a reentrant trajectory of states of the network, called a state cycle. The central biological interpretation is that cell types are attractors. A number of properties differentiate the ordered and chaotic regimes. These include the size and number of attractors, the existence in the ordered regime of a percolating "sea" of genes frozen in the on or off state, with a remainder of isolated twinkling islands of genes, a power law distribution of avalanches of gene activity changes following perturbation to a single gene in the ordered regime versus a similar power law distribution plus a spike of enormous avalanches of gene changes in the chaotic regime, and the existence of branching pathway of "differentiation" between attractors induced by perturbations in the ordered regime. Noise is serious issue, since noise disrupts attractors. But numerical evidence suggests that attractors can be made very stable to noise, and meanwhile, metaplasias may be a biological manifestation of noise. As we learn more about the wiring diagram and constraints on rules controlling real genes, we can build refined ensembles reflecting these properties, study the generic properties of the refined ensembles, and hope to gain insight into the dynamics of real cells.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kroeninger, Kevin Alexander; /Bonn U.
2004-04-01
Using a data set of 158 and 169 pb{sup -1} of D0 Run-II data in the electron and muon plus jets channel, respectively, the top quark mass has been measured using the Matrix Element Method. The method and its implementation are described. Its performance is studied in Monte Carlo using ensemble tests and the method is applied to the Moriond 2004 data set.
Density matrix Monte Carlo modeling of quantum cascade lasers
NASA Astrophysics Data System (ADS)
Jirauschek, Christian
2017-10-01
By including elements of the density matrix formalism, the semiclassical ensemble Monte Carlo method for carrier transport is extended to incorporate incoherent tunneling, known to play an important role in quantum cascade lasers (QCLs). In particular, this effect dominates electron transport across thick injection barriers, which are frequently used in terahertz QCL designs. A self-consistent model for quantum mechanical dephasing is implemented, eliminating the need for empirical simulation parameters. Our modeling approach is validated against available experimental data for different types of terahertz QCL designs.
Coherent Magnetic Response at Optical Frequencies Using Atomic Transitions
NASA Astrophysics Data System (ADS)
Brewer, Nicholas R.; Buckholtz, Zachary N.; Simmons, Zachary J.; Mueller, Eli A.; Yavuz, Deniz D.
2017-01-01
In optics, the interaction of atoms with the magnetic field of light is almost always ignored since its strength is many orders of magnitude weaker compared to the interaction with the electric field. In this article, by using a magnetic-dipole transition within the 4 f shell of europium ions, we show a strong interaction between a green laser and an ensemble of atomic ions. The electrons move coherently between the ground and excited ionic levels (Rabi flopping) by interacting with the magnetic field of the laser. By measuring the Rabi flopping frequency as the laser intensity is varied, we report the first direct measurement of a magnetic-dipole matrix element in the optical region of the spectrum. Using density-matrix simulations of the ensemble, we infer the generation of coherent magnetization with magnitude 5.5 ×10-3 A /m , which is capable of generating left-handed electromagnetic waves of intensity 1 nW /cm2 . These results open up the prospect of constructing left-handed materials using sharp transitions of atoms.
Large unbalanced credit scoring using Lasso-logistic regression ensemble.
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.
NASA Astrophysics Data System (ADS)
Rowley, C. D.; Hogan, P. J.; Martin, P.; Thoppil, P.; Wei, M.
2017-12-01
An extended range ensemble forecast system is being developed in the US Navy Earth System Prediction Capability (ESPC), and a global ocean ensemble generation capability to represent uncertainty in the ocean initial conditions has been developed. At extended forecast times, the uncertainty due to the model error overtakes the initial condition as the primary source of forecast uncertainty. Recently, stochastic parameterization or stochastic forcing techniques have been applied to represent the model error in research and operational atmospheric, ocean, and coupled ensemble forecasts. A simple stochastic forcing technique has been developed for application to US Navy high resolution regional and global ocean models, for use in ocean-only and coupled atmosphere-ocean-ice-wave ensemble forecast systems. Perturbation forcing is added to the tendency equations for state variables, with the forcing defined by random 3- or 4-dimensional fields with horizontal, vertical, and temporal correlations specified to characterize different possible kinds of error. Here, we demonstrate the stochastic forcing in regional and global ensemble forecasts with varying perturbation amplitudes and length and time scales, and assess the change in ensemble skill measured by a range of deterministic and probabilistic metrics.
NASA Astrophysics Data System (ADS)
Spicer, Graham L. C.; Azarin, Samira M.; Yi, Ji; Young, Scott T.; Ellis, Ronald; Bauer, Greta M.; Shea, Lonnie D.; Backman, Vadim
2016-10-01
In cancer biology, there has been a recent effort to understand tumor formation in the context of the tissue microenvironment. In particular, recent progress has explored the mechanisms behind how changes in the cell-extracellular matrix ensemble influence progression of the disease. The extensive use of in vitro tissue culture models in simulant matrix has proven effective at studying such interactions, but modalities for non-invasively quantifying aspects of these systems are scant. We present the novel application of an imaging technique, Inverse Spectroscopic Optical Coherence Tomography, for the non-destructive measurement of in vitro biological samples during matrix remodeling. Our findings indicate that the nanoscale-sensitive mass density correlation shape factor D of cancer cells increases in response to a more crosslinked matrix. We present a facile technique for the non-invasive, quantitative study of the micro- and nano-scale structure of the extracellular matrix and its host cells.
Encoding of Spatial Attention by Primate Prefrontal Cortex Neuronal Ensembles
Treue, Stefan
2018-01-01
Abstract Single neurons in the primate lateral prefrontal cortex (LPFC) encode information about the allocation of visual attention and the features of visual stimuli. However, how this compares to the performance of neuronal ensembles at encoding the same information is poorly understood. Here, we recorded the responses of neuronal ensembles in the LPFC of two macaque monkeys while they performed a task that required attending to one of two moving random dot patterns positioned in different hemifields and ignoring the other pattern. We found single units selective for the location of the attended stimulus as well as for its motion direction. To determine the coding of both variables in the population of recorded units, we used a linear classifier and progressively built neuronal ensembles by iteratively adding units according to their individual performance (best single units), or by iteratively adding units based on their contribution to the ensemble performance (best ensemble). For both methods, ensembles of relatively small sizes (n < 60) yielded substantially higher decoding performance relative to individual single units. However, the decoder reached similar performance using fewer neurons with the best ensemble building method compared with the best single units method. Our results indicate that neuronal ensembles within the LPFC encode more information about the attended spatial and nonspatial features of visual stimuli than individual neurons. They further suggest that efficient coding of attention can be achieved by relatively small neuronal ensembles characterized by a certain relationship between signal and noise correlation structures. PMID:29568798
NASA Astrophysics Data System (ADS)
Yan, Y.; Barth, A.; Beckers, J. M.; Candille, G.; Brankart, J. M.; Brasseur, P.
2015-07-01
Sea surface height, sea surface temperature, and temperature profiles at depth collected between January and December 2005 are assimilated into a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. Sixty ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments. An incremental analysis update scheme is applied in order to reduce spurious oscillations due to the model state correction. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with independent/semiindependent observations. For deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations, in order to diagnose the ensemble distribution properties in a deterministic way. For probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centered random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system. The improvement of the assimilation is demonstrated using these validation metrics. Finally, the deterministic validation and the probabilistic validation are analyzed jointly. The consistency and complementarity between both validations are highlighted.
Shafizadeh-Moghadam, Hossein; Valavi, Roozbeh; Shahabi, Himan; Chapi, Kamran; Shirzadi, Ataollah
2018-07-01
In this research, eight individual machine learning and statistical models are implemented and compared, and based on their results, seven ensemble models for flood susceptibility assessment are introduced. The individual models included artificial neural networks, classification and regression trees, flexible discriminant analysis, generalized linear model, generalized additive model, boosted regression trees, multivariate adaptive regression splines, and maximum entropy, and the ensemble models were Ensemble Model committee averaging (EMca), Ensemble Model confidence interval Inferior (EMciInf), Ensemble Model confidence interval Superior (EMciSup), Ensemble Model to estimate the coefficient of variation (EMcv), Ensemble Model to estimate the mean (EMmean), Ensemble Model to estimate the median (EMmedian), and Ensemble Model based on weighted mean (EMwmean). The data set covered 201 flood events in the Haraz watershed (Mazandaran province in Iran) and 10,000 randomly selected non-occurrence points. Among the individual models, the Area Under the Receiver Operating Characteristic (AUROC), which showed the highest value, belonged to boosted regression trees (0.975) and the lowest value was recorded for generalized linear model (0.642). On the other hand, the proposed EMmedian resulted in the highest accuracy (0.976) among all models. In spite of the outstanding performance of some models, nevertheless, variability among the prediction of individual models was considerable. Therefore, to reduce uncertainty, creating more generalizable, more stable, and less sensitive models, ensemble forecasting approaches and in particular the EMmedian is recommended for flood susceptibility assessment. Copyright © 2018 Elsevier Ltd. All rights reserved.
Giuliani, Alessandro; Tomita, Masaru
2010-01-01
Cell fate decision remarkably generates specific cell differentiation path among the multiple possibilities that can arise through the complex interplay of high-dimensional genome activities. The coordinated action of thousands of genes to switch cell fate decision has indicated the existence of stable attractors guiding the process. However, origins of the intracellular mechanisms that create “cellular attractor” still remain unknown. Here, we examined the collective behavior of genome-wide expressions for neutrophil differentiation through two different stimuli, dimethyl sulfoxide (DMSO) and all-trans-retinoic acid (atRA). To overcome the difficulties of dealing with single gene expression noises, we grouped genes into ensembles and analyzed their expression dynamics in correlation space defined by Pearson correlation and mutual information. The standard deviation of correlation distributions of gene ensembles reduces when the ensemble size is increased following the inverse square root law, for both ensembles chosen randomly from whole genome and ranked according to expression variances across time. Choosing the ensemble size of 200 genes, we show the two probability distributions of correlations of randomly selected genes for atRA and DMSO responses overlapped after 48 hours, defining the neutrophil attractor. Next, tracking the ranked ensembles' trajectories, we noticed that only certain, not all, fall into the attractor in a fractal-like manner. The removal of these genome elements from the whole genomes, for both atRA and DMSO responses, destroys the attractor providing evidence for the existence of specific genome elements (named “genome vehicle”) responsible for the neutrophil attractor. Notably, within the genome vehicles, genes with low or moderate expression changes, which are often considered noisy and insignificant, are essential components for the creation of the neutrophil attractor. Further investigations along with our findings might provide a comprehensive mechanistic view of cell fate decision. PMID:20725638
The fast algorithm of spark in compressive sensing
NASA Astrophysics Data System (ADS)
Xie, Meihua; Yan, Fengxia
2017-01-01
Compressed Sensing (CS) is an advanced theory on signal sampling and reconstruction. In CS theory, the reconstruction condition of signal is an important theory problem, and spark is a good index to study this problem. But the computation of spark is NP hard. In this paper, we study the problem of computing spark. For some special matrixes, for example, the Gaussian random matrix and 0-1 random matrix, we obtain some conclusions. Furthermore, for Gaussian random matrix with fewer rows than columns, we prove that its spark equals to the number of its rows plus one with probability 1. For general matrix, two methods are given to compute its spark. One is the method of directly searching and the other is the method of dual-tree searching. By simulating 24 Gaussian random matrixes and 18 0-1 random matrixes, we tested the computation time of these two methods. Numerical results showed that the dual-tree searching method had higher efficiency than directly searching, especially for those matrixes which has as much as rows and columns.
Quantifying polypeptide conformational space: sensitivity to conformation and ensemble definition.
Sullivan, David C; Lim, Carmay
2006-08-24
Quantifying the density of conformations over phase space (the conformational distribution) is needed to model important macromolecular processes such as protein folding. In this work, we quantify the conformational distribution for a simple polypeptide (N-mer polyalanine) using the cumulative distribution function (CDF), which gives the probability that two randomly selected conformations are separated by less than a "conformational" distance and whose inverse gives conformation counts as a function of conformational radius. An important finding is that the conformation counts obtained by the CDF inverse depend critically on the assignment of a conformation's distance span and the ensemble (e.g., unfolded state model): varying ensemble and conformation definition (1 --> 2 A) varies the CDF-based conformation counts for Ala(50) from 10(11) to 10(69). In particular, relatively short molecular dynamics (MD) relaxation of Ala(50)'s random-walk ensemble reduces the number of conformers from 10(55) to 10(14) (using a 1 A root-mean-square-deviation radius conformation definition) pointing to potential disconnections in comparing the results from simplified models of unfolded proteins with those from all-atom MD simulations. Explicit waters are found to roughen the landscape considerably. Under some common conformation definitions, the results herein provide (i) an upper limit to the number of accessible conformations that compose unfolded states of proteins, (ii) the optimal clustering radius/conformation radius for counting conformations for a given energy and solvent model, (iii) a means of comparing various studies, and (iv) an assessment of the applicability of random search in protein folding.
A random wave model for the Aharonov-Bohm effect
NASA Astrophysics Data System (ADS)
Houston, Alexander J. H.; Gradhand, Martin; Dennis, Mark R.
2017-05-01
We study an ensemble of random waves subject to the Aharonov-Bohm effect. The introduction of a point with a magnetic flux of arbitrary strength into a random wave ensemble gives a family of wavefunctions whose distribution of vortices (complex zeros) is responsible for the topological phase associated with the Aharonov-Bohm effect. Analytical expressions are found for the vortex number and topological charge densities as functions of distance from the flux point. Comparison is made with the distribution of vortices in the isotropic random wave model. The results indicate that as the flux approaches half-integer values, a vortex with the same sign as the fractional part of the flux is attracted to the flux point, merging with it in the limit of half-integer flux. We construct a statistical model of the neighbourhood of the flux point to study how this vortex-flux merger occurs in more detail. Other features of the Aharonov-Bohm vortex distribution are also explored.
Zhang, Duan Z.; Padrino, Juan C.
2017-06-01
The ensemble averaging technique is applied to model mass transport by diffusion in random networks. The system consists of an ensemble of random networks, where each network is made of pockets connected by tortuous channels. Inside a channel, fluid transport is assumed to be governed by the one-dimensional diffusion equation. Mass balance leads to an integro-differential equation for the pocket mass density. The so-called dual-porosity model is found to be equivalent to the leading order approximation of the integration kernel when the diffusion time scale inside the channels is small compared to the macroscopic time scale. As a test problem,more » we consider the one-dimensional mass diffusion in a semi-infinite domain. Because of the required time to establish the linear concentration profile inside a channel, for early times the similarity variable is xt $-$1/4 rather than xt $-$1/2 as in the traditional theory. We found this early time similarity can be explained by random walk theory through the network.« less
NASA Astrophysics Data System (ADS)
Zhang, Wei; Ding, Dong-Sheng; Shi, Shuai; Li, Yan; Zhou, Zhi-Yuan; Shi, Bao-Sen; Guo, Guang-Can
2016-02-01
Quantum memory is an essential building block for quantum communication and scalable linear quantum computation. Storing two-color entangled photons with one photon being at the telecommunication (telecom) wavelength while the other photon is compatible with quantum memory has great advantages toward the realization of the fiber-based long-distance quantum communication with the aid of quantum repeaters. Here, we report an experimental realization of storing a photon entangled with a telecom photon in polarization as an atomic spin wave in a cold atomic ensemble, thus establishing the entanglement between the telecom-band photon and the atomic-ensemble memory in a polarization degree of freedom. The reconstructed density matrix and the violation of the Clauser-Horne-Shimony-Holt inequality clearly show the preservation of quantum entanglement during storage. Our result is very promising for establishing a long-distance quantum network based on cold atomic ensembles.
Laser transit anemometer software development program
NASA Technical Reports Server (NTRS)
Abbiss, John B.
1989-01-01
Algorithms were developed for the extraction of two components of mean velocity, standard deviation, and the associated correlation coefficient from laser transit anemometry (LTA) data ensembles. The solution method is based on an assumed two-dimensional Gaussian probability density function (PDF) model of the flow field under investigation. The procedure consists of transforming the data ensembles from the data acquisition domain (consisting of time and angle information) to the velocity space domain (consisting of velocity component information). The mean velocity results are obtained from the data ensemble centroid. Through a least squares fitting of the transformed data to an ellipse representing the intersection of a plane with the PDF, the standard deviations and correlation coefficient are obtained. A data set simulation method is presented to test the data reduction process. Results of using the simulation system with a limited test matrix of input values is also given.
Information-Theoretic Uncertainty of SCFG-Modeled Folding Space of The Non-coding RNA
Manzourolajdad, Amirhossein; Wang, Yingfeng; Shaw, Timothy I.; Malmberg, Russell L.
2012-01-01
RNA secondary structure ensembles define probability distributions for alternative equilibrium secondary structures of an RNA sequence. Shannon’s Entropy is a measure for the amount of diversity present in any ensemble. In this work, Shannon’s entropy of the SCFG ensemble on an RNA sequence is derived and implemented in polynomial time for both structurally ambiguous and unambiguous grammars. Micro RNA sequences generally have low folding entropy, as previously discovered. Surprisingly, signs of significantly high folding entropy were observed in certain ncRNA families. More effective models coupled with targeted randomization tests can lead to a better insight into folding features of these families. PMID:23160142
NASA Astrophysics Data System (ADS)
Harrison, Judd; Davies, Christine T. H.; Wingate, Matthew; Hpqcd Collaboration
2018-03-01
We present results of a lattice QCD calculation of B →D* and Bs→Ds* axial vector matrix elements with both states at rest. These zero recoil matrix elements provide the normalization necessary to infer a value for the CKM matrix element |Vc b| from experimental measurements of B¯ 0→D*+ℓ-ν ¯ and B¯s0→Ds*+ℓ-ν¯ decay. Results are derived from correlation functions computed with highly improved staggered quarks (HISQ) for light, strange, and charm quark propagators, and nonrelativistic QCD for the bottom quark propagator. The calculation of correlation functions employs MILC Collaboration ensembles over a range of three lattice spacings. These gauge field configurations include sea quark effects of charm, strange, and equal-mass up and down quarks. We use ensembles with physically light up and down quarks, as well as heavier values. Our main results are FB→D *(1 )=0.895 ±0.01 0stat±0.024sys and FBs→Ds*(1 )=0.883 ±0.01 2stat±0.02 8sys . We discuss the consequences for |Vc b| in light of recent investigations into the extrapolation of experimental data to zero recoil.
Statistical hadronization and microcanonical ensemble
Becattini, F.; Ferroni, L.
2004-01-01
We present a Monte Carlo calculation of the microcanonical ensemble of the of the ideal hadron-resonance gas including all known states up to a mass of 1. 8 GeV, taking into account quantum statistics. The computing method is a development of a previous one based on a Metropolis Monte Carlo algorithm, with a the grand-canonical limit of the multi-species multiplicity distribution as proposal matrix. The microcanonical average multiplicities of the various hadron species are found to converge to the canonical ones for moderately low values of the total energy. This algorithm opens the way for event generators based for themore » statistical hadronization model.« less
CELES: CUDA-accelerated simulation of electromagnetic scattering by large ensembles of spheres
NASA Astrophysics Data System (ADS)
Egel, Amos; Pattelli, Lorenzo; Mazzamuto, Giacomo; Wiersma, Diederik S.; Lemmer, Uli
2017-09-01
CELES is a freely available MATLAB toolbox to simulate light scattering by many spherical particles. Aiming at high computational performance, CELES leverages block-diagonal preconditioning, a lookup-table approach to evaluate costly functions and massively parallel execution on NVIDIA graphics processing units using the CUDA computing platform. The combination of these techniques allows to efficiently address large electrodynamic problems (>104 scatterers) on inexpensive consumer hardware. In this paper, we validate near- and far-field distributions against the well-established multi-sphere T-matrix (MSTM) code and discuss the convergence behavior for ensembles of different sizes, including an exemplary system comprising 105 particles.
Impact of distributions on the archetypes and prototypes in heterogeneous nanoparticle ensembles.
Fernandez, Michael; Wilson, Hugh F; Barnard, Amanda S
2017-01-05
The magnitude and complexity of the structural and functional data available on nanomaterials requires data analytics, statistical analysis and information technology to drive discovery. We demonstrate that multivariate statistical analysis can recognise the sets of truly significant nanostructures and their most relevant properties in heterogeneous ensembles with different probability distributions. The prototypical and archetypal nanostructures of five virtual ensembles of Si quantum dots (SiQDs) with Boltzmann, frequency, normal, Poisson and random distributions are identified using clustering and archetypal analysis, where we find that their diversity is defined by size and shape, regardless of the type of distribution. At the complex hull of the SiQD ensembles, simple configuration archetypes can efficiently describe a large number of SiQDs, whereas more complex shapes are needed to represent the average ordering of the ensembles. This approach provides a route towards the characterisation of computationally intractable virtual nanomaterial spaces, which can convert big data into smart data, and significantly reduce the workload to simulate experimentally relevant virtual samples.
Thermalization threshold in models of 1D fermions
NASA Astrophysics Data System (ADS)
Mukerjee, Subroto; Modak, Ranjan; Ramswamy, Sriram
2013-03-01
The question of how isolated quantum systems thermalize is an interesting and open one. In this study we equate thermalization with non-integrability to try to answer this question. In particular, we study the effect of system size on the integrability of 1D systems of interacting fermions on a lattice. We find that for a finite-sized system, a non-zero value of an integrability breaking parameter is required to make an integrable system appear non-integrable. Using exact diagonalization and diagnostics such as energy level statistics and the Drude weight, we find that the threshold value of the integrability breaking parameter scales to zero as a power law with system size. We find the exponent to be the same for different models with its value depending on the random matrix ensemble describing the non-integrable system. We also study a simple analytical model of a non-integrable system with an integrable limit to better understand how a power law emerges.
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...
Teh, Seng Khoon; Zheng, Wei; Lau, David P; Huang, Zhiwei
2009-06-01
In this work, we evaluated the diagnostic ability of near-infrared (NIR) Raman spectroscopy associated with the ensemble recursive partitioning algorithm based on random forests for identifying cancer from normal tissue in the larynx. A rapid-acquisition NIR Raman system was utilized for tissue Raman measurements at 785 nm excitation, and 50 human laryngeal tissue specimens (20 normal; 30 malignant tumors) were used for NIR Raman studies. The random forests method was introduced to develop effective diagnostic algorithms for classification of Raman spectra of different laryngeal tissues. High-quality Raman spectra in the range of 800-1800 cm(-1) can be acquired from laryngeal tissue within 5 seconds. Raman spectra differed significantly between normal and malignant laryngeal tissues. Classification results obtained from the random forests algorithm on tissue Raman spectra yielded a diagnostic sensitivity of 88.0% and specificity of 91.4% for laryngeal malignancy identification. The random forests technique also provided variables importance that facilitates correlation of significant Raman spectral features with cancer transformation. This study shows that NIR Raman spectroscopy in conjunction with random forests algorithm has a great potential for the rapid diagnosis and detection of malignant tumors in the larynx.
NASA Astrophysics Data System (ADS)
Chen, Xin; Luo, Yong; Xing, Pei; Nie, Suping; Tian, Qinhua
2015-04-01
Two sets of gridded annual mean surface air temperature in past millennia over the Northern Hemisphere was constructed employing optimal interpolation (OI) method so as to merge the tree ring proxy records with the simulations from CMIP5 (the fifth phase of the Climate Model Intercomparison Project). Both the uncertainties in proxy reconstruction and model simulations can be taken into account applying OI algorithm. For better preservation of physical coordinated features and spatial-temporal completeness of climate variability in 7 copies of model results, we perform the Empirical Orthogonal Functions (EOF) analysis to truncate the ensemble mean field as the first guess (background field) for OI. 681 temperature sensitive tree-ring chronologies are collected and screened from International Tree Ring Data Bank (ITRDB) and Past Global Changes (PAGES-2k) project. Firstly, two methods (variance matching and linear regression) are employed to calibrate the tree ring chronologies with instrumental data (CRUTEM4v) individually. In addition, we also remove the bias of both the background field and proxy records relative to instrumental dataset. Secondly, time-varying background error covariance matrix (B) and static "observation" error covariance matrix (R) are calculated for OI frame. In our scheme, matrix B was calculated locally, and "observation" error covariance are partially considered in R matrix (the covariance value between the pairs of tree ring sites that are very close to each other would be counted), which is different from the traditional assumption that R matrix should be diagonal. Comparing our results, it turns out that regional averaged series are not sensitive to the selection for calibration methods. The Quantile-Quantile plots indicate regional climatologies based on both methods are tend to be more agreeable with regional reconstruction of PAGES-2k in 20th century warming period than in little ice age (LIA). Lager volcanic cooling response over Asia and Europe in context of recent millennium are detected in our datasets than that revealed in regional reconstruction from PAGES-2k network. Verification experiments have showed that the merging approach really reconcile the proxy data and model ensemble simulations in an optimal way (with smaller errors than both of them). Further research is needed to improve the error estimation on them.
Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
Rahman, Raziur; Haider, Saad; Ghosh, Souparno; Pal, Ranadip
2015-01-01
Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error. PMID:27081304
Typical performance of approximation algorithms for NP-hard problems
NASA Astrophysics Data System (ADS)
Takabe, Satoshi; Hukushima, Koji
2016-11-01
Typical performance of approximation algorithms is studied for randomized minimum vertex cover problems. A wide class of random graph ensembles characterized by an arbitrary degree distribution is discussed with the presentation of a theoretical framework. Herein, three approximation algorithms are examined: linear-programming relaxation, loopy-belief propagation, and the leaf-removal algorithm. The former two algorithms are analyzed using a statistical-mechanical technique, whereas the average-case analysis of the last one is conducted using the generating function method. These algorithms have a threshold in the typical performance with increasing average degree of the random graph, below which they find true optimal solutions with high probability. Our study reveals that there exist only three cases, determined by the order of the typical performance thresholds. In addition, we provide some conditions for classification of the graph ensembles and demonstrate explicitly some examples for the difference in thresholds.
NASA Astrophysics Data System (ADS)
Resseguier, V.; Memin, E.; Chapron, B.; Fox-Kemper, B.
2017-12-01
In order to better observe and predict geophysical flows, ensemble-based data assimilation methods are of high importance. In such methods, an ensemble of random realizations represents the variety of the simulated flow's likely behaviors. For this purpose, randomness needs to be introduced in a suitable way and physically-based stochastic subgrid parametrizations are promising paths. This talk will propose a new kind of such a parametrization referred to as modeling under location uncertainty. The fluid velocity is decomposed into a resolved large-scale component and an aliased small-scale one. The first component is possibly random but time-correlated whereas the second is white-in-time but spatially-correlated and possibly inhomogeneous and anisotropic. With such a velocity, the material derivative of any - possibly active - tracer is modified. Three new terms appear: a correction of the large-scale advection, a multiplicative noise and a possibly heterogeneous and anisotropic diffusion. This parameterization naturally ensures attractive properties such as energy conservation for each realization. Additionally, this stochastic material derivative and the associated Reynolds' transport theorem offer a systematic method to derive stochastic models. In particular, we will discuss the consequences of the Quasi-Geostrophic assumptions in our framework. Depending on the turbulence amount, different models with different physical behaviors are obtained. Under strong turbulence assumptions, a simplified diagnosis of frontolysis and frontogenesis at the surface of the ocean is possible in this framework. A Surface Quasi-Geostrophic (SQG) model with a weaker noise influence has also been simulated. A single realization better represents small scales than a deterministic SQG model at the same resolution. Moreover, an ensemble accurately predicts extreme events, bifurcations as well as the amplitudes and the positions of the simulation errors. Figure 1 highlights this last result and compares it to the strong error underestimation of an ensemble simulated from the deterministic dynamic with random initial conditions.
Universality of quantum information in chaotic CFTs
NASA Astrophysics Data System (ADS)
Lashkari, Nima; Dymarsky, Anatoly; Liu, Hong
2018-03-01
We study the Eigenstate Thermalization Hypothesis (ETH) in chaotic conformal field theories (CFTs) of arbitrary dimensions. Assuming local ETH, we compute the reduced density matrix of a ball-shaped subsystem of finite size in the infinite volume limit when the full system is an energy eigenstate. This reduced density matrix is close in trace distance to a density matrix, to which we refer as the ETH density matrix, that is independent of all the details of an eigenstate except its energy and charges under global symmetries. In two dimensions, the ETH density matrix is universal for all theories with the same value of central charge. We argue that the ETH density matrix is close in trace distance to the reduced density matrix of the (micro)canonical ensemble. We support the argument in higher dimensions by comparing the Von Neumann entropy of the ETH density matrix with the entropy of a black hole in holographic systems in the low temperature limit. Finally, we generalize our analysis to the coherent states with energy density that varies slowly in space, and show that locally such states are well described by the ETH density matrix.
Developing Novel Frameworks for Many-Body Ensembles
2016-03-17
RETURN YOUR FORM TO THE ABOVE ADDRESS. Massachusetts Institute of Technology (MIT) 77 Massachusetts Ave. NE18-901 Cambridge , MA 02139 -4307 15-Jul-2015...of-equilibrium dynamics and to estimate prob- Page 4 of 9 Figure 2: Illustration of the dendro- gram representation. The rectangle on the left shows...isolation as illustrated in Figure 4. Starting from random initial conditions, an ensemble of particle pairs was simulated to establish the long-time
Radiation dosimetry using three-dimensional optical random access memories
NASA Technical Reports Server (NTRS)
Moscovitch, M.; Phillips, G. W.
2001-01-01
Three-dimensional optical random access memories (3D ORAMs) are a new generation of high-density data storage devices. Binary information is stored and retrieved via a light induced reversible transformation of an ensemble of bistable photochromic molecules embedded in a polymer matrix. This paper describes the application of 3D ORAM materials to radiation dosimetry. It is shown both theoretically and experimentally, that ionizing radiation in the form of heavy charged particles is capable of changing the information originally stored on the ORAM material. The magnitude and spatial distribution of these changes are used as a measure of the absorbed dose, particle type and energy. The effects of exposure on 3D ORAM materials have been investigated for a variety of particle types and energies, including protons, alpha particles and 12C ions. The exposed materials are observed to fluoresce when exposed to laser light. The intensity and the depth of the fluorescence is dependent on the type and energy of the particle to which the materials were exposed. It is shown that these effects can be modeled using Monte Carlo calculations. The model provides a better understanding of the properties of these materials. which should prove useful for developing systems for charged particle and neutron dosimetry/detector applications. c2001 Published by Elsevier Science B.V.
Estimating the State of Aerodynamic Flows in the Presence of Modeling Errors
NASA Astrophysics Data System (ADS)
da Silva, Andre F. C.; Colonius, Tim
2017-11-01
The ensemble Kalman filter (EnKF) has been proven to be successful in fields such as meteorology, in which high-dimensional nonlinear systems render classical estimation techniques impractical. When the model used to forecast state evolution misrepresents important aspects of the true dynamics, estimator performance may degrade. In this work, parametrization and state augmentation are used to track misspecified boundary conditions (e.g., free stream perturbations). The resolution error is modeled as a Gaussian-distributed random variable with the mean (bias) and variance to be determined. The dynamics of the flow past a NACA 0009 airfoil at high angles of attack and moderate Reynolds number is represented by a Navier-Stokes equations solver with immersed boundaries capabilities. The pressure distribution on the airfoil or the velocity field in the wake, both randomized by synthetic noise, are sampled as measurement data and incorporated into the estimated state and bias following Kalman's analysis scheme. Insights about how to specify the modeling error covariance matrix and its impact on the estimator performance are conveyed. This work has been supported in part by a Grant from AFOSR (FA9550-14-1-0328) with Dr. Douglas Smith as program manager, and by a Science without Borders scholarship from the Ministry of Education of Brazil (Capes Foundation - BEX 12966/13-4).
Observability under recurrent loss of data
NASA Technical Reports Server (NTRS)
Luck, Rogelio; Ray, Asok; Halevi, Yoram
1992-01-01
An account is given of the concept of extended observability in finite-dimensional linear time-invariant systems under recurrent loss of data, where the state vector has to be reconstructed from an ensemble of sensor data at nonconsecutive samples. An at once necessary and sufficient condition for extended observability that can be expressed via a recursive relation is presented, together with such conditions for this as may be related to the characteristic polynomial of the state transition matrix in a discrete-time setting, or of the system matrix in a continuous-time setting.
Convergence in High Probability of the Quantum Diffusion in a Random Band Matrix Model
NASA Astrophysics Data System (ADS)
Margarint, Vlad
2018-06-01
We consider Hermitian random band matrices H in d ≥slant 1 dimensions. The matrix elements H_{xy}, indexed by x, y \\in Λ \\subset Z^d, are independent, uniformly distributed random variable if |x-y| is less than the band width W, and zero otherwise. We update the previous results of the converge of quantum diffusion in a random band matrix model from convergence of the expectation to convergence in high probability. The result is uniformly in the size |Λ| of the matrix.
Numerical approach on dynamic self-assembly of colloidal particles
NASA Astrophysics Data System (ADS)
Ibrahimi, Muhamet; Ilday, Serim; Makey, Ghaith; Pavlov, Ihor; Yavuz, Özgàn; Gulseren, Oguz; Ilday, Fatih Omer
Far from equilibrium systems of artificial ensembles are crucial for understanding many intelligent features in self-organized natural systems. However, the lack of established theory underlies a need for numerical implementations. Inspired by a novel work, we simulate a solution-suspended colloidal system that dynamically self assembles due to convective forces generated in the solvent when heated by a laser. In order to incorporate with random fluctuations of particles and continuously changing flow, we exploit a random-walk based Brownian motion model and a fluid dynamics solver prepared for games, respectively. Simulation results manage to fit to experiments and show many quantitative features of a non equilibrium dynamic self assembly, including phase space compression and an ensemble-energy input feedback loop.
Exactly solvable random graph ensemble with extensively many short cycles
NASA Astrophysics Data System (ADS)
Aguirre López, Fabián; Barucca, Paolo; Fekom, Mathilde; Coolen, Anthony C. C.
2018-02-01
We introduce and analyse ensembles of 2-regular random graphs with a tuneable distribution of short cycles. The phenomenology of these graphs depends critically on the scaling of the ensembles’ control parameters relative to the number of nodes. A phase diagram is presented, showing a second order phase transition from a connected to a disconnected phase. We study both the canonical formulation, where the size is large but fixed, and the grand canonical formulation, where the size is sampled from a discrete distribution, and show their equivalence in the thermodynamical limit. We also compute analytically the spectral density, which consists of a discrete set of isolated eigenvalues, representing short cycles, and a continuous part, representing cycles of diverging size.
Alternative Approaches to Land Initialization for Seasonal Precipitation and Temperature Forecasts
NASA Technical Reports Server (NTRS)
Koster, Randal; Suarez, Max; Liu, Ping; Jambor, Urszula
2004-01-01
The seasonal prediction system of the NASA Global Modeling and Assimilation Office is used to generate ensembles of summer forecasts utilizing realistic soil moisture initialization. To derive the realistic land states, we drive offline the system's land model with realistic meteorological forcing over the period 1979-1993 (in cooperation with the Global Land Data Assimilation System project at GSFC) and then extract the state variables' values on the chosen forecast start dates. A parallel series of forecast ensembles is performed with a random (though climatologically consistent) set of land initial conditions; by comparing the two sets of ensembles, we can isolate the impact of land initialization on forecast skill from that of the imposed SSTs. The base initialization experiment is supplemented with several forecast ensembles that use alternative initialization techniques. One ensemble addresses the impact of minimizing climate drift in the system through the scaling of the initial conditions, and another is designed to isolate the importance of the precipitation signal from that of all other signals in the antecedent offline forcing. A third ensemble includes a more realistic initialization of the atmosphere along with the land initialization. The impact of each variation on forecast skill is quantified.
A variational ensemble scheme for noisy image data assimilation
NASA Astrophysics Data System (ADS)
Yang, Yin; Robinson, Cordelia; Heitz, Dominique; Mémin, Etienne
2014-05-01
Data assimilation techniques aim at recovering a system state variables trajectory denoted as X, along time from partially observed noisy measurements of the system denoted as Y. These procedures, which couple dynamics and noisy measurements of the system, fulfill indeed a twofold objective. On one hand, they provide a denoising - or reconstruction - procedure of the data through a given model framework and on the other hand, they provide estimation procedures for unknown parameters of the dynamics. A standard variational data assimilation problem can be formulated as the minimization of the following objective function with respect to the initial discrepancy, η, from the background initial guess: δ« J(η(x)) = 1∥Xb (x) - X (t ,x)∥2 + 1 tf∥H(X (t,x ))- Y (t,x)∥2dt. 2 0 0 B 2 t0 R (1) where the observation operator H links the state variable and the measurements. The cost function can be interpreted as the log likelihood function associated to the a posteriori distribution of the state given the past history of measurements and the background. In this work, we aim at studying ensemble based optimal control strategies for data assimilation. Such formulation nicely combines the ingredients of ensemble Kalman filters and variational data assimilation (4DVar). It is also formulated as the minimization of the objective function (1), but similarly to ensemble filter, it introduces in its objective function an empirical ensemble-based background-error covariance defined as: B ≡ <(Xb -
Ensemble of One-Class Classifiers for Personal Risk Detection Based on Wearable Sensor Data.
Rodríguez, Jorge; Barrera-Animas, Ari Y; Trejo, Luis A; Medina-Pérez, Miguel Angel; Monroy, Raúl
2016-09-29
This study introduces the One-Class K-means with Randomly-projected features Algorithm (OCKRA). OCKRA is an ensemble of one-class classifiers built over multiple projections of a dataset according to random feature subsets. Algorithms found in the literature spread over a wide range of applications where ensembles of one-class classifiers have been satisfactorily applied; however, none is oriented to the area under our study: personal risk detection. OCKRA has been designed with the aim of improving the detection performance in the problem posed by the Personal RIsk DEtection(PRIDE) dataset. PRIDE was built based on 23 test subjects, where the data for each user were captured using a set of sensors embedded in a wearable band. The performance of OCKRA was compared against support vector machine and three versions of the Parzen window classifier. On average, experimental results show that OCKRA outperformed the other classifiers for at least 0.53% of the area under the curve (AUC). In addition, OCKRA achieved an AUC above 90% for more than 57% of the users.
Ensemble of One-Class Classifiers for Personal Risk Detection Based on Wearable Sensor Data
Rodríguez, Jorge; Barrera-Animas, Ari Y.; Trejo, Luis A.; Medina-Pérez, Miguel Angel; Monroy, Raúl
2016-01-01
This study introduces the One-Class K-means with Randomly-projected features Algorithm (OCKRA). OCKRA is an ensemble of one-class classifiers built over multiple projections of a dataset according to random feature subsets. Algorithms found in the literature spread over a wide range of applications where ensembles of one-class classifiers have been satisfactorily applied; however, none is oriented to the area under our study: personal risk detection. OCKRA has been designed with the aim of improving the detection performance in the problem posed by the Personal RIsk DEtection(PRIDE) dataset. PRIDE was built based on 23 test subjects, where the data for each user were captured using a set of sensors embedded in a wearable band. The performance of OCKRA was compared against support vector machine and three versions of the Parzen window classifier. On average, experimental results show that OCKRA outperformed the other classifiers for at least 0.53% of the area under the curve (AUC). In addition, OCKRA achieved an AUC above 90% for more than 57% of the users. PMID:27690054
Inhomogeneous diffusion and ergodicity breaking induced by global memory effects
NASA Astrophysics Data System (ADS)
Budini, Adrián A.
2016-11-01
We introduce a class of discrete random-walk model driven by global memory effects. At any time, the right-left transitions depend on the whole previous history of the walker, being defined by an urnlike memory mechanism. The characteristic function is calculated in an exact way, which allows us to demonstrate that the ensemble of realizations is ballistic. Asymptotically, each realization is equivalent to that of a biased Markovian diffusion process with transition rates that strongly differs from one trajectory to another. Using this "inhomogeneous diffusion" feature, the ergodic properties of the dynamics are analytically studied through the time-averaged moments. Even in the long-time regime, they remain random objects. While their average over realizations recovers the corresponding ensemble averages, departure between time and ensemble averages is explicitly shown through their probability densities. For the density of the second time-averaged moment, an ergodic limit and the limit of infinite lag times do not commutate. All these effects are induced by the memory effects. A generalized Einstein fluctuation-dissipation relation is also obtained for the time-averaged moments.
Multi-Conformer Ensemble Docking to Difficult Protein Targets
Ellingson, Sally R.; Miao, Yinglong; Baudry, Jerome; ...
2014-09-08
We investigate large-scale ensemble docking using five proteins from the Directory of Useful Decoys (DUD, dud.docking.org) for which docking to crystal structures has proven difficult. Molecular dynamics trajectories are produced for each protein and an ensemble of representative conformational structures extracted from the trajectories. Docking calculations are performed on these selected simulation structures and ensemble-based enrichment factors compared with those obtained using docking in crystal structures of the same protein targets or random selection of compounds. We also found simulation-derived snapshots with improved enrichment factors that increased the chemical diversity of docking hits for four of the five selected proteins.more » A combination of all the docking results obtained from molecular dynamics simulation followed by selection of top-ranking compounds appears to be an effective strategy for increasing the number and diversity of hits when using docking to screen large libraries of chemicals against difficult protein targets.« less
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)
2001-01-01
Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers' performance levels high is an important area of research. In this article, we explore input decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses them to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles (IDEs) outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains.
Novel layered clustering-based approach for generating ensemble of classifiers.
Rahman, Ashfaqur; Verma, Brijesh
2011-05-01
This paper introduces a novel concept for creating an ensemble of classifiers. The concept is based on generating an ensemble of classifiers through clustering of data at multiple layers. The ensemble classifier model generates a set of alternative clustering of a dataset at different layers by randomly initializing the clustering parameters and trains a set of base classifiers on the patterns at different clusters in different layers. A test pattern is classified by first finding the appropriate cluster at each layer and then using the corresponding base classifier. The decisions obtained at different layers are fused into a final verdict using majority voting. As the base classifiers are trained on overlapping patterns at different layers, the proposed approach achieves diversity among the individual classifiers. Identification of difficult-to-classify patterns through clustering as well as achievement of diversity through layering leads to better classification results as evidenced from the experimental results.
An Alternative Method for Computing Mean and Covariance Matrix of Some Multivariate Distributions
ERIC Educational Resources Information Center
Radhakrishnan, R.; Choudhury, Askar
2009-01-01
Computing the mean and covariance matrix of some multivariate distributions, in particular, multivariate normal distribution and Wishart distribution are considered in this article. It involves a matrix transformation of the normal random vector into a random vector whose components are independent normal random variables, and then integrating…
An optimal modification of a Kalman filter for time scales
NASA Technical Reports Server (NTRS)
Greenhall, C. A.
2003-01-01
The Kalman filter in question, which was implemented in the time scale algorithm TA(NIST), produces time scales with poor short-term stability. A simple modification of the error covariance matrix allows the filter to produce time scales with good stability at all averaging times, as verified by simulations of clock ensembles.
NASA Astrophysics Data System (ADS)
Yan, Yajing; Barth, Alexander; Beckers, Jean-Marie; Candille, Guillem; Brankart, Jean-Michel; Brasseur, Pierre
2016-04-01
In this paper, four assimilation schemes, including an intermittent assimilation scheme (INT) and three incremental assimilation schemes (IAU 0, IAU 50 and IAU 100), are compared in the same assimilation experiments with a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. The three IAU schemes differ from each other in the position of the increment update window that has the same size as the assimilation window. 0, 50 and 100 correspond to the degree of superposition of the increment update window on the current assimilation window. Sea surface height, sea surface temperature, and temperature profiles at depth collected between January and December 2005 are assimilated. Sixty ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments The relevance of each assimilation scheme is evaluated through analyses on thermohaline variables and the current velocities. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with independent/semi-independent observations. For deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations, in order to diagnose the ensemble distribution properties in a deterministic way. For probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centered random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system.
NASA Astrophysics Data System (ADS)
Yan, Yajing; Barth, Alexander; Beckers, Jean-Marie; Candille, Guillem; Brankart, Jean-Michel; Brasseur, Pierre
2015-04-01
Sea surface height, sea surface temperature and temperature profiles at depth collected between January and December 2005 are assimilated into a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. 60 ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments. Incremental analysis update scheme is applied in order to reduce spurious oscillations due to the model state correction. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with observations used in the assimilation experiments and independent observations, which goes further than most previous studies and constitutes one of the original points of this paper. Regarding the deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations in order to diagnose the ensemble distribution properties in a deterministic way. Regarding the probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centred random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system. The improvement of the assimilation is demonstrated using these validation metrics. Finally, the deterministic validation and the probabilistic validation are analysed jointly. The consistency and complementarity between both validations are highlighted. High reliable situations, in which the RMS error and the CRPS give the same information, are identified for the first time in this paper.
A Simple Approach to Account for Climate Model Interdependence in Multi-Model Ensembles
NASA Astrophysics Data System (ADS)
Herger, N.; Abramowitz, G.; Angelil, O. M.; Knutti, R.; Sanderson, B.
2016-12-01
Multi-model ensembles are an indispensable tool for future climate projection and its uncertainty quantification. Ensembles containing multiple climate models generally have increased skill, consistency and reliability. Due to the lack of agreed-on alternatives, most scientists use the equally-weighted multi-model mean as they subscribe to model democracy ("one model, one vote").Different research groups are known to share sections of code, parameterizations in their model, literature, or even whole model components. Therefore, individual model runs do not represent truly independent estimates. Ignoring this dependence structure might lead to a false model consensus, wrong estimation of uncertainty and effective number of independent models.Here, we present a way to partially address this problem by selecting a subset of CMIP5 model runs so that its climatological mean minimizes the RMSE compared to a given observation product. Due to the cancelling out of errors, regional biases in the ensemble mean are reduced significantly.Using a model-as-truth experiment we demonstrate that those regional biases persist into the future and we are not fitting noise, thus providing improved observationally-constrained projections of the 21st century. The optimally selected ensemble shows significantly higher global mean surface temperature projections than the original ensemble, where all the model runs are considered. Moreover, the spread is decreased well beyond that expected from the decreased ensemble size.Several previous studies have recommended an ensemble selection approach based on performance ranking of the model runs. Here, we show that this approach can perform even worse than randomly selecting ensemble members and can thus be harmful. We suggest that accounting for interdependence in the ensemble selection process is a necessary step for robust projections for use in impact assessments, adaptation and mitigation of climate change.
NASA Technical Reports Server (NTRS)
Keppenne, Christian L.; Rienecker, Michele M.; Koblinsky, Chester (Technical Monitor)
2001-01-01
A multivariate ensemble Kalman filter (MvEnKF) implemented on a massively parallel computer architecture has been implemented for the Poseidon ocean circulation model and tested with a Pacific Basin model configuration. There are about two million prognostic state-vector variables. Parallelism for the data assimilation step is achieved by regionalization of the background-error covariances that are calculated from the phase-space distribution of the ensemble. Each processing element (PE) collects elements of a matrix measurement functional from nearby PEs. To avoid the introduction of spurious long-range covariances associated with finite ensemble sizes, the background-error covariances are given compact support by means of a Hadamard (element by element) product with a three-dimensional canonical correlation function. The methodology and the MvEnKF configuration are discussed. It is shown that the regionalization of the background covariances; has a negligible impact on the quality of the analyses. The parallel algorithm is very efficient for large numbers of observations but does not scale well beyond 100 PEs at the current model resolution. On a platform with distributed memory, memory rather than speed is the limiting factor.
Relation Between Pore Size and the Compressibility of a Confined Fluid
Gor, Gennady Y.; Siderius, Daniel W.; Rasmussen, Christopher J.; Krekelberg, William P.; Shen, Vincent K.; Bernstein, Noam
2015-01-01
When a fluid is confined to a nanopore, its thermodynamic properties differ from the properties of a bulk fluid, so measuring such properties of the confined fluid can provide information about the pore sizes. Here we report a simple relation between the pore size and isothermal compressibility of argon confined in these pores. Compressibility is calculated from the fluctuations of the number of particles in the grand canonical ensemble using two different simulation techniques: conventional grand-canonical Monte Carlo and grand-canonical ensemble transition-matrix Monte Carlo. Our results provide a theoretical framework for extracting the information on the pore sizes of fluid-saturated samples by measuring the compressibility from ultrasonic experiments. PMID:26590541
Generalized Gibbs ensembles for quantum field theories
NASA Astrophysics Data System (ADS)
Essler, F. H. L.; Mussardo, G.; Panfil, M.
2015-05-01
We consider the nonequilibrium dynamics in quantum field theories (QFTs). After being prepared in a density matrix that is not an eigenstate of the Hamiltonian, such systems are expected to relax locally to a stationary state. In the presence of local conservation laws, these stationary states are believed to be described by appropriate generalized Gibbs ensembles. Here we demonstrate that in order to obtain a correct description of the stationary state, it is necessary to take into account conservation laws that are not (ultra)local in the usual sense of QFTs, but fulfill a significantly weaker form of locality. We discuss the implications of our results for integrable QFTs in one spatial dimension.
Level statistics of a noncompact cosmological billiard
NASA Astrophysics Data System (ADS)
Csordas, Andras; Graham, Robert; Szepfalusy, Peter
1991-08-01
A noncompact chaotic billiard on a two-dimensional space of constant negative curvature, the infinite equilateral triangle describing anisotropy oscillations in the very early universe, is studied quantum-mechanically. A Weyl formula with a logarithmic correction term is derived for the smoothed number of states function. For one symmetry class of the eigenfunctions, the level spacing distribution, the spectral rigidity Delta3, and the Sigma2 statistics are determined numerically using the finite matrix approximation. Systematic deviations are found both from the Gaussian orthogonal ensemble (GOE) and the Poissonian ensemble. However, good agreement with the GOE is found if the fundamental triangle is deformed in such a way that it no longer tiles the space.
Subsurface characterization with localized ensemble Kalman filter employing adaptive thresholding
NASA Astrophysics Data System (ADS)
Delijani, Ebrahim Biniaz; Pishvaie, Mahmoud Reza; Boozarjomehry, Ramin Bozorgmehry
2014-07-01
Ensemble Kalman filter, EnKF, as a Monte Carlo sequential data assimilation method has emerged promisingly for subsurface media characterization during past decade. Due to high computational cost of large ensemble size, EnKF is limited to small ensemble set in practice. This results in appearance of spurious correlation in covariance structure leading to incorrect or probable divergence of updated realizations. In this paper, a universal/adaptive thresholding method is presented to remove and/or mitigate spurious correlation problem in the forecast covariance matrix. This method is, then, extended to regularize Kalman gain directly. Four different thresholding functions have been considered to threshold forecast covariance and gain matrices. These include hard, soft, lasso and Smoothly Clipped Absolute Deviation (SCAD) functions. Three benchmarks are used to evaluate the performances of these methods. These benchmarks include a small 1D linear model and two 2D water flooding (in petroleum reservoirs) cases whose levels of heterogeneity/nonlinearity are different. It should be noted that beside the adaptive thresholding, the standard distance dependant localization and bootstrap Kalman gain are also implemented for comparison purposes. We assessed each setup with different ensemble sets to investigate the sensitivity of each method on ensemble size. The results indicate that thresholding of forecast covariance yields more reliable performance than Kalman gain. Among thresholding function, SCAD is more robust for both covariance and gain estimation. Our analyses emphasize that not all assimilation cycles do require thresholding and it should be performed wisely during the early assimilation cycles. The proposed scheme of adaptive thresholding outperforms other methods for subsurface characterization of underlying benchmarks.
A second-order unconstrained optimization method for canonical-ensemble density-functional methods
NASA Astrophysics Data System (ADS)
Nygaard, Cecilie R.; Olsen, Jeppe
2013-03-01
A second order converging method of ensemble optimization (SOEO) in the framework of Kohn-Sham Density-Functional Theory is presented, where the energy is minimized with respect to an ensemble density matrix. It is general in the sense that the number of fractionally occupied orbitals is not predefined, but rather it is optimized by the algorithm. SOEO is a second order Newton-Raphson method of optimization, where both the form of the orbitals and the occupation numbers are optimized simultaneously. To keep the occupation numbers between zero and two, a set of occupation angles is defined, from which the occupation numbers are expressed as trigonometric functions. The total number of electrons is controlled by a built-in second order restriction of the Newton-Raphson equations, which can be deactivated in the case of a grand-canonical ensemble (where the total number of electrons is allowed to change). To test the optimization method, dissociation curves for diatomic carbon are produced using different functionals for the exchange-correlation energy. These curves show that SOEO favors symmetry broken pure-state solutions when using functionals with exact exchange such as Hartree-Fock and Becke three-parameter Lee-Yang-Parr. This is explained by an unphysical contribution to the exact exchange energy from interactions between fractional occupations. For functionals without exact exchange, such as local density approximation or Becke Lee-Yang-Parr, ensemble solutions are favored at interatomic distances larger than the equilibrium distance. Calculations on the chromium dimer are also discussed. They show that SOEO is able to converge to ensemble solutions for systems that are more complicated than diatomic carbon.
Quantifying rapid changes in cardiovascular state with a moving ensemble average.
Cieslak, Matthew; Ryan, William S; Babenko, Viktoriya; Erro, Hannah; Rathbun, Zoe M; Meiring, Wendy; Kelsey, Robert M; Blascovich, Jim; Grafton, Scott T
2018-04-01
MEAP, the moving ensemble analysis pipeline, is a new open-source tool designed to perform multisubject preprocessing and analysis of cardiovascular data, including electrocardiogram (ECG), impedance cardiogram (ICG), and continuous blood pressure (BP). In addition to traditional ensemble averaging, MEAP implements a moving ensemble averaging method that allows for the continuous estimation of indices related to cardiovascular state, including cardiac output, preejection period, heart rate variability, and total peripheral resistance, among others. Here, we define the moving ensemble technique mathematically, highlighting its differences from fixed-window ensemble averaging. We describe MEAP's interface and features for signal processing, artifact correction, and cardiovascular-based fMRI analysis. We demonstrate the accuracy of MEAP's novel B point detection algorithm on a large collection of hand-labeled ICG waveforms. As a proof of concept, two subjects completed a series of four physical and cognitive tasks (cold pressor, Valsalva maneuver, video game, random dot kinetogram) on 3 separate days while ECG, ICG, and BP were recorded. Critically, the moving ensemble method reliably captures the rapid cyclical cardiovascular changes related to the baroreflex during the Valsalva maneuver and the classic cold pressor response. Cardiovascular measures were seen to vary considerably within repetitions of the same cognitive task for each individual, suggesting that a carefully designed paradigm could be used to capture fast-acting event-related changes in cardiovascular state. © 2017 Society for Psychophysiological Research.
Pourhoseingholi, Mohamad Amin; Kheirian, Sedigheh; Zali, Mohammad Reza
2017-12-01
Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients.
A molecular ensemble in the rER for procollagen maturation.
Ishikawa, Yoshihiro; Bächinger, Hans Peter
2013-11-01
Extracellular matrix (ECM) proteins create structural frameworks in tissues such as bone, skin, tendon and cartilage etc. These connective tissues play important roles in the development and homeostasis of organs. Collagen is the most abundant ECM protein and represents one third of all proteins in humans. The biosynthesis of ECM proteins occurs in the rough endoplasmic reticulum (rER). This review describes the current understanding of the biosynthesis and folding of procollagens, which are the precursor molecules of collagens, in the rER. Multiple folding enzymes and molecular chaperones are required for procollagen to establish specific posttranslational modifications, and facilitate folding and transport to the cell surface. Thus, this molecular ensemble in the rER contributes to ECM maturation and to the development and homeostasis of tissues. Mutations in this ensemble are likely candidates for connective tissue disorders. This article is part of a Special Issue entitled: Functional and structural diversity of endoplasmic reticulum. Copyright © 2013 Elsevier B.V. All rights reserved.
Strong diffusion formulation of Markov chain ensembles and its optimal weaker reductions
NASA Astrophysics Data System (ADS)
Güler, Marifi
2017-10-01
Two self-contained diffusion formulations, in the form of coupled stochastic differential equations, are developed for the temporal evolution of state densities over an ensemble of Markov chains evolving independently under a common transition rate matrix. Our first formulation derives from Kurtz's strong approximation theorem of density-dependent Markov jump processes [Stoch. Process. Their Appl. 6, 223 (1978), 10.1016/0304-4149(78)90020-0] and, therefore, strongly converges with an error bound of the order of lnN /N for ensemble size N . The second formulation eliminates some fluctuation variables, and correspondingly some noise terms, within the governing equations of the strong formulation, with the objective of achieving a simpler analytic formulation and a faster computation algorithm when the transition rates are constant or slowly varying. There, the reduction of the structural complexity is optimal in the sense that the elimination of any given set of variables takes place with the lowest attainable increase in the error bound. The resultant formulations are supported by numerical simulations.
Localization of soft modes at the depinning transition
NASA Astrophysics Data System (ADS)
Cao, Xiangyu; Bouzat, Sebastian; Kolton, Alejandro B.; Rosso, Alberto
2018-02-01
We characterize the soft modes of the dynamical matrix at the depinning transition, and compare the matrix with the properties of the Anderson model (and long-range generalizations). The density of states at the edge of the spectrum displays a universal linear tail, different from the Lifshitz tails. The eigenvectors are instead very similar in the two matrix ensembles. We focus on the ground state (soft mode), which represents the epicenter of avalanche instabilities. We expect it to be localized in all finite dimensions, and make a clear connection between its localization length and the Larkin length of the depinning model. In the fully connected model, we show that the weak-strong pinning transition coincides with a peculiar localization transition of the ground state.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Novaes, Marcel
2015-06-15
We consider the statistics of time delay in a chaotic cavity having M open channels, in the absence of time-reversal invariance. In the random matrix theory approach, we compute the average value of polynomial functions of the time delay matrix Q = − iħS{sup †}dS/dE, where S is the scattering matrix. Our results do not assume M to be large. In a companion paper, we develop a semiclassical approximation to S-matrix correlation functions, from which the statistics of Q can also be derived. Together, these papers contribute to establishing the conjectured equivalence between the random matrix and the semiclassical approaches.
Cell population modelling of yeast glycolytic oscillations.
Henson, Michael A; Müller, Dirk; Reuss, Matthias
2002-01-01
We investigated a cell-population modelling technique in which the population is constructed from an ensemble of individual cell models. The average value or the number distribution of any intracellular property captured by the individual cell model can be calculated by simulation of a sufficient number of individual cells. The proposed method is applied to a simple model of yeast glycolytic oscillations where synchronization of the cell population is mediated by the action of an excreted metabolite. We show that smooth one-dimensional distributions can be obtained with ensembles comprising 1000 individual cells. Random variations in the state and/or structure of individual cells are shown to produce complex dynamic behaviours which cannot be adequately captured by small ensembles. PMID:12206713
Synchronization of an ensemble of oscillators regulated by their spatial movement.
Sarkar, Sumantra; Parmananda, P
2010-12-01
Synchronization for a collection of oscillators residing in a finite two dimensional plane is explored. The coupling between any two oscillators in this array is unidirectional, viz., master-slave configuration. Initially the oscillators are distributed randomly in space and their autonomous time-periods follow a Gaussian distribution. The duty cycles of these oscillators, which work under an on-off scenario, are normally distributed as well. It is realized that random hopping of oscillators is a necessary condition for observing global synchronization in this ensemble of oscillators. Global synchronization in the context of the present work is defined as the state in which all the oscillators are rendered identical. Furthermore, there exists an optimal amplitude of random hopping for which the attainment of this global synchronization is the fastest. The present work is deemed to be of relevance to the synchronization phenomena exhibited by pulse coupled oscillators such as a collection of fireflies. © 2010 American Institute of Physics.
Random versus maximum entropy models of neural population activity
NASA Astrophysics Data System (ADS)
Ferrari, Ulisse; Obuchi, Tomoyuki; Mora, Thierry
2017-04-01
The principle of maximum entropy provides a useful method for inferring statistical mechanics models from observations in correlated systems, and is widely used in a variety of fields where accurate data are available. While the assumptions underlying maximum entropy are intuitive and appealing, its adequacy for describing complex empirical data has been little studied in comparison to alternative approaches. Here, data from the collective spiking activity of retinal neurons is reanalyzed. The accuracy of the maximum entropy distribution constrained by mean firing rates and pairwise correlations is compared to a random ensemble of distributions constrained by the same observables. For most of the tested networks, maximum entropy approximates the true distribution better than the typical or mean distribution from that ensemble. This advantage improves with population size, with groups as small as eight being almost always better described by maximum entropy. Failure of maximum entropy to outperform random models is found to be associated with strong correlations in the population.
Unimodular lattice triangulations as small-world and scale-free random graphs
NASA Astrophysics Data System (ADS)
Krüger, B.; Schmidt, E. M.; Mecke, K.
2015-02-01
Real-world networks, e.g., the social relations or world-wide-web graphs, exhibit both small-world and scale-free behaviour. We interpret lattice triangulations as planar graphs by identifying triangulation vertices with graph nodes and one-dimensional simplices with edges. Since these triangulations are ergodic with respect to a certain Pachner flip, applying different Monte Carlo simulations enables us to calculate average properties of random triangulations, as well as canonical ensemble averages, using an energy functional that is approximately the variance of the degree distribution. All considered triangulations have clustering coefficients comparable with real-world graphs; for the canonical ensemble there are inverse temperatures with small shortest path length independent of system size. Tuning the inverse temperature to a quasi-critical value leads to an indication of scale-free behaviour for degrees k≥slant 5. Using triangulations as a random graph model can improve the understanding of real-world networks, especially if the actual distance of the embedded nodes becomes important.
NASA Astrophysics Data System (ADS)
Akibue, Seiseki; Kato, Go
2018-04-01
For distinguishing quantum states sampled from a fixed ensemble, the gap in bipartite and single-party distinguishability can be interpreted as a nonlocality of the ensemble. In this paper, we consider bipartite state discrimination in a composite system consisting of N subsystems, where each subsystem is shared between two parties and the state of each subsystem is randomly sampled from a particular ensemble comprising the Bell states. We show that the success probability of perfectly identifying the state converges to 1 as N →∞ if the entropy of the probability distribution associated with the ensemble is less than 1, even if the success probability is less than 1 for any finite N . In other words, the nonlocality of the N -fold ensemble asymptotically disappears if the probability distribution associated with each ensemble is concentrated. Furthermore, we show that the disappearance of the nonlocality can be regarded as a remarkable counterexample of a fundamental open question in theoretical computer science, called a parallel repetition conjecture of interactive games with two classically communicating players. Measurements for the discrimination task include a projective measurement of one party represented by stabilizer states, which enable the other party to perfectly distinguish states that are sampled with high probability.
Ensemble of Thermostatically Controlled Loads: Statistical Physics Approach.
Chertkov, Michael; Chernyak, Vladimir
2017-08-17
Thermostatically controlled loads, e.g., air conditioners and heaters, are by far the most widespread consumers of electricity. Normally the devices are calibrated to provide the so-called bang-bang control - changing from on to off, and vice versa, depending on temperature. We considered aggregation of a large group of similar devices into a statistical ensemble, where the devices operate following the same dynamics, subject to stochastic perturbations and randomized, Poisson on/off switching policy. Using theoretical and computational tools of statistical physics, we analyzed how the ensemble relaxes to a stationary distribution and established a relationship between the relaxation and the statistics of the probability flux associated with devices' cycling in the mixed (discrete, switch on/off, and continuous temperature) phase space. This allowed us to derive the spectrum of the non-equilibrium (detailed balance broken) statistical system and uncover how switching policy affects oscillatory trends and the speed of the relaxation. Relaxation of the ensemble is of practical interest because it describes how the ensemble recovers from significant perturbations, e.g., forced temporary switching off aimed at utilizing the flexibility of the ensemble to provide "demand response" services to change consumption temporarily to balance a larger power grid. We discuss how the statistical analysis can guide further development of the emerging demand response technology.
Ensemble of Thermostatically Controlled Loads: Statistical Physics Approach
Chertkov, Michael; Chernyak, Vladimir
2017-01-17
Thermostatically Controlled Loads (TCL), e.g. air-conditioners and heaters, are by far the most wide-spread consumers of electricity. Normally the devices are calibrated to provide the so-called bang-bang control of temperature - changing from on to off , and vice versa, depending on temperature. Aggregation of a large group of similar devices into a statistical ensemble is considered, where the devices operate following the same dynamics subject to stochastic perturbations and randomized, Poisson on/off switching policy. We analyze, using theoretical and computational tools of statistical physics, how the ensemble relaxes to a stationary distribution and establish relation between the re- laxationmore » and statistics of the probability flux, associated with devices' cycling in the mixed (discrete, switch on/off , and continuous, temperature) phase space. This allowed us to derive and analyze spec- trum of the non-equilibrium (detailed balance broken) statistical system. and uncover how switching policy affects oscillatory trend and speed of the relaxation. Relaxation of the ensemble is of a practical interest because it describes how the ensemble recovers from significant perturbations, e.g. forceful temporary switching o aimed at utilizing flexibility of the ensemble in providing "demand response" services relieving consumption temporarily to balance larger power grid. We discuss how the statistical analysis can guide further development of the emerging demand response technology.« less
Ensemble of Thermostatically Controlled Loads: Statistical Physics Approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chertkov, Michael; Chernyak, Vladimir
Thermostatically Controlled Loads (TCL), e.g. air-conditioners and heaters, are by far the most wide-spread consumers of electricity. Normally the devices are calibrated to provide the so-called bang-bang control of temperature - changing from on to off , and vice versa, depending on temperature. Aggregation of a large group of similar devices into a statistical ensemble is considered, where the devices operate following the same dynamics subject to stochastic perturbations and randomized, Poisson on/off switching policy. We analyze, using theoretical and computational tools of statistical physics, how the ensemble relaxes to a stationary distribution and establish relation between the re- laxationmore » and statistics of the probability flux, associated with devices' cycling in the mixed (discrete, switch on/off , and continuous, temperature) phase space. This allowed us to derive and analyze spec- trum of the non-equilibrium (detailed balance broken) statistical system. and uncover how switching policy affects oscillatory trend and speed of the relaxation. Relaxation of the ensemble is of a practical interest because it describes how the ensemble recovers from significant perturbations, e.g. forceful temporary switching o aimed at utilizing flexibility of the ensemble in providing "demand response" services relieving consumption temporarily to balance larger power grid. We discuss how the statistical analysis can guide further development of the emerging demand response technology.« less
NASA Astrophysics Data System (ADS)
Zhao, Yan; Stratt, Richard M.
2018-05-01
Surprisingly long-ranged intermolecular correlations begin to appear in isotropic (orientationally disordered) phases of liquid crystal forming molecules when the temperature or density starts to close in on the boundary with the nematic (ordered) phase. Indeed, the presence of slowly relaxing, strongly orientationally correlated, sets of molecules under putatively disordered conditions ("pseudo-nematic domains") has been apparent for some time from light-scattering and optical-Kerr experiments. Still, a fully microscopic characterization of these domains has been lacking. We illustrate in this paper how pseudo-nematic domains can be studied in even relatively small computer simulations by looking for order-parameter tensor fluctuations much larger than one would expect from random matrix theory. To develop this idea, we show that random matrix theory offers an exact description of how the probability distribution for liquid-crystal order parameter tensors converges to its macroscopic-system limit. We then illustrate how domain properties can be inferred from finite-size-induced deviations from these random matrix predictions. A straightforward generalization of time-independent random matrix theory also allows us to prove that the analogous random matrix predictions for the time dependence of the order-parameter tensor are similarly exact in the macroscopic limit, and that relaxation behavior of the domains can be seen in the breakdown of the finite-size scaling required by that random-matrix theory.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
NASA Astrophysics Data System (ADS)
Oh, Seok-Geun; Suh, Myoung-Seok
2017-07-01
The projection skills of five ensemble methods were analyzed according to simulation skills, training period, and ensemble members, using 198 sets of pseudo-simulation data (PSD) produced by random number generation assuming the simulated temperature of regional climate models. The PSD sets were classified into 18 categories according to the relative magnitude of bias, variance ratio, and correlation coefficient, where each category had 11 sets (including 1 truth set) with 50 samples. The ensemble methods used were as follows: equal weighted averaging without bias correction (EWA_NBC), EWA with bias correction (EWA_WBC), weighted ensemble averaging based on root mean square errors and correlation (WEA_RAC), WEA based on the Taylor score (WEA_Tay), and multivariate linear regression (Mul_Reg). The projection skills of the ensemble methods improved generally as compared with the best member for each category. However, their projection skills are significantly affected by the simulation skills of the ensemble member. The weighted ensemble methods showed better projection skills than non-weighted methods, in particular, for the PSD categories having systematic biases and various correlation coefficients. The EWA_NBC showed considerably lower projection skills than the other methods, in particular, for the PSD categories with systematic biases. Although Mul_Reg showed relatively good skills, it showed strong sensitivity to the PSD categories, training periods, and number of members. On the other hand, the WEA_Tay and WEA_RAC showed relatively superior skills in both the accuracy and reliability for all the sensitivity experiments. This indicates that WEA_Tay and WEA_RAC are applicable even for simulation data with systematic biases, a short training period, and a small number of ensemble members.
NASA Astrophysics Data System (ADS)
Bi, Lei; Yang, Ping
2015-04-01
Understanding the inherent optical properties (IOPs) of coccoliths and coccolithophores is important in oceanic radiative transfer simulations and remote sensing implementations. In this study, the invariant imbedding T-matrix method (II-TM) is employed to investigate the IOPs of coccoliths and coccolithophores. The Emiliania huxleyi (Ehux) coccolith and coccolithophore models are built based on observed biometric parameters including the eccentricity, the number of slits, and the rim width of detached coccoliths. The calcification state that specifies the amount of calcium of a single coccolith is critical in the determination of the size-volume/mass relationship (note, the volume/mass of coccoltihs at different calcification states are different although the diameters are the same). The present results show that the calcification state, namely, under-calcification, normal-calcification, or over-calcification, significantly influences the backscattering cross section and the phase matrix. Furthermore, the linear depolarization ratio of the light scattered by coccoliths is sensitive to the degree of calcification, and provides a potentially valuable parameter for interpreting oceanic remote sensing data. The phase function of an ensemble of randomly oriented coccolithophores has a similar pattern to that of individual coccoliths, but the forward scattering is dominant in the coccolithophores due to the large geometric cross sections. The linear depolarization ratio associated with coccolithophores is found to be larger than that for coccoliths as polarization is more sensitive to multiple scattering than the phase function. The simulated coccolithophore phase matrix numerical results are compared with laboratory measurements. For scattering angles larger than 100°, an increase of the phase function with respect to the scattering angle is confirmed based on the present coccolithophore model while the spherical approximation fails.
Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo
2016-01-01
Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble's output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) - k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer's disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.
Prediction of drug synergy in cancer using ensemble-based machine learning techniques
NASA Astrophysics Data System (ADS)
Singh, Harpreet; Rana, Prashant Singh; Singh, Urvinder
2018-04-01
Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug-drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.
NASA Astrophysics Data System (ADS)
Liu, Danian; Zhu, Jiang; Shu, Yeqiang; Wang, Dongxiao; Wang, Weiqiang; Cai, Shuqun
2018-06-01
The Northwestern Tropical Pacific Ocean (NWTPO) moorings observing system, including 15 moorings, was established in 2013 to provide velocity profile data. Observing system simulation experiments (OSSEs) were carried out to assess the ability of the observation system to monitor intraseasonal variability in a pilot study, where ideal "mooring-observed" velocity was assimilated using Ensemble Optimal Interpolation (EnOI) based on the Regional Oceanic Modeling System (ROMS). Because errors between the control and "nature" runs have a mesoscale structure, a random ensemble derived from 20-90-day bandpass-filtered nine-year model outputs is proved to be more appropriate for the NWTPO mooring array assimilation than a random ensemble derived from a 30-day running mean. The simulation of the intraseasonal currents in the North Equatorial Current (NEC), North Equatorial Countercurrent (NECC), and Equatorial Undercurrent (EUC) areas can be improved by assimilating velocity profiles using a 20-90-day bandpass-filtered ensemble. The root mean square errors (RMSEs) of the intraseasonal zonal (U) and meridional velocity (V) above 500 m depth within the study area (between 0°N-18°N and 122°E-147°E) were reduced by 15.4% and 16.9%, respectively. Improvements in the downstream area of the NEC moorings transect were optimum where the RMSEs of the intraseasonal velocities above 500 m were reduced by more than 30%. Assimilating velocity profiles can have a positive impact on the simulation and forecast of thermohaline structure and sea level anomalies in the ocean.
NASA Astrophysics Data System (ADS)
Ushenko, Yu. A.; Prysyazhnyuk, V. P.; Gavrylyak, M. S.; Gorsky, M. P.; Bachinskiy, V. T.; Vanchuliak, O. Ya.
2015-02-01
A new information optical technique of diagnostics of the structure of polycrystalline films of blood plasma is proposed. The model of Mueller-matrix description of mechanisms of optical anisotropy of such objects as optical activity, birefringence, as well as linear and circular dichroism is suggested. The ensemble of informationally topical azimuthally stable Mueller-matrix invariants is determined. Within the statistical analysis of such parameters distributions the objective criteria of differentiation of films of blood plasma taken from healthy and patients with liver cirrhosis were determined. From the point of view of probative medicine the operational characteristics (sensitivity, specificity and accuracy) of the information-optical method of Mueller-matrix mapping of polycrystalline films of blood plasma were found and its efficiency in diagnostics of liver cirrhosis was demonstrated. Prospects of application of the method in experimental medicine to differentiate postmortem changes of the myocardial tissue was examined.
Topology determines force distributions in one-dimensional random spring networks.
Heidemann, Knut M; Sageman-Furnas, Andrew O; Sharma, Abhinav; Rehfeldt, Florian; Schmidt, Christoph F; Wardetzky, Max
2018-02-01
Networks of elastic fibers are ubiquitous in biological systems and often provide mechanical stability to cells and tissues. Fiber-reinforced materials are also common in technology. An important characteristic of such materials is their resistance to failure under load. Rupture occurs when fibers break under excessive force and when that failure propagates. Therefore, it is crucial to understand force distributions. Force distributions within such networks are typically highly inhomogeneous and are not well understood. Here we construct a simple one-dimensional model system with periodic boundary conditions by randomly placing linear springs on a circle. We consider ensembles of such networks that consist of N nodes and have an average degree of connectivity z but vary in topology. Using a graph-theoretical approach that accounts for the full topology of each network in the ensemble, we show that, surprisingly, the force distributions can be fully characterized in terms of the parameters (N,z). Despite the universal properties of such (N,z) ensembles, our analysis further reveals that a classical mean-field approach fails to capture force distributions correctly. We demonstrate that network topology is a crucial determinant of force distributions in elastic spring networks.
Topology determines force distributions in one-dimensional random spring networks
NASA Astrophysics Data System (ADS)
Heidemann, Knut M.; Sageman-Furnas, Andrew O.; Sharma, Abhinav; Rehfeldt, Florian; Schmidt, Christoph F.; Wardetzky, Max
2018-02-01
Networks of elastic fibers are ubiquitous in biological systems and often provide mechanical stability to cells and tissues. Fiber-reinforced materials are also common in technology. An important characteristic of such materials is their resistance to failure under load. Rupture occurs when fibers break under excessive force and when that failure propagates. Therefore, it is crucial to understand force distributions. Force distributions within such networks are typically highly inhomogeneous and are not well understood. Here we construct a simple one-dimensional model system with periodic boundary conditions by randomly placing linear springs on a circle. We consider ensembles of such networks that consist of N nodes and have an average degree of connectivity z but vary in topology. Using a graph-theoretical approach that accounts for the full topology of each network in the ensemble, we show that, surprisingly, the force distributions can be fully characterized in terms of the parameters (N ,z ) . Despite the universal properties of such (N ,z ) ensembles, our analysis further reveals that a classical mean-field approach fails to capture force distributions correctly. We demonstrate that network topology is a crucial determinant of force distributions in elastic spring networks.
Arshad, Sannia; Rho, Seungmin
2014-01-01
We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes. PMID:25295302
Khalid, Shehzad; Arshad, Sannia; Jabbar, Sohail; Rho, Seungmin
2014-01-01
We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes.
Optical hyperpolarization of 13C nuclear spins in nanodiamond ensembles
NASA Astrophysics Data System (ADS)
Chen, Q.; Schwarz, I.; Jelezko, F.; Retzker, A.; Plenio, M. B.
2015-11-01
Dynamical nuclear polarization holds the key for orders of magnitude enhancements of nuclear magnetic resonance signals which, in turn, would enable a wide range of novel applications in biomedical sciences. However, current implementations of DNP require cryogenic temperatures and long times for achieving high polarization. Here we propose and analyze in detail protocols that can achieve rapid hyperpolarization of 13C nuclear spins in randomly oriented ensembles of nanodiamonds at room temperature. Our protocols exploit a combination of optical polarization of electron spins in nitrogen-vacancy centers and the transfer of this polarization to 13C nuclei by means of microwave control to overcome the severe challenges that are posed by the random orientation of the nanodiamonds and their nitrogen-vacancy centers. Specifically, these random orientations result in exceedingly large energy variations of the electron spin levels that render the polarization and coherent control of the nitrogen-vacancy center electron spins as well as the control of their coherent interaction with the surrounding 13C nuclear spins highly inefficient. We address these challenges by a combination of an off-resonant microwave double resonance scheme in conjunction with a realization of the integrated solid effect which, together with adiabatic rotations of external magnetic fields or rotations of nanodiamonds, leads to a protocol that achieves high levels of hyperpolarization of the entire nuclear-spin bath in a randomly oriented ensemble of nanodiamonds even at room temperature. This hyperpolarization together with the long nuclear-spin polarization lifetimes in nanodiamonds and the relatively high density of 13C nuclei has the potential to result in a major signal enhancement in 13C nuclear magnetic resonance imaging and suggests functionalized and hyperpolarized nanodiamonds as a unique probe for molecular imaging both in vitro and in vivo.
Analytical Applications of Monte Carlo Techniques.
ERIC Educational Resources Information Center
Guell, Oscar A.; Holcombe, James A.
1990-01-01
Described are analytical applications of the theory of random processes, in particular solutions obtained by using statistical procedures known as Monte Carlo techniques. Supercomputer simulations, sampling, integration, ensemble, annealing, and explicit simulation are discussed. (CW)
On the error probability of general tree and trellis codes with applications to sequential decoding
NASA Technical Reports Server (NTRS)
Johannesson, R.
1973-01-01
An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random binary tree codes is derived and shown to be independent of the length of the tree. An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random L-branch binary trellis codes of rate R = 1/n is derived which separates the effects of the tail length T and the memory length M of the code. It is shown that the bound is independent of the length L of the information sequence. This implication is investigated by computer simulations of sequential decoding utilizing the stack algorithm. These simulations confirm the implication and further suggest an empirical formula for the true undetected decoding error probability with sequential decoding.
Wang, Qi; Xie, Zhiyi; Li, Fangbai
2015-11-01
This study aims to identify and apportion multi-source and multi-phase heavy metal pollution from natural and anthropogenic inputs using ensemble models that include stochastic gradient boosting (SGB) and random forest (RF) in agricultural soils on the local scale. The heavy metal pollution sources were quantitatively assessed, and the results illustrated the suitability of the ensemble models for the assessment of multi-source and multi-phase heavy metal pollution in agricultural soils on the local scale. The results of SGB and RF consistently demonstrated that anthropogenic sources contributed the most to the concentrations of Pb and Cd in agricultural soils in the study region and that SGB performed better than RF. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Engeland, Kolbjorn; Steinsland, Ingelin
2016-04-01
The aim of this study is to investigate how the inclusion of uncertainties in inputs and observed streamflow influence the parameter estimation, streamflow predictions and model evaluation. In particular we wanted to answer the following research questions: • What is the effect of including a random error in the precipitation and temperature inputs? • What is the effect of decreased information about precipitation by excluding the nearest precipitation station? • What is the effect of the uncertainty in streamflow observations? • What is the effect of reduced information about the true streamflow by using a rating curve where the measurement of the highest and lowest streamflow is excluded when estimating the rating curve? To answer these questions, we designed a set of calibration experiments and evaluation strategies. We used the elevation distributed HBV model operating on daily time steps combined with a Bayesian formulation and the MCMC routine Dream for parameter inference. The uncertainties in inputs was represented by creating ensembles of precipitation and temperature. The precipitation ensemble were created using a meta-gaussian random field approach. The temperature ensembles were created using a 3D Bayesian kriging with random sampling of the temperature laps rate. The streamflow ensembles were generated by a Bayesian multi-segment rating curve model. Precipitation and temperatures were randomly sampled for every day, whereas the streamflow ensembles were generated from rating curve ensembles, and the same rating curve was always used for the whole time series in a calibration or evaluation run. We chose a catchment with a meteorological station measuring precipitation and temperature, and a rating curve of relatively high quality. This allowed us to investigate and further test the effect of having less information on precipitation and streamflow during model calibration, predictions and evaluation. The results showed that including uncertainty in the precipitation and temperature input has a negligible effect on the posterior distribution of parameters and for the Nash-Sutcliffe (NS) efficiency for the predicted flows, while the reliability and the continuous rank probability score (CRPS) improves. Reduced information in precipitation input resulted in a and a shift in the water balance parameter Pcorr, a model producing smoother streamflow predictions giving poorer NS and CRPS, but higher reliability. The effect of calibrating the hydrological model using wrong rating curves is mainly seen as variability in the water balance parameter Pcorr. When evaluating predictions obtained using a wrong rating curve, the evaluation scores varies depending on the true rating curve. Generally, the best evaluation scores were not achieved for the rating curve used for calibration, but for a rating curves giving low variance in streamflow observations. Reduced information in streamflow influenced the water balance parameter Pcorr, and increased the spread in evaluation scores giving both better and worse scores. This case study shows that estimating the water balance is challenging since both precipitation inputs and streamflow observations have pronounced systematic component in their uncertainties.
Breaking time reversal in a simple smooth chaotic system.
Tomsovic, Steven; Ullmo, Denis; Nagano, Tatsuro
2003-06-01
Within random matrix theory, the statistics of the eigensolutions depend fundamentally on the presence (or absence) of time reversal symmetry. Accepting the Bohigas-Giannoni-Schmit conjecture, this statement extends to quantum systems with chaotic classical analogs. For practical reasons, much of the supporting numerical studies of symmetry breaking have been done with billiards or maps, and little with simple, smooth systems. There are two main difficulties in attempting to break time reversal invariance in a continuous time system with a smooth potential. The first is avoiding false time reversal breaking. The second is locating a parameter regime in which the symmetry breaking is strong enough to transform the fluctuation properties fully to the broken symmetry case, and yet remain weak enough so as not to regularize the dynamics sufficiently that the system is no longer chaotic. We give an example of a system of two coupled quartic oscillators whose energy level statistics closely match with those of the Gaussian unitary ensemble, and which possesses only a minor proportion of regular motion in its phase space.
Diffusion, Dispersion, and Uncertainty in Anisotropic Fractal Porous Media
NASA Astrophysics Data System (ADS)
Monnig, N. D.; Benson, D. A.
2007-12-01
Motivated by field measurements of aquifer hydraulic conductivity (K), recent techniques were developed to construct anisotropic fractal random fields, in which the scaling, or self-similarity parameter, varies with direction and is defined by a matrix. Ensemble numerical results are analyzed for solute transport through these 2-D "operator-scaling" fractional Brownian motion (fBm) ln(K) fields. Contrary to some analytic stochastic theories for monofractal K fields, the plume growth rates never exceed Mercado's (1967) purely stratified aquifer growth rate of plume apparent dispersivity proportional to mean distance. Apparent super-stratified growth must be the result of other demonstrable factors, such as initial plume size. The addition of large local dispersion and diffusion does not significantly change the effective longitudinal dispersivity of the plumes. In the presence of significant local dispersion or diffusion, the concentration coefficient of variation CV={σc}/{\\langle c \\rangle} remains large at the leading edge of the plumes. This indicates that even with considerable mixing due to dispersion or diffusion, there is still substantial uncertainty in the leading edge of a plume moving in fractal porous media.
Power flow prediction in vibrating systems via model reduction
NASA Astrophysics Data System (ADS)
Li, Xianhui
This dissertation focuses on power flow prediction in vibrating systems. Reduced order models (ROMs) are built based on rational Krylov model reduction which preserve power flow information in the original systems over a specified frequency band. Stiffness and mass matrices of the ROMs are obtained by projecting the original system matrices onto the subspaces spanned by forced responses. A matrix-free algorithm is designed to construct ROMs directly from the power quantities at selected interpolation frequencies. Strategies for parallel implementation of the algorithm via message passing interface are proposed. The quality of ROMs is iteratively refined according to the error estimate based on residual norms. Band capacity is proposed to provide a priori estimate of the sizes of good quality ROMs. Frequency averaging is recast as ensemble averaging and Cauchy distribution is used to simplify the computation. Besides model reduction for deterministic systems, details of constructing ROMs for parametric and nonparametric random systems are also presented. Case studies have been conducted on testbeds from Harwell-Boeing collections. Input and coupling power flow are computed for the original systems and the ROMs. Good agreement is observed in all cases.
NASA Astrophysics Data System (ADS)
Krishnan, Chethan; Pavan Kumar, K. V.; Rosa, Dario
2018-01-01
We contrast some aspects of various SYK-like models with large- N melonic behavior. First, we note that ungauged tensor models can exhibit symmetry breaking, even though these are 0+1 dimensional theories. Related to this, we show that when gauged, some of them admit no singlets, and are anomalous. The uncolored Majorana tensor model with even N is a simple case where gauge singlets can exist in the spectrum. We outline a strategy for solving for the singlet spectrum, taking advantage of the results in arXiv:1706.05364, and reproduce the singlet states expected in N = 2. In the second part of the paper, we contrast the random matrix aspects of some ungauged tensor models, the original SYK model, and a model due to Gross and Rosenhaus. The latter, even though disorder averaged, shows parallels with the Gurau-Witten model. In particular, the two models fall into identical Andreev ensembles as a function of N . In an appendix, we contrast the (expected) spectra of AdS2 quantum gravity, SYK and SYK-like tensor models, and the zeros of the Riemann Zeta function.
Quantum Effects at a Proton Relaxation at Low Temperatures
NASA Astrophysics Data System (ADS)
Kalytka, V. A.; Korovkin, M. V.
2016-11-01
Quantum effects during migratory polarization in multi-well crystals (including multi-well silicates and crystalline hydrates) are investigated in a variable electric field at low temperatures by direct quantum-mechanical calculations. Based on analytical solution of the quantum Liouville kinetic equation in the linear approximation for the polarizing field, the non-stationary density matrix is calculated for an ensemble of non-interacting protons moving in the field of one-dimensional multi-well crystal potential relief of rectangular shape. An expression for the complex dielectric constant convenient for a comparison with experiment and calculation of relaxer parameters is derived using the nonequilibrium polarization density matrix. The density matrix apparatus can be used for analytical investigation of the quantum mechanism of spontaneous polarization of a ferroelectric material (KDP and DKDP).
Autofluorescent polarimetry of bile films in the liver pathology differentiation
NASA Astrophysics Data System (ADS)
Prysyazhnyuk, V. P.; Ushenko, Yu. O.; Dubolazov, O. V.; Ushenko, A. G.; Savich, V. O.; Karachevtsev, A. O.
2015-09-01
A new information optical technique of diagnostics of the structure of the polycrystalline bile films is proposed. The model of Mueller-matrix description of mechanisms of optical anisotropy of such objects as optical activity, birefringence, as well as linear and circular dichroism is suggested. The ensemble of informationally topical azimuthally stable Mueller-matrix invariants is determined. Within the statistical analysis of such parameters distributions the objective criteria of differentiation of the polycrystalline bile films taken from patients with fatty degeneration (group 1) chronic hepatitis (group 2) of the liver were determined. From the point of view of probative medicine the operational characteristics (sensitivity, specificity and accuracy) of the information-optical method of Mueller-matrix mapping of polycrystalline films of bile were found and its efficiency in diagnostics of pathological changes was demonstrated.
Blending of Radial HF Radar Surface Current and Model Using ETKF Scheme For The Sunda Strait
NASA Astrophysics Data System (ADS)
Mujiasih, Subekti; Riyadi, Mochammad; Wandono, Dr; Wayan Suardana, I.; Nyoman Gede Wiryajaya, I.; Nyoman Suarsa, I.; Hartanto, Dwi; Barth, Alexander; Beckers, Jean-Marie
2017-04-01
Preliminary study of data blending of surface current for Sunda Strait-Indonesia has been done using the analysis scheme of the Ensemble Transform Kalman Filter (ETKF). The method is utilized to combine radial velocity from HF Radar and u and v component of velocity from Global Copernicus - Marine environment monitoring service (CMEMS) model. The initial ensemble is based on the time variability of the CMEMS model result. Data tested are from 2 CODAR Seasonde radar sites in Sunda Strait and 2 dates such as 09 September 2013 and 08 February 2016 at 12.00 UTC. The radial HF Radar data has a hourly temporal resolution, 20-60 km of spatial range, 3 km of range resolution, 5 degree of angular resolution and spatial resolution and 11.5-14 MHz of frequency range. The u and v component of the model velocity represents a daily mean with 1/12 degree spatial resolution. The radial data from one HF radar site is analyzed and the result compared to the equivalent radial velocity from CMEMS for the second HF radar site. Error checking is calculated by root mean squared error (RMSE). Calculation of ensemble analysis and ensemble mean is using Sangoma software package. The tested R which represents observation error covariance matrix, is a diagonal matrix with diagonal elements equal 0.05, 0.5 or 1.0 m2/s2. The initial ensemble members comes from a model simulation spanning a month (September 2013 or February 2016), one year (2013) or 4 years (2013-2016). The spatial distribution of the radial current are analyzed and the RMSE values obtained from independent HF radar station are optimized. It was verified that the analysis reproduces well the structure included in the analyzed HF radar data. More importantly, the analysis was also improved relative to the second independent HF radar site. RMSE of the improved analysis is better than first HF Radar site Analysis. The best result of the blending exercise was obtained for observation error variance equal to 0.05 m2/s2. This study is still preliminary step, but it gives promising result for bigger size of data, combining other model and further development. Keyword: HF Radar, Sunda Strait, ETKF, CMEMS
Chang, Chi-Ying; Chang, Chia-Chi; Hsiao, Tzu-Chien
2013-01-01
Excitation-emission matrix (EEM) fluorescence spectroscopy is a noninvasive method for tissue diagnosis and has become important in clinical use. However, the intrinsic characterization of EEM fluorescence remains unclear. Photobleaching and the complexity of the chemical compounds make it difficult to distinguish individual compounds due to overlapping features. Conventional studies use principal component analysis (PCA) for EEM fluorescence analysis, and the relationship between the EEM features extracted by PCA and diseases has been examined. The spectral features of different tissue constituents are not fully separable or clearly defined. Recently, a non-stationary method called multi-dimensional ensemble empirical mode decomposition (MEEMD) was introduced; this method can extract the intrinsic oscillations on multiple spatial scales without loss of information. The aim of this study was to propose a fluorescence spectroscopy system for EEM measurements and to describe a method for extracting the intrinsic characteristics of EEM by MEEMD. The results indicate that, although PCA provides the principal factor for the spectral features associated with chemical compounds, MEEMD can provide additional intrinsic features with more reliable mapping of the chemical compounds. MEEMD has the potential to extract intrinsic fluorescence features and improve the detection of biochemical changes. PMID:24240806
Ideas for a pattern-oriented approach towards a VERA analysis ensemble
NASA Astrophysics Data System (ADS)
Gorgas, T.; Dorninger, M.
2010-09-01
Ideas for a pattern-oriented approach towards a VERA analysis ensemble For many applications in meteorology and especially for verification purposes it is important to have some information about the uncertainties of observation and analysis data. A high quality of these "reference data" is an absolute necessity as the uncertainties are reflected in verification measures. The VERA (Vienna Enhanced Resolution Analysis) scheme includes a sophisticated quality control tool which accounts for the correction of observational data and provides an estimation of the observation uncertainty. It is crucial for meteorologically and physically reliable analysis fields. VERA is based on a variational principle and does not need any first guess fields. It is therefore NWP model independent and can also be used as an unbiased reference for real time model verification. For downscaling purposes VERA uses an a priori knowledge on small-scale physical processes over complex terrain, the so called "fingerprint technique", which transfers information from rich to data sparse regions. The enhanced Joint D-PHASE and COPS data set forms the data base for the analysis ensemble study. For the WWRP projects D-PHASE and COPS a joint activity has been started to collect GTS and non-GTS data from the national and regional meteorological services in Central Europe for 2007. Data from more than 11.000 stations are available for high resolution analyses. The usage of random numbers as perturbations for ensemble experiments is a common approach in meteorology. In most implementations, like for NWP-model ensemble systems, the focus lies on error growth and propagation on the spatial and temporal scale. When defining errors in analysis fields we have to consider the fact that analyses are not time dependent and that no perturbation method aimed at temporal evolution is possible. Further, the method applied should respect two major sources of analysis errors: Observation errors AND analysis or interpolation errors. With the concept of an analysis ensemble we hope to get a more detailed sight on both sources of analysis errors. For the computation of the VERA ensemble members a sample of Gaussian random perturbations is produced for each station and parameter. The deviation of perturbations is based on the correction proposals by the VERA QC scheme to provide some "natural" limits for the ensemble. In order to put more emphasis on the weather situation we aim to integrate the main synoptic field structures as weighting factors for the perturbations. Two widely approved approaches are used for the definition of these main field structures: The Principal Component Analysis and a 2D-Discrete Wavelet Transform. The results of tests concerning the implementation of this pattern-supported analysis ensemble system and a comparison of the different approaches are given in the presentation.
Staggered chiral random matrix theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osborn, James C.
2011-02-01
We present a random matrix theory for the staggered lattice QCD Dirac operator. The staggered random matrix theory is equivalent to the zero-momentum limit of the staggered chiral Lagrangian and includes all taste breaking terms at their leading order. This is an extension of previous work which only included some of the taste breaking terms. We will also present some results for the taste breaking contributions to the partition function and the Dirac eigenvalues.
NASA Astrophysics Data System (ADS)
Briseño, Jessica; Herrera, Graciela S.
2010-05-01
Herrera (1998) proposed a method for the optimal design of groundwater quality monitoring networks that involves space and time in a combined form. The method was applied later by Herrera et al (2001) and by Herrera and Pinder (2005). To get the estimates of the contaminant concentration being analyzed, this method uses a space-time ensemble Kalman filter, based on a stochastic flow and transport model. When the method is applied, it is important that the characteristics of the stochastic model be congruent with field data, but, in general, it is laborious to manually achieve a good match between them. For this reason, the main objective of this work is to extend the space-time ensemble Kalman filter proposed by Herrera, to estimate the hydraulic conductivity, together with hydraulic head and contaminant concentration, and its application in a synthetic example. The method has three steps: 1) Given the mean and the semivariogram of the natural logarithm of hydraulic conductivity (ln K), random realizations of this parameter are obtained through two alternatives: Gaussian simulation (SGSim) and Latin Hypercube Sampling method (LHC). 2) The stochastic model is used to produce hydraulic head (h) and contaminant (C) realizations, for each one of the conductivity realizations. With these realization the mean of ln K, h and C are obtained, for h and C, the mean is calculated in space and time, and also the cross covariance matrix h-ln K-C in space and time. The covariance matrix is obtained averaging products of the ln K, h and C realizations on the estimation points and times, and the positions and times with data of the analyzed variables. The estimation points are the positions at which estimates of ln K, h or C are gathered. In an analogous way, the estimation times are those at which estimates of any of the three variables are gathered. 3) Finally the ln K, h and C estimate are obtained using the space-time ensemble Kalman filter. The realization mean for each one of the variables is used as the prior space-time estimate for the Kalman filter, and the space-time cross-covariance matrix of h-ln K-C as the prior estimate-error covariance-matrix. The synthetic example has a modeling area of 700 x 700 square meters; a triangular mesh model with 702 nodes and 1306 elements is used. A pumping well located in the central part of the study area is considered. For the contaminant transport model, a contaminant source area is present in the western part of the study area. The estimation points for hydraulic conductivity, hydraulic head and contaminant concentrations are located on a submesh of the model mesh (same location for h, ln K and c), composed by 48 nodes spread throughout the study area, with an approximately separation of 90 meters between nodes. The results analysis was done through the mean error, root mean square error, initial and final estimation maps of h, ln K and C at each time, and the initial and final variance maps of h, ln K and C. To obtain model convergence, 3000 realizations of ln K were required using SGSim, and only 1000 with LHC. The results show that for both alternatives, the Kalman filter estimates for h, ln K and C using h and C data, have errors which magnitudes decrease as data is added. HERRERA, G. S.(1998), Cost Effective Groundwater Quality Sampling Network Design. Ph. D. thesis, University of Vermont, Burlington, Vermont, 172 pp. HERRERA G., GUARNACCIA J., PINDER G. Y SIMUTA R.(2001),"Diseño de redes de monitoreo de la calidad del agua subterránea eficientes", Proceedings of the 2001 International Symposium on Environmental Hydraulics, Arizona, U.S.A. HERRERA G. S. and PINDER G.F. (2005), Space-time optimization of groundwater quality sampling networks Water Resour. Res., Vol. 41, No. 12, W12407, 10.1029/2004WR003626.
Ren, Fulong; Cao, Peng; Li, Wei; Zhao, Dazhe; Zaiane, Osmar
2017-01-01
Diabetic retinopathy (DR) is a progressive disease, and its detection at an early stage is crucial for saving a patient's vision. An automated screening system for DR can help in reduce the chances of complete blindness due to DR along with lowering the work load on ophthalmologists. Among the earliest signs of DR are microaneurysms (MAs). However, current schemes for MA detection appear to report many false positives because detection algorithms have high sensitivity. Inevitably some non-MAs structures are labeled as MAs in the initial MAs identification step. This is a typical "class imbalance problem". Class imbalanced data has detrimental effects on the performance of conventional classifiers. In this work, we propose an ensemble based adaptive over-sampling algorithm for overcoming the class imbalance problem in the false positive reduction, and we use Boosting, Bagging, Random subspace as the ensemble framework to improve microaneurysm detection. The ensemble based over-sampling methods we proposed combine the strength of adaptive over-sampling and ensemble. The objective of the amalgamation of ensemble and adaptive over-sampling is to reduce the induction biases introduced from imbalanced data and to enhance the generalization classification performance of extreme learning machines (ELM). Experimental results show that our ASOBoost method has higher area under the ROC curve (AUC) and G-mean values than many existing class imbalance learning methods. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Wang, S.; Huang, G. H.; Baetz, B. W.; Huang, W.
2015-11-01
This paper presents a polynomial chaos ensemble hydrologic prediction system (PCEHPS) for an efficient and robust uncertainty assessment of model parameters and predictions, in which possibilistic reasoning is infused into probabilistic parameter inference with simultaneous consideration of randomness and fuzziness. The PCEHPS is developed through a two-stage factorial polynomial chaos expansion (PCE) framework, which consists of an ensemble of PCEs to approximate the behavior of the hydrologic model, significantly speeding up the exhaustive sampling of the parameter space. Multiple hypothesis testing is then conducted to construct an ensemble of reduced-dimensionality PCEs with only the most influential terms, which is meaningful for achieving uncertainty reduction and further acceleration of parameter inference. The PCEHPS is applied to the Xiangxi River watershed in China to demonstrate its validity and applicability. A detailed comparison between the HYMOD hydrologic model, the ensemble of PCEs, and the ensemble of reduced PCEs is performed in terms of accuracy and efficiency. Results reveal temporal and spatial variations in parameter sensitivities due to the dynamic behavior of hydrologic systems, and the effects (magnitude and direction) of parametric interactions depending on different hydrological metrics. The case study demonstrates that the PCEHPS is capable not only of capturing both expert knowledge and probabilistic information in the calibration process, but also of implementing an acceleration of more than 10 times faster than the hydrologic model without compromising the predictive accuracy.
Four dimensional variational inversion of atmospheric chemical sources in WRFDA
NASA Astrophysics Data System (ADS)
Guerrette, J. J.
Atmospheric aerosols are known to affect health, weather, and climate, but their impacts on regional scales are uncertain due to heterogeneous source, transport, and transformation mechanisms. The Weather Research and Forecasting model with chemistry (WRF-Chem) can account for aerosol-meteorology feedbacks as it simultaneously integrates equations of dynamical and chemical processes. Here we develop and apply incremental four dimensional variational (4D-Var) data assimilation (DA) capabilities in WRF-Chem to constrain chemical emissions (WRFDA-Chem). We develop adjoint (ADM) and tangent linear (TLM) model descriptions of boundary layer mixing, emission, aging, dry deposition, and advection of black carbon (BC) aerosol. ADM and TLM model performance is verified against finite difference derivative approximations. A second order checkpointing scheme is used to reduce memory costs and enable simulations longer than six hours. We apply WRFDA-Chem to constraining anthropogenic and biomass burning sources of BC throughout California during the 2008 Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) field campaign. Manual corrections to the prior emissions and subsequent inverse modeling reduce the spread in total emitted BC mass between two biomass burning inventories from a factor of x10 to only x2 across three days of measurements. We quantify posterior emission variance using an eigendecomposition of the cost function Hessian matrix. We also address the limited scalability of 4D-Var, which traditionally uses a sequential optimization algorithm (e.g., conjugate gradient) to approximate these Hessian eigenmodes. The Randomized Incremental Optimal Technique (RIOT) uses an ensemble of TLM and ADM instances to perform a Hessian singular value decomposition. While RIOT requires more ensemble members than Lanczos requires iterations to converge to a comparable posterior control vector, the wall-time of RIOT is x10 shorter since the ensemble is executed in parallel. This work demonstrates that RIOT improves the scalability of 4D-Var for high-dimensional nonlinear problems. Overall, WRFDA-Chem and RIOT provide a framework for air quality forecasting, campaign planning, and emissions constraint that can be used to refine our understanding of the interplay between atmospheric chemistry, meteorology, climate, and human health.
Key-Generation Algorithms for Linear Piece In Hand Matrix Method
NASA Astrophysics Data System (ADS)
Tadaki, Kohtaro; Tsujii, Shigeo
The linear Piece In Hand (PH, for short) matrix method with random variables was proposed in our former work. It is a general prescription which can be applicable to any type of multivariate public-key cryptosystems for the purpose of enhancing their security. Actually, we showed, in an experimental manner, that the linear PH matrix method with random variables can certainly enhance the security of HFE against the Gröbner basis attack, where HFE is one of the major variants of multivariate public-key cryptosystems. In 1998 Patarin, Goubin, and Courtois introduced the plus method as a general prescription which aims to enhance the security of any given MPKC, just like the linear PH matrix method with random variables. In this paper we prove the equivalence between the plus method and the primitive linear PH matrix method, which is introduced by our previous work to explain the notion of the PH matrix method in general in an illustrative manner and not for a practical use to enhance the security of any given MPKC. Based on this equivalence, we show that the linear PH matrix method with random variables has the substantial advantage over the plus method with respect to the security enhancement. In the linear PH matrix method with random variables, the three matrices, including the PH matrix, play a central role in the secret-key and public-key. In this paper, we clarify how to generate these matrices and thus present two probabilistic polynomial-time algorithms to generate these matrices. In particular, the second one has a concise form, and is obtained as a byproduct of the proof of the equivalence between the plus method and the primitive linear PH matrix method.
Information flow in an atmospheric model and data assimilation
NASA Astrophysics Data System (ADS)
Yoon, Young-noh
2011-12-01
Weather forecasting consists of two processes, model integration and analysis (data assimilation). During the model integration, the state estimate produced by the analysis evolves to the next cycle time according to the atmospheric model to become the background estimate. The analysis then produces a new state estimate by combining the background state estimate with new observations, and the cycle repeats. In an ensemble Kalman filter, the probability distribution of the state estimate is represented by an ensemble of sample states, and the covariance matrix is calculated using the ensemble of sample states. We perform numerical experiments on toy atmospheric models introduced by Lorenz in 2005 to study the information flow in an atmospheric model in conjunction with ensemble Kalman filtering for data assimilation. This dissertation consists of two parts. The first part of this dissertation is about the propagation of information and the use of localization in ensemble Kalman filtering. If we can perform data assimilation locally by considering the observations and the state variables only near each grid point, then we can reduce the number of ensemble members necessary to cover the probability distribution of the state estimate, reducing the computational cost for the data assimilation and the model integration. Several localized versions of the ensemble Kalman filter have been proposed. Although tests applying such schemes have proven them to be extremely promising, a full basic understanding of the rationale and limitations of localization is currently lacking. We address these issues and elucidate the role played by chaotic wave dynamics in the propagation of information and the resulting impact on forecasts. The second part of this dissertation is about ensemble regional data assimilation using joint states. Assuming that we have a global model and a regional model of higher accuracy defined in a subregion inside the global region, we propose a data assimilation scheme that produces the analyses for the global and the regional model simultaneously, considering forecast information from both models. We show that our new data assimilation scheme produces better results both in the subregion and the global region than the data assimilation scheme that produces the analyses for the global and the regional model separately.
Entanglement in a solid-state spin ensemble.
Simmons, Stephanie; Brown, Richard M; Riemann, Helge; Abrosimov, Nikolai V; Becker, Peter; Pohl, Hans-Joachim; Thewalt, Mike L W; Itoh, Kohei M; Morton, John J L
2011-02-03
Entanglement is the quintessential quantum phenomenon. It is a necessary ingredient in most emerging quantum technologies, including quantum repeaters, quantum information processing and the strongest forms of quantum cryptography. Spin ensembles, such as those used in liquid-state nuclear magnetic resonance, have been important for the development of quantum control methods. However, these demonstrations contain no entanglement and ultimately constitute classical simulations of quantum algorithms. Here we report the on-demand generation of entanglement between an ensemble of electron and nuclear spins in isotopically engineered, phosphorus-doped silicon. We combined high-field (3.4 T), low-temperature (2.9 K) electron spin resonance with hyperpolarization of the (31)P nuclear spin to obtain an initial state of sufficient purity to create a non-classical, inseparable state. The state was verified using density matrix tomography based on geometric phase gates, and had a fidelity of 98% relative to the ideal state at this field and temperature. The entanglement operation was performed simultaneously, with high fidelity, on 10(10) spin pairs; this fulfils one of the essential requirements for a silicon-based quantum information processor.
Unifying model for random matrix theory in arbitrary space dimensions
NASA Astrophysics Data System (ADS)
Cicuta, Giovanni M.; Krausser, Johannes; Milkus, Rico; Zaccone, Alessio
2018-03-01
A sparse random block matrix model suggested by the Hessian matrix used in the study of elastic vibrational modes of amorphous solids is presented and analyzed. By evaluating some moments, benchmarked against numerics, differences in the eigenvalue spectrum of this model in different limits of space dimension d , and for arbitrary values of the lattice coordination number Z , are shown and discussed. As a function of these two parameters (and their ratio Z /d ), the most studied models in random matrix theory (Erdos-Renyi graphs, effective medium, and replicas) can be reproduced in the various limits of block dimensionality d . Remarkably, the Marchenko-Pastur spectral density (which is recovered by replica calculations for the Laplacian matrix) is reproduced exactly in the limit of infinite size of the blocks, or d →∞ , which clarifies the physical meaning of space dimension in these models. We feel that the approximate results for d =3 provided by our method may have many potential applications in the future, from the vibrational spectrum of glasses and elastic networks to wave localization, disordered conductors, random resistor networks, and random walks.
E. Freeman; G. Moisen; J. Coulston; B. Wilson
2014-01-01
Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...
Comparative study of feature selection with ensemble learning using SOM variants
NASA Astrophysics Data System (ADS)
Filali, Ameni; Jlassi, Chiraz; Arous, Najet
2017-03-01
Ensemble learning has succeeded in the growth of stability and clustering accuracy, but their runtime prohibits them from scaling up to real-world applications. This study deals the problem of selecting a subset of the most pertinent features for every cluster from a dataset. The proposed method is another extension of the Random Forests approach using self-organizing maps (SOM) variants to unlabeled data that estimates the out-of-bag feature importance from a set of partitions. Every partition is created using a various bootstrap sample and a random subset of the features. Then, we show that the process internal estimates are used to measure variable pertinence in Random Forests are also applicable to feature selection in unsupervised learning. This approach aims to the dimensionality reduction, visualization and cluster characterization at the same time. Hence, we provide empirical results on nineteen benchmark data sets indicating that RFS can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art unsupervised methods, with a very limited subset of features. The approach proves promise to treat with very broad domains.
Generalization of one-dimensional solute transport: A stochastic-convective flow conceptualization
NASA Astrophysics Data System (ADS)
Simmons, C. S.
1986-04-01
A stochastic-convective representation of one-dimensional solute transport is derived. It is shown to conceptually encompass solutions of the conventional convection-dispersion equation. This stochastic approach, however, does not rely on the assumption that dispersive flux satisfies Fick's diffusion law. Observable values of solute concentration and flux, which together satisfy a conservation equation, are expressed as expectations over a flow velocity ensemble, representing the inherent random processess that govern dispersion. Solute concentration is determined by a Lagrangian pdf for random spatial displacements, while flux is determined by an equivalent Eulerian pdf for random travel times. A condition for such equivalence is derived for steady nonuniform flow, and it is proven that both Lagrangian and Eulerian pdfs are required to account for specified initial and boundary conditions on a global scale. Furthermore, simplified modeling of transport is justified by proving that an ensemble of effectively constant velocities always exists that constitutes an equivalent representation. An example of how a two-dimensional transport problem can be reduced to a single-dimensional stochastic viewpoint is also presented to further clarify concepts.
Random center vortex lines in continuous 3D space-time
DOE Office of Scientific and Technical Information (OSTI.GOV)
Höllwieser, Roman; Institute of Atomic and Subatomic Physics, Vienna University of Technology, Operngasse 9, 1040 Vienna; Altarawneh, Derar
2016-01-22
We present a model of center vortices, represented by closed random lines in continuous 2+1-dimensional space-time. These random lines are modeled as being piece-wise linear and an ensemble is generated by Monte Carlo methods. The physical space in which the vortex lines are defined is a cuboid with periodic boundary conditions. Besides moving, growing and shrinking of the vortex configuration, also reconnections are allowed. Our ensemble therefore contains not a fixed, but a variable number of closed vortex lines. This is expected to be important for realizing the deconfining phase transition. Using the model, we study both vortex percolation andmore » the potential V(R) between quark and anti-quark as a function of distance R at different vortex densities, vortex segment lengths, reconnection conditions and at different temperatures. We have found three deconfinement phase transitions, as a function of density, as a function of vortex segment length, and as a function of temperature. The model reproduces the qualitative features of confinement physics seen in SU(2) Yang-Mills theory.« less
Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif
2017-01-01
Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.
Poissonian steady states: from stationary densities to stationary intensities.
Eliazar, Iddo
2012-10-01
Markov dynamics are the most elemental and omnipresent form of stochastic dynamics in the sciences, with applications ranging from physics to chemistry, from biology to evolution, and from economics to finance. Markov dynamics can be either stationary or nonstationary. Stationary Markov dynamics represent statistical steady states and are quantified by stationary densities. In this paper, we generalize the notion of steady state to the case of general Markov dynamics. Considering an ensemble of independent motions governed by common Markov dynamics, we establish that the entire ensemble attains Poissonian steady states which are quantified by stationary Poissonian intensities and which hold valid also in the case of nonstationary Markov dynamics. The methodology is applied to a host of Markov dynamics, including Brownian motion, birth-death processes, random walks, geometric random walks, renewal processes, growth-collapse dynamics, decay-surge dynamics, Ito diffusions, and Langevin dynamics.
Poissonian steady states: From stationary densities to stationary intensities
NASA Astrophysics Data System (ADS)
Eliazar, Iddo
2012-10-01
Markov dynamics are the most elemental and omnipresent form of stochastic dynamics in the sciences, with applications ranging from physics to chemistry, from biology to evolution, and from economics to finance. Markov dynamics can be either stationary or nonstationary. Stationary Markov dynamics represent statistical steady states and are quantified by stationary densities. In this paper, we generalize the notion of steady state to the case of general Markov dynamics. Considering an ensemble of independent motions governed by common Markov dynamics, we establish that the entire ensemble attains Poissonian steady states which are quantified by stationary Poissonian intensities and which hold valid also in the case of nonstationary Markov dynamics. The methodology is applied to a host of Markov dynamics, including Brownian motion, birth-death processes, random walks, geometric random walks, renewal processes, growth-collapse dynamics, decay-surge dynamics, Ito diffusions, and Langevin dynamics.
NASA Astrophysics Data System (ADS)
Vyas, Manan; Kota, V. K. B.
2012-12-01
Following the earlier studies on embedded unitary ensembles generated by random two-body interactions [EGUE(2)] with spin SU(2) and spin-isospin SU(4) symmetries, developed is a general formulation, for deriving lower order moments of the one- and two-point correlation functions in eigenvalues, that is valid for any EGUE(2) and BEGUE(2) ("B" stands for bosons) with U(Ω)⊗SU(r) embedding and with two-body interactions preserving SU(r) symmetry. Using this formulation with r = 1, we recover the results derived by Asaga et al. [Ann. Phys. (N.Y.) 297, 344 (2002)], 10.1006/aphy.2002.6248 for spinless boson systems. Going further, new results are obtained for r = 2 (this corresponds to two species boson systems) and r = 3 (this corresponds to spin 1 boson systems).
NASA Astrophysics Data System (ADS)
Baran, Anthony J.; Ishimoto, Hiroshi; Sourdeval, Odran; Hesse, Evelyn; Harlow, Chawn
2018-02-01
The bulk single-scattering properties of various randomly oriented aggregate ice crystal models are compared and contrasted at a number of frequencies between 89 and 874 GHz. The model ice particles consist of the ten-branched plate aggregate, five-branched plate aggregate, eight-branched hexagonal aggregate, Voronoi ice aggregate, six-branched hollow bullet rosette, hexagonal column of aspect ratio unity, and the ten-branched hexagonal aggregate. The bulk single-scattering properties of the latter two ice particle models have been calculated using the light scattering methods described in Part I, which represent the two most extreme members of an ensemble model of cirrus ice crystals. In Part I, it was shown that the method of physical optics could be combined with the T-matrix at a size parameter of about 18 to compute the bulk integral ice optical properties and the phase function in the microwave to sufficient accuracy to be of practical value. Here, the bulk single-scattering properties predicted by the two ensemble model members and the Voronoi model are shown to generally bound those of all other models at frequencies between 89 and 874 GHz, thus representing a three-component model of ice cloud that can be generally applied to the microwave, rather than using many differing ice particle models. Moreover, the Voronoi model and hollow bullet rosette scatter similarly to each other in the microwave. Furthermore, from the various comparisons, the importance of assumed shapes of the particle size distribution as well as cm-sized ice aggregates is demonstrated.
Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo
2016-01-01
Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble’s output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) − k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer’s disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases. PMID:26764911
Understanding the Structural Ensembles of a Highly Extended Disordered Protein†
Daughdrill, Gary W.; Kashtanov, Stepan; Stancik, Amber; Hill, Shannon E.; Helms, Gregory; Muschol, Martin
2013-01-01
Developing a comprehensive description of the equilibrium structural ensembles for intrinsically disordered proteins (IDPs) is essential to understanding their function. The p53 transactivation domain (p53TAD) is an IDP that interacts with multiple protein partners and contains numerous phosphorylation sites. Multiple techniques were used to investigate the equilibrium structural ensemble of p53TAD in its native and chemically unfolded states. The results from these experiments show that the native state of p53TAD has dimensions similar to a classical random coil while the chemically unfolded state is more extended. To investigate the molecular properties responsible for this behavior, a novel algorithm that generates diverse and unbiased structural ensembles of IDPs was developed. This algorithm was used to generate a large pool of plausible p53TAD structures that were reweighted to identify a subset of structures with the best fit to small angle X-ray scattering data. High weight structures in the native state ensemble show features that are localized to protein binding sites and regions with high proline content. The features localized to the protein binding sites are mostly eliminated in the chemically unfolded ensemble; while, the regions with high proline content remain relatively unaffected. Data from NMR experiments support these results, showing that residues from the protein binding sites experience larger environmental changes upon unfolding by urea than regions with high proline content. This behavior is consistent with the urea-induced exposure of nonpolar and aromatic side-chains in the protein binding sites that are partially excluded from solvent in the native state ensemble. PMID:21979461
Generalized Ensemble Sampling of Enzyme Reaction Free Energy Pathways
Wu, Dongsheng; Fajer, Mikolai I.; Cao, Liaoran; Cheng, Xiaolin; Yang, Wei
2016-01-01
Free energy path sampling plays an essential role in computational understanding of chemical reactions, particularly those occurring in enzymatic environments. Among a variety of molecular dynamics simulation approaches, the generalized ensemble sampling strategy is uniquely attractive for the fact that it not only can enhance the sampling of rare chemical events but also can naturally ensure consistent exploration of environmental degrees of freedom. In this review, we plan to provide a tutorial-like tour on an emerging topic: generalized ensemble sampling of enzyme reaction free energy path. The discussion is largely focused on our own studies, particularly ones based on the metadynamics free energy sampling method and the on-the-path random walk path sampling method. We hope that this mini presentation will provide interested practitioners some meaningful guidance for future algorithm formulation and application study. PMID:27498634
Evaluation of the Plant-Craig stochastic convection scheme in an ensemble forecasting system
NASA Astrophysics Data System (ADS)
Keane, R. J.; Plant, R. S.; Tennant, W. J.
2015-12-01
The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic element only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.
A generalization of random matrix theory and its application to statistical physics.
Wang, Duan; Zhang, Xin; Horvatic, Davor; Podobnik, Boris; Eugene Stanley, H
2017-02-01
To study the statistical structure of crosscorrelations in empirical data, we generalize random matrix theory and propose a new method of cross-correlation analysis, known as autoregressive random matrix theory (ARRMT). ARRMT takes into account the influence of auto-correlations in the study of cross-correlations in multiple time series. We first analytically and numerically determine how auto-correlations affect the eigenvalue distribution of the correlation matrix. Then we introduce ARRMT with a detailed procedure of how to implement the method. Finally, we illustrate the method using two examples taken from inflation rates for air pressure data for 95 US cities.
Reduced Kalman Filters for Clock Ensembles
NASA Technical Reports Server (NTRS)
Greenhall, Charles A.
2011-01-01
This paper summarizes the author's work ontimescales based on Kalman filters that act upon the clock comparisons. The natural Kalman timescale algorithm tends to optimize long-term timescale stability at the expense of short-term stability. By subjecting each post-measurement error covariance matrix to a non-transparent reduction operation, one obtains corrected clocks with improved short-term stability and little sacrifice of long-term stability.
NASA Astrophysics Data System (ADS)
Roberge, S.; Chokmani, K.; De Sève, D.
2012-04-01
The snow cover plays an important role in the hydrological cycle of Quebec (Eastern Canada). Consequently, evaluating its spatial extent interests the authorities responsible for the management of water resources, especially hydropower companies. The main objective of this study is the development of a snow-cover mapping strategy using remote sensing data and ensemble based systems techniques. Planned to be tested in a near real-time operational mode, this snow-cover mapping strategy has the advantage to provide the probability of a pixel to be snow covered and its uncertainty. Ensemble systems are made of two key components. First, a method is needed to build an ensemble of classifiers that is diverse as much as possible. Second, an approach is required to combine the outputs of individual classifiers that make up the ensemble in such a way that correct decisions are amplified, and incorrect ones are cancelled out. In this study, we demonstrate the potential of ensemble systems to snow-cover mapping using remote sensing data. The chosen classifier is a sequential thresholds algorithm using NOAA-AVHRR data adapted to conditions over Eastern Canada. Its special feature is the use of a combination of six sequential thresholds varying according to the day in the winter season. Two versions of the snow-cover mapping algorithm have been developed: one is specific for autumn (from October 1st to December 31st) and the other for spring (from March 16th to May 31st). In order to build the ensemble based system, different versions of the algorithm are created by varying randomly its parameters. One hundred of the versions are included in the ensemble. The probability of a pixel to be snow, no-snow or cloud covered corresponds to the amount of votes the pixel has been classified as such by all classifiers. The overall performance of ensemble based mapping is compared to the overall performance of the chosen classifier, and also with ground observations at meteorological stations.
NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available. PMID:24667482
Ali, Safdar; Majid, Abdul; Khan, Asifullah
2014-04-01
Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our analysis demonstrates that ensemble-RF, ensemble-SVM and ensemble-KNN are more effective than their individual counterparts. The proposed 'IDM-PhyChm-Ens' method has shown improved performance compared to existing techniques.
The feasibility and stability of large complex biological networks: a random matrix approach.
Stone, Lewi
2018-05-29
In the 70's, Robert May demonstrated that complexity creates instability in generic models of ecological networks having random interaction matrices A. Similar random matrix models have since been applied in many disciplines. Central to assessing stability is the "circular law" since it describes the eigenvalue distribution for an important class of random matrices A. However, despite widespread adoption, the "circular law" does not apply for ecological systems in which density-dependence operates (i.e., where a species growth is determined by its density). Instead one needs to study the far more complicated eigenvalue distribution of the community matrix S = DA, where D is a diagonal matrix of population equilibrium values. Here we obtain this eigenvalue distribution. We show that if the random matrix A is locally stable, the community matrix S = DA will also be locally stable, providing the system is feasible (i.e., all species have positive equilibria D > 0). This helps explain why, unusually, nearly all feasible systems studied here are locally stable. Large complex systems may thus be even more fragile than May predicted, given the difficulty of assembling a feasible system. It was also found that the degree of stability, or resilience of a system, depended on the minimum equilibrium population.
Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance.
Liu, Yang; Traskin, Mikhail; Lorch, Scott A; George, Edward I; Small, Dylan
2015-03-01
A commonly used method for evaluating a hospital's performance on an outcome is to compare the hospital's observed outcome rate to the hospital's expected outcome rate given its patient (case) mix and service. The process of calculating the hospital's expected outcome rate given its patient mix and service is called risk adjustment (Iezzoni 1997). Risk adjustment is critical for accurately evaluating and comparing hospitals' performances since we would not want to unfairly penalize a hospital just because it treats sicker patients. The key to risk adjustment is accurately estimating the probability of an Outcome given patient characteristics. For cases with binary outcomes, the method that is commonly used in risk adjustment is logistic regression. In this paper, we consider ensemble of trees methods as alternatives for risk adjustment, including random forests and Bayesian additive regression trees (BART). Both random forests and BART are modern machine learning methods that have been shown recently to have excellent performance for prediction of outcomes in many settings. We apply these methods to carry out risk adjustment for the performance of neonatal intensive care units (NICU). We show that these ensemble of trees methods outperform logistic regression in predicting mortality among babies treated in NICU, and provide a superior method of risk adjustment compared to logistic regression.
Polarization-correlation analysis of maps of optical anisotropy biological layers
NASA Astrophysics Data System (ADS)
Ushenko, Yu. A.; Dubolazov, A. V.; Prysyazhnyuk, V. S.; Marchuk, Y. F.; Pashkovskaya, N. V.; Motrich, A. V.; Novakovskaya, O. Y.
2014-08-01
A new information optical technique of diagnostics of the structure of polycrystalline films of bile is proposed. The model of Mueller-matrix description of mechanisms of optical anisotropy of such objects as optical activity, birefringence, as well as linear and circular dichroism is suggested. The ensemble of informationally topical azimuthally stable Mueller-matrix invariants is determined. Within the statistical analysis of such parameters distributions the objective criteria of differentiation of films of bile taken from healthy donors and diabetes of type 2 were determined. From the point of view of probative medicine the operational characteristics (sensitivity, specificity and accuracy) of the information-optical method of Mueller-matrix mapping of polycrystalline films of bile were found and its efficiency in diagnostics of diabetes extent of type 2 was demonstrated. Considered prospects of applying this method in the diagnosis of cirrhosis.
Robust electroencephalogram phase estimation with applications in brain-computer interface systems.
Seraj, Esmaeil; Sameni, Reza
2017-03-01
In this study, a robust method is developed for frequency-specific electroencephalogram (EEG) phase extraction using the analytic representation of the EEG. Based on recent theoretical findings in this area, it is shown that some of the phase variations-previously associated to the brain response-are systematic side-effects of the methods used for EEG phase calculation, especially during low analytical amplitude segments of the EEG. With this insight, the proposed method generates randomized ensembles of the EEG phase using minor perturbations in the zero-pole loci of narrow-band filters, followed by phase estimation using the signal's analytical form and ensemble averaging over the randomized ensembles to obtain a robust EEG phase and frequency. This Monte Carlo estimation method is shown to be very robust to noise and minor changes of the filter parameters and reduces the effect of fake EEG phase jumps, which do not have a cerebral origin. As proof of concept, the proposed method is used for extracting EEG phase features for a brain computer interface (BCI) application. The results show significant improvement in classification rates using rather simple phase-related features and a standard K-nearest neighbors and random forest classifiers, over a standard BCI dataset. The average performance was improved between 4-7% (in absence of additive noise) and 8-12% (in presence of additive noise). The significance of these improvements was statistically confirmed by a paired sample t-test, with 0.01 and 0.03 p-values, respectively. The proposed method for EEG phase calculation is very generic and may be applied to other EEG phase-based studies.
Tanner, Bertrand C.W.; McNabb, Mark; Palmer, Bradley M.; Toth, Michael J.; Miller, Mark S.
2014-01-01
Diminished skeletal muscle performance with aging, disuse, and disease may be partially attributed to the loss of myofilament proteins. Several laboratories have found a disproportionate loss of myosin protein content relative to other myofilament proteins, but due to methodological limitations, the structural manifestation of this protein loss is unknown. To investigate how variations in myosin content affect ensemble cross-bridge behavior and force production we simulated muscle contraction in the half-sarcomere as myosin was removed either i) uniformly, from the Z-line end of thick-filaments, or ii) randomly, along the length of thick-filaments. Uniform myosin removal decreased force production, showing a slightly steeper force-to-myosin content relationship than the 1:1 relationship that would be expected from the loss of cross-bridges. Random myosin removal also decreased force production, but this decrease was less than observed with uniform myosin loss, largely due to increased myosin attachment time (ton) and fractional cross-bridge binding with random myosin loss. These findings support our prior observations that prolonged ton may augment force production in single fibers with randomly reduced myosin content from chronic heart failure patients. These simulation also illustrate that the pattern of myosin loss along thick-filaments influences ensemble cross-bridge behavior and maintenance of force throughout the sarcomere. PMID:24486373
Using Support Vector Machine Ensembles for Target Audience Classification on Twitter
Lo, Siaw Ling; Chiong, Raymond; Cornforth, David
2015-01-01
The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space. PMID:25874768
Using support vector machine ensembles for target audience classification on Twitter.
Lo, Siaw Ling; Chiong, Raymond; Cornforth, David
2015-01-01
The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space.
Breaking of Ensemble Equivalence in Networks
NASA Astrophysics Data System (ADS)
Squartini, Tiziano; de Mol, Joey; den Hollander, Frank; Garlaschelli, Diego
2015-12-01
It is generally believed that, in the thermodynamic limit, the microcanonical description as a function of energy coincides with the canonical description as a function of temperature. However, various examples of systems for which the microcanonical and canonical ensembles are not equivalent have been identified. A complete theory of this intriguing phenomenon is still missing. Here we show that ensemble nonequivalence can manifest itself also in random graphs with topological constraints. We find that, while graphs with a given number of links are ensemble equivalent, graphs with a given degree sequence are not. This result holds irrespective of whether the energy is nonadditive (as in unipartite graphs) or additive (as in bipartite graphs). In contrast with previous expectations, our results show that (1) physically, nonequivalence can be induced by an extensive number of local constraints, and not necessarily by long-range interactions or nonadditivity, (2) mathematically, nonequivalence is determined by a different large-deviation behavior of microcanonical and canonical probabilities for a single microstate, and not necessarily for almost all microstates. The latter criterion, which is entirely local, is not restricted to networks and holds in general.
Yin, Yizhou; Kundu, Kunal; Pal, Lipika R; Moult, John
2017-09-01
CAGI (Critical Assessment of Genome Interpretation) conducts community experiments to determine the state of the art in relating genotype to phenotype. Here, we report results obtained using newly developed ensemble methods to address two CAGI4 challenges: enzyme activity for population missense variants found in NAGLU (Human N-acetyl-glucosaminidase) and random missense mutations in Human UBE2I (Human SUMO E2 ligase), assayed in a high-throughput competitive yeast complementation procedure. The ensemble methods are effective, ranked second for SUMO-ligase and third for NAGLU, according to the CAGI independent assessors. However, in common with other methods used in CAGI, there are large discrepancies between predicted and experimental activities for a subset of variants. Analysis of the structural context provides some insight into these. Post-challenge analysis shows that the ensemble methods are also effective at assigning pathogenicity for the NAGLU variants. In the clinic, providing an estimate of the reliability of pathogenic assignments is the key. We have also used the NAGLU dataset to show that ensemble methods have considerable potential for this task, and are already reliable enough for use with a subset of mutations. © 2017 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Post, Evert Jan
1999-05-01
This essay presents conclusive evidence of the impermissibility of Copenhagen's single system interpretation of the Schroedinger process. The latter needs to be viewed as a tool exclusively describing phase and orientation randomized ensembles and is not be used for isolated single systems. Asymptotic closeness of single system and ensemble behavior and the rare nature of true single system manifestations have prevented a definitive identification of this Copenhagen deficiency over the past three quarter century. Quantum uncertainty so becomes a basic trade mark of phase and orientation disordered ensembles. The ensuing void of usable single system tools opens a new inquiry for tools without statistical connotations. Three, in part already known, period integrals here identified as flux, charge and action counters emerge as diffeo-4 invariant tools fully compatible with the demands of the general theory of relativity. The discovery of the quantum Hall effect has been instrumental in forcing a distinction between ensemble disorder as in the normal Hall effect versus ensemble order in the plateau states. Since the order of the latter permits a view of the plateau states as a macro- or meso-scopic single system, the period integral description applies, yielding a straightforward unified description of integer and fractional quantum Hall effects.
Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression.
Gao, Guangwei; Yang, Jian; Jing, Xiaoyuan; Huang, Pu; Hua, Juliang; Yue, Dong
2016-01-01
In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the limited training sample size is the most fundamental problem. By making use of the low-rank structural information of the reconstructed error image, the so-called nuclear norm-based matrix regression has been demonstrated to be effective for robust face recognition with continuous occlusions. However, the recognition performance of nuclear norm-based matrix regression degrades greatly in the face of the small sample size problem. An alternative solution to tackle this problem is performing matrix regression on each patch and then integrating the outputs from all patches. However, it is difficult to set an optimal patch size across different databases. To fully utilize the complementary information from different patch scales for the final decision, we propose a multi-scale patch-based matrix regression scheme based on which the ensemble of multi-scale outputs can be achieved optimally. Extensive experiments on benchmark face databases validate the effectiveness and robustness of our method, which outperforms several state-of-the-art patch-based face recognition algorithms.
Olesen, Alexander Neergaard; Christensen, Julie A E; Sorensen, Helge B D; Jennum, Poul J
2016-08-01
Reducing the number of recording modalities for sleep staging research can benefit both researchers and patients, under the condition that they provide as accurate results as conventional systems. This paper investigates the possibility of exploiting the multisource nature of the electrooculography (EOG) signals by presenting a method for automatic sleep staging using the complete ensemble empirical mode decomposition with adaptive noise algorithm, and a random forest classifier. It achieves a high overall accuracy of 82% and a Cohen's kappa of 0.74 indicating substantial agreement between automatic and manual scoring.
Constrained Perturbation Regularization Approach for Signal Estimation Using Random Matrix Theory
NASA Astrophysics Data System (ADS)
Suliman, Mohamed; Ballal, Tarig; Kammoun, Abla; Al-Naffouri, Tareq Y.
2016-12-01
In this supplementary appendix we provide proofs and additional extensive simulations that complement the analysis of the main paper (constrained perturbation regularization approach for signal estimation using random matrix theory).
Mesoscale model response to random, surface-based perturbations — A sea-breeze experiment
NASA Astrophysics Data System (ADS)
Garratt, J. R.; Pielke, R. A.; Miller, W. F.; Lee, T. J.
1990-09-01
The introduction into a mesoscale model of random (in space) variations in roughness length, or random (in space and time) surface perturbations of temperature and friction velocity, produces a measurable, but barely significant, response in the simulated flow dynamics of the lower atmosphere. The perturbations are an attempt to include the effects of sub-grid variability into the ensemble-mean parameterization schemes used in many numerical models. Their magnitude is set in our experiments by appeal to real-world observations of the spatial variations in roughness length and daytime surface temperature over the land on horizontal scales of one to several tens of kilometers. With sea-breeze simulations, comparisons of a number of realizations forced by roughness-length and surface-temperature perturbations with the standard simulation reveal no significant change in ensemble mean statistics, and only small changes in the sea-breeze vertical velocity. Changes in the updraft velocity for individual runs, of up to several cms-1 (compared to a mean of 14 cms-1), are directly the result of prefrontal temperature changes of 0.1 to 0.2K, produced by the random surface forcing. The correlation and magnitude of the changes are entirely consistent with a gravity-current interpretation of the sea breeze.
An Intelligent Ensemble Neural Network Model for Wind Speed Prediction in Renewable Energy Systems.
Ranganayaki, V; Deepa, S N
2016-01-01
Various criteria are proposed to select the number of hidden neurons in artificial neural network (ANN) models and based on the criterion evolved an intelligent ensemble neural network model is proposed to predict wind speed in renewable energy applications. The intelligent ensemble neural model based wind speed forecasting is designed by averaging the forecasted values from multiple neural network models which includes multilayer perceptron (MLP), multilayer adaptive linear neuron (Madaline), back propagation neural network (BPN), and probabilistic neural network (PNN) so as to obtain better accuracy in wind speed prediction with minimum error. The random selection of hidden neurons numbers in artificial neural network results in overfitting or underfitting problem. This paper aims to avoid the occurrence of overfitting and underfitting problems. The selection of number of hidden neurons is done in this paper employing 102 criteria; these evolved criteria are verified by the computed various error values. The proposed criteria for fixing hidden neurons are validated employing the convergence theorem. The proposed intelligent ensemble neural model is applied for wind speed prediction application considering the real time wind data collected from the nearby locations. The obtained simulation results substantiate that the proposed ensemble model reduces the error value to minimum and enhances the accuracy. The computed results prove the effectiveness of the proposed ensemble neural network (ENN) model with respect to the considered error factors in comparison with that of the earlier models available in the literature.
An Intelligent Ensemble Neural Network Model for Wind Speed Prediction in Renewable Energy Systems
Ranganayaki, V.; Deepa, S. N.
2016-01-01
Various criteria are proposed to select the number of hidden neurons in artificial neural network (ANN) models and based on the criterion evolved an intelligent ensemble neural network model is proposed to predict wind speed in renewable energy applications. The intelligent ensemble neural model based wind speed forecasting is designed by averaging the forecasted values from multiple neural network models which includes multilayer perceptron (MLP), multilayer adaptive linear neuron (Madaline), back propagation neural network (BPN), and probabilistic neural network (PNN) so as to obtain better accuracy in wind speed prediction with minimum error. The random selection of hidden neurons numbers in artificial neural network results in overfitting or underfitting problem. This paper aims to avoid the occurrence of overfitting and underfitting problems. The selection of number of hidden neurons is done in this paper employing 102 criteria; these evolved criteria are verified by the computed various error values. The proposed criteria for fixing hidden neurons are validated employing the convergence theorem. The proposed intelligent ensemble neural model is applied for wind speed prediction application considering the real time wind data collected from the nearby locations. The obtained simulation results substantiate that the proposed ensemble model reduces the error value to minimum and enhances the accuracy. The computed results prove the effectiveness of the proposed ensemble neural network (ENN) model with respect to the considered error factors in comparison with that of the earlier models available in the literature. PMID:27034973
[Three-dimensional parallel collagen scaffold promotes tendon extracellular matrix formation].
Zheng, Zefeng; Shen, Weiliang; Le, Huihui; Dai, Xuesong; Ouyang, Hongwei; Chen, Weishan
2016-03-01
To investigate the effects of three-dimensional parallel collagen scaffold on the cell shape, arrangement and extracellular matrix formation of tendon stem cells. Parallel collagen scaffold was fabricated by unidirectional freezing technique, while random collagen scaffold was fabricated by freeze-drying technique. The effects of two scaffolds on cell shape and extracellular matrix formation were investigated in vitro by seeding tendon stem/progenitor cells and in vivo by ectopic implantation. Parallel and random collagen scaffolds were produced successfully. Parallel collagen scaffold was more akin to tendon than random collagen scaffold. Tendon stem/progenitor cells were spindle-shaped and unified orientated in parallel collagen scaffold, while cells on random collagen scaffold had disorder orientation. Two weeks after ectopic implantation, cells had nearly the same orientation with the collagen substance. In parallel collagen scaffold, cells had parallel arrangement, and more spindly cells were observed. By contrast, cells in random collagen scaffold were disorder. Parallel collagen scaffold can induce cells to be in spindly and parallel arrangement, and promote parallel extracellular matrix formation; while random collagen scaffold can induce cells in random arrangement. The results indicate that parallel collagen scaffold is an ideal structure to promote tendon repairing.
Spectrum of walk matrix for Koch network and its application
NASA Astrophysics Data System (ADS)
Xie, Pinchen; Lin, Yuan; Zhang, Zhongzhi
2015-06-01
Various structural and dynamical properties of a network are encoded in the eigenvalues of walk matrix describing random walks on the network. In this paper, we study the spectra of walk matrix of the Koch network, which displays the prominent scale-free and small-world features. Utilizing the particular architecture of the network, we obtain all the eigenvalues and their corresponding multiplicities. Based on the link between the eigenvalues of walk matrix and random target access time defined as the expected time for a walker going from an arbitrary node to another one selected randomly according to the steady-state distribution, we then derive an explicit solution to the random target access time for random walks on the Koch network. Finally, we corroborate our computation for the eigenvalues by enumerating spanning trees in the Koch network, using the connection governing eigenvalues and spanning trees, where a spanning tree of a network is a subgraph of the network, that is, a tree containing all the nodes.
Low-dimensional Representation of Error Covariance
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; Cohn, Stephen E.; Todling, Ricardo; Marchesin, Dan
2000-01-01
Ensemble and reduced-rank approaches to prediction and assimilation rely on low-dimensional approximations of the estimation error covariances. Here stability properties of the forecast/analysis cycle for linear, time-independent systems are used to identify factors that cause the steady-state analysis error covariance to admit a low-dimensional representation. A useful measure of forecast/analysis cycle stability is the bound matrix, a function of the dynamics, observation operator and assimilation method. Upper and lower estimates for the steady-state analysis error covariance matrix eigenvalues are derived from the bound matrix. The estimates generalize to time-dependent systems. If much of the steady-state analysis error variance is due to a few dominant modes, the leading eigenvectors of the bound matrix approximate those of the steady-state analysis error covariance matrix. The analytical results are illustrated in two numerical examples where the Kalman filter is carried to steady state. The first example uses the dynamics of a generalized advection equation exhibiting nonmodal transient growth. Failure to observe growing modes leads to increased steady-state analysis error variances. Leading eigenvectors of the steady-state analysis error covariance matrix are well approximated by leading eigenvectors of the bound matrix. The second example uses the dynamics of a damped baroclinic wave model. The leading eigenvectors of a lowest-order approximation of the bound matrix are shown to approximate well the leading eigenvectors of the steady-state analysis error covariance matrix.
Shallow cumuli ensemble statistics for development of a stochastic parameterization
NASA Astrophysics Data System (ADS)
Sakradzija, Mirjana; Seifert, Axel; Heus, Thijs
2014-05-01
According to a conventional deterministic approach to the parameterization of moist convection in numerical atmospheric models, a given large scale forcing produces an unique response from the unresolved convective processes. This representation leaves out the small-scale variability of convection, as it is known from the empirical studies of deep and shallow convective cloud ensembles, there is a whole distribution of sub-grid states corresponding to the given large scale forcing. Moreover, this distribution gets broader with the increasing model resolution. This behavior is also consistent with our theoretical understanding of a coarse-grained nonlinear system. We propose an approach to represent the variability of the unresolved shallow-convective states, including the dependence of the sub-grid states distribution spread and shape on the model horizontal resolution. Starting from the Gibbs canonical ensemble theory, Craig and Cohen (2006) developed a theory for the fluctuations in a deep convective ensemble. The micro-states of a deep convective cloud ensemble are characterized by the cloud-base mass flux, which, according to the theory, is exponentially distributed (Boltzmann distribution). Following their work, we study the shallow cumulus ensemble statistics and the distribution of the cloud-base mass flux. We employ a Large-Eddy Simulation model (LES) and a cloud tracking algorithm, followed by a conditional sampling of clouds at the cloud base level, to retrieve the information about the individual cloud life cycles and the cloud ensemble as a whole. In the case of shallow cumulus cloud ensemble, the distribution of micro-states is a generalized exponential distribution. Based on the empirical and theoretical findings, a stochastic model has been developed to simulate the shallow convective cloud ensemble and to test the convective ensemble theory. Stochastic model simulates a compound random process, with the number of convective elements drawn from a Poisson distribution, and cloud properties sub-sampled from a generalized ensemble distribution. We study the role of the different cloud subtypes in a shallow convective ensemble and how the diverse cloud properties and cloud lifetimes affect the system macro-state. To what extent does the cloud-base mass flux distribution deviate from the simple Boltzmann distribution and how does it affect the results from the stochastic model? Is the memory, provided by the finite lifetime of individual clouds, of importance for the ensemble statistics? We also test for the minimal information given as an input to the stochastic model, able to reproduce the ensemble mean statistics and the variability in a convective ensemble. An important property of the resulting distribution of the sub-grid convective states is its scale-adaptivity - the smaller the grid-size, the broader the compound distribution of the sub-grid states.
Impact of Damping Uncertainty on SEA Model Response Variance
NASA Technical Reports Server (NTRS)
Schiller, Noah; Cabell, Randolph; Grosveld, Ferdinand
2010-01-01
Statistical Energy Analysis (SEA) is commonly used to predict high-frequency vibroacoustic levels. This statistical approach provides the mean response over an ensemble of random subsystems that share the same gross system properties such as density, size, and damping. Recently, techniques have been developed to predict the ensemble variance as well as the mean response. However these techniques do not account for uncertainties in the system properties. In the present paper uncertainty in the damping loss factor is propagated through SEA to obtain more realistic prediction bounds that account for both ensemble and damping variance. The analysis is performed on a floor-equipped cylindrical test article that resembles an aircraft fuselage. Realistic bounds on the damping loss factor are determined from measurements acquired on the sidewall of the test article. The analysis demonstrates that uncertainties in damping have the potential to significantly impact the mean and variance of the predicted response.
NASA Astrophysics Data System (ADS)
Keane, Richard J.; Plant, Robert S.; Tennant, Warren J.
2016-05-01
The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic scheme only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.
Changing precipitation in western Europe, climate change or natural variability?
NASA Astrophysics Data System (ADS)
Aalbers, Emma; Lenderink, Geert; van Meijgaard, Erik; van den Hurk, Bart
2017-04-01
Multi-model RCM-GCM ensembles provide high resolution climate projections, valuable for among others climate impact assessment studies. While the application of multiple models (both GCMs and RCMs) provides a certain robustness with respect to model uncertainty, the interpretation of differences between ensemble members - the combined result of model uncertainty and natural variability of the climate system - is not straightforward. Natural variability is intrinsic to the climate system, and a potentially large source of uncertainty in climate change projections, especially for projections on the local to regional scale. To quantify the natural variability and get a robust estimate of the forced climate change response (given a certain model and forcing scenario), large ensembles of climate model simulations of the same model provide essential information. While for global climate models (GCMs) a number of such large single model ensembles exists and have been analyzed, for regional climate models (RCMs) the number and size of single model ensembles is limited, and the predictability of the forced climate response at the local to regional scale is still rather uncertain. We present a regional downscaling of a 16-member single model ensemble over western Europe and the Alps at a resolution of 0.11 degrees (˜12km), similar to the highest resolution EURO-CORDEX simulations. This 16-member ensemble was generated by the GCM EC-EARTH, which was downscaled with the RCM RACMO for the period 1951-2100. This single model ensemble has been investigated in terms of the ensemble mean response (our estimate of the forced climate response), as well as the difference between the ensemble members, which measures natural variability. We focus on the response in seasonal mean and extreme precipitation (seasonal maxima and extremes with a return period up to 20 years) for the near to far future. For most precipitation indices we can reliably determine the climate change signal, given the applied model chain and forcing scenario. However, the analysis also shows how limited the information in single ensemble members is on the local scale forced climate response, even for high levels of global warming when the forced response has emerged from natural variability. Analysis and application of multi-model ensembles like EURO-CORDEX should go hand-in-hand with single model ensembles, like the one presented here, to be able to correctly interpret the fine-scale information in terms of a forced signal and random noise due to natural variability.
Mathematical foundations of hybrid data assimilation from a synchronization perspective
NASA Astrophysics Data System (ADS)
Penny, Stephen G.
2017-12-01
The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.
Mathematical foundations of hybrid data assimilation from a synchronization perspective.
Penny, Stephen G
2017-12-01
The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.
A stochastic diffusion process for Lochner's generalized Dirichlet distribution
Bakosi, J.; Ristorcelli, J. R.
2013-10-01
The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the joint probability of N stochastic variables with Lochner’s generalized Dirichlet distribution as its asymptotic solution. Individual samples of a discrete ensemble, obtained from the system of stochastic differential equations, equivalent to the Fokker-Planck equation developed here, satisfy a unit-sum constraint at all times and ensure a bounded sample space, similarly to the process developed in for the Dirichlet distribution. Consequently, the generalized Dirichlet diffusion process may be used to represent realizations of a fluctuating ensemble of N variables subject to a conservation principle.more » Compared to the Dirichlet distribution and process, the additional parameters of the generalized Dirichlet distribution allow a more general class of physical processes to be modeled with a more general covariance matrix.« less
Multiple Scattering in Planetary Regoliths Using Incoherent Interactions
NASA Astrophysics Data System (ADS)
Muinonen, K.; Markkanen, J.; Vaisanen, T.; Penttilä, A.
2017-12-01
We consider scattering of light by a planetary regolith using novel numerical methods for discrete random media of particles. Understanding the scattering process is of key importance for spectroscopic, photometric, and polarimetric modeling of airless planetary objects, including radar studies. In our modeling, the size of the spherical random medium can range from microscopic to macroscopic sizes, whereas the particles are assumed to be of the order of the wavelength in size. We extend the radiative transfer and coherent backscattering method (RT-CB) to the case of dense packing of particles by adopting the ensemble-averaged first-order incoherent extinction, scattering, and absorption characteristics of a volume element of particles as input. In the radiative transfer part, at each absorption and scattering process, we account for absorption with the help of the single-scattering albedo and peel off the Stokes parameters of radiation emerging from the medium in predefined scattering angles. We then generate a new scattering direction using the joint probability density for the local polar and azimuthal scattering angles. In the coherent backscattering part, we utilize amplitude scattering matrices along the radiative-transfer path and the reciprocal path. Furthermore, we replace the far-field interactions of the RT-CB method with rigorous interactions facilitated by the Superposition T-matrix method (STMM). This gives rise to a new RT-RT method, radiative transfer with reciprocal interactions. For microscopic random media, we then compare the new results to asymptotically exact results computed using the STMM, succeeding in the numerical validation of the new methods.Acknowledgments. Research supported by European Research Council with Advanced Grant No. 320773 SAEMPL, Scattering and Absorption of ElectroMagnetic waves in ParticuLate media. Computational resources provided by CSC - IT Centre for Science Ltd, Finland.
Dimitriadis, S I; Liparas, Dimitris; Tsolaki, Magda N
2018-05-15
In the era of computer-assisted diagnostic tools for various brain diseases, Alzheimer's disease (AD) covers a large percentage of neuroimaging research, with the main scope being its use in daily practice. However, there has been no study attempting to simultaneously discriminate among Healthy Controls (HC), early mild cognitive impairment (MCI), late MCI (cMCI) and stable AD, using features derived from a single modality, namely MRI. Based on preprocessed MRI images from the organizers of a neuroimaging challenge, 3 we attempted to quantify the prediction accuracy of multiple morphological MRI features to simultaneously discriminate among HC, MCI, cMCI and AD. We explored the efficacy of a novel scheme that includes multiple feature selections via Random Forest from subsets of the whole set of features (e.g. whole set, left/right hemisphere etc.), Random Forest classification using a fusion approach and ensemble classification via majority voting. From the ADNI database, 60 HC, 60 MCI, 60 cMCI and 60 CE were used as a training set with known labels. An extra dataset of 160 subjects (HC: 40, MCI: 40, cMCI: 40 and AD: 40) was used as an external blind validation dataset to evaluate the proposed machine learning scheme. In the second blind dataset, we succeeded in a four-class classification of 61.9% by combining MRI-based features with a Random Forest-based Ensemble Strategy. We achieved the best classification accuracy of all teams that participated in this neuroimaging competition. The results demonstrate the effectiveness of the proposed scheme to simultaneously discriminate among four groups using morphological MRI features for the very first time in the literature. Hence, the proposed machine learning scheme can be used to define single and multi-modal biomarkers for AD. Copyright © 2017 Elsevier B.V. All rights reserved.
Simultaneous Denoising, Deconvolution, and Demixing of Calcium Imaging Data
Pnevmatikakis, Eftychios A.; Soudry, Daniel; Gao, Yuanjun; Machado, Timothy A.; Merel, Josh; Pfau, David; Reardon, Thomas; Mu, Yu; Lacefield, Clay; Yang, Weijian; Ahrens, Misha; Bruno, Randy; Jessell, Thomas M.; Peterka, Darcy S.; Yuste, Rafael; Paninski, Liam
2016-01-01
SUMMARY We present a modular approach for analyzing calcium imaging recordings of large neuronal ensembles. Our goal is to simultaneously identify the locations of the neurons, demix spatially overlapping components, and denoise and deconvolve the spiking activity from the slow dynamics of the calcium indicator. Our approach relies on a constrained nonnegative matrix factorization that expresses the spatiotemporal fluorescence activity as the product of a spatial matrix that encodes the spatial footprint of each neuron in the optical field and a temporal matrix that characterizes the calcium concentration of each neuron over time. This framework is combined with a novel constrained deconvolution approach that extracts estimates of neural activity from fluorescence traces, to create a spatiotemporal processing algorithm that requires minimal parameter tuning. We demonstrate the general applicability of our method by applying it to in vitro and in vivo multineuronal imaging data, whole-brain light-sheet imaging data, and dendritic imaging data. PMID:26774160
Noisy covariance matrices and portfolio optimization
NASA Astrophysics Data System (ADS)
Pafka, S.; Kondor, I.
2002-05-01
According to recent findings [#!bouchaud!#,#!stanley!#], empirical covariance matrices deduced from financial return series contain such a high amount of noise that, apart from a few large eigenvalues and the corresponding eigenvectors, their structure can essentially be regarded as random. In [#!bouchaud!#], e.g., it is reported that about 94% of the spectrum of these matrices can be fitted by that of a random matrix drawn from an appropriately chosen ensemble. In view of the fundamental role of covariance matrices in the theory of portfolio optimization as well as in industry-wide risk management practices, we analyze the possible implications of this effect. Simulation experiments with matrices having a structure such as described in [#!bouchaud!#,#!stanley!#] lead us to the conclusion that in the context of the classical portfolio problem (minimizing the portfolio variance under linear constraints) noise has relatively little effect. To leading order the solutions are determined by the stable, large eigenvalues, and the displacement of the solution (measured in variance) due to noise is rather small: depending on the size of the portfolio and on the length of the time series, it is of the order of 5 to 15%. The picture is completely different, however, if we attempt to minimize the variance under non-linear constraints, like those that arise e.g. in the problem of margin accounts or in international capital adequacy regulation. In these problems the presence of noise leads to a serious instability and a high degree of degeneracy of the solutions.
Hybrid vs Adaptive Ensemble Kalman Filtering for Storm Surge Forecasting
NASA Astrophysics Data System (ADS)
Altaf, M. U.; Raboudi, N.; Gharamti, M. E.; Dawson, C.; McCabe, M. F.; Hoteit, I.
2014-12-01
Recent storm surge events due to Hurricanes in the Gulf of Mexico have motivated the efforts to accurately forecast water levels. Toward this goal, a parallel architecture has been implemented based on a high resolution storm surge model, ADCIRC. However the accuracy of the model notably depends on the quality and the recentness of the input data (mainly winds and bathymetry), model parameters (e.g. wind and bottom drag coefficients), and the resolution of the model grid. Given all these uncertainties in the system, the challenge is to build an efficient prediction system capable of providing accurate forecasts enough ahead of time for the authorities to evacuate the areas at risk. We have developed an ensemble-based data assimilation system to frequently assimilate available data into the ADCIRC model in order to improve the accuracy of the model. In this contribution we study and analyze the performances of different ensemble Kalman filter methodologies for efficient short-range storm surge forecasting, the aim being to produce the most accurate forecasts at the lowest possible computing time. Using Hurricane Ike meteorological data to force the ADCIRC model over a domain including the Gulf of Mexico coastline, we implement and compare the forecasts of the standard EnKF, the hybrid EnKF and an adaptive EnKF. The last two schemes have been introduced as efficient tools for enhancing the behavior of the EnKF when implemented with small ensembles by exploiting information from a static background covariance matrix. Covariance inflation and localization are implemented in all these filters. Our results suggest that both the hybrid and the adaptive approach provide significantly better forecasts than those resulting from the standard EnKF, even when implemented with much smaller ensembles.
Graphic matching based on shape contexts and reweighted random walks
NASA Astrophysics Data System (ADS)
Zhang, Mingxuan; Niu, Dongmei; Zhao, Xiuyang; Liu, Mingjun
2018-04-01
Graphic matching is a very critical issue in all aspects of computer vision. In this paper, a new graphics matching algorithm combining shape contexts and reweighted random walks was proposed. On the basis of the local descriptor, shape contexts, the reweighted random walks algorithm was modified to possess stronger robustness and correctness in the final result. Our main process is to use the descriptor of the shape contexts for the random walk on the iteration, of which purpose is to control the random walk probability matrix. We calculate bias matrix by using descriptors and then in the iteration we use it to enhance random walks' and random jumps' accuracy, finally we get the one-to-one registration result by discretization of the matrix. The algorithm not only preserves the noise robustness of reweighted random walks but also possesses the rotation, translation, scale invariance of shape contexts. Through extensive experiments, based on real images and random synthetic point sets, and comparisons with other algorithms, it is confirmed that this new method can produce excellent results in graphic matching.
Spectral density of mixtures of random density matrices for qubits
NASA Astrophysics Data System (ADS)
Zhang, Lin; Wang, Jiamei; Chen, Zhihua
2018-06-01
We derive the spectral density of the equiprobable mixture of two random density matrices of a two-level quantum system. We also work out the spectral density of mixture under the so-called quantum addition rule. We use the spectral densities to calculate the average entropy of mixtures of random density matrices, and show that the average entropy of the arithmetic-mean-state of n qubit density matrices randomly chosen from the Hilbert-Schmidt ensemble is never decreasing with the number n. We also get the exact value of the average squared fidelity. Some conjectures and open problems related to von Neumann entropy are also proposed.
Random-Forest Classification of High-Resolution Remote Sensing Images and Ndsm Over Urban Areas
NASA Astrophysics Data System (ADS)
Sun, X. F.; Lin, X. G.
2017-09-01
As an intermediate step between raw remote sensing data and digital urban maps, remote sensing data classification has been a challenging and long-standing research problem in the community of remote sensing. In this work, an effective classification method is proposed for classifying high-resolution remote sensing data over urban areas. Starting from high resolution multi-spectral images and 3D geometry data, our method proceeds in three main stages: feature extraction, classification, and classified result refinement. First, we extract color, vegetation index and texture features from the multi-spectral image and compute the height, elevation texture and differential morphological profile (DMP) features from the 3D geometry data. Then in the classification stage, multiple random forest (RF) classifiers are trained separately, then combined to form a RF ensemble to estimate each sample's category probabilities. Finally the probabilities along with the feature importance indicator outputted by RF ensemble are used to construct a fully connected conditional random field (FCCRF) graph model, by which the classification results are refined through mean-field based statistical inference. Experiments on the ISPRS Semantic Labeling Contest dataset show that our proposed 3-stage method achieves 86.9% overall accuracy on the test data.
NASA Astrophysics Data System (ADS)
Sun, Fubao; Roderick, Michael L.; Lim, Wee Ho; Farquhar, Graham D.
2011-12-01
We assess hydroclimatic projections for the Murray-Darling Basin (MDB) using an ensemble of 39 Intergovernmental Panel on Climate Change AR4 climate model runs based on the A1B emissions scenario. The raw model output for precipitation, P, was adjusted using a quantile-based bias correction approach. We found that the projected change, ΔP, between two 30 year periods (2070-2099 less 1970-1999) was little affected by bias correction. The range for ΔP among models was large (˜±150 mm yr-1) with all-model run and all-model ensemble averages (4.9 and -8.1 mm yr-1) near zero, against a background climatological P of ˜500 mm yr-1. We found that the time series of actually observed annual P over the MDB was indistinguishable from that generated by a purely random process. Importantly, nearly all the model runs showed similar behavior. We used these facts to develop a new approach to understanding variability in projections of ΔP. By plotting ΔP versus the variance of the time series, we could easily identify model runs with projections for ΔP that were beyond the bounds expected from purely random variations. For the MDB, we anticipate that a purely random process could lead to differences of ±57 mm yr-1 (95% confidence) between successive 30 year periods. This is equivalent to ±11% of the climatological P and translates into variations in runoff of around ±29%. This sets a baseline for gauging modeled and/or observed changes.
Performance Analysis of Local Ensemble Kalman Filter
NASA Astrophysics Data System (ADS)
Tong, Xin T.
2018-03-01
Ensemble Kalman filter (EnKF) is an important data assimilation method for high-dimensional geophysical systems. Efficient implementation of EnKF in practice often involves the localization technique, which updates each component using only information within a local radius. This paper rigorously analyzes the local EnKF (LEnKF) for linear systems and shows that the filter error can be dominated by the ensemble covariance, as long as (1) the sample size exceeds the logarithmic of state dimension and a constant that depends only on the local radius; (2) the forecast covariance matrix admits a stable localized structure. In particular, this indicates that with small system and observation noises, the filter error will be accurate in long time even if the initialization is not. The analysis also reveals an intrinsic inconsistency caused by the localization technique, and a stable localized structure is necessary to control this inconsistency. While this structure is usually taken for granted for the operation of LEnKF, it can also be rigorously proved for linear systems with sparse local observations and weak local interactions. These theoretical results are also validated by numerical implementation of LEnKF on a simple stochastic turbulence in two dynamical regimes.
Random matrices and the New York City subway system
NASA Astrophysics Data System (ADS)
Jagannath, Aukosh; Trogdon, Thomas
2017-09-01
We analyze subway arrival times in the New York City subway system. We find regimes where the gaps between trains are well modeled by (unitarily invariant) random matrix statistics and Poisson statistics. The departure from random matrix statistics is captured by the value of the Coulomb potential along the subway route. This departure becomes more pronounced as trains make more stops.
Doyle, Suzanne R.; Donovan, Dennis M.
2014-01-01
Aims The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed and the subagging procedure of random subsampling is applied. Methods Among 234 individuals with stimulant use disorders randomized to a 12-Step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. Findings The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower ASI Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. Conclusions The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038
NASA Astrophysics Data System (ADS)
Liu, Haitao
The objective of the present study is to investigate damage mechanisms and thermal residual stresses of composites, and to establish the frameworks to model the particle-reinforced metal matrix composites with particle-matrix interfacial debonding, particle cracking or thermal residual stresses. An evolutionary interfacial debonding model is proposed for the composites with spheroidal particles. The construction of the equivalent stiffness is based on the fact that when debonding occurs in a certain direction, the load-transfer ability will lose in that direction. By using this equivalent method, the interfacial debonding problem can be converted into a composite problem with perfectly bonded inclusions. Considering the interfacial debonding is a progressive process in which the debonding area increases in proportion to external loading, a progressive interfacial debonding model is proposed. In this model, the relation between external loading and the debonding area is established using a normal stress controlled debonding criterion. Furthermore, an equivalent orthotropic stiffness tensor is constructed based on the debonding areas. This model is able to study the composites with randomly distributed spherical particles. The double-inclusion theory is recalled to model the particle cracking problems. Cracks inside particles are treated as penny-shape particles with zero stiffness. The disturbed stress field due to the existence of a double-inclusion is expressed explicitly. Finally, a thermal mismatch eigenstrain is introduced to simulate the inconsistent expansions of the matrix and the particles due to the difference of the coefficients of thermal expansion. Micromechanical stress and strain fields are calculated due to the combination of applied external loads and the prescribed thermal mismatch eigenstrains. For all of the above models, ensemble-volume averaging procedures are employed to derive the effective yield function of the composites. Numerical simulations are performed to analyze the effects of various parameters and several good agreements between our model's predictions and experimental results are obtained. It should be mentioned that all of expressions in the frameworks are explicitly derived and these analytical results are easy to be adopted in other related investigations.
Quantum chaos for nonstandard symmetry classes in the Feingold-Peres model of coupled tops
NASA Astrophysics Data System (ADS)
Fan, Yiyun; Gnutzmann, Sven; Liang, Yuqi
2017-12-01
We consider two coupled quantum tops with angular momentum vectors L and M . The coupling Hamiltonian defines the Feingold-Peres model, which is a known paradigm of quantum chaos. We show that this model has a nonstandard symmetry with respect to the Altland-Zirnbauer tenfold symmetry classification of quantum systems, which extends the well-known threefold way of Wigner and Dyson (referred to as "standard" symmetry classes here). We identify the nonstandard symmetry classes BD I0 (chiral orthogonal class with no zero modes), BD I1 (chiral orthogonal class with one zero mode), and C I (antichiral orthogonal class) as well as the standard symmetry class A I (orthogonal class). We numerically analyze the specific spectral quantum signatures of chaos related to the nonstandard symmetries. In the microscopic density of states and in the distribution of the lowest positive energy eigenvalue, we show that the Feingold-Peres model follows the predictions of the Gaussian ensembles of random-matrix theory in the appropriate symmetry class if the corresponding classical dynamics is chaotic. In a crossover to mixed and near-integrable classical dynamics, we show that these signatures disappear or strongly change.
Wave Propagation and Localization via Quasi-Normal Modes and Transmission Eigenchannels
NASA Astrophysics Data System (ADS)
Wang, Jing; Shi, Zhou; Davy, Matthieu; Genack, Azriel Z.
2013-10-01
Field transmission coefficients for microwave radiation between arrays of points on the incident and output surfaces of random samples are analyzed to yield the underlying quasi-normal modes and transmission eigenchannels of each realization of the sample. The linewidths, central frequencies, and transmitted speckle patterns associated with each of the modes of the medium are found. Modal speckle patterns are found to be strongly correlated leading to destructive interference between modes. This explains distinctive features of transmission spectra and pulsed transmission. An alternate description of wave transport is obtained from the eigenchannels and eigenvalues of the transmission matrix. The maximum transmission eigenvalue, τ1 is near unity for diffusive waves even in turbid samples. For localized waves, τ1 is nearly equal to the dimensionless conductance, which is the sum of all transmission eigenvalues, g = Στn. The spacings between the ensemble averages of successive values of lnτn are constant and equal to the inverse of the bare conductance in accord with predictions by Dorokhov. The effective number of transmission eigenvalues Neff determines the contrast between the peak and background of radiation focused for maximum peak intensity. The connection between the mode and channel approaches is discussed.
Wave Propagation and Localization via Quasi-Normal Modes and Transmission Eigenchannels
NASA Astrophysics Data System (ADS)
Wang, Jing; Shi, Zhou; Davy, Matthieu; Genack, Azriel Z.
Field transmission coefficients for microwave radiation between arrays of points on the incident and output surfaces of random samples are analyzed to yield the underlying quasi-normal modes and transmission eigenchannels of each realization of the sample. The linewidths, central frequencies, and transmitted speckle patterns associated with each of the modes of the medium are found. Modal speckle patterns are found to be strongly correlated leading to destructive interference between modes. This explains distinctive features of transmission spectra and pulsed transmission. An alternate description of wave transport is obtained from the eigenchannels and eigenvalues of the transmission matrix. The maximum transmission eigenvalue, τ1 is near unity for diffusive waves even in turbid samples. For localized waves, τ1 is nearly equal to the dimensionless conductance, which is the sum of all transmission eigenvalues, g = Στn. The spacings between the ensemble averages of successive values of lnτn are constant and equal to the inverse of the bare conductance in accord with predictions by Dorokhov. The effective number of transmission eigenvalues Neff determines the contrast between the peak and background of radiation focused for maximum peak intensity. The connection between the mode and channel approaches is discussed.
Quantum chaos for nonstandard symmetry classes in the Feingold-Peres model of coupled tops.
Fan, Yiyun; Gnutzmann, Sven; Liang, Yuqi
2017-12-01
We consider two coupled quantum tops with angular momentum vectors L and M. The coupling Hamiltonian defines the Feingold-Peres model, which is a known paradigm of quantum chaos. We show that this model has a nonstandard symmetry with respect to the Altland-Zirnbauer tenfold symmetry classification of quantum systems, which extends the well-known threefold way of Wigner and Dyson (referred to as "standard" symmetry classes here). We identify the nonstandard symmetry classes BDI_{0} (chiral orthogonal class with no zero modes), BDI_{1} (chiral orthogonal class with one zero mode), and CI (antichiral orthogonal class) as well as the standard symmetry class AI (orthogonal class). We numerically analyze the specific spectral quantum signatures of chaos related to the nonstandard symmetries. In the microscopic density of states and in the distribution of the lowest positive energy eigenvalue, we show that the Feingold-Peres model follows the predictions of the Gaussian ensembles of random-matrix theory in the appropriate symmetry class if the corresponding classical dynamics is chaotic. In a crossover to mixed and near-integrable classical dynamics, we show that these signatures disappear or strongly change.
Equilibrium problems for Raney densities
NASA Astrophysics Data System (ADS)
Forrester, Peter J.; Liu, Dang-Zheng; Zinn-Justin, Paul
2015-07-01
The Raney numbers are a class of combinatorial numbers generalising the Fuss-Catalan numbers. They are indexed by a pair of positive real numbers (p, r) with p > 1 and 0 < r ⩽ p, and form the moments of a probability density function. For certain (p, r) the latter has the interpretation as the density of squared singular values for certain random matrix ensembles, and in this context equilibrium problems characterising the Raney densities for (p, r) = (θ + 1, 1) and (θ/2 + 1, 1/2) have recently been proposed. Using two different techniques—one based on the Wiener-Hopf method for the solution of integral equations and the other on an analysis of the algebraic equation satisfied by the Green's function—we establish the validity of the equilibrium problems for general θ > 0 and similarly use both methods to identify the equilibrium problem for (p, r) = (θ/q + 1, 1/q), θ > 0 and q \\in Z+ . The Wiener-Hopf method is used to extend the latter to parameters (p, r) = (θ/q + 1, m + 1/q) for m a non-negative integer, and also to identify the equilibrium problem for a family of densities with moments given by certain binomial coefficients.
Law, Andrew J.; Rivlis, Gil
2014-01-01
Pioneering studies demonstrated that novel degrees of freedom could be controlled individually by directly encoding the firing rate of single motor cortex neurons, without regard to each neuron's role in controlling movement of the native limb. In contrast, recent brain-computer interface work has emphasized decoding outputs from large ensembles that include substantially more neurons than the number of degrees of freedom being controlled. To bridge the gap between direct encoding by single neurons and decoding output from large ensembles, we studied monkeys controlling one degree of freedom by comodulating up to four arbitrarily selected motor cortex neurons. Performance typically exceeded random quite early in single sessions and then continued to improve to different degrees in different sessions. We therefore examined factors that might affect performance. Performance improved with larger ensembles. In contrast, other factors that might have reflected preexisting synaptic architecture—such as the similarity of preferred directions—had little if any effect on performance. Patterns of comodulation among ensemble neurons became more consistent across trials as performance improved over single sessions. Compared with the ensemble neurons, other simultaneously recorded neurons showed less modulation. Patterns of voluntarily comodulated firing among small numbers of arbitrarily selected primary motor cortex (M1) neurons thus can be found and improved rapidly, with little constraint based on the normal relationships of the individual neurons to native limb movement. This rapid flexibility in relationships among M1 neurons may in part underlie our ability to learn new movements and improve motor skill. PMID:24920030
Abawajy, Jemal; Kelarev, Andrei; Chowdhury, Morshed U; Jelinek, Herbert F
2016-01-01
Blood biochemistry attributes form an important class of tests, routinely collected several times per year for many patients with diabetes. The objective of this study is to investigate the role of blood biochemistry for improving the predictive accuracy of the diagnosis of cardiac autonomic neuropathy (CAN) progression. Blood biochemistry contributes to CAN, and so it is a causative factor that can provide additional power for the diagnosis of CAN especially in the absence of a complete set of Ewing tests. We introduce automated iterative multitier ensembles (AIME) and investigate their performance in comparison to base classifiers and standard ensemble classifiers for blood biochemistry attributes. AIME incorporate diverse ensembles into several tiers simultaneously and combine them into one automatically generated integrated system so that one ensemble acts as an integral part of another ensemble. We carried out extensive experimental analysis using large datasets from the diabetes screening research initiative (DiScRi) project. The results of our experiments show that several blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in situations where one or more of the Ewing tests cannot be completed because of the individual difficulties faced by each patient in performing the tests. The results show that AIME provide higher accuracy as a multitier CAN classification paradigm. The best predictive accuracy of 99.57% has been obtained by the AIME combining decorate on top tier with bagging on middle tier based on random forest. Practitioners can use these findings to increase the accuracy of CAN diagnosis.
NASA Technical Reports Server (NTRS)
Maggioni, V.; Anagnostou, E. N.; Reichle, R. H.
2013-01-01
The contribution of rainfall forcing errors relative to model (structural and parameter) uncertainty in the prediction of soil moisture is investigated by integrating the NASA Catchment Land Surface Model (CLSM), forced with hydro-meteorological data, in the Oklahoma region. Rainfall-forcing uncertainty is introduced using a stochastic error model that generates ensemble rainfall fields from satellite rainfall products. The ensemble satellite rain fields are propagated through CLSM to produce soil moisture ensembles. Errors in CLSM are modeled with two different approaches: either by perturbing model parameters (representing model parameter uncertainty) or by adding randomly generated noise (representing model structure and parameter uncertainty) to the model prognostic variables. Our findings highlight that the method currently used in the NASA GEOS-5 Land Data Assimilation System to perturb CLSM variables poorly describes the uncertainty in the predicted soil moisture, even when combined with rainfall model perturbations. On the other hand, by adding model parameter perturbations to rainfall forcing perturbations, a better characterization of uncertainty in soil moisture simulations is observed. Specifically, an analysis of the rank histograms shows that the most consistent ensemble of soil moisture is obtained by combining rainfall and model parameter perturbations. When rainfall forcing and model prognostic perturbations are added, the rank histogram shows a U-shape at the domain average scale, which corresponds to a lack of variability in the forecast ensemble. The more accurate estimation of the soil moisture prediction uncertainty obtained by combining rainfall and parameter perturbations is encouraging for the application of this approach in ensemble data assimilation systems.
Framework for cascade size calculations on random networks
NASA Astrophysics Data System (ADS)
Burkholz, Rebekka; Schweitzer, Frank
2018-04-01
We present a framework to calculate the cascade size evolution for a large class of cascade models on random network ensembles in the limit of infinite network size. Our method is exact and applies to network ensembles with almost arbitrary degree distribution, degree-degree correlations, and, in case of threshold models, for arbitrary threshold distribution. With our approach, we shift the perspective from the known branching process approximations to the iterative update of suitable probability distributions. Such distributions are key to capture cascade dynamics that involve possibly continuous quantities and that depend on the cascade history, e.g., if load is accumulated over time. As a proof of concept, we provide two examples: (a) Constant load models that cover many of the analytically tractable casacade models, and, as a highlight, (b) a fiber bundle model that was not tractable by branching process approximations before. Our derivations cover the whole cascade dynamics, not only their steady state. This allows us to include interventions in time or further model complexity in the analysis.
Randomized subspace-based robust principal component analysis for hyperspectral anomaly detection
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Li, Jialin; Zhang, Dianfa
2018-01-01
A randomized subspace-based robust principal component analysis (RSRPCA) method for anomaly detection in hyperspectral imagery (HSI) is proposed. The RSRPCA combines advantages of randomized column subspace and robust principal component analysis (RPCA). It assumes that the background has low-rank properties, and the anomalies are sparse and do not lie in the column subspace of the background. First, RSRPCA implements random sampling to sketch the original HSI dataset from columns and to construct a randomized column subspace of the background. Structured random projections are also adopted to sketch the HSI dataset from rows. Sketching from columns and rows could greatly reduce the computational requirements of RSRPCA. Second, the RSRPCA adopts the columnwise RPCA (CWRPCA) to eliminate negative effects of sampled anomaly pixels and that purifies the previous randomized column subspace by removing sampled anomaly columns. The CWRPCA decomposes the submatrix of the HSI data into a low-rank matrix (i.e., background component), a noisy matrix (i.e., noise component), and a sparse anomaly matrix (i.e., anomaly component) with only a small proportion of nonzero columns. The algorithm of inexact augmented Lagrange multiplier is utilized to optimize the CWRPCA problem and estimate the sparse matrix. Nonzero columns of the sparse anomaly matrix point to sampled anomaly columns in the submatrix. Third, all the pixels are projected onto the complemental subspace of the purified randomized column subspace of the background and the anomaly pixels in the original HSI data are finally exactly located. Several experiments on three real hyperspectral images are carefully designed to investigate the detection performance of RSRPCA, and the results are compared with four state-of-the-art methods. Experimental results show that the proposed RSRPCA outperforms four comparison methods both in detection performance and in computational time.
Ensemble of ground subsidence hazard maps using fuzzy logic
NASA Astrophysics Data System (ADS)
Park, Inhye; Lee, Jiyeong; Saro, Lee
2014-06-01
Hazard maps of ground subsidence around abandoned underground coal mines (AUCMs) in Samcheok, Korea, were constructed using fuzzy ensemble techniques and a geographical information system (GIS). To evaluate the factors related to ground subsidence, a spatial database was constructed from topographic, geologic, mine tunnel, land use, groundwater, and ground subsidence maps. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 70/30 for training and validation of the models. The relationships between the detected ground-subsidence area and the factors were identified and quantified by frequency ratio (FR), logistic regression (LR) and artificial neural network (ANN) models. The relationships were used as factor ratings in the overlay analysis to create ground-subsidence hazard indexes and maps. The three GSH maps were then used as new input factors and integrated using fuzzy-ensemble methods to make better hazard maps. All of the hazard maps were validated by comparison with known subsidence areas that were not used directly in the analysis. As the result, the ensemble model was found to be more effective in terms of prediction accuracy than the individual model.
Yang, Runtao; Zhang, Chengjin; Gao, Rui; Zhang, Lina
2015-09-07
Antifreeze proteins (AFPs) play a pivotal role in the antifreeze effect of overwintering organisms. They have a wide range of applications in numerous fields, such as improving the production of crops and the quality of frozen foods. Accurate identification of AFPs may provide important clues to decipher the underlying mechanisms of AFPs in ice-binding and to facilitate the selection of the most appropriate AFPs for several applications. Based on an ensemble learning technique, this study proposes an AFP identification system called AFP-Ensemble. In this system, random forest classifiers are trained by different training subsets and then aggregated into a consensus classifier by majority voting. The resulting predictor yields a sensitivity of 0.892, a specificity of 0.940, an accuracy of 0.938 and a balanced accuracy of 0.916 on an independent dataset, which are far better than the results obtained by previous methods. These results reveal that AFP-Ensemble is an effective and promising predictor for large-scale determination of AFPs. The detailed feature analysis in this study may give useful insights into the molecular mechanisms of AFP-ice interactions and provide guidance for the related experimental validation. A web server has been designed to implement the proposed method.
Can single classifiers be as useful as model ensembles to produce benthic seabed substratum maps?
NASA Astrophysics Data System (ADS)
Turner, Joseph A.; Babcock, Russell C.; Hovey, Renae; Kendrick, Gary A.
2018-05-01
Numerous machine-learning classifiers are available for benthic habitat map production, which can lead to different results. This study highlights the performance of the Random Forest (RF) classifier, which was significantly better than Classification Trees (CT), Naïve Bayes (NB), and a multi-model ensemble in terms of overall accuracy, Balanced Error Rate (BER), Kappa, and area under the curve (AUC) values. RF accuracy was often higher than 90% for each substratum class, even at the most detailed level of the substratum classification and AUC values also indicated excellent performance (0.8-1). Total agreement between classifiers was high at the broadest level of classification (75-80%) when differentiating between hard and soft substratum. However, this sharply declined as the number of substratum categories increased (19-45%) including a mix of rock, gravel, pebbles, and sand. The model ensemble, produced from the results of all three classifiers by majority voting, did not show any increase in predictive performance when compared to the single RF classifier. This study shows how a single classifier may be sufficient to produce benthic seabed maps and model ensembles of multiple classifiers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahn, Tae-Hyuk; Sandu, Adrian; Watson, Layne T.
2015-08-01
Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, andmore » where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model are consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25 %, and the total processor idle time by 85 %.« less
Data-driven probability concentration and sampling on manifold
DOE Office of Scientific and Technical Information (OSTI.GOV)
Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu
2016-09-15
A new methodology is proposed for generating realizations of a random vector with values in a finite-dimensional Euclidean space that are statistically consistent with a dataset of observations of this vector. The probability distribution of this random vector, while a priori not known, is presumed to be concentrated on an unknown subset of the Euclidean space. A random matrix is introduced whose columns are independent copies of the random vector and for which the number of columns is the number of data points in the dataset. The approach is based on the use of (i) the multidimensional kernel-density estimation methodmore » for estimating the probability distribution of the random matrix, (ii) a MCMC method for generating realizations for the random matrix, (iii) the diffusion-maps approach for discovering and characterizing the geometry and the structure of the dataset, and (iv) a reduced-order representation of the random matrix, which is constructed using the diffusion-maps vectors associated with the first eigenvalues of the transition matrix relative to the given dataset. The convergence aspects of the proposed methodology are analyzed and a numerical validation is explored through three applications of increasing complexity. The proposed method is found to be robust to noise levels and data complexity as well as to the intrinsic dimension of data and the size of experimental datasets. Both the methodology and the underlying mathematical framework presented in this paper contribute new capabilities and perspectives at the interface of uncertainty quantification, statistical data analysis, stochastic modeling and associated statistical inverse problems.« less
2012-03-01
with each SVM discriminating between a pair of the N total speakers in the data set. The (( + 1))/2 classifiers then vote on the final...classification of a test sample. The Random Forest classifier is an ensemble classifier that votes amongst decision trees generated with each node using...Forest vote , and the effects of overtraining will be mitigated by the fact that each decision tree is overtrained differently (due to the random
Diffusion in random networks: Asymptotic properties, and numerical and engineering approximations
NASA Astrophysics Data System (ADS)
Padrino, Juan C.; Zhang, Duan Z.
2016-11-01
The ensemble phase averaging technique is applied to model mass transport by diffusion in random networks. The system consists of an ensemble of random networks, where each network is made of a set of pockets connected by tortuous channels. Inside a channel, we assume that fluid transport is governed by the one-dimensional diffusion equation. Mass balance leads to an integro-differential equation for the pores mass density. The so-called dual porosity model is found to be equivalent to the leading order approximation of the integration kernel when the diffusion time scale inside the channels is small compared to the macroscopic time scale. As a test problem, we consider the one-dimensional mass diffusion in a semi-infinite domain, whose solution is sought numerically. Because of the required time to establish the linear concentration profile inside a channel, for early times the similarity variable is xt- 1 / 4 rather than xt- 1 / 2 as in the traditional theory. This early time sub-diffusive similarity can be explained by random walk theory through the network. In addition, by applying concepts of fractional calculus, we show that, for small time, the governing equation reduces to a fractional diffusion equation with known solution. We recast this solution in terms of special functions easier to compute. Comparison of the numerical and exact solutions shows excellent agreement.
Nonlinear consolidation in randomly heterogeneous highly compressible aquitards
NASA Astrophysics Data System (ADS)
Zapata-Norberto, Berenice; Morales-Casique, Eric; Herrera, Graciela S.
2018-05-01
Severe land subsidence due to groundwater extraction may occur in multiaquifer systems where highly compressible aquitards are present. The highly compressible nature of the aquitards leads to nonlinear consolidation where the groundwater flow parameters are stress-dependent. The case is further complicated by the heterogeneity of the hydrogeologic and geotechnical properties of the aquitards. The effect of realistic vertical heterogeneity of hydrogeologic and geotechnical parameters on the consolidation of highly compressible aquitards is investigated by means of one-dimensional Monte Carlo numerical simulations where the lower boundary represents the effect of an instant drop in hydraulic head due to groundwater pumping. Two thousand realizations are generated for each of the following parameters: hydraulic conductivity ( K), compression index ( C c), void ratio ( e) and m (an empirical parameter relating hydraulic conductivity and void ratio). The correlation structure, the mean and the variance for each parameter were obtained from a literature review about field studies in the lacustrine sediments of Mexico City. The results indicate that among the parameters considered, random K has the largest effect on the ensemble average behavior of the system when compared to a nonlinear consolidation model with deterministic initial parameters. The deterministic solution underestimates the ensemble average of total settlement when initial K is random. In addition, random K leads to the largest variance (and therefore largest uncertainty) of total settlement, groundwater flux and time to reach steady-state conditions.
Disentangling giant component and finite cluster contributions in sparse random matrix spectra.
Kühn, Reimer
2016-04-01
We describe a method for disentangling giant component and finite cluster contributions to sparse random matrix spectra, using sparse symmetric random matrices defined on Erdős-Rényi graphs as an example and test bed. Our methods apply to sparse matrices defined in terms of arbitrary graphs in the configuration model class, as long as they have finite mean degree.
Anisotropic properties of the enamel organic extracellular matrix.
do Espírito Santo, Alexandre R; Novaes, Pedro D; Line, Sérgio R P
2006-05-01
Enamel biosynthesis is initiated by the secretion, processing, and self-assembly of a complex mixture of proteins. This supramolecular ensemble controls the nucleation of the crystalline mineral phase. The detection of anisotropic properties by polarizing microscopy has been extensively used to detect macromolecular organizations in ordinary histological sections. The aim of this work was to study the birefringence of enamel organic matrix during the development of rat molar and incisor teeth. Incisor and molar teeth of rats were fixed in 2% paraformaldehyde/0.5% glutaraldehyde in 0.2 M phosphate-buffered saline (PBS), pH 7.2, and decalcified in 5% nitric acid/4% formaldehyde. After paraffin embedding, 5-microm-thick sections were obtained, treated with xylene, and hydrated. Form birefringence curves were obtained after measuring optical retardations in imbibing media, with different refractive indices. Our observations showed that enamel organic matrix of rat incisor and molar teeth is strongly birefringent, presenting an ordered supramolecular structure. The birefringence starts during the early secretion phase and disappears at the maturation phase. The analysis of enamel organic matrix birefringence may be used to detect the effects of genetic and environmental factors on the supramolecular orientation of enamel matrix and their effects on the structure of mature enamel.
NASA Astrophysics Data System (ADS)
Davies, Christine; Harrison, Judd; Lepage, G. Peter; Monahan, Christopher; Shigemitsu, Junko; Wingate, Matthew
2018-03-01
We present lattice QCD results for the matrix elements of R2 and other dimension-7, ΔB = 2 operators relevant for calculations of Δs, the Bs - B̅s width difference. We have computed correlation functions using 5 ensembles of the MILC Collaboration's 2+1 + 1-flavour gauge field configurations, spanning 3 lattice spacings and light sea quarks masses down to the physical point. The HISQ action is used for the valence strange quarks, and the NRQCD action is used for the bottom quarks. Once our analysis is complete, the theoretical uncertainty in the Standard Model prediction for ΔΓs will be substantially reduced.
D → Klv semileptonic decay using lattice QCD with HISQ at physical pion masses
NASA Astrophysics Data System (ADS)
Chakraborty, Bipasha; Davies, Christine; Koponen, Jonna; Lepage, G. Peter
2018-03-01
he quark flavor sector of the Standard Model is a fertile ground to look for new physics effects through a unitarity test of the Cabbibo-Kobayashi-Maskawa (CKM) matrix. We present a lattice QCD calculation of the scalar and the vector form factors (over a large q2 region including q2 = 0) associated with the D→ Klv semi-leptonic decay. This calculation will then allow us to determine the central CKM matrix element, Vcs in the Standard Model, by comparing the lattice QCD results for the form factors and the experimental decay rate. This form factor calculation has been performed on the Nf = 2 + 1 + 1 MILC HISQ ensembles with the physical light quark masses.
NASA Astrophysics Data System (ADS)
Ou, Jiemei; Yang, Yuzhao; Lin, Wensheng; Yuan, Zhongke; Gan, Lin; Lin, Xiaofeng; Chen, Xudong; Chen, Yujie
2015-03-01
We investigated the transitions of conformations and their effects on emission properties of poly[2-methoxy-5-(2'-ethyl-hexyloxy)-1,4-phenylene vinylene] (MEH-PPV) single molecules in PMMA matrix during thermal annealing process. Total internal reflection fluorescence microscopy measurements reveal the transformation from collapsed conformations to extended, highly ordered rod-like structures of MEH-PPV single molecules during thermal annealing. The blue shifts in the ensemble single molecule PL spectra support our hypnosis. The transition occurs as the annealing temperature exceeds 100 °C, implying that an annealing temperature near the glass transition temperature Tg of matrix is ideal for the control and optimization of blend polymer films.
Improved method for predicting protein fold patterns with ensemble classifiers.
Chen, W; Liu, X; Huang, Y; Jiang, Y; Zou, Q; Lin, C
2012-01-27
Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific scoring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at: http://datamining.xmu.edu.cn/main/~cwc/ProteinPredict.html.
Li, Shaobo; Liu, Guokai; Tang, Xianghong; Lu, Jianguang; Hu, Jianjun
2017-07-28
Intelligent machine health monitoring and fault diagnosis are becoming increasingly important for modern manufacturing industries. Current fault diagnosis approaches mostly depend on expert-designed features for building prediction models. In this paper, we proposed IDSCNN, a novel bearing fault diagnosis algorithm based on ensemble deep convolutional neural networks and an improved Dempster-Shafer theory based evidence fusion. The convolutional neural networks take the root mean square (RMS) maps from the FFT (Fast Fourier Transformation) features of the vibration signals from two sensors as inputs. The improved D-S evidence theory is implemented via distance matrix from evidences and modified Gini Index. Extensive evaluations of the IDSCNN on the Case Western Reserve Dataset showed that our IDSCNN algorithm can achieve better fault diagnosis performance than existing machine learning methods by fusing complementary or conflicting evidences from different models and sensors and adapting to different load conditions.
High temperature phonon dispersion in graphene using classical molecular dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anees, P., E-mail: anees@igcar.gov.in; Panigrahi, B. K.; Valsakumar, M. C., E-mail: anees@igcar.gov.in
2014-04-24
Phonon dispersion and phonon density of states of graphene are calculated using classical molecular dynamics simulations. In this method, the dynamical matrix is constructed based on linear response theory by computing the displacement of atoms during the simulations. The computed phonon dispersions show excellent agreement with experiments. The simulations are done in both NVT and NPT ensembles at 300 K and found that the LO/TO modes are getting hardened at the Γ point. The NPT ensemble simulations capture the anharmonicity of the crystal accurately and the hardening of LO/TO modes is more pronounced. We also found that at 300 Kmore » the C-C bond length reduces below the equilibrium value and the ZA bending mode frequency becomes imaginary close to Γ along K-Γ direction, which indicates instability of the flat 2D graphene sheets.« less
Quantum storage of orbital angular momentum entanglement in an atomic ensemble.
Ding, Dong-Sheng; Zhang, Wei; Zhou, Zhi-Yuan; Shi, Shuai; Xiang, Guo-Yong; Wang, Xi-Shi; Jiang, Yun-Kun; Shi, Bao-Sen; Guo, Guang-Can
2015-02-06
Constructing a quantum memory for a photonic entanglement is vital for realizing quantum communication and network. Because of the inherent infinite dimension of orbital angular momentum (OAM), the photon's OAM has the potential for encoding a photon in a high-dimensional space, enabling the realization of high channel capacity communication. Photons entangled in orthogonal polarizations or optical paths had been stored in a different system, but there have been no reports on the storage of a photon pair entangled in OAM space. Here, we report the first experimental realization of storing an entangled OAM state through the Raman protocol in a cold atomic ensemble. We reconstruct the density matrix of an OAM entangled state with a fidelity of 90.3%±0.8% and obtain the Clauser-Horne-Shimony-Holt inequality parameter S of 2.41±0.06 after a programed storage time. All results clearly show the preservation of entanglement during the storage.
Li, Shaobo; Liu, Guokai; Tang, Xianghong; Lu, Jianguang
2017-01-01
Intelligent machine health monitoring and fault diagnosis are becoming increasingly important for modern manufacturing industries. Current fault diagnosis approaches mostly depend on expert-designed features for building prediction models. In this paper, we proposed IDSCNN, a novel bearing fault diagnosis algorithm based on ensemble deep convolutional neural networks and an improved Dempster–Shafer theory based evidence fusion. The convolutional neural networks take the root mean square (RMS) maps from the FFT (Fast Fourier Transformation) features of the vibration signals from two sensors as inputs. The improved D-S evidence theory is implemented via distance matrix from evidences and modified Gini Index. Extensive evaluations of the IDSCNN on the Case Western Reserve Dataset showed that our IDSCNN algorithm can achieve better fault diagnosis performance than existing machine learning methods by fusing complementary or conflicting evidences from different models and sensors and adapting to different load conditions. PMID:28788099
Agarwal, Jayant P; Mendenhall, Shaun D; Anderson, Layla A; Ying, Jian; Boucher, Kenneth M; Liu, Ting; Neumayer, Leigh A
2015-01-01
Recent literature has focused on the advantages and disadvantages of using acellular dermal matrix in breast reconstruction. Many of the reported data are from low level-of-evidence studies, leaving many questions incompletely answered. The present randomized trial provides high-level data on the incidence and severity of complications in acellular dermal matrix breast reconstruction between two commonly used types of acellular dermal matrix. A prospective randomized trial was conducted to compare outcomes of immediate staged tissue expander breast reconstruction using either AlloDerm or DermaMatrix. The impact of body mass index, smoking, diabetes, mastectomy type, radiation therapy, and chemotherapy on outcomes was analyzed. Acellular dermal matrix biointegration was analyzed clinically and histologically. Patient satisfaction was assessed by means of preoperative and postoperative surveys. Logistic regression models were used to identify predictors of complications. This article reports on the study design, surgical technique, patient characteristics, and preoperative survey results, with outcomes data in a separate report. After 2.5 years, we successfully enrolled and randomized 128 patients (199 breasts). The majority of patients were healthy nonsmokers, with 41 percent of patients receiving radiation therapy and 49 percent receiving chemotherapy. Half of the mastectomies were prophylactic, with nipple-sparing mastectomy common in both cancer and prophylactic cases. Preoperative survey results indicate that patients were satisfied with their premastectomy breast reconstruction education. Results from the Breast Reconstruction Evaluation Using Acellular Dermal Matrix as a Sling Trial will assist plastic surgeons in making evidence-based decisions regarding acellular dermal matrix-assisted tissue expander breast reconstruction. Therapeutic, II.
Partial transpose of random quantum states: Exact formulas and meanders
NASA Astrophysics Data System (ADS)
Fukuda, Motohisa; Śniady, Piotr
2013-04-01
We investigate the asymptotic behavior of the empirical eigenvalues distribution of the partial transpose of a random quantum state. The limiting distribution was previously investigated via Wishart random matrices indirectly (by approximating the matrix of trace 1 by the Wishart matrix of random trace) and shown to be the semicircular distribution or the free difference of two free Poisson distributions, depending on how dimensions of the concerned spaces grow. Our use of Wishart matrices gives exact combinatorial formulas for the moments of the partial transpose of the random state. We find three natural asymptotic regimes in terms of geodesics on the permutation groups. Two of them correspond to the above two cases; the third one turns out to be a new matrix model for the meander polynomials. Moreover, we prove the convergence to the semicircular distribution together with its extreme eigenvalues under weaker assumptions, and show large deviation bound for the latter.
NASA Astrophysics Data System (ADS)
Siegel, Z.; Siegel, Edward Carl-Ludwig
2011-03-01
RANDOMNESS of Numbers cognitive-semantics DEFINITION VIA Cognition QUERY: WHAT???, NOT HOW?) VS. computer-``science" mindLESS number-crunching (Harrel-Sipser-...) algorithmics Goldreich "PSEUDO-randomness"[Not.AMS(02)] mea-culpa is ONLY via MAXWELL-BOLTZMANN CLASSICAL-STATISTICS(NOT FDQS!!!) "hot-plasma" REPULSION VERSUS Newcomb(1881)-Weyl(1914;1916)-Benford(1938) "NeWBe" logarithmic-law digit-CLUMPING/ CLUSTERING NON-Randomness simple Siegel[AMS Joint.Mtg.(02)-Abs. # 973-60-124] algebraic-inversion to THE QUANTUM and ONLY BEQS preferentially SEQUENTIALLY lower-DIGITS CLUMPING/CLUSTERING with d = 0 BEC, is ONLY VIA Siegel-Baez FUZZYICS=CATEGORYICS (SON OF TRIZ)/"Category-Semantics"(C-S), latter intersection/union of Lawvere(1964)-Siegel(1964)] category-theory (matrix: MORPHISMS V FUNCTORS) "+" cognitive-semantics'' (matrix: ANTONYMS V SYNONYMS) yields Siegel-Baez FUZZYICS=CATEGORYICS/C-S tabular list-format matrix truth-table analytics: MBCS RANDOMNESS TRUTH/EMET!!!
QCD-inspired spectra from Blue's functions
NASA Astrophysics Data System (ADS)
Nowak, Maciej A.; Papp, Gábor; Zahed, Ismail
1996-02-01
We use the law of addition in random matrix theory to analyze the spectral distributions of a variety of chiral random matrix models as inspired from QCD whether through symmetries or models. In terms of the Blue's functions recently discussed by Zee, we show that most of the spectral distributions in the macroscopic limit and the quenched approximation, follow algebraically from the discontinuity of a pertinent solution to a cubic (Cardano) or a quartic (Ferrari) equation. We use the end-point equation of the energy spectra in chiral random matrix models to argue for novel phase structures, in which the Dirac density of states plays the role of an order parameter.
Universality in chaos: Lyapunov spectrum and random matrix theory.
Hanada, Masanori; Shimada, Hidehiko; Tezuka, Masaki
2018-02-01
We propose the existence of a new universality in classical chaotic systems when the number of degrees of freedom is large: the statistical property of the Lyapunov spectrum is described by random matrix theory. We demonstrate it by studying the finite-time Lyapunov exponents of the matrix model of a stringy black hole and the mass-deformed models. The massless limit, which has a dual string theory interpretation, is special in that the universal behavior can be seen already at t=0, while in other cases it sets in at late time. The same pattern is demonstrated also in the product of random matrices.
Universality in chaos: Lyapunov spectrum and random matrix theory
NASA Astrophysics Data System (ADS)
Hanada, Masanori; Shimada, Hidehiko; Tezuka, Masaki
2018-02-01
We propose the existence of a new universality in classical chaotic systems when the number of degrees of freedom is large: the statistical property of the Lyapunov spectrum is described by random matrix theory. We demonstrate it by studying the finite-time Lyapunov exponents of the matrix model of a stringy black hole and the mass-deformed models. The massless limit, which has a dual string theory interpretation, is special in that the universal behavior can be seen already at t =0 , while in other cases it sets in at late time. The same pattern is demonstrated also in the product of random matrices.
Social patterns revealed through random matrix theory
NASA Astrophysics Data System (ADS)
Sarkar, Camellia; Jalan, Sarika
2014-11-01
Despite the tremendous advancements in the field of network theory, very few studies have taken weights in the interactions into consideration that emerge naturally in all real-world systems. Using random matrix analysis of a weighted social network, we demonstrate the profound impact of weights in interactions on emerging structural properties. The analysis reveals that randomness existing in particular time frame affects the decisions of individuals rendering them more freedom of choice in situations of financial security. While the structural organization of networks remains the same throughout all datasets, random matrix theory provides insight into the interaction pattern of individuals of the society in situations of crisis. It has also been contemplated that individual accountability in terms of weighted interactions remains as a key to success unless segregation of tasks comes into play.
Zhang, Du; Su, Neil Qiang; Yang, Weitao
2017-07-20
The GW self-energy, especially G 0 W 0 based on the particle-hole random phase approximation (phRPA), is widely used to study quasiparticle (QP) energies. Motivated by the desirable features of the particle-particle (pp) RPA compared to the conventional phRPA, we explore the pp counterpart of GW, that is, the T-matrix self-energy, formulated with the eigenvectors and eigenvalues of the ppRPA matrix. We demonstrate the accuracy of the T-matrix method for molecular QP energies, highlighting the importance of the pp channel for calculating QP spectra.
NASA Astrophysics Data System (ADS)
Hu, Guiqiang; Xiao, Di; Wang, Yong; Xiang, Tao; Zhou, Qing
2017-11-01
Recently, a new kind of image encryption approach using compressive sensing (CS) and double random phase encoding has received much attention due to the advantages such as compressibility and robustness. However, this approach is found to be vulnerable to chosen plaintext attack (CPA) if the CS measurement matrix is re-used. Therefore, designing an efficient measurement matrix updating mechanism that ensures resistance to CPA is of practical significance. In this paper, we provide a novel solution to update the CS measurement matrix by altering the secret sparse basis with the help of counter mode operation. Particularly, the secret sparse basis is implemented by a reality-preserving fractional cosine transform matrix. Compared with the conventional CS-based cryptosystem that totally generates all the random entries of measurement matrix, our scheme owns efficiency superiority while guaranteeing resistance to CPA. Experimental and analysis results show that the proposed scheme has a good security performance and has robustness against noise and occlusion.
Randomized Dynamic Mode Decomposition
NASA Astrophysics Data System (ADS)
Erichson, N. Benjamin; Brunton, Steven L.; Kutz, J. Nathan
2017-11-01
The dynamic mode decomposition (DMD) is an equation-free, data-driven matrix decomposition that is capable of providing accurate reconstructions of spatio-temporal coherent structures arising in dynamical systems. We present randomized algorithms to compute the near-optimal low-rank dynamic mode decomposition for massive datasets. Randomized algorithms are simple, accurate and able to ease the computational challenges arising with `big data'. Moreover, randomized algorithms are amenable to modern parallel and distributed computing. The idea is to derive a smaller matrix from the high-dimensional input data matrix using randomness as a computational strategy. Then, the dynamic modes and eigenvalues are accurately learned from this smaller representation of the data, whereby the approximation quality can be controlled via oversampling and power iterations. Here, we present randomized DMD algorithms that are categorized by how many passes the algorithm takes through the data. Specifically, the single-pass randomized DMD does not require data to be stored for subsequent passes. Thus, it is possible to approximately decompose massive fluid flows (stored out of core memory, or not stored at all) using single-pass algorithms, which is infeasible with traditional DMD algorithms.
Analytical solution of a stochastic model of risk spreading with global coupling
NASA Astrophysics Data System (ADS)
Morita, Satoru; Yoshimura, Jin
2013-11-01
We study a stochastic matrix model to understand the mechanics of risk spreading (or bet hedging) by dispersion. Up to now, this model has been mostly dealt with numerically, except for the well-mixed case. Here, we present an analytical result that shows that optimal dispersion leads to Zipf's law. Moreover, we found that the arithmetic ensemble average of the total growth rate converges to the geometric one, because the sample size is finite.
Leptonic-decay-constant ratio f(K+)/f(π+) from lattice QCD with physical light quarks.
Bazavov, A; Bernard, C; DeTar, C; Foley, J; Freeman, W; Gottlieb, Steven; Heller, U M; Hetrick, J E; Kim, J; Laiho, J; Levkova, L; Lightman, M; Osborn, J; Qiu, S; Sugar, R L; Toussaint, D; Van de Water, R S; Zhou, R
2013-04-26
A calculation of the ratio of leptonic decay constants f(K+)/f(π+) makes possible a precise determination of the ratio of Cabibbo-Kobayashi-Maskawa (CKM) matrix elements |V(us)|/|V(ud)| in the standard model, and places a stringent constraint on the scale of new physics that would lead to deviations from unitarity in the first row of the CKM matrix. We compute f(K+)/f(π+) numerically in unquenched lattice QCD using gauge-field ensembles recently generated that include four flavors of dynamical quarks: up, down, strange, and charm. We analyze data at four lattice spacings a ≈ 0.06, 0.09, 0.12, and 0.15 fm with simulated pion masses down to the physical value 135 MeV. We obtain f(K+)/f(π+) = 1.1947(26)(37), where the errors are statistical and total systematic, respectively. This is our first physics result from our N(f) = 2+1+1 ensembles, and the first calculation of f(K+)/f(π+) from lattice-QCD simulations at the physical point. Our result is the most precise lattice-QCD determination of f(K+)/f(π+), with an error comparable to the current world average. When combined with experimental measurements of the leptonic branching fractions, it leads to a precise determination of |V(us)|/|V(ud)| = 0.2309(9)(4) where the errors are theoretical and experimental, respectively.
Deterministic Mean-Field Ensemble Kalman Filtering
Law, Kody J. H.; Tembine, Hamidou; Tempone, Raul
2016-05-03
The proof of convergence of the standard ensemble Kalman filter (EnKF) from Le Gland, Monbet, and Tran [Large sample asymptotics for the ensemble Kalman filter, in The Oxford Handbook of Nonlinear Filtering, Oxford University Press, Oxford, UK, 2011, pp. 598--631] is extended to non-Gaussian state-space models. In this paper, a density-based deterministic approximation of the mean-field limit EnKF (DMFEnKF) is proposed, consisting of a PDE solver and a quadrature rule. Given a certain minimal order of convergence κ between the two, this extends to the deterministic filter approximation, which is therefore asymptotically superior to standard EnKF for dimension d
Yu, Hualong; Ni, Jun
2014-01-01
Training classifiers on skewed data can be technically challenging tasks, especially if the data is high-dimensional simultaneously, the tasks can become more difficult. In biomedicine field, skewed data type often appears. In this study, we try to deal with this problem by combining asymmetric bagging ensemble classifier (asBagging) that has been presented in previous work and an improved random subspace (RS) generation strategy that is called feature subspace (FSS). Specifically, FSS is a novel method to promote the balance level between accuracy and diversity of base classifiers in asBagging. In view of the strong generalization capability of support vector machine (SVM), we adopt it to be base classifier. Extensive experiments on four benchmark biomedicine data sets indicate that the proposed ensemble learning method outperforms many baseline approaches in terms of Accuracy, F-measure, G-mean and AUC evaluation criterions, thus it can be regarded as an effective and efficient tool to deal with high-dimensional and imbalanced biomedical data.
Deterministic Mean-Field Ensemble Kalman Filtering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Law, Kody J. H.; Tembine, Hamidou; Tempone, Raul
The proof of convergence of the standard ensemble Kalman filter (EnKF) from Le Gland, Monbet, and Tran [Large sample asymptotics for the ensemble Kalman filter, in The Oxford Handbook of Nonlinear Filtering, Oxford University Press, Oxford, UK, 2011, pp. 598--631] is extended to non-Gaussian state-space models. In this paper, a density-based deterministic approximation of the mean-field limit EnKF (DMFEnKF) is proposed, consisting of a PDE solver and a quadrature rule. Given a certain minimal order of convergence κ between the two, this extends to the deterministic filter approximation, which is therefore asymptotically superior to standard EnKF for dimension d
Learning ensemble classifiers for diabetic retinopathy assessment.
Saleh, Emran; Błaszczyński, Jerzy; Moreno, Antonio; Valls, Aida; Romero-Aroca, Pedro; de la Riva-Fernández, Sofia; Słowiński, Roman
2018-04-01
Diabetic retinopathy is one of the most common comorbidities of diabetes. Unfortunately, the recommended annual screening of the eye fundus of diabetic patients is too resource-consuming. Therefore, it is necessary to develop tools that may help doctors to determine the risk of each patient to attain this condition, so that patients with a low risk may be screened less frequently and the use of resources can be improved. This paper explores the use of two kinds of ensemble classifiers learned from data: fuzzy random forest and dominance-based rough set balanced rule ensemble. These classifiers use a small set of attributes which represent main risk factors to determine whether a patient is in risk of developing diabetic retinopathy. The levels of specificity and sensitivity obtained in the presented study are over 80%. This study is thus a first successful step towards the construction of a personalized decision support system that could help physicians in daily clinical practice. Copyright © 2017 Elsevier B.V. All rights reserved.
Xu, Xinyu; Tian, Yu; Wang, Guolin; Tian, Xin
2014-08-15
Working memory (WM) refers to the temporary storage and manipulation of information necessary for performance of complex cognitive tasks. There is a growing interest in whether and how propofol anesthesia inhibits WM function. The aim of this study is to investigate the possible inhibition mechanism of propofol anesthesia from the view of single neuron and neuronal ensemble activities. Adult SD rats were randomly divided into two groups: propofol group (0.9 mg kg(-1)min(-1), 2h via a tail vein catheter) and control group. All the rats were tested for working memory performances in a Y-maze-rewarded alternation task (a task of delayed non-matched-to-sample) at 24, 48, 72 h after propofol anesthesia, and the behavior results of WM tasks were recorded at the same time. Spatio-temporal trains of action potentials were obtained from the original signals. Single neuron activity was characterized by peri-event time histograms analysis and neuron ensemble activities were characterized by Granger causality to describe the interactions within the neuron ensemble. The results show that: comparing with the control group, the percentage of neurons excited and related to WM was significantly decreased (p<0.01 in 24h, p<0.05 in 48 h); the interactions within neuron ensemble were significantly weakened (p<0.01 in 24h, p<0.05 in 48 h), whereas no significant difference in 72 h (p>0.05), which were consistent with the behavior results. These findings could lead to improved understanding of the mechanism of anesthesia inhibition on WM functions from the view of single neuron activity and neuron ensemble interactions. Copyright © 2014 Elsevier B.V. All rights reserved.
Park, Jae Hong; Kim, Chang-Eop; Shin, Jaewoo; Im, Changkyun; Koh, Chin Su; Seo, In Seok; Kim, Sang Jeong; Shin, Hyung-Cheul
2013-10-01
Chronic monitoring of the state of the bladder can be used to notify patients with urinary dysfunction when the bladder should be voided. Given that many spinal neurons respond both to somatic and visceral inputs, it is necessary to extract bladder information selectively from the spinal cord. Here, we hypothesize that sensory information with distinct modalities should be represented by the distinct ensemble activity patterns within the neuronal population and, therefore, analyzing the activity patterns of the neuronal population could distinguish bladder fullness from somatic stimuli. We simultaneously recorded 26-27 single unit activities in response to bladder distension or tactile stimuli in the dorsal spinal cord of each Sprague-Dawley rat. In order to discriminate between bladder fullness and tactile stimulus inputs, we analyzed the ensemble activity patterns of the entire neuronal population. A support vector machine (SVM) was employed as a classifier, and discrimination performance was measured by k-fold cross-validation tests. Most of the units responding to bladder fullness also responded to the tactile stimuli (88.9-100%). The SVM classifier precisely distinguished the bladder fullness from the somatic input (100%), indicating that the ensemble activity patterns of the unit population in the spinal cord are distinct enough to identify the current input modality. Moreover, our ensemble activity pattern-based classifier showed high robustness against random losses of signals. This study is the first to demonstrate that the two main issues of electroneurographic monitoring of bladder fullness, low signals and selectiveness, can be solved by an ensemble activity pattern-based approach, improving the feasibility of chronic monitoring of bladder fullness by neural recording.
Clark, M.R.; Gangopadhyay, S.; Hay, L.; Rajagopalan, B.; Wilby, R.
2004-01-01
A number of statistical methods that are used to provide local-scale ensemble forecasts of precipitation and temperature do not contain realistic spatial covariability between neighboring stations or realistic temporal persistence for subsequent forecast lead times. To demonstrate this point, output from a global-scale numerical weather prediction model is used in a stepwise multiple linear regression approach to downscale precipitation and temperature to individual stations located in and around four study basins in the United States. Output from the forecast model is downscaled for lead times up to 14 days. Residuals in the regression equation are modeled stochastically to provide 100 ensemble forecasts. The precipitation and temperature ensembles from this approach have a poor representation of the spatial variability and temporal persistence. The spatial correlations for downscaled output are considerably lower than observed spatial correlations at short forecast lead times (e.g., less than 5 days) when there is high accuracy in the forecasts. At longer forecast lead times, the downscaled spatial correlations are close to zero. Similarly, the observed temporal persistence is only partly present at short forecast lead times. A method is presented for reordering the ensemble output in order to recover the space-time variability in precipitation and temperature fields. In this approach, the ensemble members for a given forecast day are ranked and matched with the rank of precipitation and temperature data from days randomly selected from similar dates in the historical record. The ensembles are then reordered to correspond to the original order of the selection of historical data. Using this approach, the observed intersite correlations, intervariable correlations, and the observed temporal persistence are almost entirely recovered. This reordering methodology also has applications for recovering the space-time variability in modeled streamflow. ?? 2004 American Meteorological Society.
NASA Astrophysics Data System (ADS)
Dong, Lieqian; Wang, Deying; Zhang, Yimeng; Zhou, Datong
2017-09-01
Signal enhancement is a necessary step in seismic data processing. In this paper we utilize the complementary ensemble empirical mode decomposition (CEEMD) and complex curvelet transform (CCT) methods to separate signal from random noise further to improve the signal to noise (S/N) ratio. Firstly, the original data with noise is decomposed into a series of intrinsic mode function (IMF) profiles with the aid of CEEMD. Then the IMFs with noise are transformed into CCT domain. By choosing different thresholds which are based on the noise level difference of each IMF profile, the noise in original data can be suppressed. Finally, we illustrate the effectiveness of the approach by simulated and field datasets.
Path planning in uncertain flow fields using ensemble method
NASA Astrophysics Data System (ADS)
Wang, Tong; Le Maître, Olivier P.; Hoteit, Ibrahim; Knio, Omar M.
2016-10-01
An ensemble-based approach is developed to conduct optimal path planning in unsteady ocean currents under uncertainty. We focus our attention on two-dimensional steady and unsteady uncertain flows, and adopt a sampling methodology that is well suited to operational forecasts, where an ensemble of deterministic predictions is used to model and quantify uncertainty. In an operational setting, much about dynamics, topography, and forcing of the ocean environment is uncertain. To address this uncertainty, the flow field is parametrized using a finite number of independent canonical random variables with known densities, and the ensemble is generated by sampling these variables. For each of the resulting realizations of the uncertain current field, we predict the path that minimizes the travel time by solving a boundary value problem (BVP), based on the Pontryagin maximum principle. A family of backward-in-time trajectories starting at the end position is used to generate suitable initial values for the BVP solver. This allows us to examine and analyze the performance of the sampling strategy and to develop insight into extensions dealing with general circulation ocean models. In particular, the ensemble method enables us to perform a statistical analysis of travel times and consequently develop a path planning approach that accounts for these statistics. The proposed methodology is tested for a number of scenarios. We first validate our algorithms by reproducing simple canonical solutions, and then demonstrate our approach in more complex flow fields, including idealized, steady and unsteady double-gyre flows.
New scaling relation for information transfer in biological networks
Kim, Hyunju; Davies, Paul; Walker, Sara Imari
2015-01-01
We quantify characteristics of the informational architecture of two representative biological networks: the Boolean network model for the cell-cycle regulatory network of the fission yeast Schizosaccharomyces pombe (Davidich et al. 2008 PLoS ONE 3, e1672 (doi:10.1371/journal.pone.0001672)) and that of the budding yeast Saccharomyces cerevisiae (Li et al. 2004 Proc. Natl Acad. Sci. USA 101, 4781–4786 (doi:10.1073/pnas.0305937101)). We compare our results for these biological networks with the same analysis performed on ensembles of two different types of random networks: Erdös–Rényi and scale-free. We show that both biological networks share features in common that are not shared by either random network ensemble. In particular, the biological networks in our study process more information than the random networks on average. Both biological networks also exhibit a scaling relation in information transferred between nodes that distinguishes them from random, where the biological networks stand out as distinct even when compared with random networks that share important topological properties, such as degree distribution, with the biological network. We show that the most biologically distinct regime of this scaling relation is associated with a subset of control nodes that regulate the dynamics and function of each respective biological network. Information processing in biological networks is therefore interpreted as an emergent property of topology (causal structure) and dynamics (function). Our results demonstrate quantitatively how the informational architecture of biologically evolved networks can distinguish them from other classes of network architecture that do not share the same informational properties. PMID:26701883
Supermodeling With A Global Atmospheric Model
NASA Astrophysics Data System (ADS)
Wiegerinck, Wim; Burgers, Willem; Selten, Frank
2013-04-01
In weather and climate prediction studies it often turns out to be the case that the multi-model ensemble mean prediction has the best prediction skill scores. One possible explanation is that the major part of the model error is random and is averaged out in the ensemble mean. In the standard multi-model ensemble approach, the models are integrated in time independently and the predicted states are combined a posteriori. Recently an alternative ensemble prediction approach has been proposed in which the models exchange information during the simulation and synchronize on a common solution that is closer to the truth than any of the individual model solutions in the standard multi-model ensemble approach or a weighted average of these. This approach is called the super modeling approach (SUMO). The potential of the SUMO approach has been demonstrated in the context of simple, low-order, chaotic dynamical systems. The information exchange takes the form of linear nudging terms in the dynamical equations that nudge the solution of each model to the solution of all other models in the ensemble. With a suitable choice of the connection strengths the models synchronize on a common solution that is indeed closer to the true system than any of the individual model solutions without nudging. This approach is called connected SUMO. An alternative approach is to integrate a weighted averaged model, weighted SUMO. At each time step all models in the ensemble calculate the tendency, these tendencies are weighted averaged and the state is integrated one time step into the future with this weighted averaged tendency. It was shown that in case the connected SUMO synchronizes perfectly, the connected SUMO follows the weighted averaged trajectory and both approaches yield the same solution. In this study we pioneer both approaches in the context of a global, quasi-geostrophic, three-level atmosphere model that is capable of simulating quite realistically the extra-tropical circulation in the Northern Hemisphere winter.
NASA Astrophysics Data System (ADS)
Ali, Mumtaz; Deo, Ravinesh C.; Downs, Nathan J.; Maraseni, Tek
2018-07-01
Forecasting drought by means of the World Meteorological Organization-approved Standardized Precipitation Index (SPI) is considered to be a fundamental task to support socio-economic initiatives and effectively mitigating the climate-risk. This study aims to develop a robust drought modelling strategy to forecast multi-scalar SPI in drought-rich regions of Pakistan where statistically significant lagged combinations of antecedent SPI are used to forecast future SPI. With ensemble-Adaptive Neuro Fuzzy Inference System ('ensemble-ANFIS') executed via a 10-fold cross-validation procedure, a model is constructed by randomly partitioned input-target data. Resulting in 10-member ensemble-ANFIS outputs, judged by mean square error and correlation coefficient in the training period, the optimal forecasts are attained by the averaged simulations, and the model is benchmarked with M5 Model Tree and Minimax Probability Machine Regression (MPMR). The results show the proposed ensemble-ANFIS model's preciseness was notably better (in terms of the root mean square and mean absolute error including the Willmott's, Nash-Sutcliffe and Legates McCabe's index) for the 6- and 12- month compared to the 3-month forecasts as verified by the largest error proportions that registered in smallest error band. Applying 10-member simulations, ensemble-ANFIS model was validated for its ability to forecast severity (S), duration (D) and intensity (I) of drought (including the error bound). This enabled uncertainty between multi-models to be rationalized more efficiently, leading to a reduction in forecast error caused by stochasticity in drought behaviours. Through cross-validations at diverse sites, a geographic signature in modelled uncertainties was also calculated. Considering the superiority of ensemble-ANFIS approach and its ability to generate uncertainty-based information, the study advocates the versatility of a multi-model approach for drought-risk forecasting and its prime importance for estimating drought properties over confidence intervals to generate better information for strategic decision-making.
Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen
2016-04-07
Being one type of post-translational modifications (PTMs), protein lysine succinylation is important in regulating varieties of biological processes. It is also involved with some diseases, however. Consequently, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence having many Lys residues therein, which ones can be succinylated, and which ones cannot? To address this problem, we have developed a predictor called pSuc-Lys through (1) incorporating the sequence-coupled information into the general pseudo amino acid composition, (2) balancing out skewed training dataset by random sampling, and (3) constructing an ensemble predictor by fusing a series of individual random forest classifiers. Rigorous cross-validations indicated that it remarkably outperformed the existing methods. A user-friendly web-server for pSuc-Lys has been established at http://www.jci-bioinfo.cn/pSuc-Lys, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can also be used to analyze many other problems in computational proteomics. Copyright © 2016 Elsevier Ltd. All rights reserved.
Quantum probabilistic logic programming
NASA Astrophysics Data System (ADS)
Balu, Radhakrishnan
2015-05-01
We describe a quantum mechanics based logic programming language that supports Horn clauses, random variables, and covariance matrices to express and solve problems in probabilistic logic. The Horn clauses of the language wrap random variables, including infinite valued, to express probability distributions and statistical correlations, a powerful feature to capture relationship between distributions that are not independent. The expressive power of the language is based on a mechanism to implement statistical ensembles and to solve the underlying SAT instances using quantum mechanical machinery. We exploit the fact that classical random variables have quantum decompositions to build the Horn clauses. We establish the semantics of the language in a rigorous fashion by considering an existing probabilistic logic language called PRISM with classical probability measures defined on the Herbrand base and extending it to the quantum context. In the classical case H-interpretations form the sample space and probability measures defined on them lead to consistent definition of probabilities for well formed formulae. In the quantum counterpart, we define probability amplitudes on Hinterpretations facilitating the model generations and verifications via quantum mechanical superpositions and entanglements. We cast the well formed formulae of the language as quantum mechanical observables thus providing an elegant interpretation for their probabilities. We discuss several examples to combine statistical ensembles and predicates of first order logic to reason with situations involving uncertainty.
Transmembrane protein CD93 diffuses by a continuous time random walk.
NASA Astrophysics Data System (ADS)
Goiko, Maria; de Bruyn, John; Heit, Bryan
Molecular motion within the cell membrane is a poorly-defined process. In this study, we characterized the diffusion of the transmembrane protein CD93. By careful analysis of the dependence of the ensemble-averaged mean squared displacement (EA-MSD, r2) on time t and the ensemble-averaged, time-averaged MSD (EA-TAMSD, δ2) on lag time τ and total measurement time T, we showed that the motion of CD93 is well-described by a continuous-time random walk (CTRW). CD93 tracks were acquired using single particle tracking. The tracks were classified as confined or free, and the behavior of the MSD analyzed. EA-MSDs of both populations grew non-linearly with t, indicative of anomalous diffusion. Their EA-TAMSDs were found to depend on both τ and T, indicating non-ergodicity. Free molecules had r2 tα and δ2 (τ /T 1 - α) , with α 0 . 5 , consistent with a CTRW. Mean maximal excursion analysis supported this result. Confined CD93 had r2 t0 and δ2 (τ / T) α , with α 0 . 3 , consistent with a confined CTRW. CTRWs are described by a series of random jumps interspersed with power-law distributed waiting times, and may arise due to the interactions of CD93 with the endocytic machinery. NSERC.
Near-optimal matrix recovery from random linear measurements.
Romanov, Elad; Gavish, Matan
2018-06-25
In matrix recovery from random linear measurements, one is interested in recovering an unknown M-by-N matrix [Formula: see text] from [Formula: see text] measurements [Formula: see text], where each [Formula: see text] is an M-by-N measurement matrix with i.i.d. random entries, [Formula: see text] We present a matrix recovery algorithm, based on approximate message passing, which iteratively applies an optimal singular-value shrinker-a nonconvex nonlinearity tailored specifically for matrix estimation. Our algorithm typically converges exponentially fast, offering a significant speedup over previously suggested matrix recovery algorithms, such as iterative solvers for nuclear norm minimization (NNM). It is well known that there is a recovery tradeoff between the information content of the object [Formula: see text] to be recovered (specifically, its matrix rank r) and the number of linear measurements n from which recovery is to be attempted. The precise tradeoff between r and n, beyond which recovery by a given algorithm becomes possible, traces the so-called phase transition curve of that algorithm in the [Formula: see text] plane. The phase transition curve of our algorithm is noticeably better than that of NNM. Interestingly, it is close to the information-theoretic lower bound for the minimal number of measurements needed for matrix recovery, making it not only state of the art in terms of convergence rate, but also near optimal in terms of the matrices it successfully recovers. Copyright © 2018 the Author(s). Published by PNAS.
Tang, Shuaiqi; Zhang, Minghua; Xie, Shaocheng
2016-01-05
Large-scale atmospheric forcing data can greatly impact the simulations of atmospheric process models including Large Eddy Simulations (LES), Cloud Resolving Models (CRMs) and Single-Column Models (SCMs), and impact the development of physical parameterizations in global climate models. This study describes the development of an ensemble variationally constrained objective analysis of atmospheric large-scale forcing data and its application to evaluate the cloud biases in the Community Atmospheric Model (CAM5). Sensitivities of the variational objective analysis to background data, error covariance matrix and constraint variables are described and used to quantify the uncertainties in the large-scale forcing data. Application of the ensemblemore » forcing in the CAM5 SCM during March 2000 intensive operational period (IOP) at the Southern Great Plains (SGP) of the Atmospheric Radiation Measurement (ARM) program shows systematic biases in the model simulations that cannot be explained by the uncertainty of large-scale forcing data, which points to the deficiencies of physical parameterizations. The SCM is shown to overestimate high clouds and underestimate low clouds. These biases are found to also exist in the global simulation of CAM5 when it is compared with satellite data.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tang, Shuaiqi; Zhang, Minghua; Xie, Shaocheng
Large-scale atmospheric forcing data can greatly impact the simulations of atmospheric process models including Large Eddy Simulations (LES), Cloud Resolving Models (CRMs) and Single-Column Models (SCMs), and impact the development of physical parameterizations in global climate models. This study describes the development of an ensemble variationally constrained objective analysis of atmospheric large-scale forcing data and its application to evaluate the cloud biases in the Community Atmospheric Model (CAM5). Sensitivities of the variational objective analysis to background data, error covariance matrix and constraint variables are described and used to quantify the uncertainties in the large-scale forcing data. Application of the ensemblemore » forcing in the CAM5 SCM during March 2000 intensive operational period (IOP) at the Southern Great Plains (SGP) of the Atmospheric Radiation Measurement (ARM) program shows systematic biases in the model simulations that cannot be explained by the uncertainty of large-scale forcing data, which points to the deficiencies of physical parameterizations. The SCM is shown to overestimate high clouds and underestimate low clouds. These biases are found to also exist in the global simulation of CAM5 when it is compared with satellite data.« less
Quantifying radar-rainfall uncertainties in urban drainage flow modelling
NASA Astrophysics Data System (ADS)
Rico-Ramirez, M. A.; Liguori, S.; Schellart, A. N. A.
2015-09-01
This work presents the results of the implementation of a probabilistic system to model the uncertainty associated to radar rainfall (RR) estimates and the way this uncertainty propagates through the sewer system of an urban area located in the North of England. The spatial and temporal correlations of the RR errors as well as the error covariance matrix were computed to build a RR error model able to generate RR ensembles that reproduce the uncertainty associated with the measured rainfall. The results showed that the RR ensembles provide important information about the uncertainty in the rainfall measurement that can be propagated in the urban sewer system. The results showed that the measured flow peaks and flow volumes are often bounded within the uncertainty area produced by the RR ensembles. In 55% of the simulated events, the uncertainties in RR measurements can explain the uncertainties observed in the simulated flow volumes. However, there are also some events where the RR uncertainty cannot explain the whole uncertainty observed in the simulated flow volumes indicating that there are additional sources of uncertainty that must be considered such as the uncertainty in the urban drainage model structure, the uncertainty in the urban drainage model calibrated parameters, and the uncertainty in the measured sewer flows.
Vorontsov, Ivan I; Miyashita, Osamu
2011-04-30
Complexes of two Cyanovirin-N (CVN) mutants, m4-CVN and P51G-m4-CVN, with deoxy di-mannose analogs were employed as models to generate conformational ensembles using explicit water Molecular Dynamics (MD) simulations in solution and in crystal environment. The results were utilized for evaluation of binding free energies with the molecular mechanics Poisson-Boltzmann (or Generalized Born) surface area, MM/PB(GB)SA, methods. The calculations provided the ranking of deoxy di-mannose ligands affinity in agreement with available qualitative experimental evidences. This confirms the importance of the hydrogen-bond network between di-mannose 3'- and 4'-hydroxyl groups and the protein binding site B(M) as a basis of the CVN activity as an effective HIV fusion inhibitor. Comparison of binding free energies averaged over snapshots from the solution and crystal simulations showed high promises in the use of the crystal matrix for acceleration of the conformational ensemble generation, the most time consuming step in MM/PB(GB)SA approach. Correlation between energy values based on solution versus crystal ensembles is 0.95 for both MM/PBSA and MM/GBSA methods. Copyright © 2010 Wiley Periodicals, Inc.
The cubic ternary complex receptor-occupancy model. III. resurrecting efficacy.
Weiss, J M; Morgan, P H; Lutz, M W; Kenakin, T P
1996-08-21
Early work in pharmacology characterized the interaction of receptors and ligands in terms of two parameters, affinity and efficacy, an approach we term the bipartite view. A precise formulation of efficacy only exists for very simple pharmacological models. Here we extend the notion of efficacy to models that incorporate receptor activation and G-protein coupling. Using the cubic ternary complex model, we show that efficacy is not purely a property of the ligand-receptor interaction; it also depends upon the distributional details of the receptor species in the native receptor ensemble. This suggests a distinction between what we call potential efficacy (a vector) and realized efficacy (a scalar). To each receptor species in the native receptor ensemble we assign a part-worth utility; taken together these utilities comprise the potential efficacy vector. Realized efficacy is the expectation of these part-worth utilities with respect to the frequency distribution of receptor species in the native receptor ensemble. In the parlance of statistical decision theory, the binding of a ligand to a receptor ensemble is a random prospect and realized efficacy is the utility of this prospect. We explore the implications that our definition of efficacy has for understanding agonism and in assessing the legitimacy of the bipartite view in pharmacology.
Stochastic dynamics and mechanosensitivity of myosin II minifilaments
NASA Astrophysics Data System (ADS)
Albert, Philipp J.; Erdmann, Thorsten; Schwarz, Ulrich S.
2014-09-01
Tissue cells are in a state of permanent mechanical tension that is maintained mainly by myosin II minifilaments, which are bipolar assemblies of tens of myosin II molecular motors contracting actin networks and bundles. Here we introduce a stochastic model for myosin II minifilaments as two small myosin II motor ensembles engaging in a stochastic tug-of-war. Each of the two ensembles is described by the parallel cluster model that allows us to use exact stochastic simulations and at the same time to keep important molecular details of the myosin II cross-bridge cycle. Our simulation and analytical results reveal a strong dependence of myosin II minifilament dynamics on environmental stiffness that is reminiscent of the cellular response to substrate stiffness. For small stiffness, minifilaments form transient crosslinks exerting short spikes of force with negligible mean. For large stiffness, minifilaments form near permanent crosslinks exerting a mean force which hardly depends on environmental elasticity. This functional switch arises because dissociation after the power stroke is suppressed by force (catch bonding) and because ensembles can no longer perform the power stroke at large forces. Symmetric myosin II minifilaments perform a random walk with an effective diffusion constant which decreases with increasing ensemble size, as demonstrated for rigid substrates with an analytical treatment.
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
NASA Astrophysics Data System (ADS)
Li, Jiafu; Xiang, Shuiying; Wang, Haoning; Gong, Junkai; Wen, Aijun
2018-03-01
In this paper, a novel image encryption algorithm based on synchronization of physical random bit generated in a cascade-coupled semiconductor ring lasers (CCSRL) system is proposed, and the security analysis is performed. In both transmitter and receiver parts, the CCSRL system is a master-slave configuration consisting of a master semiconductor ring laser (M-SRL) with cross-feedback and a solitary SRL (S-SRL). The proposed image encryption algorithm includes image preprocessing based on conventional chaotic maps, pixel confusion based on control matrix extracted from physical random bit, and pixel diffusion based on random bit stream extracted from physical random bit. Firstly, the preprocessing method is used to eliminate the correlation between adjacent pixels. Secondly, physical random bit with verified randomness is generated based on chaos in the CCSRL system, and is used to simultaneously generate the control matrix and random bit stream. Finally, the control matrix and random bit stream are used for the encryption algorithm in order to change the position and the values of pixels, respectively. Simulation results and security analysis demonstrate that the proposed algorithm is effective and able to resist various typical attacks, and thus is an excellent candidate for secure image communication application.
NASA Astrophysics Data System (ADS)
Mishchenko, Michael I.; Liu, Li; Mackowski, Daniel W.
2013-07-01
We use state-of-the-art public-domain Fortran codes based on the T-matrix method to calculate orientation and ensemble averaged scattering matrix elements for a variety of morphologically complex black carbon (BC) and BC-containing aerosol particles, with a special emphasis on the linear depolarization ratio (LDR). We explain theoretically the quasi-Rayleigh LDR peak at side-scattering angles typical of low-density soot fractals and conclude that the measurement of this feature enables one to evaluate the compactness state of BC clusters and trace the evolution of low-density fluffy fractals into densely packed aggregates. We show that small backscattering LDRs measured with ground-based, airborne, and spaceborne lidars for fresh smoke generally agree with the values predicted theoretically for fluffy BC fractals and densely packed near-spheroidal BC aggregates. To reproduce higher lidar LDRs observed for aged smoke, one needs alternative particle models such as shape mixtures of BC spheroids or cylinders.
NASA Technical Reports Server (NTRS)
Mishchenko, Michael I.; Liu, Li; Mackowski, Daniel W.
2013-01-01
We use state-of-the-art public-domain Fortran codes based on the T-matrix method to calculate orientation and ensemble averaged scattering matrix elements for a variety of morphologically complex black carbon (BC) and BC-containing aerosol particles, with a special emphasis on the linear depolarization ratio (LDR). We explain theoretically the quasi-Rayleigh LDR peak at side-scattering angles typical of low-density soot fractals and conclude that the measurement of this feature enables one to evaluate the compactness state of BC clusters and trace the evolution of low-density fluffy fractals into densely packed aggregates. We show that small backscattering LDRs measured with groundbased, airborne, and spaceborne lidars for fresh smoke generally agree with the values predicted theoretically for fluffy BC fractals and densely packed near-spheroidal BC aggregates. To reproduce higher lidar LDRs observed for aged smoke, one needs alternative particle models such as shape mixtures of BC spheroids or cylinders.
Matrix quantum mechanics on S1 /Z2
NASA Astrophysics Data System (ADS)
Betzios, P.; Gürsoy, U.; Papadoulaki, O.
2018-03-01
We study Matrix Quantum Mechanics on the Euclidean time orbifold S1 /Z2. Upon Wick rotation to Lorentzian time and taking the double-scaling limit this theory provides a toy model for a big-bang/big crunch universe in two dimensional non-critical string theory where the orbifold fixed points become cosmological singularities. We derive the MQM partition function both in the canonical and grand canonical ensemble in two different formulations and demonstrate agreement between them. We pinpoint the contribution of twisted states in both of these formulations either in terms of bi-local operators acting at the end-points of time or branch-cuts on the complex plane. We calculate, in the matrix model, the contribution of the twisted states to the torus level partition function explicitly and show that it precisely matches the world-sheet result, providing a non-trivial test of the proposed duality. Finally we discuss some interesting features of the partition function and the possibility of realising it as a τ-function of an integrable hierarchy.
NASA Astrophysics Data System (ADS)
Batté, Lauriane; Déqué, Michel
2016-06-01
Stochastic methods are increasingly used in global coupled model climate forecasting systems to account for model uncertainties. In this paper, we describe in more detail the stochastic dynamics technique introduced by Batté and Déqué (2012) in the ARPEGE-Climate atmospheric model. We present new results with an updated version of CNRM-CM using ARPEGE-Climate v6.1, and show that the technique can be used both as a means of analyzing model error statistics and accounting for model inadequacies in a seasonal forecasting framework.The perturbations are designed as corrections of model drift errors estimated from a preliminary weakly nudged re-forecast run over an extended reference period of 34 boreal winter seasons. A detailed statistical analysis of these corrections is provided, and shows that they are mainly made of intra-month variance, thereby justifying their use as in-run perturbations of the model in seasonal forecasts. However, the interannual and systematic error correction terms cannot be neglected. Time correlation of the errors is limited, but some consistency is found between the errors of up to 3 consecutive days.These findings encourage us to test several settings of the random draws of perturbations in seasonal forecast mode. Perturbations are drawn randomly but consistently for all three prognostic variables perturbed. We explore the impact of using monthly mean perturbations throughout a given forecast month in a first ensemble re-forecast (SMM, for stochastic monthly means), and test the use of 5-day sequences of perturbations in a second ensemble re-forecast (S5D, for stochastic 5-day sequences). Both experiments are compared in the light of a REF reference ensemble with initial perturbations only. Results in terms of forecast quality are contrasted depending on the region and variable of interest, but very few areas exhibit a clear degradation of forecasting skill with the introduction of stochastic dynamics. We highlight some positive impacts of the method, mainly on Northern Hemisphere extra-tropics. The 500 hPa geopotential height bias is reduced, and improvements project onto the representation of North Atlantic weather regimes. A modest impact on ensemble spread is found over most regions, which suggests that this method could be complemented by other stochastic perturbation techniques in seasonal forecasting mode.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Altmeyer, Michaela; Guterding, Daniel; Hirschfeld, P. J.
2016-12-21
In the framework of a multiorbital Hubbard model description of superconductivity, a matrix formulation of the superconducting pairing interaction that has been widely used is designed to treat spin, charge, and orbital fluctuations within a random phase approximation (RPA). In terms of Feynman diagrams, this takes into account particle-hole ladder and bubble contributions as expected. It turns out, however, that this matrix formulation also generates additional terms which have the diagrammatic structure of vertex corrections. Furthermore we examine these terms and discuss the relationship between the matrix-RPA superconducting pairing interaction and the Feynman diagrams that it sums.
Using random forests for assistance in the curation of G-protein coupled receptor databases.
Shkurin, Aleksei; Vellido, Alfredo
2017-08-18
Biology is experiencing a gradual but fast transformation from a laboratory-centred science towards a data-centred one. As such, it requires robust data engineering and the use of quantitative data analysis methods as part of database curation. This paper focuses on G protein-coupled receptors, a large and heterogeneous super-family of cell membrane proteins of interest to biology in general. One of its families, Class C, is of particular interest to pharmacology and drug design. This family is quite heterogeneous on its own, and the discrimination of its several sub-families is a challenging problem. In the absence of known crystal structure, such discrimination must rely on their primary amino acid sequences. We are interested not as much in achieving maximum sub-family discrimination accuracy using quantitative methods, but in exploring sequence misclassification behavior. Specifically, we are interested in isolating those sequences showing consistent misclassification, that is, sequences that are very often misclassified and almost always to the same wrong sub-family. Random forests are used for this analysis due to their ensemble nature, which makes them naturally suited to gauge the consistency of misclassification. This consistency is here defined through the voting scheme of their base tree classifiers. Detailed consistency results for the random forest ensemble classification were obtained for all receptors and for all data transformations of their unaligned primary sequences. Shortlists of the most consistently misclassified receptors for each subfamily and transformation, as well as an overall shortlist including those cases that were consistently misclassified across transformations, were obtained. The latter should be referred to experts for further investigation as a data curation task. The automatic discrimination of the Class C sub-families of G protein-coupled receptors from their unaligned primary sequences shows clear limits. This study has investigated in some detail the consistency of their misclassification using random forest ensemble classifiers. Different sub-families have been shown to display very different discrimination consistency behaviors. The individual identification of consistently misclassified sequences should provide a tool for quality control to GPCR database curators.
Random walks with long-range steps generated by functions of Laplacian matrices
NASA Astrophysics Data System (ADS)
Riascos, A. P.; Michelitsch, T. M.; Collet, B. A.; Nowakowski, A. F.; Nicolleau, F. C. G. A.
2018-04-01
In this paper, we explore different Markovian random walk strategies on networks with transition probabilities between nodes defined in terms of functions of the Laplacian matrix. We generalize random walk strategies with local information in the Laplacian matrix, that describes the connections of a network, to a dynamic determined by functions of this matrix. The resulting processes are non-local allowing transitions of the random walker from one node to nodes beyond its nearest neighbors. We find that only two types of Laplacian functions are admissible with distinct behaviors for long-range steps in the infinite network limit: type (i) functions generate Brownian motions, type (ii) functions Lévy flights. For this asymptotic long-range step behavior only the lowest non-vanishing order of the Laplacian function is relevant, namely first order for type (i), and fractional order for type (ii) functions. In the first part, we discuss spectral properties of the Laplacian matrix and a series of relations that are maintained by a particular type of functions that allow to define random walks on any type of undirected connected networks. Once described general properties, we explore characteristics of random walk strategies that emerge from particular cases with functions defined in terms of exponentials, logarithms and powers of the Laplacian as well as relations of these dynamics with non-local strategies like Lévy flights and fractional transport. Finally, we analyze the global capacity of these random walk strategies to explore networks like lattices and trees and different types of random and complex networks.
Scaling of peak flows with constant flow velocity in random self-similar networks
Troutman, Brent M.; Mantilla, Ricardo; Gupta, Vijay K.
2011-01-01
A methodology is presented to understand the role of the statistical self-similar topology of real river networks on scaling, or power law, in peak flows for rainfall-runoff events. We created Monte Carlo generated sets of ensembles of 1000 random self-similar networks (RSNs) with geometrically distributed interior and exterior generators having parameters pi and pe, respectively. The parameter values were chosen to replicate the observed topology of real river networks. We calculated flow hydrographs in each of these networks by numerically solving the link-based mass and momentum conservation equation under the assumption of constant flow velocity. From these simulated RSNs and hydrographs, the scaling exponents β and φ characterizing power laws with respect to drainage area, and corresponding to the width functions and flow hydrographs respectively, were estimated. We found that, in general, φ > β, which supports a similar finding first reported for simulations in the river network of the Walnut Gulch basin, Arizona. Theoretical estimation of β and φ in RSNs is a complex open problem. Therefore, using results for a simpler problem associated with the expected width function and expected hydrograph for an ensemble of RSNs, we give heuristic arguments for theoretical derivations of the scaling exponents β(E) and φ(E) that depend on the Horton ratios for stream lengths and areas. These ratios in turn have a known dependence on the parameters of the geometric distributions of RSN generators. Good agreement was found between the analytically conjectured values of β(E) and φ(E) and the values estimated by the simulated ensembles of RSNs and hydrographs. The independence of the scaling exponents φ(E) and φ with respect to the value of flow velocity and runoff intensity implies an interesting connection between unit hydrograph theory and flow dynamics. Our results provide a reference framework to study scaling exponents under more complex scenarios of flow dynamics and runoff generation processes using ensembles of RSNs.
Applications of Derandomization Theory in Coding
NASA Astrophysics Data System (ADS)
Cheraghchi, Mahdi
2011-07-01
Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions, and construct efficient and information-theoretically optimal communication protocols for this model. Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold. Finally, we design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. [This is a shortened version of the actual abstract in the thesis.
Uniform Recovery Bounds for Structured Random Matrices in Corrupted Compressed Sensing
NASA Astrophysics Data System (ADS)
Zhang, Peng; Gan, Lu; Ling, Cong; Sun, Sumei
2018-04-01
We study the problem of recovering an $s$-sparse signal $\\mathbf{x}^{\\star}\\in\\mathbb{C}^n$ from corrupted measurements $\\mathbf{y} = \\mathbf{A}\\mathbf{x}^{\\star}+\\mathbf{z}^{\\star}+\\mathbf{w}$, where $\\mathbf{z}^{\\star}\\in\\mathbb{C}^m$ is a $k$-sparse corruption vector whose nonzero entries may be arbitrarily large and $\\mathbf{w}\\in\\mathbb{C}^m$ is a dense noise with bounded energy. The aim is to exactly and stably recover the sparse signal with tractable optimization programs. In this paper, we prove the uniform recovery guarantee of this problem for two classes of structured sensing matrices. The first class can be expressed as the product of a unit-norm tight frame (UTF), a random diagonal matrix and a bounded columnwise orthonormal matrix (e.g., partial random circulant matrix). When the UTF is bounded (i.e. $\\mu(\\mathbf{U})\\sim1/\\sqrt{m}$), we prove that with high probability, one can recover an $s$-sparse signal exactly and stably by $l_1$ minimization programs even if the measurements are corrupted by a sparse vector, provided $m = \\mathcal{O}(s \\log^2 s \\log^2 n)$ and the sparsity level $k$ of the corruption is a constant fraction of the total number of measurements. The second class considers randomly sub-sampled orthogonal matrix (e.g., random Fourier matrix). We prove the uniform recovery guarantee provided that the corruption is sparse on certain sparsifying domain. Numerous simulation results are also presented to verify and complement the theoretical results.
Holmes, John B; Dodds, Ken G; Lee, Michael A
2017-03-02
An important issue in genetic evaluation is the comparability of random effects (breeding values), particularly between pairs of animals in different contemporary groups. This is usually referred to as genetic connectedness. While various measures of connectedness have been proposed in the literature, there is general agreement that the most appropriate measure is some function of the prediction error variance-covariance matrix. However, obtaining the prediction error variance-covariance matrix is computationally demanding for large-scale genetic evaluations. Many alternative statistics have been proposed that avoid the computational cost of obtaining the prediction error variance-covariance matrix, such as counts of genetic links between contemporary groups, gene flow matrices, and functions of the variance-covariance matrix of estimated contemporary group fixed effects. In this paper, we show that a correction to the variance-covariance matrix of estimated contemporary group fixed effects will produce the exact prediction error variance-covariance matrix averaged by contemporary group for univariate models in the presence of single or multiple fixed effects and one random effect. We demonstrate the correction for a series of models and show that approximations to the prediction error matrix based solely on the variance-covariance matrix of estimated contemporary group fixed effects are inappropriate in certain circumstances. Our method allows for the calculation of a connectedness measure based on the prediction error variance-covariance matrix by calculating only the variance-covariance matrix of estimated fixed effects. Since the number of fixed effects in genetic evaluation is usually orders of magnitudes smaller than the number of random effect levels, the computational requirements for our method should be reduced.
Expected distributions of root-mean-square positional deviations in proteins.
Pitera, Jed W
2014-06-19
The atom positional root-mean-square deviation (RMSD) is a standard tool for comparing the similarity of two molecular structures. It is used to characterize the quality of biomolecular simulations, to cluster conformations, and as a reaction coordinate for conformational changes. This work presents an approximate analytic form for the expected distribution of RMSD values for a protein or polymer fluctuating about a stable native structure. The mean and maximum of the expected distribution are independent of chain length for long chains and linearly proportional to the average atom positional root-mean-square fluctuations (RMSF). To approximate the RMSD distribution for random-coil or unfolded ensembles, numerical distributions of RMSD were generated for ensembles of self-avoiding and non-self-avoiding random walks. In both cases, for all reference structures tested for chains more than three monomers long, the distributions have a maximum distant from the origin with a power-law dependence on chain length. The purely entropic nature of this result implies that care must be taken when interpreting stable high-RMSD regions of the free-energy landscape as "intermediates" or well-defined stable states.
On the Feynman-Hellmann theorem in quantum field theory and the calculation of matrix elements
Bouchard, Chris; Chang, Chia Cheng; Kurth, Thorsten; ...
2017-07-12
In this paper, the Feynman-Hellmann theorem can be derived from the long Euclidean-time limit of correlation functions determined with functional derivatives of the partition function. Using this insight, we fully develop an improved method for computing matrix elements of external currents utilizing only two-point correlation functions. Our method applies to matrix elements of any external bilinear current, including nonzero momentum transfer, flavor-changing, and two or more current insertion matrix elements. The ability to identify and control all the systematic uncertainties in the analysis of the correlation functions stems from the unique time dependence of the ground-state matrix elements and the fact that all excited states and contact terms are Euclidean-time dependent. We demonstrate the utility of our method with a calculation of the nucleon axial charge using gradient-flowed domain-wall valence quarks on themore » $$N_f=2+1+1$$ MILC highly improved staggered quark ensemble with lattice spacing and pion mass of approximately 0.15 fm and 310 MeV respectively. We show full control over excited-state systematics with the new method and obtain a value of $$g_A = 1.213(26)$$ with a quark-mass-dependent renormalization coefficient.« less
Palchesko, Rachelle N; Szymanski, John M; Sahu, Amrita; Feinberg, Adam W
2014-09-01
Cell-matrix interactions are important for the physical integration of cells into tissues and the function of insoluble, mechanosensitive signaling networks. Studying these interactions in vitro can be difficult because the extracellular matrix (ECM) proteins that adsorb to in vitro cell culture surfaces do not fully recapitulate the ECM-dense basement membranes to which cells such as cardiomyocytes and endothelial cells adhere to in vivo . Towards addressing this limitation, we have developed a surface-initiated assembly process to engineer ECM proteins into nanostructured, microscale sheets that can be shrink wrapped around single cells and small cell ensembles to provide a functional and instructive matrix niche. Unlike current cell encapsulation technology using alginate, fibrin or other hydrogels, our engineered ECM is similar in density and thickness to native basal lamina and can be tailored in structure and composition using the proteins fibronectin, laminin, fibrinogen, and/or collagen type IV. A range of cells including C2C12 myoblasts, bovine corneal endothelial cells and cardiomyocytes survive the shrink wrapping process with high viability. Further, we demonstrate that, compared to non-encapsulated controls, the engineered ECM modulates cytoskeletal structure, stability of cell-matrix adhesions and cell behavior in 2D and 3D microenvironments.
Palchesko, Rachelle N.; Szymanski, John M.; Sahu, Amrita; Feinberg, Adam W.
2014-01-01
Cell-matrix interactions are important for the physical integration of cells into tissues and the function of insoluble, mechanosensitive signaling networks. Studying these interactions in vitro can be difficult because the extracellular matrix (ECM) proteins that adsorb to in vitro cell culture surfaces do not fully recapitulate the ECM-dense basement membranes to which cells such as cardiomyocytes and endothelial cells adhere to in vivo. Towards addressing this limitation, we have developed a surface-initiated assembly process to engineer ECM proteins into nanostructured, microscale sheets that can be shrink wrapped around single cells and small cell ensembles to provide a functional and instructive matrix niche. Unlike current cell encapsulation technology using alginate, fibrin or other hydrogels, our engineered ECM is similar in density and thickness to native basal lamina and can be tailored in structure and composition using the proteins fibronectin, laminin, fibrinogen, and/or collagen type IV. A range of cells including C2C12 myoblasts, bovine corneal endothelial cells and cardiomyocytes survive the shrink wrapping process with high viability. Further, we demonstrate that, compared to non-encapsulated controls, the engineered ECM modulates cytoskeletal structure, stability of cell-matrix adhesions and cell behavior in 2D and 3D microenvironments. PMID:25530816
On the equilibrium state of a small system with random matrix coupling to its environment
NASA Astrophysics Data System (ADS)
Lebowitz, J. L.; Pastur, L.
2015-07-01
We consider a random matrix model of interaction between a small n-level system, S, and its environment, a N-level heat reservoir, R. The interaction between S and R is modeled by a tensor product of a fixed n× n matrix and a N× N Hermitian random matrix. We show that under certain ‘macroscopicity’ conditions on R, the reduced density matrix of the system {{ρ }S}=T{{r}R}ρ S\\cup R(eq), is given by ρ S(c)˜ exp \\{-β {{H}S}\\}, where HS is the Hamiltonian of the isolated system. This holds for all strengths of the interaction and thus gives some justification for using ρ S(c) to describe some nano-systems, like biopolymers, in equilibrium with their environment (Seifert 2012 Rep. Prog. Phys. 75 126001). Our results extend those obtained previously in (Lebowitz and Pastur 2004 J. Phys. A: Math. Gen. 37 1517-34) (Lebowitz et al 2007 Contemporary Mathematics (Providence RI: American Mathematical Society) pp 199-218) for a special two-level system.
NASA Astrophysics Data System (ADS)
Yaremchuk, Max; Martin, Paul; Beattie, Christopher
2017-09-01
Development and maintenance of the linearized and adjoint code for advanced circulation models is a challenging issue, requiring a significant proportion of total effort in operational data assimilation (DA). The ensemble-based DA techniques provide a derivative-free alternative, which appears to be competitive with variational methods in many practical applications. This article proposes a hybrid scheme for generating the search subspaces in the adjoint-free 4-dimensional DA method (a4dVar) that does not use a predefined ensemble. The method resembles 4dVar in that the optimal solution is strongly constrained by model dynamics and search directions are supplied iteratively using information from the current and previous model trajectories generated in the process of optimization. In contrast to 4dVar, which produces a single search direction from exact gradient information, a4dVar employs an ensemble of directions to form a subspace in order to proceed. In the earlier versions of a4dVar, search subspaces were built using the leading EOFs of either the model trajectory or the projections of the model-data misfits onto the range of the background error covariance (BEC) matrix at the current iteration. In the present study, we blend both approaches and explore a hybrid scheme of ensemble generation in order to improve the performance and flexibility of the algorithm. In addition, we introduce balance constraints into the BEC structure and periodically augment the search ensemble with BEC eigenvectors to avoid repeating minimization over already explored subspaces. Performance of the proposed hybrid a4dVar (ha4dVar) method is compared with that of standard 4dVar in a realistic regional configuration assimilating real data into the Navy Coastal Ocean Model (NCOM). It is shown that the ha4dVar converges faster than a4dVar and can be potentially competitive with 4dvar both in terms of the required computational time and the forecast skill.
Random Matrix Approach for Primal-Dual Portfolio Optimization Problems
NASA Astrophysics Data System (ADS)
Tada, Daichi; Yamamoto, Hisashi; Shinzato, Takashi
2017-12-01
In this paper, we revisit the portfolio optimization problems of the minimization/maximization of investment risk under constraints of budget and investment concentration (primal problem) and the maximization/minimization of investment concentration under constraints of budget and investment risk (dual problem) for the case that the variances of the return rates of the assets are identical. We analyze both optimization problems by the Lagrange multiplier method and the random matrix approach. Thereafter, we compare the results obtained from our proposed approach with the results obtained in previous work. Moreover, we use numerical experiments to validate the results obtained from the replica approach and the random matrix approach as methods for analyzing both the primal and dual portfolio optimization problems.
Note on coefficient matrices from stochastic Galerkin methods for random diffusion equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou Tao, E-mail: tzhou@lsec.cc.ac.c; Tang Tao, E-mail: ttang@hkbu.edu.h
2010-11-01
In a recent work by Xiu and Shen [D. Xiu, J. Shen, Efficient stochastic Galerkin methods for random diffusion equations, J. Comput. Phys. 228 (2009) 266-281], the Galerkin methods are used to solve stochastic diffusion equations in random media, where some properties for the coefficient matrix of the resulting system are provided. They also posed an open question on the properties of the coefficient matrix. In this work, we will provide some results related to the open question.
Ramírez, J; Górriz, J M; Ortiz, A; Martínez-Murcia, F J; Segovia, F; Salas-Gonzalez, D; Castillo-Barnes, D; Illán, I A; Puntonet, C G
2018-05-15
Alzheimer's disease (AD) is the most common cause of dementia in the elderly and affects approximately 30 million individuals worldwide. Mild cognitive impairment (MCI) is very frequently a prodromal phase of AD, and existing studies have suggested that people with MCI tend to progress to AD at a rate of about 10-15% per year. However, the ability of clinicians and machine learning systems to predict AD based on MRI biomarkers at an early stage is still a challenging problem that can have a great impact in improving treatments. The proposed system, developed by the SiPBA-UGR team for this challenge, is based on feature standardization, ANOVA feature selection, partial least squares feature dimension reduction and an ensemble of One vs. Rest random forest classifiers. With the aim of improving its performance when discriminating healthy controls (HC) from MCI, a second binary classification level was introduced that reconsiders the HC and MCI predictions of the first level. The system was trained and evaluated on an ADNI datasets that consist of T1-weighted MRI morphological measurements from HC, stable MCI, converter MCI and AD subjects. The proposed system yields a 56.25% classification score on the test subset which consists of 160 real subjects. The classifier yielded the best performance when compared to: (i) One vs. One (OvO), One vs. Rest (OvR) and error correcting output codes (ECOC) as strategies for reducing the multiclass classification task to multiple binary classification problems, (ii) support vector machines, gradient boosting classifier and random forest as base binary classifiers, and (iii) bagging ensemble learning. A robust method has been proposed for the international challenge on MCI prediction based on MRI data. The system yielded the second best performance during the competition with an accuracy rate of 56.25% when evaluated on the real subjects of the test set. Copyright © 2017 Elsevier B.V. All rights reserved.
Reinforce: An Ensemble Approach for Inferring PPI Network from AP-MS Data.
Tian, Bo; Duan, Qiong; Zhao, Can; Teng, Ben; He, Zengyou
2017-05-17
Affinity Purification-Mass Spectrometry (AP-MS) is one of the most important technologies for constructing protein-protein interaction (PPI) networks. In this paper, we propose an ensemble method, Reinforce, for inferring PPI network from AP-MS data set. The new algorithm named Reinforce is based on rank aggregation and false discovery rate control. Under the null hypothesis that the interaction scores from different scoring methods are randomly generated, Reinforce follows three steps to integrate multiple ranking results from different algorithms or different data sets. The experimental results show that Reinforce can get more stable and accurate inference results than existing algorithms. The source codes of Reinforce and data sets used in the experiments are available at: https://sourceforge.net/projects/reinforce/.
Size dependence of single-photon superradiance of cold and dilute atomic ensembles
NASA Astrophysics Data System (ADS)
Kuraptsev, A. S.; Sokolov, I. M.
2017-11-01
We report a theoretical investigation of angular distribution of a single-photon superradiance from cold and dilute atomic clouds. In the present work we focus our attention on the dependence of superradiance on the size and shape of the cloud. We analyze the dynamics of the afterglow of atomic ensemble excited by pulse radiation. Two theoretical approaches are used. The first is the quantum microscopic approach based on a coupled-dipole model. The second approach is random walk approximation. We show that the results obtained in both approaches coincide with a good accuracy for incoherent fluorescence excited by short resonant pulses. We also show that the superradiance decay rate changes with size differently for radiation emitted into different directions.
Spam comments prediction using stacking with ensemble learning
NASA Astrophysics Data System (ADS)
Mehmood, Arif; On, Byung-Won; Lee, Ingyu; Ashraf, Imran; Choi, Gyu Sang
2018-01-01
Illusive comments of product or services are misleading for people in decision making. The current methodologies to predict deceptive comments are concerned for feature designing with single training model. Indigenous features have ability to show some linguistic phenomena but are hard to reveal the latent semantic meaning of the comments. We propose a prediction model on general features of documents using stacking with ensemble learning. Term Frequency/Inverse Document Frequency (TF/IDF) features are inputs to stacking of Random Forest and Gradient Boosted Trees and the outputs of the base learners are encapsulated with decision tree to make final training of the model. The results exhibits that our approach gives the accuracy of 92.19% which outperform the state-of-the-art method.
A strictly Markovian expansion for plasma turbulence theory
NASA Technical Reports Server (NTRS)
Jones, F. C.
1976-01-01
The collision operator that appears in the equation of motion for a particle distribution function that was averaged over an ensemble of random Hamiltonians is non-Markovian. It is non-Markovian in that it involves a propagated integral over the past history of the ensemble averaged distribution function. All formal expansions of this nonlinear collision operator to date preserve this non-Markovian character term by term yielding an integro-differential equation that must be converted to a diffusion equation by an additional approximation. An expansion is derived for the collision operator that is strictly Markovian to any finite order and yields a diffusion equation as the lowest nontrivial order. The validity of this expansion is seen to be the same as that of the standard quasilinear expansion.
Coupling strength assumption in statistical energy analysis
Lafont, T.; Totaro, N.
2017-01-01
This paper is a discussion of the hypothesis of weak coupling in statistical energy analysis (SEA). The examples of coupled oscillators and statistical ensembles of coupled plates excited by broadband random forces are discussed. In each case, a reference calculation is compared with the SEA calculation. First, it is shown that the main SEA relation, the coupling power proportionality, is always valid for two oscillators irrespective of the coupling strength. But the case of three subsystems, consisting of oscillators or ensembles of plates, indicates that the coupling power proportionality fails when the coupling is strong. Strong coupling leads to non-zero indirect coupling loss factors and, sometimes, even to a reversal of the energy flow direction from low to high vibrational temperature. PMID:28484335
On the efficiency of a randomized mirror descent algorithm in online optimization problems
NASA Astrophysics Data System (ADS)
Gasnikov, A. V.; Nesterov, Yu. E.; Spokoiny, V. G.
2015-04-01
A randomized online version of the mirror descent method is proposed. It differs from the existing versions by the randomization method. Randomization is performed at the stage of the projection of a subgradient of the function being optimized onto the unit simplex rather than at the stage of the computation of a subgradient, which is common practice. As a result, a componentwise subgradient descent with a randomly chosen component is obtained, which admits an online interpretation. This observation, for example, has made it possible to uniformly interpret results on weighting expert decisions and propose the most efficient method for searching for an equilibrium in a zero-sum two-person matrix game with sparse matrix.
NASA Astrophysics Data System (ADS)
Rooper, Christopher N.; Zimmermann, Mark; Prescott, Megan M.
2017-08-01
Deep-sea coral and sponge ecosystems are widespread throughout most of Alaska's marine waters, and are associated with many different species of fishes and invertebrates. These ecosystems are vulnerable to the effects of commercial fishing activities and climate change. We compared four commonly used species distribution models (general linear models, generalized additive models, boosted regression trees and random forest models) and an ensemble model to predict the presence or absence and abundance of six groups of benthic invertebrate taxa in the Gulf of Alaska. All four model types performed adequately on training data for predicting presence and absence, with regression forest models having the best overall performance measured by the area under the receiver-operating-curve (AUC). The models also performed well on the test data for presence and absence with average AUCs ranging from 0.66 to 0.82. For the test data, ensemble models performed the best. For abundance data, there was an obvious demarcation in performance between the two regression-based methods (general linear models and generalized additive models), and the tree-based models. The boosted regression tree and random forest models out-performed the other models by a wide margin on both the training and testing data. However, there was a significant drop-off in performance for all models of invertebrate abundance ( 50%) when moving from the training data to the testing data. Ensemble model performance was between the tree-based and regression-based methods. The maps of predictions from the models for both presence and abundance agreed very well across model types, with an increase in variability in predictions for the abundance data. We conclude that where data conforms well to the modeled distribution (such as the presence-absence data and binomial distribution in this study), the four types of models will provide similar results, although the regression-type models may be more consistent with biological theory. For data with highly zero-inflated distributions and non-normal distributions such as the abundance data from this study, the tree-based methods performed better. Ensemble models that averaged predictions across the four model types, performed better than the GLM or GAM models but slightly poorer than the tree-based methods, suggesting ensemble models might be more robust to overfitting than tree methods, while mitigating some of the disadvantages in predictive performance of regression methods.
Non-equilibrium many-body dynamics following a quantum quench
NASA Astrophysics Data System (ADS)
Vyas, Manan
2017-12-01
We study analytically and numerically the non-equilibrium dynamics of an isolated interacting many-body quantum system following a random quench. We model the system Hamiltonian by Embedded Gaussian Orthogonal Ensemble (EGOE) of random matrices with one plus few-body interactions for fermions. EGOE are paradigmatic models to study the crossover from integrability to chaos in interacting many-body quantum systems. We obtain a generic formulation, based on spectral variances, for describing relaxation dynamics of survival probabilities as a function of rank of interactions. Our analytical results are in good agreement with numerics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bouchard, Chris; Chang, Chia Cheng; Kurth, Thorsten
In this paper, the Feynman-Hellmann theorem can be derived from the long Euclidean-time limit of correlation functions determined with functional derivatives of the partition function. Using this insight, we fully develop an improved method for computing matrix elements of external currents utilizing only two-point correlation functions. Our method applies to matrix elements of any external bilinear current, including nonzero momentum transfer, flavor-changing, and two or more current insertion matrix elements. The ability to identify and control all the systematic uncertainties in the analysis of the correlation functions stems from the unique time dependence of the ground-state matrix elements and the fact that all excited states and contact terms are Euclidean-time dependent. We demonstrate the utility of our method with a calculation of the nucleon axial charge using gradient-flowed domain-wall valence quarks on themore » $$N_f=2+1+1$$ MILC highly improved staggered quark ensemble with lattice spacing and pion mass of approximately 0.15 fm and 310 MeV respectively. We show full control over excited-state systematics with the new method and obtain a value of $$g_A = 1.213(26)$$ with a quark-mass-dependent renormalization coefficient.« less
Random waves in the brain: Symmetries and defect generation in the visual cortex
NASA Astrophysics Data System (ADS)
Schnabel, M.; Kaschube, M.; Löwel, S.; Wolf, F.
2007-06-01
How orientation maps in the visual cortex of the brain develop is a matter of long standing debate. Experimental and theoretical evidence suggests that their development represents an activity-dependent self-organization process. Theoretical analysis [1] exploring this hypothesis predicted that maps at an early developmental stage are realizations of Gaussian random fields exhibiting a rigorous lower bound for their densities of topological defects, called pinwheels. As a consequence, lower pinwheel densities, if observed in adult animals, are predicted to develop through the motion and annihilation of pinwheel pairs. Despite of being valid for a large class of developmental models this result depends on the symmetries of the models and thus of the predicted random field ensembles. In [1] invariance of the orientation map's statistical properties under independent space rotations and orientation shifts was assumed. However, full rotation symmetry appears to be broken by interactions of cortical neurons, e.g. selective couplings between groups of neurons with collinear orientation preferences [2]. A recently proposed new symmetry, called shift-twist symmetry [3], stating that spatial rotations have to occur together with orientation shifts in order to be an appropriate symmetry transformation, is more consistent with this organization. Here we generalize our random field approach to this important symmetry class. We propose a new class of shift-twist symmetric Gaussian random fields and derive the general correlation functions of this ensemble. It turns out that despite strong effects of the shift-twist symmetry on the structure of the correlation functions and on the map layout the lower bound on the pinwheel densities remains unaffected, predicting pinwheel annihilation in systems with low pinwheel densities.
Finite-size analysis of the detectability limit of the stochastic block model
NASA Astrophysics Data System (ADS)
Young, Jean-Gabriel; Desrosiers, Patrick; Hébert-Dufresne, Laurent; Laurence, Edward; Dubé, Louis J.
2017-06-01
It has been shown in recent years that the stochastic block model is sometimes undetectable in the sparse limit, i.e., that no algorithm can identify a partition correlated with the partition used to generate an instance, if the instance is sparse enough and infinitely large. In this contribution, we treat the finite case explicitly, using arguments drawn from information theory and statistics. We give a necessary condition for finite-size detectability in the general SBM. We then distinguish the concept of average detectability from the concept of instance-by-instance detectability and give explicit formulas for both definitions. Using these formulas, we prove that there exist large equivalence classes of parameters, where widely different network ensembles are equally detectable with respect to our definitions of detectability. In an extensive case study, we investigate the finite-size detectability of a simplified variant of the SBM, which encompasses a number of important models as special cases. These models include the symmetric SBM, the planted coloring model, and more exotic SBMs not previously studied. We conclude with three appendices, where we study the interplay of noise and detectability, establish a connection between our information-theoretic approach and random matrix theory, and provide proofs of some of the more technical results.
Optical modeling of volcanic ash particles using ellipsoids
NASA Astrophysics Data System (ADS)
Merikallio, Sini; Muñoz, Olga; Sundström, Anu-Maija; Virtanen, Timo H.; Horttanainen, Matti; de Leeuw, Gerrit; Nousiainen, Timo
2015-05-01
The single-scattering properties of volcanic ash particles are modeled here by using ellipsoidal shapes. Ellipsoids are expected to improve the accuracy of the retrieval of aerosol properties using remote sensing techniques, which are currently often based on oversimplified assumptions of spherical ash particles. Measurements of the single-scattering optical properties of ash particles from several volcanoes across the globe, including previously unpublished measurements from the Eyjafjallajökull and Puyehue volcanoes, are used to assess the performance of the ellipsoidal particle models. These comparisons between the measurements and the ellipsoidal particle model include consideration of the whole scattering matrix, as well as sensitivity studies on the point of view of the Advanced Along Track Scanning Radiometer (AATSR) instrument. AATSR, which flew on the ENVISAT satellite, offers two viewing directions but no information on polarization, so usually only the phase function is relevant for interpreting its measurements. As expected, ensembles of ellipsoids are able to reproduce the observed scattering matrix more faithfully than spheres. Performance of ellipsoid ensembles depends on the distribution of particle shapes, which we tried to optimize. No single specific shape distribution could be found that would perform superiorly in all situations, but all of the best-fit ellipsoidal distributions, as well as the additionally tested equiprobable distribution, improved greatly over the performance of spheres. We conclude that an equiprobable shape distribution of ellipsoidal model particles is a relatively good, yet enticingly simple, approach for modeling volcanic ash single-scattering optical properties.
Group identification in Indonesian stock market
NASA Astrophysics Data System (ADS)
Nurriyadi Suparno, Ervano; Jo, Sung Kyun; Lim, Kyuseong; Purqon, Acep; Kim, Soo Yong
2016-08-01
The characteristic of Indonesian stock market is interesting especially because it represents developing countries. We investigate the dynamics and structures by using Random Matrix Theory (RMT). Here, we analyze the cross-correlation of the fluctuations of the daily closing price of stocks from the Indonesian Stock Exchange (IDX) between January 1, 2007, and October 28, 2014. The eigenvalue distribution of the correlation matrix consists of noise which is filtered out using the random matrix as a control. The bulk of the eigenvalue distribution conforms to the random matrix, allowing the separation of random noise from original data which is the deviating eigenvalues. From the deviating eigenvalues and the corresponding eigenvectors, we identify the intrinsic normal modes of the system and interpret their meaning based on qualitative and quantitative approach. The results show that the largest eigenvector represents the market-wide effect which has a predominantly common influence toward all stocks. The other eigenvectors represent highly correlated groups within the system. Furthermore, identification of the largest components of the eigenvectors shows the sector or background of the correlated groups. Interestingly, the result shows that there are mainly two clusters within IDX, natural and non-natural resource companies. We then decompose the correlation matrix to investigate the contribution of the correlated groups to the total correlation, and we find that IDX is still driven mainly by the market-wide effect.
Tensor manifold-based extreme learning machine for 2.5-D face recognition
NASA Astrophysics Data System (ADS)
Chong, Lee Ying; Ong, Thian Song; Teoh, Andrew Beng Jin
2018-01-01
We explore the use of the Gabor regional covariance matrix (GRCM), a flexible matrix-based descriptor that embeds the Gabor features in the covariance matrix, as a 2.5-D facial descriptor and an effective means of feature fusion for 2.5-D face recognition problems. Despite its promise, matching is not a trivial problem for GRCM since it is a special instance of a symmetric positive definite (SPD) matrix that resides in non-Euclidean space as a tensor manifold. This implies that GRCM is incompatible with the existing vector-based classifiers and distance matchers. Therefore, we bridge the gap of the GRCM and extreme learning machine (ELM), a vector-based classifier for the 2.5-D face recognition problem. We put forward a tensor manifold-compliant ELM and its two variants by embedding the SPD matrix randomly into reproducing kernel Hilbert space (RKHS) via tensor kernel functions. To preserve the pair-wise distance of the embedded data, we orthogonalize the random-embedded SPD matrix. Hence, classification can be done using a simple ridge regressor, an integrated component of ELM, on the random orthogonal RKHS. Experimental results show that our proposed method is able to improve the recognition performance and further enhance the computational efficiency.
NASA Astrophysics Data System (ADS)
Jiang, Fan; Zhu, Zhencai; Li, Wei; Zhou, Gongbo; Chen, Guoan
2014-07-01
Accurately identifying faults in rotor-bearing systems by analyzing vibration signals, which are nonlinear and nonstationary, is challenging. To address this issue, a new approach based on ensemble empirical mode decomposition (EEMD) and self-zero space projection analysis is proposed in this paper. This method seeks to identify faults appearing in a rotor-bearing system using simple algebraic calculations and projection analyses. First, EEMD is applied to decompose the collected vibration signals into a set of intrinsic mode functions (IMFs) for features. Second, these extracted features under various mechanical health conditions are used to design a self-zero space matrix according to space projection analysis. Finally, the so-called projection indicators are calculated to identify the rotor-bearing system's faults with simple decision logic. Experiments are implemented to test the reliability and effectiveness of the proposed approach. The results show that this approach can accurately identify faults in rotor-bearing systems.
Learning disordered topological phases by statistical recovery of symmetry
NASA Astrophysics Data System (ADS)
Yoshioka, Nobuyuki; Akagi, Yutaka; Katsura, Hosho
2018-05-01
We apply the artificial neural network in a supervised manner to map out the quantum phase diagram of disordered topological superconductors in class DIII. Given the disorder that keeps the discrete symmetries of the ensemble as a whole, translational symmetry which is broken in the quasiparticle distribution individually is recovered statistically by taking an ensemble average. By using this, we classify the phases by the artificial neural network that learned the quasiparticle distribution in the clean limit and show that the result is totally consistent with the calculation by the transfer matrix method or noncommutative geometry approach. If all three phases, namely the Z2, trivial, and thermal metal phases, appear in the clean limit, the machine can classify them with high confidence over the entire phase diagram. If only the former two phases are present, we find that the machine remains confused in a certain region, leading us to conclude the detection of the unknown phase which is eventually identified as the thermal metal phase.
Filatov, Michael; Liu, Fang; Martínez, Todd J.
2017-07-21
The state-averaged (SA) spin restricted ensemble referenced Kohn-Sham (REKS) method and its state interaction (SI) extension, SI-SA-REKS, enable one to describe correctly the shape of the ground and excited potential energy surfaces of molecules undergoing bond breaking/bond formation reactions including features such as conical intersections crucial for theoretical modeling of non-adiabatic reactions. Until recently, application of the SA-REKS and SI-SA-REKS methods to modeling the dynamics of such reactions was obstructed due to the lack of the analytical energy derivatives. Here, the analytical derivatives of the individual SA-REKS and SI-SA-REKS energies are derived. The final analytic gradient expressions are formulated entirelymore » in terms of traces of matrix products and are presented in the form convenient for implementation in the traditional quantum chemical codes employing basis set expansions of the molecular orbitals. Finally, we will describe the implementation and benchmarking of the derived formalism in a subsequent article of this series.« less
NASA Astrophysics Data System (ADS)
Drzewiecki, Wojciech
2017-12-01
We evaluated the performance of nine machine learning regression algorithms and their ensembles for sub-pixel estimation of impervious areas coverages from Landsat imagery. The accuracy of imperviousness mapping in individual time points was assessed based on RMSE, MAE and R2. These measures were also used for the assessment of imperviousness change intensity estimations. The applicability for detection of relevant changes in impervious areas coverages at sub-pixel level was evaluated using overall accuracy, F-measure and ROC Area Under Curve. The results proved that Cubist algorithm may be advised for Landsat-based mapping of imperviousness for single dates. Stochastic gradient boosting of regression trees (GBM) may be also considered for this purpose. However, Random Forest algorithm is endorsed for both imperviousness change detection and mapping of its intensity. In all applications the heterogeneous model ensembles performed at least as well as the best individual models or better. They may be recommended for improving the quality of sub-pixel imperviousness and imperviousness change mapping. The study revealed also limitations of the investigated methodology for detection of subtle changes of imperviousness inside the pixel. None of the tested approaches was able to reliably classify changed and non-changed pixels if the relevant change threshold was set as one or three percent. Also for fi ve percent change threshold most of algorithms did not ensure that the accuracy of change map is higher than the accuracy of random classifi er. For the threshold of relevant change set as ten percent all approaches performed satisfactory.
NASA Astrophysics Data System (ADS)
Chen, Jie; Brissette, François P.; Lucas-Picher, Philippe
2016-11-01
Given the ever increasing number of climate change simulations being carried out, it has become impractical to use all of them to cover the uncertainty of climate change impacts. Various methods have been proposed to optimally select subsets of a large ensemble of climate simulations for impact studies. However, the behaviour of optimally-selected subsets of climate simulations for climate change impacts is unknown, since the transfer process from climate projections to the impact study world is usually highly non-linear. Consequently, this study investigates the transferability of optimally-selected subsets of climate simulations in the case of hydrological impacts. Two different methods were used for the optimal selection of subsets of climate scenarios, and both were found to be capable of adequately representing the spread of selected climate model variables contained in the original large ensemble. However, in both cases, the optimal subsets had limited transferability to hydrological impacts. To capture a similar variability in the impact model world, many more simulations have to be used than those that are needed to simply cover variability from the climate model variables' perspective. Overall, both optimal subset selection methods were better than random selection when small subsets were selected from a large ensemble for impact studies. However, as the number of selected simulations increased, random selection often performed better than the two optimal methods. To ensure adequate uncertainty coverage, the results of this study imply that selecting as many climate change simulations as possible is the best avenue. Where this was not possible, the two optimal methods were found to perform adequately.
Nanostructured complex oxides as a route towards thermal behavior in artificial spin ice systems
NASA Astrophysics Data System (ADS)
Chopdekar, R. V.; Li, B.; Wynn, T. A.; Lee, M. S.; Jia, Y.; Liu, Z. Q.; Biegalski, M. D.; Retterer, S. T.; Young, A. T.; Scholl, A.; Takamura, Y.
2017-07-01
We have used soft x-ray photoemission electron microscopy to image the magnetization of single-domain L a0.7S r0.3Mn O3 nanoislands arranged in geometrically frustrated configurations such as square ice and kagome ice geometries. Upon thermal randomization, ensembles of nanoislands with strong interisland magnetic coupling relax towards low-energy configurations. Statistical analysis shows that the likelihood of ensembles falling into low-energy configurations depends strongly on the annealing temperature. Annealing to just below the Curie temperature of the ferromagnetic film (TC=338 K ) allows for a much greater probability of achieving low-energy configurations as compared to annealing above the Curie temperature. At this thermally active temperature of 325 K, the ensemble of ferromagnetic nanoislands explore their energy landscape over time and eventually transition to lower energy states as compared to the frozen-in configurations obtained upon cooling from above the Curie temperature. Thus, this materials system allows for a facile method to systematically study thermal evolution of artificial spin ice arrays of nanoislands at temperatures modestly above room temperature.
The Effects of Band Labels on Evaluators' Judgments of Musical Performance
ERIC Educational Resources Information Center
Silvey, Brian A.
2009-01-01
This study investigates the effects of band labels on evaluators' judgments of musical performance. High school concert band members (n = 72), wind ensemble members ( n = 77), and band directors (n = 8) were randomly assigned to a band label or no label group. Only the band label group was given evaluation forms that specified the group playing…
The random energy model in a magnetic field and joint source channel coding
NASA Astrophysics Data System (ADS)
Merhav, Neri
2008-09-01
We demonstrate that there is an intimate relationship between the magnetic properties of Derrida’s random energy model (REM) of spin glasses and the problem of joint source-channel coding in Information Theory. In particular, typical patterns of erroneously decoded messages in the coding problem have “magnetization” properties that are analogous to those of the REM in certain phases, where the non-uniformity of the distribution of the source in the coding problem plays the role of an external magnetic field applied to the REM. We also relate the ensemble performance (random coding exponents) of joint source-channel codes to the free energy of the REM in its different phases.
Model of random center vortex lines in continuous 2 +1 -dimensional spacetime
NASA Astrophysics Data System (ADS)
Altarawneh, Derar; Engelhardt, Michael; Höllwieser, Roman
2016-12-01
A picture of confinement in QCD based on a condensate of thick vortices with fluxes in the center of the gauge group (center vortices) is studied. Previous concrete model realizations of this picture utilized a hypercubic space-time scaffolding, which, together with many advantages, also has some disadvantages, e.g., in the treatment of vortex topological charge. In the present work, we explore a center vortex model which does not rely on such a scaffolding. Vortices are represented by closed random lines in continuous 2 +1 -dimensional space-time. These random lines are modeled as being piecewise linear, and an ensemble is generated by Monte Carlo methods. The physical space in which the vortex lines are defined is a torus with periodic boundary conditions. Besides moving, growing, and shrinking of the vortex configurations, also reconnections are allowed. Our ensemble therefore contains not a fixed but a variable number of closed vortex lines. This is expected to be important for realizing the deconfining phase transition. We study both vortex percolation and the potential V (R ) between the quark and antiquark as a function of distance R at different vortex densities, vortex segment lengths, reconnection conditions, and at different temperatures. We find three deconfinement phase transitions, as a function of density, as a function of vortex segment length, and as a function of temperature.
Transverse momentum-dependent parton distribution functions from lattice QCD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Michael Engelhardt, Philipp Haegler, Bernhard Musch, John Negele, Andreas Schaefer
Transverse momentum-dependent parton distributions (TMDs) relevant for semi-inclusive deep inelastic scattering (SIDIS) and the Drell-Yan process can be defined in terms of matrix elements of a quark bilocal operator containing a staple-shaped Wilson connection. Starting from such a definition, a scheme to determine TMDs in lattice QCD is developed and explored. Parametrizing the aforementioned matrix elements in terms of invariant amplitudes permits a simple transformation of the problem to a Lorentz frame suited for the lattice calculation. Results for the Sivers and Boer-Mulders transverse momentum shifts are obtained using ensembles at the pion masses 369MeV and 518MeV, focusing in particularmore » on the dependence of these shifts on the staple extent and a Collins-Soper-type evolution parameter quantifying proximity of the staples to the light cone.« less
Multiple scattering in planetary regoliths using first-order incoherent interactions
NASA Astrophysics Data System (ADS)
Muinonen, Karri; Markkanen, Johannes; Väisänen, Timo; Penttilä, Antti
2017-10-01
We consider scattering of light by a planetary regolith modeled using discrete random media of spherical particles. The size of the random medium can range from microscopic sizes of a few wavelengths to macroscopic sizes approaching infinity. The size of the particles is assumed to be of the order of the wavelength. We extend the numerical Monte Carlo method of radiative transfer and coherent backscattering (RT-CB) to the case of dense packing of particles. We adopt the ensemble-averaged first-order incoherent extinction, scattering, and absorption characteristics of a volume element of particles as input for the RT-CB. The volume element must be larger than the wavelength but smaller than the mean free path length of incoherent extinction. In the radiative transfer part, at each absorption and scattering process, we account for absorption with the help of the single-scattering albedo and peel off the Stokes parameters of radiation emerging from the medium in predefined scattering angles. We then generate a new scattering direction using the joint probability density for the local polar and azimuthal scattering angles. In the coherent backscattering part, we utilize amplitude scattering matrices along the radiative-transfer path and the reciprocal path, and utilize the reciprocity of electromagnetic waves to verify the computation. We illustrate the incoherent volume-element scattering characteristics and compare the dense-medium RT-CB to asymptotically exact results computed using the Superposition T-matrix method (STMM). We show that the dense-medium RT-CB compares favorably to the STMM results for the current cases of sparse and dense discrete random media studied. The novel method can be applied in modeling light scattering by the surfaces of asteroids and other airless solar system objects, including UV-Vis-NIR spectroscopy, photometry, polarimetry, and radar scattering problems.Acknowledgments. Research supported by European Research Council with Advanced Grant No. 320773 SAEMPL, Scattering and Absorption of ElectroMagnetic waves in ParticuLate media. Computational resources provided by CSC - IT Centre for Science Ltd, Finland.
NASA Astrophysics Data System (ADS)
Cugliandolo, Leticia F.; Lozano, Gustavo S.; Nessi, Nicolás; Picco, Marco; Tartaglia, Alessandro
2018-06-01
We study the Hamiltonian dynamics of the spherical spin model with fully-connected two-body random interactions. In the statistical physics framework, the potential energy is of the so-called p = 2 kind, closely linked to the scalar field theory. Most importantly for our setting, the energy conserving dynamics are equivalent to the ones of the Neumann integrable model. We take initial conditions from the Boltzmann equilibrium measure at a temperature that can be above or below the static phase transition, typical of a disordered (paramagnetic) or of an ordered (disguised ferromagnetic) equilibrium phase. We subsequently evolve the configurations with Newton dynamics dictated by a different Hamiltonian, obtained from an instantaneous global rescaling of the elements in the interaction random matrix. In the limit of infinitely many degrees of freedom, , we identify three dynamical phases depending on the parameters that characterise the initial state and the final Hamiltonian. We next set the analysis of the system with finite number of degrees of freedom in terms of N non-linearly coupled modes. We argue that in the limit the modes decouple at long times. We evaluate the mode temperatures and we relate them to the frequency-dependent effective temperature measured with the fluctuation-dissipation relation in the frequency domain, similarly to what was recently proposed for quantum integrable cases. Finally, we analyse the N ‑ 1 integrals of motion, notably, their scaling with N, and we use them to show that the system is out of equilibrium in all phases, even for parameters that show an apparent Gibbs–Boltzmann behaviour of the global observables. We elaborate on the role played by these constants of motion after the quench and we briefly discuss the possible description of the asymptotic dynamics in terms of a generalised Gibbs ensemble.
Meta-heuristic CRPS minimization for the calibration of short-range probabilistic forecasts
NASA Astrophysics Data System (ADS)
Mohammadi, Seyedeh Atefeh; Rahmani, Morteza; Azadi, Majid
2016-08-01
This paper deals with the probabilistic short-range temperature forecasts over synoptic meteorological stations across Iran using non-homogeneous Gaussian regression (NGR). NGR creates a Gaussian forecast probability density function (PDF) from the ensemble output. The mean of the normal predictive PDF is a bias-corrected weighted average of the ensemble members and its variance is a linear function of the raw ensemble variance. The coefficients for the mean and variance are estimated by minimizing the continuous ranked probability score (CRPS) during a training period. CRPS is a scoring rule for distributional forecasts. In the paper of Gneiting et al. (Mon Weather Rev 133:1098-1118, 2005), Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is used to minimize the CRPS. Since BFGS is a conventional optimization method with its own limitations, we suggest using the particle swarm optimization (PSO), a robust meta-heuristic method, to minimize the CRPS. The ensemble prediction system used in this study consists of nine different configurations of the weather research and forecasting model for 48-h forecasts of temperature during autumn and winter 2011 and 2012. The probabilistic forecasts were evaluated using several common verification scores including Brier score, attribute diagram and rank histogram. Results show that both BFGS and PSO find the optimal solution and show the same evaluation scores, but PSO can do this with a feasible random first guess and much less computational complexity.
NASA Astrophysics Data System (ADS)
Fyodorov, Yan V.
2018-06-01
We suggest a method of studying the joint probability density (JPD) of an eigenvalue and the associated `non-orthogonality overlap factor' (also known as the `eigenvalue condition number') of the left and right eigenvectors for non-selfadjoint Gaussian random matrices of size {N× N} . First we derive the general finite N expression for the JPD of a real eigenvalue {λ} and the associated non-orthogonality factor in the real Ginibre ensemble, and then analyze its `bulk' and `edge' scaling limits. The ensuing distribution is maximally heavy-tailed, so that all integer moments beyond normalization are divergent. A similar calculation for a complex eigenvalue z and the associated non-orthogonality factor in the complex Ginibre ensemble is presented as well and yields a distribution with the finite first moment. Its `bulk' scaling limit yields a distribution whose first moment reproduces the well-known result of Chalker and Mehlig (Phys Rev Lett 81(16):3367-3370, 1998), and we provide the `edge' scaling distribution for this case as well. Our method involves evaluating the ensemble average of products and ratios of integer and half-integer powers of characteristic polynomials for Ginibre matrices, which we perform in the framework of a supersymmetry approach. Our paper complements recent studies by Bourgade and Dubach (The distribution of overlaps between eigenvectors of Ginibre matrices, 2018. arXiv:1801.01219).
Random matrix theory and portfolio optimization in Moroccan stock exchange
NASA Astrophysics Data System (ADS)
El Alaoui, Marwane
2015-09-01
In this work, we use random matrix theory to analyze eigenvalues and see if there is a presence of pertinent information by using Marčenko-Pastur distribution. Thus, we study cross-correlation among stocks of Casablanca Stock Exchange. Moreover, we clean correlation matrix from noisy elements to see if the gap between predicted risk and realized risk would be reduced. We also analyze eigenvectors components distributions and their degree of deviations by computing the inverse participation ratio. This analysis is a way to understand the correlation structure among stocks of Casablanca Stock Exchange portfolio.
The Limits of Coding with Joint Constraints on Detected and Undetected Error Rates
NASA Technical Reports Server (NTRS)
Dolinar, Sam; Andrews, Kenneth; Pollara, Fabrizio; Divsalar, Dariush
2008-01-01
We develop a remarkably tight upper bound on the performance of a parameterized family of bounded angle maximum-likelihood (BA-ML) incomplete decoders. The new bound for this class of incomplete decoders is calculated from the code's weight enumerator, and is an extension of Poltyrev-type bounds developed for complete ML decoders. This bound can also be applied to bound the average performance of random code ensembles in terms of an ensemble average weight enumerator. We also formulate conditions defining a parameterized family of optimal incomplete decoders, defined to minimize both the total codeword error probability and the undetected error probability for any fixed capability of the decoder to detect errors. We illustrate the gap between optimal and BA-ML incomplete decoding via simulation of a small code.