Sample records for mixture model clustering

  1. Evaluating Mixture Modeling for Clustering: Recommendations and Cautions

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2011-01-01

    This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…

  2. Mixture modelling for cluster analysis.

    PubMed

    McLachlan, G J; Chang, S U

    2004-10-01

    Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.

  3. Cluster kinetics model for mixtures of glassformers

    NASA Astrophysics Data System (ADS)

    Brenskelle, Lisa A.; McCoy, Benjamin J.

    2007-10-01

    For glassformers we propose a binary mixture relation for parameters in a cluster kinetics model previously shown to represent pure compound data for viscosity and dielectric relaxation as functions of either temperature or pressure. The model parameters are based on activation energies and activation volumes for cluster association-dissociation processes. With the mixture parameters, we calculated dielectric relaxation times and compared the results to experimental values for binary mixtures. Mixtures of sorbitol and glycerol (seven compositions), sorbitol and xylitol (three compositions), and polychloroepihydrin and polyvinylmethylether (three compositions) were studied.

  4. A modified procedure for mixture-model clustering of regional geochemical data

    USGS Publications Warehouse

    Ellefsen, Karl J.; Smith, David B.; Horton, John D.

    2014-01-01

    A modified procedure is proposed for mixture-model clustering of regional-scale geochemical data. The key modification is the robust principal component transformation of the isometric log-ratio transforms of the element concentrations. This principal component transformation and the associated dimension reduction are applied before the data are clustered. The principal advantage of this modification is that it significantly improves the stability of the clustering. The principal disadvantage is that it requires subjective selection of the number of clusters and the number of principal components. To evaluate the efficacy of this modified procedure, it is applied to soil geochemical data that comprise 959 samples from the state of Colorado (USA) for which the concentrations of 44 elements are measured. The distributions of element concentrations that are derived from the mixture model and from the field samples are similar, indicating that the mixture model is a suitable representation of the transformed geochemical data. Each cluster and the associated distributions of the element concentrations are related to specific geologic and anthropogenic features. In this way, mixture model clustering facilitates interpretation of the regional geochemical data.

  5. Poisson Mixture Regression Models for Heart Disease Prediction.

    PubMed

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  6. Poisson Mixture Regression Models for Heart Disease Prediction

    PubMed Central

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  7. Combining Mixture Components for Clustering*

    PubMed Central

    Baudry, Jean-Patrick; Raftery, Adrian E.; Celeux, Gilles; Lo, Kenneth; Gottardo, Raphaël

    2010-01-01

    Model-based clustering consists of fitting a mixture model to data and identifying each cluster with one of its components. Multivariate normal distributions are typically used. The number of clusters is usually determined from the data, often using BIC. In practice, however, individual clusters can be poorly fitted by Gaussian distributions, and in that case model-based clustering tends to represent one non-Gaussian cluster by a mixture of two or more Gaussian distributions. If the number of mixture components is interpreted as the number of clusters, this can lead to overestimation of the number of clusters. This is because BIC selects the number of mixture components needed to provide a good approximation to the density, rather than the number of clusters as such. We propose first selecting the total number of Gaussian mixture components, K, using BIC and then combining them hierarchically according to an entropy criterion. This yields a unique soft clustering for each number of clusters less than or equal to K. These clusterings can be compared on substantive grounds, and we also describe an automatic way of selecting the number of clusters via a piecewise linear regression fit to the rescaled entropy plot. We illustrate the method with simulated data and a flow cytometry dataset. Supplemental Materials are available on the journal Web site and described at the end of the paper. PMID:20953302

  8. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    EPA Science Inventory

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  9. Similarity measure and domain adaptation in multiple mixture model clustering: An application to image processing.

    PubMed

    Leong, Siow Hoo; Ong, Seng Huat

    2017-01-01

    This paper considers three crucial issues in processing scaled down image, the representation of partial image, similarity measure and domain adaptation. Two Gaussian mixture model based algorithms are proposed to effectively preserve image details and avoids image degradation. Multiple partial images are clustered separately through Gaussian mixture model clustering with a scan and select procedure to enhance the inclusion of small image details. The local image features, represented by maximum likelihood estimates of the mixture components, are classified by using the modified Bayes factor (MBF) as a similarity measure. The detection of novel local features from MBF will suggest domain adaptation, which is changing the number of components of the Gaussian mixture model. The performance of the proposed algorithms are evaluated with simulated data and real images and it is shown to perform much better than existing Gaussian mixture model based algorithms in reproducing images with higher structural similarity index.

  10. Similarity measure and domain adaptation in multiple mixture model clustering: An application to image processing

    PubMed Central

    Leong, Siow Hoo

    2017-01-01

    This paper considers three crucial issues in processing scaled down image, the representation of partial image, similarity measure and domain adaptation. Two Gaussian mixture model based algorithms are proposed to effectively preserve image details and avoids image degradation. Multiple partial images are clustered separately through Gaussian mixture model clustering with a scan and select procedure to enhance the inclusion of small image details. The local image features, represented by maximum likelihood estimates of the mixture components, are classified by using the modified Bayes factor (MBF) as a similarity measure. The detection of novel local features from MBF will suggest domain adaptation, which is changing the number of components of the Gaussian mixture model. The performance of the proposed algorithms are evaluated with simulated data and real images and it is shown to perform much better than existing Gaussian mixture model based algorithms in reproducing images with higher structural similarity index. PMID:28686634

  11. Assessing variation in life-history tactics within a population using mixture regression models: a practical guide for evolutionary ecologists.

    PubMed

    Hamel, Sandra; Yoccoz, Nigel G; Gaillard, Jean-Michel

    2017-05-01

    Mixed models are now well-established methods in ecology and evolution because they allow accounting for and quantifying within- and between-individual variation. However, the required normal distribution of the random effects can often be violated by the presence of clusters among subjects, which leads to multi-modal distributions. In such cases, using what is known as mixture regression models might offer a more appropriate approach. These models are widely used in psychology, sociology, and medicine to describe the diversity of trajectories occurring within a population over time (e.g. psychological development, growth). In ecology and evolution, however, these models are seldom used even though understanding changes in individual trajectories is an active area of research in life-history studies. Our aim is to demonstrate the value of using mixture models to describe variation in individual life-history tactics within a population, and hence to promote the use of these models by ecologists and evolutionary ecologists. We first ran a set of simulations to determine whether and when a mixture model allows teasing apart latent clustering, and to contrast the precision and accuracy of estimates obtained from mixture models versus mixed models under a wide range of ecological contexts. We then used empirical data from long-term studies of large mammals to illustrate the potential of using mixture models for assessing within-population variation in life-history tactics. Mixture models performed well in most cases, except for variables following a Bernoulli distribution and when sample size was small. The four selection criteria we evaluated [Akaike information criterion (AIC), Bayesian information criterion (BIC), and two bootstrap methods] performed similarly well, selecting the right number of clusters in most ecological situations. We then showed that the normality of random effects implicitly assumed by evolutionary ecologists when using mixed models was often violated in life-history data. Mixed models were quite robust to this violation in the sense that fixed effects were unbiased at the population level. However, fixed effects at the cluster level and random effects were better estimated using mixture models. Our empirical analyses demonstrated that using mixture models facilitates the identification of the diversity of growth and reproductive tactics occurring within a population. Therefore, using this modelling framework allows testing for the presence of clusters and, when clusters occur, provides reliable estimates of fixed and random effects for each cluster of the population. In the presence or expectation of clusters, using mixture models offers a suitable extension of mixed models, particularly when evolutionary ecologists aim at identifying how ecological and evolutionary processes change within a population. Mixture regression models therefore provide a valuable addition to the statistical toolbox of evolutionary ecologists. As these models are complex and have their own limitations, we provide recommendations to guide future users. © 2016 Cambridge Philosophical Society.

  12. 2-Way k-Means as a Model for Microbiome Samples.

    PubMed

    Jackson, Weston J; Agarwal, Ipsita; Pe'er, Itsik

    2017-01-01

    Motivation . Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k -means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project.

  13. 2-Way k-Means as a Model for Microbiome Samples

    PubMed Central

    2017-01-01

    Motivation. Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k-means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project. PMID:29177026

  14. "K"-Means May Perform as well as Mixture Model Clustering but May Also Be Much Worse: Comment on Steinley and Brusco (2011)

    ERIC Educational Resources Information Center

    Vermunt, Jeroen K.

    2011-01-01

    Steinley and Brusco (2011) presented the results of a huge simulation study aimed at evaluating cluster recovery of mixture model clustering (MMC) both for the situation where the number of clusters is known and is unknown. They derived rather strong conclusions on the basis of this study, especially with regard to the good performance of…

  15. An incremental DPMM-based method for trajectory clustering, modeling, and retrieval.

    PubMed

    Hu, Weiming; Li, Xi; Tian, Guodong; Maybank, Stephen; Zhang, Zhongfei

    2013-05-01

    Trajectory analysis is the basis for many applications, such as indexing of motion events in videos, activity recognition, and surveillance. In this paper, the Dirichlet process mixture model (DPMM) is applied to trajectory clustering, modeling, and retrieval. We propose an incremental version of a DPMM-based clustering algorithm and apply it to cluster trajectories. An appropriate number of trajectory clusters is determined automatically. When trajectories belonging to new clusters arrive, the new clusters can be identified online and added to the model without any retraining using the previous data. A time-sensitive Dirichlet process mixture model (tDPMM) is applied to each trajectory cluster for learning the trajectory pattern which represents the time-series characteristics of the trajectories in the cluster. Then, a parameterized index is constructed for each cluster. A novel likelihood estimation algorithm for the tDPMM is proposed, and a trajectory-based video retrieval model is developed. The tDPMM-based probabilistic matching method and the DPMM-based model growing method are combined to make the retrieval model scalable and adaptable. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our algorithm.

  16. Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution

    PubMed Central

    Lo, Kenneth

    2011-01-01

    Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375

  17. Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution.

    PubMed

    Lo, Kenneth; Gottardo, Raphael

    2012-01-01

    Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.

  18. Infinite von Mises-Fisher Mixture Modeling of Whole Brain fMRI Data.

    PubMed

    Røge, Rasmus E; Madsen, Kristoffer H; Schmidt, Mikkel N; Mørup, Morten

    2017-10-01

    Cluster analysis of functional magnetic resonance imaging (fMRI) data is often performed using gaussian mixture models, but when the time series are standardized such that the data reside on a hypersphere, this modeling assumption is questionable. The consequences of ignoring the underlying spherical manifold are rarely analyzed, in part due to the computational challenges imposed by directional statistics. In this letter, we discuss a Bayesian von Mises-Fisher (vMF) mixture model for data on the unit hypersphere and present an efficient inference procedure based on collapsed Markov chain Monte Carlo sampling. Comparing the vMF and gaussian mixture models on synthetic data, we demonstrate that the vMF model has a slight advantage inferring the true underlying clustering when compared to gaussian-based models on data generated from both a mixture of vMFs and a mixture of gaussians subsequently normalized. Thus, when performing model selection, the two models are not in agreement. Analyzing multisubject whole brain resting-state fMRI data from healthy adult subjects, we find that the vMF mixture model is considerably more reliable than the gaussian mixture model when comparing solutions across models trained on different groups of subjects, and again we find that the two models disagree on the optimal number of components. The analysis indicates that the fMRI data support more than a thousand clusters, and we confirm this is not a result of overfitting by demonstrating better prediction on data from held-out subjects. Our results highlight the utility of using directional statistics to model standardized fMRI data and demonstrate that whole brain segmentation of fMRI data requires a very large number of functional units in order to adequately account for the discernible statistical patterns in the data.

  19. Mixture Modeling: Applications in Educational Psychology

    ERIC Educational Resources Information Center

    Harring, Jeffrey R.; Hodis, Flaviu A.

    2016-01-01

    Model-based clustering methods, commonly referred to as finite mixture modeling, have been applied to a wide variety of cross-sectional and longitudinal data to account for heterogeneity in population characteristics. In this article, we elucidate 2 such approaches: growth mixture modeling and latent profile analysis. Both techniques are…

  20. Robust Bayesian clustering.

    PubMed

    Archambeau, Cédric; Verleysen, Michel

    2007-01-01

    A new variational Bayesian learning algorithm for Student-t mixture models is introduced. This algorithm leads to (i) robust density estimation, (ii) robust clustering and (iii) robust automatic model selection. Gaussian mixture models are learning machines which are based on a divide-and-conquer approach. They are commonly used for density estimation and clustering tasks, but are sensitive to outliers. The Student-t distribution has heavier tails than the Gaussian distribution and is therefore less sensitive to any departure of the empirical distribution from Gaussianity. As a consequence, the Student-t distribution is suitable for constructing robust mixture models. In this work, we formalize the Bayesian Student-t mixture model as a latent variable model in a different way from Svensén and Bishop [Svensén, M., & Bishop, C. M. (2005). Robust Bayesian mixture modelling. Neurocomputing, 64, 235-252]. The main difference resides in the fact that it is not necessary to assume a factorized approximation of the posterior distribution on the latent indicator variables and the latent scale variables in order to obtain a tractable solution. Not neglecting the correlations between these unobserved random variables leads to a Bayesian model having an increased robustness. Furthermore, it is expected that the lower bound on the log-evidence is tighter. Based on this bound, the model complexity, i.e. the number of components in the mixture, can be inferred with a higher confidence.

  1. A mixture model-based approach to the clustering of microarray expression data.

    PubMed

    McLachlan, G J; Bean, R W; Peel, D

    2002-03-01

    This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/

  2. Joint model-based clustering of nonlinear longitudinal trajectories and associated time-to-event data analysis, linked by latent class membership: with application to AIDS clinical studies.

    PubMed

    Huang, Yangxin; Lu, Xiaosun; Chen, Jiaqing; Liang, Juan; Zangmeister, Miriam

    2017-10-27

    Longitudinal and time-to-event data are often observed together. Finite mixture models are currently used to analyze nonlinear heterogeneous longitudinal data, which, by releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, can cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, and be associated with clinically important time-to-event data. This article develops a joint modeling approach to a finite mixture of NLME models for longitudinal data and proportional hazard Cox model for time-to-event data, linked by individual latent class indicators, under a Bayesian framework. The proposed joint models and method are applied to a real AIDS clinical trial data set, followed by simulation studies to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and Cox model are fitted separately.

  3. Analyzing gene expression time-courses based on multi-resolution shape mixture model.

    PubMed

    Li, Ying; He, Ye; Zhang, Yu

    2016-11-01

    Biological processes actually are a dynamic molecular process over time. Time course gene expression experiments provide opportunities to explore patterns of gene expression change over a time and understand the dynamic behavior of gene expression, which is crucial for study on development and progression of biology and disease. Analysis of the gene expression time-course profiles has not been fully exploited so far. It is still a challenge problem. We propose a novel shape-based mixture model clustering method for gene expression time-course profiles to explore the significant gene groups. Based on multi-resolution fractal features and mixture clustering model, we proposed a multi-resolution shape mixture model algorithm. Multi-resolution fractal features is computed by wavelet decomposition, which explore patterns of change over time of gene expression at different resolution. Our proposed multi-resolution shape mixture model algorithm is a probabilistic framework which offers a more natural and robust way of clustering time-course gene expression. We assessed the performance of our proposed algorithm using yeast time-course gene expression profiles compared with several popular clustering methods for gene expression profiles. The grouped genes identified by different methods are evaluated by enrichment analysis of biological pathways and known protein-protein interactions from experiment evidence. The grouped genes identified by our proposed algorithm have more strong biological significance. A novel multi-resolution shape mixture model algorithm based on multi-resolution fractal features is proposed. Our proposed model provides a novel horizons and an alternative tool for visualization and analysis of time-course gene expression profiles. The R and Matlab program is available upon the request. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Regional SAR Image Segmentation Based on Fuzzy Clustering with Gamma Mixture Model

    NASA Astrophysics Data System (ADS)

    Li, X. L.; Zhao, Q. H.; Li, Y.

    2017-09-01

    Most of stochastic based fuzzy clustering algorithms are pixel-based, which can not effectively overcome the inherent speckle noise in SAR images. In order to deal with the problem, a regional SAR image segmentation algorithm based on fuzzy clustering with Gamma mixture model is proposed in this paper. First, initialize some generating points randomly on the image, the image domain is divided into many sub-regions using Voronoi tessellation technique. Each sub-region is regarded as a homogeneous area in which the pixels share the same cluster label. Then, assume the probability of the pixel to be a Gamma mixture model with the parameters respecting to the cluster which the pixel belongs to. The negative logarithm of the probability represents the dissimilarity measure between the pixel and the cluster. The regional dissimilarity measure of one sub-region is defined as the sum of the measures of pixels in the region. Furthermore, the Markov Random Field (MRF) model is extended from pixels level to Voronoi sub-regions, and then the regional objective function is established under the framework of fuzzy clustering. The optimal segmentation results can be obtained by the solution of model parameters and generating points. Finally, the effectiveness of the proposed algorithm can be proved by the qualitative and quantitative analysis from the segmentation results of the simulated and real SAR images.

  5. Finding Groups Using Model-Based Cluster Analysis: Heterogeneous Emotional Self-Regulatory Processes and Heavy Alcohol Use Risk

    ERIC Educational Resources Information Center

    Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

    2008-01-01

    Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…

  6. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    USGS Publications Warehouse

    Ellefsen, Karl J.; Smith, David

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.

  7. Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

    PubMed Central

    Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

    2017-01-01

    We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392

  8. Annotated Computer Output for Illustrative Examples of Clustering Using the Mixture Method and Two Comparable Methods from SAS.

    DTIC Science & Technology

    1987-06-26

    BUREAU OF STANDAR-S1963-A Nw BOM -ILE COPY -. 4eo .?3sa.9"-,,A WIN* MAT HEMATICAL SCIENCES _*INSTITUTE AD-A184 687 DTICS!ELECTE ANNOTATED COMPUTER OUTPUT...intoduction to the use of mixture models in clustering. Cornell University Biometrics Unit Technical Report BU-920-M and Mathematical Sciences Institute...mixture method and two comparable methods from SAS. Cornell University Biometrics Unit Technical Report BU-921-M and Mathematical Sciences Institute

  9. A pattern-mixture model approach for handling missing continuous outcome data in longitudinal cluster randomized trials.

    PubMed

    Fiero, Mallorie H; Hsu, Chiu-Hsieh; Bell, Melanie L

    2017-11-20

    We extend the pattern-mixture approach to handle missing continuous outcome data in longitudinal cluster randomized trials, which randomize groups of individuals to treatment arms, rather than the individuals themselves. Individuals who drop out at the same time point are grouped into the same dropout pattern. We approach extrapolation of the pattern-mixture model by applying multilevel multiple imputation, which imputes missing values while appropriately accounting for the hierarchical data structure found in cluster randomized trials. To assess parameters of interest under various missing data assumptions, imputed values are multiplied by a sensitivity parameter, k, which increases or decreases imputed values. Using simulated data, we show that estimates of parameters of interest can vary widely under differing missing data assumptions. We conduct a sensitivity analysis using real data from a cluster randomized trial by increasing k until the treatment effect inference changes. By performing a sensitivity analysis for missing data, researchers can assess whether certain missing data assumptions are reasonable for their cluster randomized trial. Copyright © 2017 John Wiley & Sons, Ltd.

  10. DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.

    PubMed

    Sun, Zhe; Wang, Ting; Deng, Ke; Wang, Xiao-Feng; Lafyatis, Robert; Ding, Ying; Hu, Ming; Chen, Wei

    2018-01-01

    Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. wei.chen@chp.edu or hum@ccf.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  11. Whole-Volume Clustering of Time Series Data from Zebrafish Brain Calcium Images via Mixture Modeling.

    PubMed

    Nguyen, Hien D; Ullmann, Jeremy F P; McLachlan, Geoffrey J; Voleti, Venkatakaushik; Li, Wenze; Hillman, Elizabeth M C; Reutens, David C; Janke, Andrew L

    2018-02-01

    Calcium is a ubiquitous messenger in neural signaling events. An increasing number of techniques are enabling visualization of neurological activity in animal models via luminescent proteins that bind to calcium ions. These techniques generate large volumes of spatially correlated time series. A model-based functional data analysis methodology via Gaussian mixtures is suggested for the clustering of data from such visualizations is proposed. The methodology is theoretically justified and a computationally efficient approach to estimation is suggested. An example analysis of a zebrafish imaging experiment is presented.

  12. Model selection for clustering of pharmacokinetic responses.

    PubMed

    Guerra, Rui P; Carvalho, Alexandra M; Mateus, Paulo

    2018-08-01

    Pharmacokinetics comprises the study of drug absorption, distribution, metabolism and excretion over time. Clinical pharmacokinetics, focusing on therapeutic management, offers important insights towards personalised medicine through the study of efficacy and toxicity of drug therapies. This study is hampered by subject's high variability in drug blood concentration, when starting a therapy with the same drug dosage. Clustering of pharmacokinetics responses has been addressed recently as a way to stratify subjects and provide different drug doses for each stratum. This clustering method, however, is not able to automatically determine the correct number of clusters, using an user-defined parameter for collapsing clusters that are closer than a given heuristic threshold. We aim to use information-theoretical approaches to address parameter-free model selection. We propose two model selection criteria for clustering pharmacokinetics responses, founded on the Minimum Description Length and on the Normalised Maximum Likelihood. Experimental results show the ability of model selection schemes to unveil the correct number of clusters underlying the mixture of pharmacokinetics responses. In this work we were able to devise two model selection criteria to determine the number of clusters in a mixture of pharmacokinetics curves, advancing over previous works. A cost-efficient parallel implementation in Java of the proposed method is publicly available for the community. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Clustered mixed nonhomogeneous Poisson process spline models for the analysis of recurrent event panel data.

    PubMed

    Nielsen, J D; Dean, C B

    2008-09-01

    A flexible semiparametric model for analyzing longitudinal panel count data arising from mixtures is presented. Panel count data refers here to count data on recurrent events collected as the number of events that have occurred within specific follow-up periods. The model assumes that the counts for each subject are generated by mixtures of nonhomogeneous Poisson processes with smooth intensity functions modeled with penalized splines. Time-dependent covariate effects are also incorporated into the process intensity using splines. Discrete mixtures of these nonhomogeneous Poisson process spline models extract functional information from underlying clusters representing hidden subpopulations. The motivating application is an experiment to test the effectiveness of pheromones in disrupting the mating pattern of the cherry bark tortrix moth. Mature moths arise from hidden, but distinct, subpopulations and monitoring the subpopulation responses was of interest. Within-cluster random effects are used to account for correlation structures and heterogeneity common to this type of data. An estimating equation approach to inference requiring only low moment assumptions is developed and the finite sample properties of the proposed estimating functions are investigated empirically by simulation.

  14. Evolution of Carbon Clusters in the Detonation Products of the Triaminotrinitrobenzene (TATB)-Based Explosive PBX 9502

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Watkins, Erik B.; Velizhanin, Kirill A.; Dattelbaum, Dana M.

    Here, the detonation of carbon-rich high explosives yields solid carbon as a major constituent of the product mixture and, depending on the thermodynamic conditions behind the shock front, a variety of carbon allotropes and morphologies may form and evolve. We applied time-resolved small angle x-ray scattering (TR-SAXS) to investigate the dynamics of carbon clustering during detonation of PBX 9502, an explosive composed of triaminotrinitrobenzene (TATB) and 5 wt% fluoropolymer binder. Solid carbon formation was probed from 0.1 to 2.0 μs behind the detonation front and revealed rapid carbon cluster growth which reached a maximum after ~200 ns. The late-time carbonmore » clusters had a radius of gyration of 3.3 nm which is consistent with 8.4 nm diameter spherical particles and matched particle sizes of recovered products. Simulations using a clustering kinetics model were found to be in good agreement with the experimental measurements of cluster growth when invoking a freeze-out temperature, and temporal shift associated with the initial precipitation of solid carbon. Product densities from reactive flow models were compared to the electron density contrast obtained from TR-SAXS and used to approximate the carbon cluster composition as a mixture of 20% highly ordered (diamond-like) and 80% disordered carbon forms, which will inform future product equation of state models for solid carbon in PBX 9502 detonation product mixtures.« less

  15. Evolution of Carbon Clusters in the Detonation Products of the Triaminotrinitrobenzene (TATB)-Based Explosive PBX 9502

    DOE PAGES

    Watkins, Erik B.; Velizhanin, Kirill A.; Dattelbaum, Dana M.; ...

    2017-08-15

    Here, the detonation of carbon-rich high explosives yields solid carbon as a major constituent of the product mixture and, depending on the thermodynamic conditions behind the shock front, a variety of carbon allotropes and morphologies may form and evolve. We applied time-resolved small angle x-ray scattering (TR-SAXS) to investigate the dynamics of carbon clustering during detonation of PBX 9502, an explosive composed of triaminotrinitrobenzene (TATB) and 5 wt% fluoropolymer binder. Solid carbon formation was probed from 0.1 to 2.0 μs behind the detonation front and revealed rapid carbon cluster growth which reached a maximum after ~200 ns. The late-time carbonmore » clusters had a radius of gyration of 3.3 nm which is consistent with 8.4 nm diameter spherical particles and matched particle sizes of recovered products. Simulations using a clustering kinetics model were found to be in good agreement with the experimental measurements of cluster growth when invoking a freeze-out temperature, and temporal shift associated with the initial precipitation of solid carbon. Product densities from reactive flow models were compared to the electron density contrast obtained from TR-SAXS and used to approximate the carbon cluster composition as a mixture of 20% highly ordered (diamond-like) and 80% disordered carbon forms, which will inform future product equation of state models for solid carbon in PBX 9502 detonation product mixtures.« less

  16. Evolution of Carbon Clusters in the Detonation Products of the Triaminotrinitrobenzene (TATB)-Based Explosive PBX 9502

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Watkins, Erik B.; Velizhanin, Kirill A.; Dattelbaum, Dana M.

    The detonation of carbon-rich high explosives yields solid carbon as a major constituent of the product mixture and, depending on the thermodynamic conditions behind the shock front, a variety of carbon allotropes and morphologies may form and evolve. We applied time-resolved small angle x-ray scattering (TR-SAXS) to investigate the dynamics of carbon clustering during detonation of PBX 9502, an explosive composed of triaminotrinitrobenzene (TATB) and 5 wt% fluoropolymer binder. Solid carbon formation was probed from 0.1 to 2.0 μs behind the detonation front and revealed rapid carbon cluster growth which reached a maximum after ~200 ns. The late-time carbon clustersmore » had a radius of gyration of 3.3 nm which is consistent with 8.4 nm diameter spherical particles and matched particle sizes of recovered products. Simulations using a clustering kinetics model were found to be in good agreement with the experimental measurements of cluster growth when invoking a freeze-out temperature, and temporal shift associated with the initial precipitation of solid carbon. Product densities from reactive flow models were compared to the electron density contrast obtained from TR-SAXS and used to approximate the carbon cluster composition as a mixture of 20% highly ordered (diamond-like) and 80% disordered carbon forms, which will inform future product equation of state models for solid carbon in PBX 9502 detonation product mixtures.« less

  17. Self-organization in a bimotility mixture of model microswimmers

    NASA Astrophysics Data System (ADS)

    Agrawal, Adyant; Babu, Sujin B.

    2018-02-01

    We study the cooperation and segregation dynamics in a bimotility mixture of microorganisms which swim at low Reynolds numbers via periodic deformations along the body. We employ a multiparticle collision dynamics method to simulate a two component mixture of artificial swimmers, termed as Taylor lines, which differ from each other only in the propulsion speed. The analysis reveals that a contribution of slower swimmers towards clustering, on average, is much larger as compared to the faster ones. We notice distinctive self-organizing dynamics, depending on the percentage difference in the speed of the two kinds. If this difference is large, the faster ones fragment the clusters of the slower ones in order to reach the boundary and form segregated clusters. Contrarily, when it is small, both kinds mix together at first, the faster ones usually leading the cluster and then gradually the slower ones slide out thereby also leading to segregation.

  18. Transformation and model choice for RNA-seq co-expression analysis.

    PubMed

    Rau, Andrea; Maugis-Rabusseau, Cathy

    2018-05-01

    Although a large number of clustering algorithms have been proposed to identify groups of co-expressed genes from microarray data, the question of if and how such methods may be applied to RNA sequencing (RNA-seq) data remains unaddressed. In this work, we investigate the use of data transformations in conjunction with Gaussian mixture models for RNA-seq co-expression analyses, as well as a penalized model selection criterion to select both an appropriate transformation and number of clusters present in the data. This approach has the advantage of accounting for per-cluster correlation structures among samples, which can be strong in RNA-seq data. In addition, it provides a rigorous statistical framework for parameter estimation, an objective assessment of data transformations and number of clusters and the possibility of performing diagnostic checks on the quality and homogeneity of the identified clusters. We analyze four varied RNA-seq data sets to illustrate the use of transformations and model selection in conjunction with Gaussian mixture models. Finally, we propose a Bioconductor package coseq (co-expression of RNA-seq data) to facilitate implementation and visualization of the recommended RNA-seq co-expression analyses.

  19. Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

    PubMed Central

    2012-01-01

    Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154

  20. The CLASSY clustering algorithm: Description, evaluation, and comparison with the iterative self-organizing clustering system (ISOCLS). [used for LACIE data

    NASA Technical Reports Server (NTRS)

    Lennington, R. K.; Malek, H.

    1978-01-01

    A clustering method, CLASSY, was developed, which alternates maximum likelihood iteration with a procedure for splitting, combining, and eliminating the resulting statistics. The method maximizes the fit of a mixture of normal distributions to the observed first through fourth central moments of the data and produces an estimate of the proportions, means, and covariances in this mixture. The mathematical model which is the basic for CLASSY and the actual operation of the algorithm is described. Data comparing the performances of CLASSY and ISOCLS on simulated and actual LACIE data are presented.

  1. Mixture-Tuned, Clutter Matched Filter for Remote Detection of Subpixel Spectral Signals

    NASA Technical Reports Server (NTRS)

    Thompson, David R.; Mandrake, Lukas; Green, Robert O.

    2013-01-01

    Mapping localized spectral features in large images demands sensitive and robust detection algorithms. Two aspects of large images that can harm matched-filter detection performance are addressed simultaneously. First, multimodal backgrounds may thwart the typical Gaussian model. Second, outlier features can trigger false detections from large projections onto the target vector. Two state-of-the-art approaches are combined that independently address outlier false positives and multimodal backgrounds. The background clustering models multimodal backgrounds, and the mixture tuned matched filter (MT-MF) addresses outliers. Combining the two methods captures significant additional performance benefits. The resulting mixture tuned clutter matched filter (MT-CMF) shows effective performance on simulated and airborne datasets. The classical MNF transform was applied, followed by k-means clustering. Then, each cluster s mean, covariance, and the corresponding eigenvalues were estimated. This yields a cluster-specific matched filter estimate as well as a cluster- specific feasibility score to flag outlier false positives. The technology described is a proof of concept that may be employed in future target detection and mapping applications for remote imaging spectrometers. It is of most direct relevance to JPL proposals for airborne and orbital hyperspectral instruments. Applications include subpixel target detection in hyperspectral scenes for military surveillance. Earth science applications include mineralogical mapping, species discrimination for ecosystem health monitoring, and land use classification.

  2. Modeling sports highlights using a time-series clustering framework and model interpretation

    NASA Astrophysics Data System (ADS)

    Radhakrishnan, Regunathan; Otsuka, Isao; Xiong, Ziyou; Divakaran, Ajay

    2005-01-01

    In our past work on sports highlights extraction, we have shown the utility of detecting audience reaction using an audio classification framework. The audio classes in the framework were chosen based on intuition. In this paper, we present a systematic way of identifying the key audio classes for sports highlights extraction using a time series clustering framework. We treat the low-level audio features as a time series and model the highlight segments as "unusual" events in a background of an "usual" process. The set of audio classes to characterize the sports domain is then identified by analyzing the consistent patterns in each of the clusters output from the time series clustering framework. The distribution of features from the training data so obtained for each of the key audio classes, is parameterized by a Minimum Description Length Gaussian Mixture Model (MDL-GMM). We also interpret the meaning of each of the mixture components of the MDL-GMM for the key audio class (the "highlight" class) that is correlated with highlight moments. Our results show that the "highlight" class is a mixture of audience cheering and commentator's excited speech. Furthermore, we show that the precision-recall performance for highlights extraction based on this "highlight" class is better than that of our previous approach which uses only audience cheering as the key highlight class.

  3. Lifetime assessment by intermittent inspection under the mixture Weibull power law model with application to XLPE cables.

    PubMed

    Hirose, H

    1997-01-01

    This paper proposes a new treatment for electrical insulation degradation. Some types of insulation which have been used under various circumstances are considered to degrade at various rates in accordance with their stress circumstances. The cross-linked polyethylene (XLPE) insulated cables inspected by major Japanese electric companies clearly indicate such phenomena. By assuming that the inspected specimen is sampled from one of the clustered groups, a mixed degradation model can be constructed. Since the degradation of the insulation under common circumstances is considered to follow a Weibull distribution, a mixture model and a Weibull power law can be combined. This is called The mixture Weibull power law model. By using the maximum likelihood estimation for the newly proposed model to Japanese 22 and 33 kV insulation class cables, they are clustered into a certain number of groups by using the AIC and the generalized likelihood ratio test method. The reliability of the cables at specified years are assessed.

  4. "K"-Means Clustering and Mixture Model Clustering: Reply to McLachlan (2011) and Vermunt (2011)

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2011-01-01

    McLachlan (2011) and Vermunt (2011) each provided thoughtful replies to our original article (Steinley & Brusco, 2011). This response serves to incorporate some of their comments while simultaneously clarifying our position. We argue that greater caution against overparamaterization must be taken when assuming that clusters are highly elliptical…

  5. Biclustering Models for Two-Mode Ordinal Data.

    PubMed

    Matechou, Eleni; Liu, Ivy; Fernández, Daniel; Farias, Miguel; Gjelsvik, Bergljot

    2016-09-01

    The work in this paper introduces finite mixture models that can be used to simultaneously cluster the rows and columns of two-mode ordinal categorical response data, such as those resulting from Likert scale responses. We use the popular proportional odds parameterisation and propose models which provide insights into major patterns in the data. Model-fitting is performed using the EM algorithm, and a fuzzy allocation of rows and columns to corresponding clusters is obtained. The clustering ability of the models is evaluated in a simulation study and demonstrated using two real data sets.

  6. CLUSTERING SOUTH AFRICAN HOUSEHOLDS BASED ON THEIR ASSET STATUS USING LATENT VARIABLE MODELS

    PubMed Central

    McParland, Damien; Gormley, Isobel Claire; McCormick, Tyler H.; Clark, Samuel J.; Kabudula, Chodziwadziwa Whiteson; Collinson, Mark A.

    2014-01-01

    The Agincourt Health and Demographic Surveillance System has since 2001 conducted a biannual household asset survey in order to quantify household socio-economic status (SES) in a rural population living in northeast South Africa. The survey contains binary, ordinal and nominal items. In the absence of income or expenditure data, the SES landscape in the study population is explored and described by clustering the households into homogeneous groups based on their asset status. A model-based approach to clustering the Agincourt households, based on latent variable models, is proposed. In the case of modeling binary or ordinal items, item response theory models are employed. For nominal survey items, a factor analysis model, similar in nature to a multinomial probit model, is used. Both model types have an underlying latent variable structure—this similarity is exploited and the models are combined to produce a hybrid model capable of handling mixed data types. Further, a mixture of the hybrid models is considered to provide clustering capabilities within the context of mixed binary, ordinal and nominal response data. The proposed model is termed a mixture of factor analyzers for mixed data (MFA-MD). The MFA-MD model is applied to the survey data to cluster the Agincourt households into homogeneous groups. The model is estimated within the Bayesian paradigm, using a Markov chain Monte Carlo algorithm. Intuitive groupings result, providing insight to the different socio-economic strata within the Agincourt region. PMID:25485026

  7. Screening and clustering of sparse regressions with finite non-Gaussian mixtures.

    PubMed

    Zhang, Jian

    2017-06-01

    This article proposes a method to address the problem that can arise when covariates in a regression setting are not Gaussian, which may give rise to approximately mixture-distributed errors, or when a true mixture of regressions produced the data. The method begins with non-Gaussian mixture-based marginal variable screening, followed by fitting a full but relatively smaller mixture regression model to the selected data with help of a new penalization scheme. Under certain regularity conditions, the new screening procedure is shown to possess a sure screening property even when the population is heterogeneous. We further prove that there exists an elbow point in the associated scree plot which results in a consistent estimator of the set of active covariates in the model. By simulations, we demonstrate that the new procedure can substantially improve the performance of the existing procedures in the content of variable screening and data clustering. By applying the proposed procedure to motif data analysis in molecular biology, we demonstrate that the new method holds promise in practice. © 2016, The International Biometric Society.

  8. A quantum mechanical strategy to investigate the structure of liquids: the cases of acetonitrile, formamide, and their mixture.

    PubMed

    Mennucci, Benedetta; da Silva, Clarissa O

    2008-06-05

    A computational strategy based on quantum mechanical (QM) calculations and continuum solvation models is used to investigate the structure of liquids (either neat liquids or mixtures). The strategy is based on the comparison of calculated and experimental spectroscopic properties (IR-Raman vibrational frequencies and Raman intensities). In particular, neat formamide, neat acetonitrile, and their equimolar mixture are studied comparing isolated and solvated clusters of different nature and size. In all cases, the study seems to indicate that liquids, even when strongly associated, can be effectively modeled in terms of a shell-like system in which clusters of strongly interacting molecules (the microenvironments) are solvated by a polarizable macroenvironment represented by the rest of the molecules. Only taking into proper account both these effects can a correct picture of the liquid structure be achieved.

  9. Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering

    DTIC Science & Technology

    2005-08-04

    describe a four-band magnetic resonance image (MRI) consisting of 23,712 pixels of a brain with a tumor 2. Because of the size of the dataset, it is not...the Royal Statistical Society, Series B 56, 363–375. Figueiredo, M. A. T. and A. K. Jain (2002). Unsupervised learning of finite mixture models. IEEE...20 5.4 Brain MRI

  10. Latent Class Detection and Class Assignment: A Comparison of the MAXEIG Taxometric Procedure and Factor Mixture Modeling Approaches

    ERIC Educational Resources Information Center

    Lubke, Gitta; Tueller, Stephen

    2010-01-01

    Taxometric procedures such as MAXEIG and factor mixture modeling (FMM) are used in latent class clustering, but they have very different sets of strengths and weaknesses. Taxometric procedures, popular in psychiatric and psychopathology applications, do not rely on distributional assumptions. Their sole purpose is to detect the presence of latent…

  11. A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq.

    PubMed

    Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling

    2015-03-01

    Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  12. Gaussian mixture clustering and imputation of microarray data.

    PubMed

    Ouyang, Ming; Welsh, William J; Georgopoulos, Panos

    2004-04-12

    In microarray experiments, missing entries arise from blemishes on the chips. In large-scale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries are excluded, or the missing entries are filled with estimates prior to the analyses. This study compares methods of missing value estimation. Two evaluation metrics of imputation accuracy are employed. First, the root mean squared error measures the difference between the true values and the imputed values. Second, the number of mis-clustered genes measures the difference between clustering with true values and that with imputed values; it examines the bias introduced by imputation to clustering. The Gaussian mixture clustering with model averaging imputation is superior to all other imputation methods, according to both evaluation metrics, on both time-series (correlated) and non-time series (uncorrelated) data sets.

  13. A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.

    PubMed

    Bruneau, Marine; Mottet, Thierry; Moulin, Serge; Kerbiriou, Maël; Chouly, Franz; Chretien, Stéphane; Guyeux, Christophe

    2018-02-01

    In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clusters is not required here. For the sake of illustration, this method is applied on a set of 100 DNA sequences taken from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene, extracted from a collection of Platyhelminthes and Nematoda species. The resulting clusters are tightly consistent with the phylogenetic tree computed using a maximum likelihood approach on gene alignment. They are coherent too with the NCBI taxonomy. Further test results based on synthesized data are then provided, showing that the proposed approach is better able to recover the clusters than the most widely used software, namely Cd-hit-est and BLASTClust. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. On selecting a prior for the precision parameter of Dirichlet process mixture models

    USGS Publications Warehouse

    Dorazio, R.M.

    2009-01-01

    In hierarchical mixture models the Dirichlet process is used to specify latent patterns of heterogeneity, particularly when the distribution of latent parameters is thought to be clustered (multimodal). The parameters of a Dirichlet process include a precision parameter ?? and a base probability measure G0. In problems where ?? is unknown and must be estimated, inferences about the level of clustering can be sensitive to the choice of prior assumed for ??. In this paper an approach is developed for computing a prior for the precision parameter ?? that can be used in the presence or absence of prior information about the level of clustering. This approach is illustrated in an analysis of counts of stream fishes. The results of this fully Bayesian analysis are compared with an empirical Bayes analysis of the same data and with a Bayesian analysis based on an alternative commonly used prior.

  15. Young star clusters in nearby molecular clouds

    NASA Astrophysics Data System (ADS)

    Getman, K. V.; Kuhn, M. A.; Feigelson, E. D.; Broos, P. S.; Bate, M. R.; Garmire, G. P.

    2018-06-01

    The SFiNCs (Star Formation in Nearby Clouds) project is an X-ray/infrared study of the young stellar populations in 22 star-forming regions with distances ≲ 1 kpc designed to extend our earlier MYStIX (Massive Young Star-Forming Complex Study in Infrared and X-ray) survey of more distant clusters. Our central goal is to give empirical constraints on cluster formation mechanisms. Using parametric mixture models applied homogeneously to the catalogue of SFiNCs young stars, we identify 52 SFiNCs clusters and 19 unclustered stellar structures. The procedure gives cluster properties including location, population, morphology, association with molecular clouds, absorption, age (AgeJX), and infrared spectral energy distribution (SED) slope. Absorption, SED slope, and AgeJX are age indicators. SFiNCs clusters are examined individually, and collectively with MYStIX clusters, to give the following results. (1) SFiNCs is dominated by smaller, younger, and more heavily obscured clusters than MYStIX. (2) SFiNCs cloud-associated clusters have the high ellipticities aligned with their host molecular filaments indicating morphology inherited from their parental clouds. (3) The effect of cluster expansion is evident from the radius-age, radius-absorption, and radius-SED correlations. Core radii increase dramatically from ˜0.08 to ˜0.9 pc over the age range 1-3.5 Myr. Inferred gas removal time-scales are longer than 1 Myr. (4) Rich, spatially distributed stellar populations are present in SFiNCs clouds representing early generations of star formation. An appendix compares the performance of the mixture models and non-parametric minimum spanning tree to identify clusters. This work is a foundation for future SFiNCs/MYStIX studies including disc longevity, age gradients, and dynamical modelling.

  16. Characterization of deuterium clusters mixed with helium gas for an application in beam-target-fusion experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bang, W.; Quevedo, H. J.; Bernstein, A. C.

    We measured the average deuterium cluster size within a mixture of deuterium clusters and helium gas by detecting Rayleigh scattering signals. The average cluster size from the gas mixture was comparable to that from a pure deuterium gas when the total backing pressure and temperature of the gas mixture were the same as those of the pure deuterium gas. According to these measurements, the average size of deuterium clusters depends on the total pressure and not the partial pressure of deuterium in the gas mixture. To characterize the cluster source size further, a Faraday cup was used to measure themore » average kinetic energy of the ions resulting from Coulomb explosion of deuterium clusters upon irradiation by an intense ultrashort pulse. The deuterium ions indeed acquired a similar amount of energy from the mixture target, corroborating our measurements of the average cluster size. As the addition of helium atoms did not reduce the resulting ion kinetic energies, the reported results confirm the utility of using a known cluster source for beam-target-fusion experiments by introducing a secondary target gas.« less

  17. Characterization of deuterium clusters mixed with helium gas for an application in beam-target-fusion experiments

    DOE PAGES

    Bang, W.; Quevedo, H. J.; Bernstein, A. C.; ...

    2014-12-10

    We measured the average deuterium cluster size within a mixture of deuterium clusters and helium gas by detecting Rayleigh scattering signals. The average cluster size from the gas mixture was comparable to that from a pure deuterium gas when the total backing pressure and temperature of the gas mixture were the same as those of the pure deuterium gas. According to these measurements, the average size of deuterium clusters depends on the total pressure and not the partial pressure of deuterium in the gas mixture. To characterize the cluster source size further, a Faraday cup was used to measure themore » average kinetic energy of the ions resulting from Coulomb explosion of deuterium clusters upon irradiation by an intense ultrashort pulse. The deuterium ions indeed acquired a similar amount of energy from the mixture target, corroborating our measurements of the average cluster size. As the addition of helium atoms did not reduce the resulting ion kinetic energies, the reported results confirm the utility of using a known cluster source for beam-target-fusion experiments by introducing a secondary target gas.« less

  18. Model-Based Clustering of Regression Time Series Data via APECM -- An AECM Algorithm Sung to an Even Faster Beat

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Wei-Chen; Maitra, Ranjan

    2011-01-01

    We propose a model-based approach for clustering time series regression data in an unsupervised machine learning framework to identify groups under the assumption that each mixture component follows a Gaussian autoregressive regression model of order p. Given the number of groups, the traditional maximum likelihood approach of estimating the parameters using the expectation-maximization (EM) algorithm can be employed, although it is computationally demanding. The somewhat fast tune to the EM folk song provided by the Alternating Expectation Conditional Maximization (AECM) algorithm can alleviate the problem to some extent. In this article, we develop an alternative partial expectation conditional maximization algorithmmore » (APECM) that uses an additional data augmentation storage step to efficiently implement AECM for finite mixture models. Results on our simulation experiments show improved performance in both fewer numbers of iterations and computation time. The methodology is applied to the problem of clustering mutual funds data on the basis of their average annual per cent returns and in the presence of economic indicators.« less

  19. Abnormal characteristics of binary molecular clusters in DMSO–ethanol mixtures under external electric fields

    NASA Astrophysics Data System (ADS)

    Wu, Zhiyan; Huang, Kama

    2018-05-01

    For the nonlinearly phenomena on the dielectric properties of dimethyl sulfoxide (DMSO)-ethanol mixtures under a low intensity microwave field, we propose a conjecture that there exist some abnormal molecular clusters. To interpret the mechanism of abnormal phenomena and confirm our conjecture about the existence of abnormal molecular clusters, an in-depth investigation about the structure evolutions of (DMSO)m(C2H5OH)n (m = 0-4; n = 0-4; m + n ≤ 4) molecular clusters induced by external electric fields has been given by using density functional theory. The results show that there exist some binary molecular clusters with large cluster radii in mixtures, and some of them are unstable under exposure of electric fields. It implies that the existence of certain abnormal molecular clusters in DMSO-ethanol mixtures results in their abnormality of dielectric properties.

  20. Mixed-up trees: the structure of phylogenetic mixtures.

    PubMed

    Matsen, Frederick A; Mossel, Elchanan; Steel, Mike

    2008-05-01

    In this paper, we apply new geometric and combinatorial methods to the study of phylogenetic mixtures. The focus of the geometric approach is to describe the geometry of phylogenetic mixture distributions for the two state random cluster model, which is a generalization of the two state symmetric (CFN) model. In particular, we show that the set of mixture distributions forms a convex polytope and we calculate its dimension; corollaries include a simple criterion for when a mixture of branch lengths on the star tree can mimic the site pattern frequency vector of a resolved quartet tree. Furthermore, by computing volumes of polytopes we can clarify how "common" non-identifiable mixtures are under the CFN model. We also present a new combinatorial result which extends any identifiability result for a specific pair of trees of size six to arbitrary pairs of trees. Next we present a positive result showing identifiability of rates-across-sites models. Finally, we answer a question raised in a previous paper concerning "mixed branch repulsion" on trees larger than quartet trees under the CFN model.

  1. A Novel Information-Theoretic Approach for Variable Clustering and Predictive Modeling Using Dirichlet Process Mixtures

    PubMed Central

    Chen, Yun; Yang, Hui

    2016-01-01

    In the era of big data, there are increasing interests on clustering variables for the minimization of data redundancy and the maximization of variable relevancy. Existing clustering methods, however, depend on nontrivial assumptions about the data structure. Note that nonlinear interdependence among variables poses significant challenges on the traditional framework of predictive modeling. In the present work, we reformulate the problem of variable clustering from an information theoretic perspective that does not require the assumption of data structure for the identification of nonlinear interdependence among variables. Specifically, we propose the use of mutual information to characterize and measure nonlinear correlation structures among variables. Further, we develop Dirichlet process (DP) models to cluster variables based on the mutual-information measures among variables. Finally, orthonormalized variables in each cluster are integrated with group elastic-net model to improve the performance of predictive modeling. Both simulation and real-world case studies showed that the proposed methodology not only effectively reveals the nonlinear interdependence structures among variables but also outperforms traditional variable clustering algorithms such as hierarchical clustering. PMID:27966581

  2. A Novel Information-Theoretic Approach for Variable Clustering and Predictive Modeling Using Dirichlet Process Mixtures.

    PubMed

    Chen, Yun; Yang, Hui

    2016-12-14

    In the era of big data, there are increasing interests on clustering variables for the minimization of data redundancy and the maximization of variable relevancy. Existing clustering methods, however, depend on nontrivial assumptions about the data structure. Note that nonlinear interdependence among variables poses significant challenges on the traditional framework of predictive modeling. In the present work, we reformulate the problem of variable clustering from an information theoretic perspective that does not require the assumption of data structure for the identification of nonlinear interdependence among variables. Specifically, we propose the use of mutual information to characterize and measure nonlinear correlation structures among variables. Further, we develop Dirichlet process (DP) models to cluster variables based on the mutual-information measures among variables. Finally, orthonormalized variables in each cluster are integrated with group elastic-net model to improve the performance of predictive modeling. Both simulation and real-world case studies showed that the proposed methodology not only effectively reveals the nonlinear interdependence structures among variables but also outperforms traditional variable clustering algorithms such as hierarchical clustering.

  3. Copula based flexible modeling of associations between clustered event times.

    PubMed

    Geerdens, Candida; Claeskens, Gerda; Janssen, Paul

    2016-07-01

    Multivariate survival data are characterized by the presence of correlation between event times within the same cluster. First, we build multi-dimensional copulas with flexible and possibly symmetric dependence structures for such data. In particular, clustered right-censored survival data are modeled using mixtures of max-infinitely divisible bivariate copulas. Second, these copulas are fit by a likelihood approach where the vast amount of copula derivatives present in the likelihood is approximated by finite differences. Third, we formulate conditions for clustered right-censored survival data under which an information criterion for model selection is either weakly consistent or consistent. Several of the familiar selection criteria are included. A set of four-dimensional data on time-to-mastitis is used to demonstrate the developed methodology.

  4. Stability of fluctuating and transient aggregates of amphiphilic solutes in aqueous binary mixtures: Studies of dimethylsulfoxide, ethanol, and tert-butyl alcohol

    NASA Astrophysics Data System (ADS)

    Banerjee, Saikat; Bagchi, Biman

    2013-10-01

    In aqueous binary mixtures, amphiphilic solutes such as dimethylsulfoxide (DMSO), ethanol, tert-butyl alcohol (TBA), etc., are known to form aggregates (or large clusters) at small to intermediate solute concentrations. These aggregates are transient in nature. Although the system remains homogeneous on macroscopic length and time scales, the microheterogeneous aggregation may profoundly affect the properties of the mixture in several distinct ways, particularly if the survival times of the aggregates are longer than density relaxation times of the binary liquid. Here we propose a theoretical scheme to quantify the lifetime and thus the stability of these microheterogeneous clusters, and apply the scheme to calculate the same for water-ethanol, water-DMSO, and water-TBA mixtures. We show that the lifetime of these clusters can range from less than a picosecond (ps) for ethanol clusters to few tens of ps for DMSO and TBA clusters. This helps explaining the absence of a strong composition dependent anomaly in water-ethanol mixtures but the presence of the same in water-DMSO and water-TBA mixtures.

  5. Patterning ecological risk of pesticide contamination at the river basin scale.

    PubMed

    Faggiano, Leslie; de Zwart, Dick; García-Berthou, Emili; Lek, Sovan; Gevrey, Muriel

    2010-05-01

    Ecological risk assessment was conducted to determine the risk posed by pesticide mixtures to the Adour-Garonne river basin (south-western France). The objectives of this study were to assess the general state of this basin with regard to pesticide contamination using a risk assessment procedure and to detect patterns in toxic mixture assemblages through a self-organizing map (SOM) methodology in order to identify the locations at risk. Exposure assessment, risk assessment with species sensitivity distribution, and mixture toxicity rules were used to compute six relative risk predictors for different toxic modes of action: the multi-substance potentially affected fraction of species depending on the toxic mode of action of compounds found in the mixture (msPAF CA(TMoA) values). Those predictors computed for the 131 sampling sites assessed in this study were then patterned through the SOM learning process. Four clusters of sampling sites exhibiting similar toxic assemblages were identified. In the first cluster, which comprised 83% of the sampling sites, the risk caused by pesticide mixture toward aquatic species was weak (mean msPAF value for those sites<0.0036%), while in another cluster the risk was significant (mean msPAF<1.09%). GIS mapping allowed an interesting spatial pattern of the distribution of sampling sites for each cluster to be highlighted with a significant and highly localized risk in the French department called "Lot et Garonne". The combined use of the SOM methodology, mixture toxicity modelling and a clear geo-referenced representation of results not only revealed the general state of the Adour-Garonne basin with regard to contamination by pesticides but also enabled to analyze the spatial pattern of toxic mixture assemblage in order to prioritize the locations at risk and to detect the group of compounds causing the greatest risk at the basin scale. Copyright 2010 Elsevier B.V. All rights reserved.

  6. A Preliminary Comparison of the Effectiveness of Cluster Analysis Weighting Procedures for Within-Group Covariance Structure.

    ERIC Educational Resources Information Center

    Donoghue, John R.

    A Monte Carlo study compared the usefulness of six variable weighting methods for cluster analysis. Data were 100 bivariate observations from 2 subgroups, generated according to a finite normal mixture model. Subgroup size, within-group correlation, within-group variance, and distance between subgroup centroids were manipulated. Of the clustering…

  7. Monte Carlo simulation of two-component bilayers: DMPC/DSPC mixtures.

    PubMed Central

    Sugár, I P; Thompson, T E; Biltonen, R L

    1999-01-01

    In this paper, we describe a relatively simple lattice model of a two-component, two-state phospholipid bilayer. Application of Monte Carlo methods to this model permits simulation of the observed excess heat capacity versus temperature curves of dimyristoylphosphatidylcholine (DMPC)/distearoylphosphatidylcholine (DSPC) mixtures as well as the lateral distributions of the components and properties related to these distributions. The analysis of the bilayer energy distribution functions reveals that the gel-fluid transition is a continuous transition for DMPC, DSPC, and all DMPC/DSPC mixtures. A comparison of the thermodynamic properties of DMPC/DSPC mixtures with the configurational properties shows that the temperatures characteristics of the configurational properties correlate well with the maxima in the excess heat capacity curves rather than with the onset and completion temperatures of the gel-fluid transition. In the gel-fluid coexistence region, we also found excellent agreement between the threshold temperatures at different system compositions detected in fluorescence recovery after photobleaching experiments and the temperatures at which the percolation probability of the gel clusters is 0.36. At every composition, the calculated mole fraction of gel state molecules at the fluorescence recovery after photobleaching threshold is 0.34 and, at the percolation threshold of gel clusters, it is 0.24. The percolation threshold mole fraction of gel or fluid lipid depends on the packing geometry of the molecules and the interchain interactions. However, it is independent of temperature, system composition, and state of the percolating cluster. PMID:10096905

  8. The nonlinear model for emergence of stable conditions in gas mixture in force field

    NASA Astrophysics Data System (ADS)

    Kalutskov, Oleg; Uvarova, Liudmila

    2016-06-01

    The case of M-component liquid evaporation from the straight cylindrical capillary into N - component gas mixture in presence of external forces was reviewed. It is assumed that the gas mixture is not ideal. The stable states in gas phase can be formed during the evaporation process for the certain model parameter valuesbecause of the mass transfer initial equationsnonlinearity. The critical concentrations of the resulting gas mixture components (the critical component concentrations at which the stable states occur in mixture) were determined mathematically for the case of single-component fluid evaporation into two-component atmosphere. It was concluded that this equilibrium concentration ratio of the mixture components can be achieved by external force influence on the mass transfer processes. It is one of the ways to create sustainable gas clusters that can be used effectively in modern nanotechnology.

  9. Theory of anomalous critical-cluster content in high-pressure binary nucleation.

    PubMed

    Kalikmanov, V I; Labetski, D G

    2007-02-23

    Nucleation experiments in binary (a-b) mixtures, when component a is supersaturated and b (carrier gas) is undersaturated, reveal that for some mixtures at high pressures the a content of the critical cluster dramatically decreases with pressure contrary to expectations based on classical nucleation theory. We show that this phenomenon is a manifestation of the dominant role of the unlike interactions at high pressures resulting in the negative partial molar volume of component a in the vapor phase beyond the compensation pressure. The analysis is based on the pressure nucleation theorem for multicomponent systems which is invariant to a nucleation model.

  10. Mixture Hidden Markov Models in Finance Research

    NASA Astrophysics Data System (ADS)

    Dias, José G.; Vermunt, Jeroen K.; Ramos, Sofia

    Finite mixture models have proven to be a powerful framework whenever unobserved heterogeneity cannot be ignored. We introduce in finance research the Mixture Hidden Markov Model (MHMM) that takes into account time and space heterogeneity simultaneously. This approach is flexible in the sense that it can deal with the specific features of financial time series data, such as asymmetry, kurtosis, and unobserved heterogeneity. This methodology is applied to model simultaneously 12 time series of Asian stock markets indexes. Because we selected a heterogeneous sample of countries including both developed and emerging countries, we expect that heterogeneity in market returns due to country idiosyncrasies will show up in the results. The best fitting model was the one with two clusters at country level with different dynamics between the two regimes.

  11. Self-organising mixture autoregressive model for non-stationary time series modelling.

    PubMed

    Ni, He; Yin, Hujun

    2008-12-01

    Modelling non-stationary time series has been a difficult task for both parametric and nonparametric methods. One promising solution is to combine the flexibility of nonparametric models with the simplicity of parametric models. In this paper, the self-organising mixture autoregressive (SOMAR) network is adopted as a such mixture model. It breaks time series into underlying segments and at the same time fits local linear regressive models to the clusters of segments. In such a way, a global non-stationary time series is represented by a dynamic set of local linear regressive models. Neural gas is used for a more flexible structure of the mixture model. Furthermore, a new similarity measure has been introduced in the self-organising network to better quantify the similarity of time series segments. The network can be used naturally in modelling and forecasting non-stationary time series. Experiments on artificial, benchmark time series (e.g. Mackey-Glass) and real-world data (e.g. numbers of sunspots and Forex rates) are presented and the results show that the proposed SOMAR network is effective and superior to other similar approaches.

  12. Spatial clustering of metal and metalloid mixtures in unregulated water sources on the Navajo Nation - Arizona, New Mexico, and Utah, USA.

    PubMed

    Hoover, Joseph H; Coker, Eric; Barney, Yolanda; Shuey, Chris; Lewis, Johnnye

    2018-08-15

    Contaminant mixtures are identified regularly in public and private drinking water supplies throughout the United States; however, the complex and often correlated nature of mixtures makes identification of relevant combinations challenging. This study employed a Bayesian clustering method to identify subgroups of water sources with similar metal and metalloid profiles. Additionally, a spatial scan statistic assessed spatial clustering of these subgroups and a human health metric was applied to investigate potential for human toxicity. These methods were applied to a dataset comprised of metal and metalloid measurements from unregulated water sources located on the Navajo Nation, in the southwest United States. Results indicated distinct subgroups of water sources with similar contaminant profiles and that some of these subgroups were spatially clustered. Several profiles had metal and metalloid concentrations that may have potential for human toxicity including arsenic, uranium, lead, manganese, and selenium. This approach may be useful for identifying mixtures in water sources, spatially evaluating the clusters, and help inform toxicological research investigating mixtures. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  13. Microheterogeneity in binary mixtures of water with CH3OH and CD3OH: ATR-IR spectroscopic, chemometric and DFT studies

    NASA Astrophysics Data System (ADS)

    Tomza, Paweł; Wrzeszcz, Władysław; Mazurek, Sylwester; Szostak, Roman; Czarnecki, Mirosław Antoni

    2018-05-01

    Here we report ATR-IR spectroscopic study on the separation at a molecular level (microheterogeneity) and the degree of deviation of H2O/CH3OH and H2O/CD3OH mixtures from the ideal mixture. Of particular interest is the effect of isotopic substitution in methyl group on molecular structure and interactions in both mixtures. To obtain comprehensive information from the multivariate data we applied the excess molar absorptivity spectra together with two-dimensional correlation analysis (2DCOS) and chemometric methods. In addition, the experimental results were compared and discussed with the structures of various model clusters obtained from theoretical (DFT) calculations. Our results evidence the presence of separation at a molecular level and deviation from the ideal mixture for both mixtures. The experimental and theoretical results show that the maximum of these deviations appears at equimolar mixture. Both mixtures consist of three kinds of species: homoclusters of water and methanol and mixed clusters (heteroclusters). The heteroclusters exist in the whole range of mole fractions with the maximum close to the equimolar mixture. At this mixture composition near 55-60% of molecules are involved in heteroclusters. In contrast, the homoclusters of water occur in a limited range of mole fractions (XME < 0.85-0.9). Upon mixing the molecules of methanol form weaker hydrogen bonding as compared with the pure alcohol. In contrast, the molecules of water in the mixture are involved in stronger hydrogen bonding than those in bulk water. All these results indicate that both mixtures have similar degree of deviation from the ideal mixture.

  14. Microheterogeneity in binary mixtures of water with CH3OH and CD3OH: ATR-IR spectroscopic, chemometric and DFT studies.

    PubMed

    Tomza, Paweł; Wrzeszcz, Władysław; Mazurek, Sylwester; Szostak, Roman; Czarnecki, Mirosław Antoni

    2018-05-15

    Here we report ATR-IR spectroscopic study on the separation at a molecular level (microheterogeneity) and the degree of deviation of H 2 O/CH 3 OH and H 2 O/CD 3 OH mixtures from the ideal mixture. Of particular interest is the effect of isotopic substitution in methyl group on molecular structure and interactions in both mixtures. To obtain comprehensive information from the multivariate data we applied the excess molar absorptivity spectra together with two-dimensional correlation analysis (2DCOS) and chemometric methods. In addition, the experimental results were compared and discussed with the structures of various model clusters obtained from theoretical (DFT) calculations. Our results evidence the presence of separation at a molecular level and deviation from the ideal mixture for both mixtures. The experimental and theoretical results show that the maximum of these deviations appears at equimolar mixture. Both mixtures consist of three kinds of species: homoclusters of water and methanol and mixed clusters (heteroclusters). The heteroclusters exist in the whole range of mole fractions with the maximum close to the equimolar mixture. At this mixture composition near 55-60% of molecules are involved in heteroclusters. In contrast, the homoclusters of water occur in a limited range of mole fractions (X ME  < 0.85-0.9). Upon mixing the molecules of methanol form weaker hydrogen bonding as compared with the pure alcohol. In contrast, the molecules of water in the mixture are involved in stronger hydrogen bonding than those in bulk water. All these results indicate that both mixtures have similar degree of deviation from the ideal mixture. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Structure and thermodynamics of a mixture of patchy and spherical colloids: A multi-body association theory with complete reference fluid information

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bansal, Artee; Asthagiri, D.; Cox, Kenneth R.

    A mixture of solvent particles with short-range, directional interactions and solute particles with short-range, isotropic interactions that can bond multiple times is of fundamental interest in understanding liquids and colloidal mixtures. Because of multi-body correlations, predicting the structure and thermodynamics of such systems remains a challenge. Earlier Marshall and Chapman [J. Chem. Phys. 139, 104904 (2013)] developed a theory wherein association effects due to interactions multiply the partition function for clustering of particles in a reference hard-sphere system. The multi-body effects are incorporated in the clustering process, which in their work was obtained in the absence of the bulk medium.more » The bulk solvent effects were then modeled approximately within a second order perturbation approach. However, their approach is inadequate at high densities and for large association strengths. Based on the idea that the clustering of solvent in a defined coordination volume around the solute is related to occupancy statistics in that defined coordination volume, we develop an approach to incorporate the complete information about hard-sphere clustering in a bulk solvent at the density of interest. The occupancy probabilities are obtained from enhanced sampling simulations but we also develop a concise parametric form to model these probabilities using the quasichemical theory of solutions. We show that incorporating the complete reference information results in an approach that can predict the bonding state and thermodynamics of the colloidal solute for a wide range of system conditions.« less

  16. Density-based clustering analyses to identify heterogeneous cellular sub-populations

    NASA Astrophysics Data System (ADS)

    Heaster, Tiffany M.; Walsh, Alex J.; Landman, Bennett A.; Skala, Melissa C.

    2017-02-01

    Autofluorescence microscopy of NAD(P)H and FAD provides functional metabolic measurements at the single-cell level. Here, density-based clustering algorithms were applied to metabolic autofluorescence measurements to identify cell-level heterogeneity in tumor cell cultures. The performance of the density-based clustering algorithm, DENCLUE, was tested in samples with known heterogeneity (co-cultures of breast carcinoma lines). DENCLUE was found to better represent the distribution of cell clusters compared to Gaussian mixture modeling. Overall, DENCLUE is a promising approach to quantify cell-level heterogeneity, and could be used to understand single cell population dynamics in cancer progression and treatment.

  17. Bayesian Hierarchical Grouping: perceptual grouping as mixture estimation

    PubMed Central

    Froyen, Vicky; Feldman, Jacob; Singh, Manish

    2015-01-01

    We propose a novel framework for perceptual grouping based on the idea of mixture models, called Bayesian Hierarchical Grouping (BHG). In BHG we assume that the configuration of image elements is generated by a mixture of distinct objects, each of which generates image elements according to some generative assumptions. Grouping, in this framework, means estimating the number and the parameters of the mixture components that generated the image, including estimating which image elements are “owned” by which objects. We present a tractable implementation of the framework, based on the hierarchical clustering approach of Heller and Ghahramani (2005). We illustrate it with examples drawn from a number of classical perceptual grouping problems, including dot clustering, contour integration, and part decomposition. Our approach yields an intuitive hierarchical representation of image elements, giving an explicit decomposition of the image into mixture components, along with estimates of the probability of various candidate decompositions. We show that BHG accounts well for a diverse range of empirical data drawn from the literature. Because BHG provides a principled quantification of the plausibility of grouping interpretations over a wide range of grouping problems, we argue that it provides an appealing unifying account of the elusive Gestalt notion of Prägnanz. PMID:26322548

  18. The impact of catchment source group classification on the accuracy of sediment fingerprinting outputs.

    PubMed

    Pulley, Simon; Foster, Ian; Collins, Adrian L

    2017-06-01

    The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis based classification methods have the potential to reduce composite uncertainty significantly in future source tracing studies. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Mixture models with entropy regularization for community detection in networks

    NASA Astrophysics Data System (ADS)

    Chang, Zhenhai; Yin, Xianjun; Jia, Caiyan; Wang, Xiaoyang

    2018-04-01

    Community detection is a key exploratory tool in network analysis and has received much attention in recent years. NMM (Newman's mixture model) is one of the best models for exploring a range of network structures including community structure, bipartite and core-periphery structures, etc. However, NMM needs to know the number of communities in advance. Therefore, in this study, we have proposed an entropy regularized mixture model (called EMM), which is capable of inferring the number of communities and identifying network structure contained in a network, simultaneously. In the model, by minimizing the entropy of mixing coefficients of NMM using EM (expectation-maximization) solution, the small clusters contained little information can be discarded step by step. The empirical study on both synthetic networks and real networks has shown that the proposed model EMM is superior to the state-of-the-art methods.

  20. Statistical mixture design selective extraction of compounds with antioxidant activity and total polyphenol content from Trichilia catigua.

    PubMed

    Lonni, Audrey Alesandra Stinghen Garcia; Longhini, Renata; Lopes, Gisely Cristiny; de Mello, João Carlos Palazzo; Scarminio, Ieda Spacino

    2012-03-16

    Statistical design mixtures of water, methanol, acetone and ethanol were used to extract material from Trichilia catigua (Meliaceae) barks to study the effects of different solvents and their mixtures on its yield, total polyphenol content and antioxidant activity. The experimental results and their response surface models showed that quaternary mixtures with approximately equal proportions of all four solvents provided the highest yields, total polyphenol contents and antioxidant activities of the crude extracts followed by ternary design mixtures. Principal component and hierarchical clustering analysis of the HPLC-DAD spectra of the chromatographic peaks of 1:1:1:1 water-methanol-acetone-ethanol mixture extracts indicate the presence of cinchonains, gallic acid derivatives, natural polyphenols, flavanoids, catechins, and epicatechins. Copyright © 2011 Elsevier B.V. All rights reserved.

  1. Mixture model-based clustering and logistic regression for automatic detection of microaneurysms in retinal images

    NASA Astrophysics Data System (ADS)

    Sánchez, Clara I.; Hornero, Roberto; Mayo, Agustín; García, María

    2009-02-01

    Diabetic Retinopathy is one of the leading causes of blindness and vision defects in developed countries. An early detection and diagnosis is crucial to avoid visual complication. Microaneurysms are the first ocular signs of the presence of this ocular disease. Their detection is of paramount importance for the development of a computer-aided diagnosis technique which permits a prompt diagnosis of the disease. However, the detection of microaneurysms in retinal images is a difficult task due to the wide variability that these images usually present in screening programs. We propose a statistical approach based on mixture model-based clustering and logistic regression which is robust to the changes in the appearance of retinal fundus images. The method is evaluated on the public database proposed by the Retinal Online Challenge in order to obtain an objective performance measure and to allow a comparative study with other proposed algorithms.

  2. Modular analysis of the probabilistic genetic interaction network.

    PubMed

    Hou, Lin; Wang, Lin; Qian, Minping; Li, Dong; Tang, Chao; Zhu, Yunping; Deng, Minghua; Li, Fangting

    2011-03-15

    Epistatic Miniarray Profiles (EMAP) has enabled the mapping of large-scale genetic interaction networks; however, the quantitative information gained from EMAP cannot be fully exploited since the data are usually interpreted as a discrete network based on an arbitrary hard threshold. To address such limitations, we adopted a mixture modeling procedure to construct a probabilistic genetic interaction network and then implemented a Bayesian approach to identify densely interacting modules in the probabilistic network. Mixture modeling has been demonstrated as an effective soft-threshold technique of EMAP measures. The Bayesian approach was applied to an EMAP dataset studying the early secretory pathway in Saccharomyces cerevisiae. Twenty-seven modules were identified, and 14 of those were enriched by gold standard functional gene sets. We also conducted a detailed comparison with state-of-the-art algorithms, hierarchical cluster and Markov clustering. The experimental results show that the Bayesian approach outperforms others in efficiently recovering biologically significant modules.

  3. Mixture Model and MDSDCA for Textual Data

    NASA Astrophysics Data System (ADS)

    Allouti, Faryel; Nadif, Mohamed; Hoai An, Le Thi; Otjacques, Benoît

    E-mailing has become an essential component of cooperation in business. Consequently, the large number of messages manually produced or automatically generated can rapidly cause information overflow for users. Many research projects have examined this issue but surprisingly few have tackled the problem of the files attached to e-mails that, in many cases, contain a substantial part of the semantics of the message. This paper considers this specific topic and focuses on the problem of clustering and visualization of attached files. Relying on the multinomial mixture model, we used the Classification EM algorithm (CEM) to cluster the set of files, and MDSDCA to visualize the obtained classes of documents. Like the Multidimensional Scaling method, the aim of the MDSDCA algorithm based on the Difference of Convex functions is to optimize the stress criterion. As MDSDCA is iterative, we propose an initialization approach to avoid starting with random values. Experiments are investigated using simulations and textual data.

  4. Galaxy formation through hierarchical clustering

    NASA Astrophysics Data System (ADS)

    White, Simon D. M.; Frenk, Carlos S.

    1991-09-01

    Analytic methods for studying the formation of galaxies by gas condensation within massive dark halos are presented. The present scheme applies to cosmogonies where structure grows through hierarchical clustering of a mixture of gas and dissipationless dark matter. The simplest models consistent with the current understanding of N-body work on dissipationless clustering, and that of numerical and analytic work on gas evolution and cooling are adopted. Standard models for the evolution of the stellar population are also employed, and new models for the way star formation heats and enriches the surrounding gas are constructed. Detailed results are presented for a cold dark matter universe with Omega = 1 and H(0) = 50 km/s/Mpc, but the present methods are applicable to other models. The present luminosity functions contain significantly more faint galaxies than are observed.

  5. Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications

    PubMed Central

    Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric

    2016-01-01

    Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method. PMID:27212939

  6. Sparse covariance estimation in heterogeneous samples*

    PubMed Central

    Rodríguez, Abel; Lenkoski, Alex; Dobra, Adrian

    2015-01-01

    Standard Gaussian graphical models implicitly assume that the conditional independence among variables is common to all observations in the sample. However, in practice, observations are usually collected from heterogeneous populations where such an assumption is not satisfied, leading in turn to nonlinear relationships among variables. To address such situations we explore mixtures of Gaussian graphical models; in particular, we consider both infinite mixtures and infinite hidden Markov models where the emission distributions correspond to Gaussian graphical models. Such models allow us to divide a heterogeneous population into homogenous groups, with each cluster having its own conditional independence structure. As an illustration, we study the trends in foreign exchange rate fluctuations in the pre-Euro era. PMID:26925189

  7. SAR image segmentation using skeleton-based fuzzy clustering

    NASA Astrophysics Data System (ADS)

    Cao, Yun Yi; Chen, Yan Qiu

    2003-06-01

    SAR image segmentation can be converted to a clustering problem in which pixels or small patches are grouped together based on local feature information. In this paper, we present a novel framework for segmentation. The segmentation goal is achieved by unsupervised clustering upon characteristic descriptors extracted from local patches. The mixture model of characteristic descriptor, which combines intensity and texture feature, is investigated. The unsupervised algorithm is derived from the recently proposed Skeleton-Based Data Labeling method. Skeletons are constructed as prototypes of clusters to represent arbitrary latent structures in image data. Segmentation using Skeleton-Based Fuzzy Clustering is able to detect the types of surfaces appeared in SAR images automatically without any user input.

  8. Effects of additional data on Bayesian clustering.

    PubMed

    Yamazaki, Keisuke

    2017-10-01

    Hierarchical probabilistic models, such as mixture models, are used for cluster analysis. These models have two types of variables: observable and latent. In cluster analysis, the latent variable is estimated, and it is expected that additional information will improve the accuracy of the estimation of the latent variable. Many proposed learning methods are able to use additional data; these include semi-supervised learning and transfer learning. However, from a statistical point of view, a complex probabilistic model that encompasses both the initial and additional data might be less accurate due to having a higher-dimensional parameter. The present paper presents a theoretical analysis of the accuracy of such a model and clarifies which factor has the greatest effect on its accuracy, the advantages of obtaining additional data, and the disadvantages of increasing the complexity. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Constrained Maximum Likelihood Estimation of Relative Abundances of Protein Conformation in a Heterogeneous Mixture from Small Angle X-Ray Scattering Intensity Measurements

    PubMed Central

    Onuk, A. Emre; Akcakaya, Murat; Bardhan, Jaydeep P.; Erdogmus, Deniz; Brooks, Dana H.; Makowski, Lee

    2015-01-01

    In this paper, we describe a model for maximum likelihood estimation (MLE) of the relative abundances of different conformations of a protein in a heterogeneous mixture from small angle X-ray scattering (SAXS) intensities. To consider cases where the solution includes intermediate or unknown conformations, we develop a subset selection method based on k-means clustering and the Cramér-Rao bound on the mixture coefficient estimation error to find a sparse basis set that represents the space spanned by the measured SAXS intensities of the known conformations of a protein. Then, using the selected basis set and the assumptions on the model for the intensity measurements, we show that the MLE model can be expressed as a constrained convex optimization problem. Employing the adenylate kinase (ADK) protein and its known conformations as an example, and using Monte Carlo simulations, we demonstrate the performance of the proposed estimation scheme. Here, although we use 45 crystallographically determined experimental structures and we could generate many more using, for instance, molecular dynamics calculations, the clustering technique indicates that the data cannot support the determination of relative abundances for more than 5 conformations. The estimation of this maximum number of conformations is intrinsic to the methodology we have used here. PMID:26924916

  10. Model-based Clustering of Categorical Time Series with Multinomial Logit Classification

    NASA Astrophysics Data System (ADS)

    Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea

    2010-09-01

    A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.

  11. The thermochemical characteristics of solution of phenol and benzoic acid in water-dimethylsulfoxide and water-acetonitrile mixtures

    NASA Astrophysics Data System (ADS)

    Zakharov, A. G.; Voronova, M. I.; Batov, D. V.; Smirnova, K. V.

    2011-03-01

    The solution of phenol and benzoic acid in water-dimethylsulfoxide (DMSO) and water-acetonitrile (AN) mixtures was studied. As distinct from benzoic acid, the thermodynamic characteristics of solution of phenol sharply change at concentrations corresponding to a change in the character of cluster formation in water-DMSO and water-AN mixtures. Differences in the solvation of phenol and benzoic acid are explained by different mechanisms of the interaction of the solutes with clusters existing in binary mixtures.

  12. Gene selection and cancer type classification of diffuse large-B-cell lymphoma using a bivariate mixture model for two-species data.

    PubMed

    Su, Yuhua; Nielsen, Dahlia; Zhu, Lei; Richards, Kristy; Suter, Steven; Breen, Matthew; Motsinger-Reif, Alison; Osborne, Jason

    2013-01-05

    : A bivariate mixture model utilizing information across two species was proposed to solve the fundamental problem of identifying differentially expressed genes in microarray experiments. The model utility was illustrated using a dog and human lymphoma data set prepared by a group of scientists in the College of Veterinary Medicine at North Carolina State University. A small number of genes were identified as being differentially expressed in both species and the human genes in this cluster serve as a good predictor for classifying diffuse large-B-cell lymphoma (DLBCL) patients into two subgroups, the germinal center B-cell-like diffuse large B-cell lymphoma and the activated B-cell-like diffuse large B-cell lymphoma. The number of human genes that were observed to be significantly differentially expressed (21) from the two-species analysis was very small compared to the number of human genes (190) identified with only one-species analysis (human data). The genes may be clinically relevant/important, as this small set achieved low misclassification rates of DLBCL subtypes. Additionally, the two subgroups defined by this cluster of human genes had significantly different survival functions, indicating that the stratification based on gene-expression profiling using the proposed mixture model provided improved insight into the clinical differences between the two cancer subtypes.

  13. Communication: Hydrogen bonding interactions in water-alcohol mixtures from X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lam, Royce K.; Smith, Jacob W.; Saykally, Richard J., E-mail: saykally@berkeley.edu

    While methanol and ethanol are macroscopically miscible with water, their mixtures exhibit negative excess entropies of mixing. Despite considerable effort in both experiment and theory, there remains significant disagreement regarding the origin of this effect. Different models for the liquid mixture structure have been proposed to address this behavior, including the enhancement of the water hydrogen bonding network around the alcohol hydrophobic groups and microscopic immiscibility or clustering. We have investigated mixtures of methanol, ethanol, and isopropanol with water by liquid microjet X-ray absorption spectroscopy on the oxygen K-edge, an atom-specific probe providing details of both inter- and intra-molecular structure.more » The measured spectra evidence a significant enhancement of hydrogen bonding originating from the methanol and ethanol hydroxyl groups upon the addition of water. These additional hydrogen bonding interactions would strengthen the liquid-liquid interactions, resulting in additional ordering in the liquid structures and leading to a reduction in entropy and a negative enthalpy of mixing, consistent with existing thermodynamic data. In contrast, the spectra of the isopropanol-water mixtures exhibit an increase in the number of broken alcohol hydrogen bonds for mixtures containing up to 0.5 water mole fraction, an observation consistent with existing enthalpy of mixing data, suggesting that the measured negative excess entropy is a result of clustering or micro-immiscibility.« less

  14. A segmentation/clustering model for the analysis of array CGH data.

    PubMed

    Picard, F; Robin, S; Lebarbier, E; Daudin, J-J

    2007-09-01

    Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.

  15. Structure and Energetics of Clusters Relevant to Thorium Tetrachloride Melts

    NASA Astrophysics Data System (ADS)

    Akdeniz, Z.; Tosi, M. P.

    2000-10-01

    We study within an ionic model the structure and energetics of neutral and charged molecular clusters which may be relevant to molten ThCl4 and to its liquid mixtures with alkali chlorides, with reference to Raman scattering experiments by Photiadis and Papatheodorou. As stressed by these authors, the most striking facts for ThCl4 in comparison to other tetrachloride compounds (and in particular to ZrCl4) are the appreciable ionic conductivity of the pure melt and the continuous structural changes which occur in the melt mixtures with varying composition. After adjusting our model to data on the isolated ThCl4 tetrahedral molecule, we evaluate (i) the Th2Cl8 dimer and the singly charged species obtained from it by chlorine-ion transfer between two such neutral dimers; (ii) the ThCl6 and ThCl7 clusters both as charged anions and as alkali-compensated species; and (iii) various oligomers carrying positive or negative double charges. Our study shows that the characteristic structural properties of the ThCl4 compound and of the alkali-Th chloride systems are the consequence of the relatively high ionic character of the binding, which is already evident in the isolated ThCl4 monomer.

  16. Finding Groups Using Model-based Cluster Analysis: Heterogeneous Emotional Self-regulatory Processes and Heavy Alcohol Use Risk

    PubMed Central

    Mun, Eun-Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

    2010-01-01

    Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of non-nested models using the Bayesian Information Criterion (BIC) to compare multiple models and identify the optimum number of clusters. The current study clustered 36 young men and women based on their baseline heart rate (HR) and HR variability (HRV), chronic alcohol use, and reasons for drinking. Two cluster groups were identified and labeled High Alcohol Risk and Normative groups. Compared to the Normative group, individuals in the High Alcohol Risk group had higher levels of alcohol use and more strongly endorsed disinhibition and suppression reasons for use. The High Alcohol Risk group showed significant HRV changes in response to positive and negative emotional and appetitive picture cues, compared to neutral cues. In contrast, the Normative group showed a significant HRV change only to negative cues. Findings suggest that the individuals with autonomic self-regulatory difficulties may be more susceptible to heavy alcohol use and use alcohol for emotional regulation. PMID:18331138

  17. Molecular crowding has no effect on the dilution thermodynamics of the biologically relevant cation mixtures.

    PubMed

    Głogocka, Daria; Przybyło, Magdalena; Langner, Marek

    2017-04-01

    The ionic composition of intracellular space is rigorously maintained in the expense of high-energy expenditure. It has been recently postulated that the cytoplasmic ionic composition is optimized so the energy cost of the fluctuations of calcium ion concentration is minimized. Specifically, thermodynamic arguments have been produced to show that the presence of potassium ions at concentrations higher than 100 mM reduce extend of the energy dissipation required for the dilution of calcium cations. No such effect has been measured when sodium ions were present in the solution or when the other divalent cation magnesium was diluted. The experimental observation has been interpreted as the indication of the formation of ionic clusters composed of calcium, chloride and potassium. In order to test the possibility that such clusters may be preserved in biological space, the thermodynamics of ionic mixtures dilution in solutions containing albumins and model lipid bilayers have been measured. Obtained thermograms clearly demonstrate that the energetics of calcium/potassium mixture is qualitatively different from calcium/sodium mixture indicating that the presence of the biologically relevant quantities of proteins and membrane hydrophilic surfaces do not interfere with the properties of the intracellular aqueous phase.

  18. A new approach for handling longitudinal count data with zero-inflation and overdispersion: poisson geometric process model.

    PubMed

    Wan, Wai-Yin; Chan, Jennifer S K

    2009-08-01

    For time series of count data, correlated measurements, clustering as well as excessive zeros occur simultaneously in biomedical applications. Ignoring such effects might contribute to misleading treatment outcomes. A generalized mixture Poisson geometric process (GMPGP) model and a zero-altered mixture Poisson geometric process (ZMPGP) model are developed from the geometric process model, which was originally developed for modelling positive continuous data and was extended to handle count data. These models are motivated by evaluating the trend development of new tumour counts for bladder cancer patients as well as by identifying useful covariates which affect the count level. The models are implemented using Bayesian method with Markov chain Monte Carlo (MCMC) algorithms and are assessed using deviance information criterion (DIC).

  19. Determining the Number of Component Clusters in the Standard Multivariate Normal Mixture Model Using Model-Selection Criteria.

    DTIC Science & Technology

    1983-06-16

    has been advocated by Gnanadesikan and 𔃾ilk (1969), and others in the literature. This suggests that, if we use the formal signficance test type...American Statistical Asso., 62, 1159-1178. Gnanadesikan , R., and Wilk, M..B. (1969). Data Analytic Methods in Multi- variate Statistical Analysis. In

  20. Identification of natural metabolites in mixture: a pattern recognition strategy based on (13)C NMR.

    PubMed

    Hubert, Jane; Nuzillard, Jean-Marc; Purson, Sylvain; Hamzaoui, Mahmoud; Borie, Nicolas; Reynaud, Romain; Renault, Jean-Hugues

    2014-03-18

    Because of their highly complex metabolite profile, the chemical characterization of bioactive natural extracts usually requires time-consuming multistep purification procedures to achieve the structural elucidation of pure individual metabolites. The aim of the present work was to develop a dereplication strategy for the identification of natural metabolites directly within mixtures. Exploiting the polarity range of metabolites, the principle was to rapidly fractionate a multigram quantity of a crude extract by centrifugal partition extraction (CPE). The obtained fractions of simplified chemical composition were subsequently analyzed by (13)C NMR. After automatic collection and alignment of (13)C signals across spectra, hierarchical clustering analysis (HCA) was performed for pattern recognition. As a result, strong correlations between (13)C signals of a single structure within the mixtures of the fraction series were visualized as chemical shift clusters. Each cluster was finally assigned to a molecular structure with the help of a locally built (13)C NMR chemical shift database. The proof of principle of this strategy was achieved on a simple model mixture of commercially available plant secondary metabolites and then applied to a bark extract of the African tree Anogeissus leiocarpus Guill. & Perr. (Combretaceae). Starting from 5 g of this genuine extract, the fraction series was generated by CPE in only 95 min. (13)C NMR analyses of all fractions followed by pattern recognition of (13)C chemical shifts resulted in the unambiguous identification of seven major compounds, namely, sericoside, trachelosperogenin E, ellagic acid, an epimer mixture of (+)-gallocatechin and (-)-epigallocatechin, 3,3'-di-O-methylellagic acid 4'-O-xylopyranoside, and 3,4,3'-tri-O-methylflavellagic acid 4'-O-glucopyranoside.

  1. The application of Gaussian mixture models for signal quantification in MALDI-TOF mass spectrometry of peptides.

    PubMed

    Spainhour, John Christian G; Janech, Michael G; Schwacke, John H; Velez, Juan Carlos Q; Ramakrishnan, Viswanathan

    2014-01-01

    Matrix assisted laser desorption/ionization time-of-flight (MALDI-TOF) coupled with stable isotope standards (SIS) has been used to quantify native peptides. This peptide quantification by MALDI-TOF approach has difficulties quantifying samples containing peptides with ion currents in overlapping spectra. In these overlapping spectra the currents sum together, which modify the peak heights and make normal SIS estimation problematic. An approach using Gaussian mixtures based on known physical constants to model the isotopic cluster of a known compound is proposed here. The characteristics of this approach are examined for single and overlapping compounds. The approach is compared to two commonly used SIS quantification methods for single compound, namely Peak Intensity method and Riemann sum area under the curve (AUC) method. For studying the characteristics of the Gaussian mixture method, Angiotensin II, Angiotensin-2-10, and Angiotenisn-1-9 and their associated SIS peptides were used. The findings suggest, Gaussian mixture method has similar characteristics as the two methods compared for estimating the quantity of isolated isotopic clusters for single compounds. All three methods were tested using MALDI-TOF mass spectra collected for peptides of the renin-angiotensin system. The Gaussian mixture method accurately estimated the native to labeled ratio of several isolated angiotensin peptides (5.2% error in ratio estimation) with similar estimation errors to those calculated using peak intensity and Riemann sum AUC methods (5.9% and 7.7%, respectively). For overlapping angiotensin peptides, (where the other two methods are not applicable) the estimation error of the Gaussian mixture was 6.8%, which is within the acceptable range. In summary, for single compounds the Gaussian mixture method is equivalent or marginally superior compared to the existing methods of peptide quantification and is capable of quantifying overlapping (convolved) peptides within the acceptable margin of error.

  2. The cluster model of a hot dense vapor

    NASA Astrophysics Data System (ADS)

    Zhukhovitskii, D. I.

    2015-04-01

    We explore thermodynamic properties of a vapor in the range of state parameters where the contribution to thermodynamic functions from bound states of atoms (clusters) dominates over the interaction between the components of the vapor in free states. The clusters are assumed to be light and sufficiently "hot" for the number of bonds to be minimized. We use the technique of calculation of the cluster partition function for the cluster with a minimum number of interatomic bonds to calculate the caloric properties (heat capacity and velocity of sound) for an ideal mixture of the lightest clusters. The problem proves to be exactly solvable and resulting formulas are functions solely of the equilibrium constant of the dimer formation. These formulas ensure a satisfactory correlation with the reference data for the vapors of cesium, mercury, and argon up to moderate densities in both the sub- and supercritical regions. For cesium, we extend the model to the densities close to the critical one by inclusion of the clusters of arbitrary size. Knowledge of the cluster composition of the cesium vapor makes it possible to treat nonequilibrium phenomena such as nucleation of the supersaturated vapor, for which the effect of the cluster structural transition is likely to be significant.

  3. Influence of birth cohort on age of onset cluster analysis in bipolar I disorder.

    PubMed

    Bauer, M; Glenn, T; Alda, M; Andreassen, O A; Angelopoulos, E; Ardau, R; Baethge, C; Bauer, R; Bellivier, F; Belmaker, R H; Berk, M; Bjella, T D; Bossini, L; Bersudsky, Y; Cheung, E Y W; Conell, J; Del Zompo, M; Dodd, S; Etain, B; Fagiolini, A; Frye, M A; Fountoulakis, K N; Garneau-Fournier, J; Gonzalez-Pinto, A; Harima, H; Hassel, S; Henry, C; Iacovides, A; Isometsä, E T; Kapczinski, F; Kliwicki, S; König, B; Krogh, R; Kunz, M; Lafer, B; Larsen, E R; Lewitzka, U; Lopez-Jaramillo, C; MacQueen, G; Manchia, M; Marsh, W; Martinez-Cengotitabengoa, M; Melle, I; Monteith, S; Morken, G; Munoz, R; Nery, F G; O'Donovan, C; Osher, Y; Pfennig, A; Quiroz, D; Ramesar, R; Rasgon, N; Reif, A; Ritter, P; Rybakowski, J K; Sagduyu, K; Scippa, A M; Severus, E; Simhandl, C; Stein, D J; Strejilevich, S; Hatim Sulaiman, A; Suominen, K; Tagata, H; Tatebayashi, Y; Torrent, C; Vieta, E; Viswanath, B; Wanchoo, M J; Zetin, M; Whybrow, P C

    2015-01-01

    Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset, using a large, international database. The database includes 4037 patients with a diagnosis of bipolar I disorder, previously collected at 36 collection sites in 23 countries. Generalized estimating equations (GEE) were used to adjust the data for country median age, and in some models, birth cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After adjusting for the birth cohort or when considering only those born after 1959, two subgroups were found. With results of either two or three subgroups, the youngest subgroup was more likely to have a family history of mood disorders and a first episode with depressed polarity. However, without adjusting for birth cohort (three subgroups), family history and polarity of the first episode could not be distinguished between the middle and oldest subgroups. These results using international data confirm prior findings using single country data, that there are subgroups of bipolar I disorder based on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more useful for research. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  4. Nonideal mixing of phosphatidylserine and phosphatidylcholine in the fluid lamellar phase.

    PubMed Central

    Huang, J; Swanson, J E; Dibble, A R; Hinderliter, A K; Feigenson, G W

    1993-01-01

    The mixing of phosphatidylserine (PS) and phosphatidylcholine (PC) in fluid bilayer model membranes was studied by measuring binding of aqueous Ca2+ ions. The measured [Ca2+]aq was used to derive the activity coefficient for PS, gamma PS, in the lipid mixture. For (16:0, 18:1) PS in binary mixtures with either (16:0, 18:1)PC, (14:1, 14:1)PC, or (18:1, 18:1)PC, gamma PS > 1; i.e., mixing is nonideal, with PS and PC clustered rather than randomly distributed, despite the electrostatic repulsion between PS headgroups. To understand better this mixing behavior, Monte Carlo simulations of the PS/PC distributions were performed, using Kawasaki relaxation. The excess energy was divided into an electrostatic term Uel and one adjustable term including all other nonideal energy contributions, delta Em. Uel was calculated using a discrete charge theory. Kirkwood's coupling parameter method was used to calculate the excess free energy of mixing, delta GEmix, hence In gamma PS,calc. The values of In gamma PS,calc were equalized by adjusting delta Em in order to find the simulated PS/PC distribution that corresponded to the experimental results. We were thus able to compare the smeared charge calculation of [Ca2+]surf with a calculation ("masked evaluation method") that recognized clustering of the negatively charged PS: clustering was found to have a modest effect on [Ca2+]surf, relative to the smeared charge model. Even though both PS and PC tend to cluster, the long-range nature of the electrostatic repulsion reduces the extent of PS clustering at low PS mole fraction compared to PC clustering at an equivalent low PC mole fraction. PMID:8457667

  5. Nonideal mixing of phosphatidylserine and phosphatidylcholine in the fluid lamellar phase.

    PubMed

    Huang, J; Swanson, J E; Dibble, A R; Hinderliter, A K; Feigenson, G W

    1993-02-01

    The mixing of phosphatidylserine (PS) and phosphatidylcholine (PC) in fluid bilayer model membranes was studied by measuring binding of aqueous Ca2+ ions. The measured [Ca2+]aq was used to derive the activity coefficient for PS, gamma PS, in the lipid mixture. For (16:0, 18:1) PS in binary mixtures with either (16:0, 18:1)PC, (14:1, 14:1)PC, or (18:1, 18:1)PC, gamma PS > 1; i.e., mixing is nonideal, with PS and PC clustered rather than randomly distributed, despite the electrostatic repulsion between PS headgroups. To understand better this mixing behavior, Monte Carlo simulations of the PS/PC distributions were performed, using Kawasaki relaxation. The excess energy was divided into an electrostatic term Uel and one adjustable term including all other nonideal energy contributions, delta Em. Uel was calculated using a discrete charge theory. Kirkwood's coupling parameter method was used to calculate the excess free energy of mixing, delta GEmix, hence In gamma PS,calc. The values of In gamma PS,calc were equalized by adjusting delta Em in order to find the simulated PS/PC distribution that corresponded to the experimental results. We were thus able to compare the smeared charge calculation of [Ca2+]surf with a calculation ("masked evaluation method") that recognized clustering of the negatively charged PS: clustering was found to have a modest effect on [Ca2+]surf, relative to the smeared charge model. Even though both PS and PC tend to cluster, the long-range nature of the electrostatic repulsion reduces the extent of PS clustering at low PS mole fraction compared to PC clustering at an equivalent low PC mole fraction.

  6. Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data

    PubMed Central

    Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.

    2003-01-01

    Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292

  7. The Impact of Multipollutant Clusters on the Association Between Fine Particulate Air Pollution and Microvascular Function.

    PubMed

    Ljungman, Petter L; Wilker, Elissa H; Rice, Mary B; Austin, Elena; Schwartz, Joel; Gold, Diane R; Koutrakis, Petros; Benjamin, Emelia J; Vita, Joseph A; Mitchell, Gary F; Vasan, Ramachandran S; Hamburg, Naomi M; Mittleman, Murray A

    2016-03-01

    Prior studies including the Framingham Heart Study have suggested associations between single components of air pollution and vascular function; however, underlying mixtures of air pollution may have distinct associations with vascular function. We used a k-means approach to construct five distinct pollution mixtures from elemental analyses of particle filters, air pollution monitoring data, and meteorology. Exposure was modeled as an interaction between fine particle mass (PM2.5), and concurrent pollution cluster. Outcome variables were two measures of microvascular function in the fingertip in the Framingham Offspring and Third Generation cohorts from 2003 to 2008. In 1,720 participants, associations between PM2.5 and baseline pulse amplitude tonometry differed by air pollution cluster (interaction P value 0.009). Higher PM2.5 on days with low mass concentrations but high proportion of ultrafine particles from traffic was associated with 18% (95% confidence interval: 4.6%, 33%) higher baseline pulse amplitude per 5 μg/m and days with high contributions of oil and wood combustion with 16% (95% confidence interval: 0.2%, 34%) higher baseline pulse amplitude. We observed no variation in associations of PM2.5 with hyperemic response to ischemia observed across air pollution clusters. PM2.5 exposure from air pollution mixtures with large contributions of local ultrafine particles from traffic, heating oil, and wood combustion was associated with higher baseline pulse amplitude but not hyperemic response. Our findings suggest little association between acute exposure to air pollution clusters reflective of select sources and hyperemic response to ischemia, but possible associations with excessive small artery pulsatility with potentially deleterious microvascular consequences.

  8. The Impact of Multi-pollutant Clusters on the Association between Fine Particulate Air Pollution and Microvascular Function

    PubMed Central

    Ljungman, Petter L.; Wilker, Elissa H.; Rice, Mary B.; Austin, Elena; Schwartz, Joel; Gold, Diane R.; Koutrakis, Petros; Benjamin, Emelia J.; Vita, Joseph A.; Mitchell, Gary F.; Vasan, Ramachandran S.

    2016-01-01

    Background Prior studies including the Framingham Heart Study have suggested associations between single components of air pollution and vascular function; however, underlying mixtures of air pollution may have distinct associations with vascular function. Methods We used a k-means approach to construct five distinct pollution mixtures from elemental analyses of particle filters, air pollution monitoring data, and meteorology. Exposure was modeled as an interaction between fine particle mass (PM2.5), and concurrent pollution cluster. Outcome variables were two measures of microvascular function in the fingertip in the Framingham Offspring and Third Generation cohorts from 2003-2008. Results In 1,720 participants, associations between PM2.5 and baseline pulse amplitude tonometry differed by air pollution cluster (interaction p value 0.009). Higher PM2.5 on days with low mass concentrations but high proportion of ultrafine particles from traffic was associated with 18% (95% CI 4.6%; 33%) higher baseline pulse amplitude per 5 μg/m3 and days with high contributions of oil and wood combustion with 16% (95% CI 0.2%; 34%) higher baseline pulse amplitude. We observed no variation in associations of PM2.5 with hyperemic response to ischemia observed across air pollution clusters. Conclusions PM2.5 exposure from air pollution mixtures with large contributions of local ultrafine particles from traffic, heating oil and wood combustion was associated with higher baseline pulse amplitude but not PAT ratio. Our findings suggest little association between acute exposure to air pollution clusters reflective of select sources and hyperemic response to ischemia, but possible associations with excessive small artery pulsatility with potentially deleterious microvascular consequences. PMID:26562062

  9. Cluster formation and percolation in ethanol-water mixtures

    NASA Astrophysics Data System (ADS)

    Gereben, Orsolya; Pusztai, László

    2017-10-01

    Results of systematic molecular dynamics studies of ethanol-water mixtures, over the entire concentration range, were reported previously that agree with experimental X-ray diffraction data. These simulated systems are analyzed in this work to examine cluster formation and percolation, using four different hydrogen bond definitions. Percolation analyses revealed that each mixture (even the one containing 80 mol% ethanol) is above the 3D percolation threshold, with fractal dimensions, df, between 2.6 and 2.9, depending on concentration. Monotype water cluster formation was also studied in the mixtures: 3D water percolation can be found in systems with less than 40 mol% ethanol, with fractal dimensions between 2.53 and 2.84. These observations can be put in parallel with experimental data on some thermodynamic quantities, such as the excess partial molar enthalpy and entropy.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perera, Aurélien; Mazighi, Redha

    Computer simulation studies of aqueous dimethyl sulfoxyde (DMSO) mixtures show micro-heterogeneous structures, just like aqueous alcohol mixtures. However, there is a marked difference in the aggregate structure of water between the two types of systems. While water molecules form multiconnected globular clusters in alcohols, we report herein that the typical water aggregates in aqueous DMSO mixtures are linear, favouring a 2 hydrogen bond structure per water molecule, and for all DMSO mole fractions ranging from 0.1 to 0.9. This linear-aggregate structure produces a particular signature in the water site-site structure factors, in the form of a pre-peak at k ≈more » 0.2–0.8 Å{sup −1}, depending on DMSO concentration. This pre-peak is either absent in other aqueous mixtures, such as aqueous methanol mixtures, or very difficult to see through computer simulations, such as in aqueous-t-butanol mixtures. This difference in the topology of the aggregates explains why the Kirkwood-Buff integrals of aqueous-DMSO mixture look nearly ideal, in contrast with those of aqueous alcohol mixtures, suggesting a connection between the shape of the water aggregates, its fluctuations, and the concentration fluctuations. In order to further study this discrepancy between aqueous DMSO and aqueous alcohol mixture, two models of pseudo-DMSO are introduced, where the size of the sulfur atom is increased by a factor 1.6 and 1.7, respectively, hence increasing the hydrophobicity of the molecule. The study shows that these mixtures become closer to the emulsion type seen in aqueous alcohol mixtures, with more globular clustering of the water molecules, long range domain oscillations in the water-water correlations and increased water-water Kirkwood-Buff integrals. It demonstrates that the local ordering of the water molecules is influenced by the nature of the solute molecules, with very different consequences for structural properties and related thermodynamic quantities. This study illustrates the unique plasticity of water in presence of different types of solutes.« less

  11. Applications of modern statistical methods to analysis of data in physical science

    NASA Astrophysics Data System (ADS)

    Wicker, James Eric

    Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.

  12. Image-driven Population Analysis through Mixture Modeling

    PubMed Central

    Sabuncu, Mert R.; Balci, Serdar K.; Shenton, Martha E.; Golland, Polina

    2009-01-01

    We present iCluster, a fast and efficient algorithm that clusters a set of images while co-registering them using a parameterized, nonlinear transformation model. The output of the algorithm is a small number of template images that represent different modes in a population. This is in contrast with traditional, hypothesis-driven computational anatomy approaches that assume a single template to construct an atlas. We derive the algorithm based on a generative model of an image population as a mixture of deformable template images. We validate and explore our method in four experiments. In the first experiment, we use synthetic data to explore the behavior of the algorithm and inform a design choice on parameter settings. In the second experiment, we demonstrate the utility of having multiple atlases for the application of localizing temporal lobe brain structures in a pool of subjects that contains healthy controls and schizophrenia patients. Next, we employ iCluster to partition a data set of 415 whole brain MR volumes of subjects aged 18 through 96 years into three anatomical subgroups. Our analysis suggests that these subgroups mainly correspond to age groups. The templates reveal significant structural differences across these age groups that confirm previous findings in aging research. In the final experiment, we run iCluster on a group of 15 patients with dementia and 15 age-matched healthy controls. The algorithm produces two modes, one of which contains dementia patients only. These results suggest that the algorithm can be used to discover sub-populations that correspond to interesting structural or functional “modes.” PMID:19336293

  13. Multilevel systems biology modeling characterized the atheroprotective efficiencies of modified dairy fats in a hamster model.

    PubMed

    Martin, Jean-Charles; Berton, Amélie; Ginies, Christian; Bott, Romain; Scheercousse, Pierre; Saddi, Alessandra; Gripois, Daniel; Landrier, Jean-François; Dalemans, Daniel; Alessi, Marie-Christine; Delplanque, Bernadette

    2015-09-01

    We assessed the atheroprotective efficiency of modified dairy fats in hyperlipidemic hamsters. A systems biology approach was implemented to reveal and quantify the dietary fat-related components of the disease. Three modified dairy fats (40% energy) were prepared from regular butter by mixing with a plant oil mixture, by removing cholesterol alone, or by removing cholesterol in combination with reducing saturated fatty acids. A plant oil mixture and a regular butter were used as control diets. The atherosclerosis severity (aortic cholesteryl-ester level) was higher in the regular butter-fed hamsters than in the other four groups (P < 0.05). Eighty-seven of the 1,666 variables measured from multiplatform analysis were found to be strongly associated with the disease. When aggregated into 10 biological clusters combined into a multivariate predictive equation, these 87 variables explained 81% of the disease variability. The biological cluster "regulation of lipid transport and metabolism" appeared central to atherogenic development relative to diets. The "vitamin E metabolism" cluster was the main driver of atheroprotection with the best performing transformed dairy fat. Under conditions that promote atherosclerosis, the impact of dairy fats on atherogenesis could be greatly ameliorated by technological modifications. Our modeling approach allowed for identifying and quantifying the contribution of complex factors to atherogenic development in each dietary setup. Copyright © 2015 the American Physiological Society.

  14. Commentary on Steinley and Brusco (2011): Recommendations and Cautions

    ERIC Educational Resources Information Center

    McLachlan, Geoffrey J.

    2011-01-01

    I discuss the recommendations and cautions in Steinley and Brusco's (2011) article on the use of finite models to cluster a data set. In their article, much use is made of comparison with the "K"-means procedure. As noted by researchers for over 30 years, the "K"-means procedure can be viewed as a special case of finite mixture modeling in which…

  15. Statistical Mechanical Theory of Coupled Slow Dynamics in Glassy Polymer-Molecule Mixtures

    NASA Astrophysics Data System (ADS)

    Zhang, Rui; Schweizer, Kenneth

    The microscopic Elastically Collective Nonlinear Langevin Equation theory of activated relaxation in one-component supercooled liquids and glasses is generalized to polymer-molecule mixtures. The key idea is to account for dynamic coupling between molecule and polymer segment motion. For describing the molecule hopping event, a temporal casuality condition is formulated to self-consistently determine a dimensionless degree of matrix distortion relative to the molecule jump distance based on the concept of coupled dynamic free energies. Implementation for real materials employs an established Kuhn sphere model of the polymer liquid and a quantitative mapping to a hard particle reference system guided by the experimental equation-of-state. The theory makes predictions for the mixture dynamic shear modulus, activated relaxation time and diffusivity of both species, and mixture glass transition temperature as a function of molecule-Kuhn segment size ratio and attraction strength, composition and temperature. Model calculations illustrate the dynamical behavior in three distinct mixture regimes (fully miscible, bridging, clustering) controlled by the molecule-polymer interaction or chi-parameter. Applications to specific experimental systems will be discussed.

  16. Phylogenetic mixtures and linear invariants for equal input models.

    PubMed

    Casanellas, Marta; Steel, Mike

    2017-04-01

    The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Markov process supports linear phylogenetic invariants, then provided these are sufficiently informative, the identifiability of the tree topology can be restored. In this paper, we investigate a class of processes that support linear invariants once the stationary distribution is fixed, the 'equal input model'. This model generalizes the 'Felsenstein 1981' model (and thereby the Jukes-Cantor model) from four states to an arbitrary number of states (finite or infinite), and it can also be described by a 'random cluster' process. We describe the structure and dimension of the vector spaces of phylogenetic mixtures and of linear invariants for any fixed phylogenetic tree (and for all trees-the so called 'model invariants'), on any number n of leaves. We also provide a precise description of the space of mixtures and linear invariants for the special case of [Formula: see text] leaves. By combining techniques from discrete random processes and (multi-) linear algebra, our results build on a classic result that was first established by James Lake (Mol Biol Evol 4:167-191, 1987).

  17. Mixture models in diagnostic meta-analyses--clustering summary receiver operating characteristic curves accounted for heterogeneity and correlation.

    PubMed

    Schlattmann, Peter; Verba, Maryna; Dewey, Marc; Walther, Mario

    2015-01-01

    Bivariate linear and generalized linear random effects are frequently used to perform a diagnostic meta-analysis. The objective of this article was to apply a finite mixture model of bivariate normal distributions that can be used for the construction of componentwise summary receiver operating characteristic (sROC) curves. Bivariate linear random effects and a bivariate finite mixture model are used. The latter model is developed as an extension of a univariate finite mixture model. Two examples, computed tomography (CT) angiography for ruling out coronary artery disease and procalcitonin as a diagnostic marker for sepsis, are used to estimate mean sensitivity and mean specificity and to construct sROC curves. The suggested approach of a bivariate finite mixture model identifies two latent classes of diagnostic accuracy for the CT angiography example. Both classes show high sensitivity but mainly two different levels of specificity. For the procalcitonin example, this approach identifies three latent classes of diagnostic accuracy. Here, sensitivities and specificities are quite different as such that sensitivity increases with decreasing specificity. Additionally, the model is used to construct componentwise sROC curves and to classify individual studies. The proposed method offers an alternative approach to model between-study heterogeneity in a diagnostic meta-analysis. Furthermore, it is possible to construct sROC curves even if a positive correlation between sensitivity and specificity is present. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Solving Coupled Gross--Pitaevskii Equations on a Cluster of PlayStation 3 Computers

    NASA Astrophysics Data System (ADS)

    Edwards, Mark; Heward, Jeffrey; Clark, C. W.

    2009-05-01

    At Georgia Southern University we have constructed an 8+1--node cluster of Sony PlayStation 3 (PS3) computers with the intention of using this computing resource to solve problems related to the behavior of ultra--cold atoms in general with a particular emphasis on studying bose--bose and bose--fermi mixtures confined in optical lattices. As a first project that uses this computing resource, we have implemented a parallel solver of the coupled time--dependent, one--dimensional Gross--Pitaevskii (TDGP) equations. These equations govern the behavior of dual-- species bosonic mixtures. We chose the split--operator/FFT to solve the coupled 1D TDGP equations. The fast Fourier transform component of this solver can be readily parallelized on the PS3 cpu known as the Cell Broadband Engine (CellBE). Each CellBE chip contains a single 64--bit PowerPC Processor Element known as the PPE and eight ``Synergistic Processor Element'' identified as the SPE's. We report on this algorithm and compare its performance to a non--parallel solver as applied to modeling evaporative cooling in dual--species bosonic mixtures.

  19. Scalable clustering algorithms for continuous environmental flow cytometry.

    PubMed

    Hyrkas, Jeremy; Clayton, Sophie; Ribalet, Francois; Halperin, Daniel; Armbrust, E Virginia; Howe, Bill

    2016-02-01

    Recent technological innovations in flow cytometry now allow oceanographers to collect high-frequency flow cytometry data from particles in aquatic environments on a scale far surpassing conventional flow cytometers. The SeaFlow cytometer continuously profiles microbial phytoplankton populations across thousands of kilometers of the surface ocean. The data streams produced by instruments such as SeaFlow challenge the traditional sample-by-sample approach in cytometric analysis and highlight the need for scalable clustering algorithms to extract population information from these large-scale, high-frequency flow cytometers. We explore how available algorithms commonly used for medical applications perform at classification of such a large-scale, environmental flow cytometry data. We apply large-scale Gaussian mixture models to massive datasets using Hadoop. This approach outperforms current state-of-the-art cytometry classification algorithms in accuracy and can be coupled with manual or automatic partitioning of data into homogeneous sections for further classification gains. We propose the Gaussian mixture model with partitioning approach for classification of large-scale, high-frequency flow cytometry data. Source code available for download at https://github.com/jhyrkas/seaflow_cluster, implemented in Java for use with Hadoop. hyrkas@cs.washington.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. A clustering-based fuzzy wavelet neural network model for short-term load forecasting.

    PubMed

    Kodogiannis, Vassilis S; Amina, Mahdi; Petrounias, Ilias

    2013-10-01

    Load forecasting is a critical element of power system operation, involving prediction of the future level of demand to serve as the basis for supply and demand planning. This paper presents the development of a novel clustering-based fuzzy wavelet neural network (CB-FWNN) model and validates its prediction on the short-term electric load forecasting of the Power System of the Greek Island of Crete. The proposed model is obtained from the traditional Takagi-Sugeno-Kang fuzzy system by replacing the THEN part of fuzzy rules with a "multiplication" wavelet neural network (MWNN). Multidimensional Gaussian type of activation functions have been used in the IF part of the fuzzyrules. A Fuzzy Subtractive Clustering scheme is employed as a pre-processing technique to find out the initial set and adequate number of clusters and ultimately the number of multiplication nodes in MWNN, while Gaussian Mixture Models with the Expectation Maximization algorithm are utilized for the definition of the multidimensional Gaussians. The results corresponding to the minimum and maximum power load indicate that the proposed load forecasting model provides significantly accurate forecasts, compared to conventional neural networks models.

  1. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

    PubMed Central

    Wernisch, Lorenz

    2017-01-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190

  2. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

    PubMed

    Gabasova, Evelina; Reid, John; Wernisch, Lorenz

    2017-10-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.

  3. Effective cluster model of dielectric enhancement in metal-insulator composites

    NASA Astrophysics Data System (ADS)

    Doyle, W. T.; Jacobs, I. S.

    1990-11-01

    The electrical permittivity of a suspension of conducting spheres at high volume loading exhibits a large enhancement above the value predicted by the Clausius-Mossotti approximation. The permittivity enhancement is a dielectric anomaly accompanying a metallization transition that occurs when conducting particles are close packed. In disordered suspensions, close encounters can cause a permittivity enhancement at any volume loading. We attribute the permittivity enhancements typically observed in monodisperse disordered suspensions of conducting spheres to local metallized regions of high density produced by density fluctuations. We model a disordered suspension as a mixture, or mesosuspension, of isolated spheres and random close-packed spherical clusters of arbitrary size. Multipole interactions within the clusters are treated exactly. External interactions between clusters and isolated spheres are treated in the dipole approximation. Model permittivities are compared with Guillien's experimental permittivity measurements [Ann. Phys. (Paris) Ser. 11, 16, 205 (1941)] on liquid suspensions of Hg droplets in oil and with Turner's conductivity measurements [Chem. Eng. Sci. 31, 487 (1976)] on fluidized bed suspensions of ion-exchange resin beads in aqueous solution. New permittivity measurements at 10 GHz on solid suspensions of monodisperse metal spheres in polyurethane are presented and compared with the model permittivities. The effective spherical cluster model is in excellent agreement with the experiments over the entire accessible range of volume loading.

  4. The cluster model of a hot dense vapor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhukhovitskii, D. I., E-mail: dmr@ihed.ras.ru

    2015-04-28

    We explore thermodynamic properties of a vapor in the range of state parameters where the contribution to thermodynamic functions from bound states of atoms (clusters) dominates over the interaction between the components of the vapor in free states. The clusters are assumed to be light and sufficiently “hot” for the number of bonds to be minimized. We use the technique of calculation of the cluster partition function for the cluster with a minimum number of interatomic bonds to calculate the caloric properties (heat capacity and velocity of sound) for an ideal mixture of the lightest clusters. The problem proves tomore » be exactly solvable and resulting formulas are functions solely of the equilibrium constant of the dimer formation. These formulas ensure a satisfactory correlation with the reference data for the vapors of cesium, mercury, and argon up to moderate densities in both the sub- and supercritical regions. For cesium, we extend the model to the densities close to the critical one by inclusion of the clusters of arbitrary size. Knowledge of the cluster composition of the cesium vapor makes it possible to treat nonequilibrium phenomena such as nucleation of the supersaturated vapor, for which the effect of the cluster structural transition is likely to be significant.« less

  5. Coupling microscopic and mesoscopic scales to simulate chemical equilibrium between a nanometric carbon cluster and detonation products fluid.

    PubMed

    Bourasseau, Emeric; Maillet, Jean-Bernard

    2011-04-21

    This paper presents a new method to obtain chemical equilibrium properties of detonation products mixtures including a solid carbon phase. In this work, the solid phase is modelled through a mesoparticle immersed in the fluid, such that the heterogeneous character of the mixture is explicitly taken into account. Inner properties of the clusters are taken from an equation of state obtained in a previous work, and interaction potential between the nanocluster and the fluid particles is derived from all-atoms simulations using the LCBOPII potential (Long range Carbon Bond Order Potential II). It appears that differences in chemical equilibrium results obtained with this method and the "composite ensemble method" (A. Hervouet et al., J. Phys. Chem. B, 2008, 112.), where fluid and solid phases are considered as non-interacting, are not significant, underlining the fact that considering the inhomogeneity of such system is crucial.

  6. Generalized Wishart Mixtures for Unsupervised Classification of PolSAR Data

    NASA Astrophysics Data System (ADS)

    Li, Lan; Chen, Erxue; Li, Zengyuan

    2013-01-01

    This paper presents an unsupervised clustering algorithm based upon the expectation maximization (EM) algorithm for finite mixture modelling, using the complex wishart probability density function (PDF) for the probabilities. The mixture model enables to consider heterogeneous thematic classes which could not be better fitted by the unimodal wishart distribution. In order to make it fast and robust to calculate, we use the recently proposed generalized gamma distribution (GΓD) for the single polarization intensity data to make the initial partition. Then we use the wishart probability density function for the corresponding sample covariance matrix to calculate the posterior class probabilities for each pixel. The posterior class probabilities are used for the prior probability estimates of each class and weights for all class parameter updates. The proposed method is evaluated and compared with the wishart H-Alpha-A classification. Preliminary results show that the proposed method has better performance.

  7. Vapor condensation onto a non-volatile liquid drop

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Inci, Levent; Bowles, Richard K., E-mail: richard.bowles@usask.ca

    2013-12-07

    Molecular dynamics simulations of miscible and partially miscible binary Lennard–Jones mixtures are used to study the dynamics and thermodynamics of vapor condensation onto a non-volatile liquid drop in the canonical ensemble. When the system volume is large, the driving force for condensation is low and only a submonolayer of the solvent is adsorbed onto the liquid drop. A small degree of mixing of the solvent phase into the core of the particles occurs for the miscible system. At smaller volumes, complete film formation is observed and the dynamics of film growth are dominated by cluster-cluster coalescence. Mixing into the coremore » of the droplet is also observed for partially miscible systems below an onset volume suggesting the presence of a solubility transition. We also develop a non-volatile liquid drop model, based on the capillarity approximations, that exhibits a solubility transition between small and large drops for partially miscible mixtures and has a hysteresis loop similar to the one observed in the deliquescence of small soluble salt particles. The properties of the model are compared to our simulation results and the model is used to study the formulation of classical nucleation theory for systems with low free energy barriers.« less

  8. Micro-heterogeneity versus clustering in binary mixtures of ethanol with water or alkanes.

    PubMed

    Požar, Martina; Lovrinčević, Bernarda; Zoranić, Larisa; Primorać, Tomislav; Sokolić, Franjo; Perera, Aurélien

    2016-08-24

    Ethanol is a hydrogen bonding liquid. When mixed in small concentrations with water or alkanes, it forms aggregate structures reminiscent of, respectively, the direct and inverse micellar aggregates found in emulsions, albeit at much smaller sizes. At higher concentrations, micro-heterogeneous mixing with segregated domains is found. We examine how different statistical methods, namely correlation function analysis, structure factor analysis and cluster distribution analysis, can describe efficiently these morphological changes in these mixtures. In particular, we explain how the neat alcohol pre-peak of the structure factor evolves into the domain pre-peak under mixing conditions, and how this evolution differs whether the co-solvent is water or alkane. This study clearly establishes the heuristic superiority of the correlation function/structure factor analysis to study the micro-heterogeneity, since cluster distribution analysis is insensitive to domain segregation. Correlation functions detect the domains, with a clear structure factor pre-peak signature, while the cluster techniques detect the cluster hierarchy within domains. The main conclusion is that, in micro-segregated mixtures, the domain structure is a more fundamental statistical entity than the underlying cluster structures. These findings could help better understand comparatively the radiation scattering experiments, which are sensitive to domains, versus the spectroscopy-NMR experiments, which are sensitive to clusters.

  9. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm.

    PubMed

    Raykov, Yordan P; Boukouvalas, Alexis; Baig, Fahd; Little, Max A

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.

  10. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm

    PubMed Central

    Baig, Fahd; Little, Max A.

    2016-01-01

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism. PMID:27669525

  11. Development of advanced acreage estimation methods

    NASA Technical Reports Server (NTRS)

    Guseman, L. F., Jr. (Principal Investigator)

    1982-01-01

    The development of an accurate and efficient algorithm for analyzing the structure of MSS data, the application of the Akaiki information criterion to mixture models, and a research plan to delineate some of the technical issues and associated tasks in the area of rice scene radiation characterization are discussed. The AMOEBA clustering algorithm is refined and documented.

  12. A Computational Algorithm for Functional Clustering of Proteome Dynamics During Development

    PubMed Central

    Wang, Yaqun; Wang, Ningtao; Hao, Han; Guo, Yunqian; Zhen, Yan; Shi, Jisen; Wu, Rongling

    2014-01-01

    Phenotypic traits, such as seed development, are a consequence of complex biochemical interactions among genes, proteins and metabolites, but the underlying mechanisms that operate in a coordinated and sequential manner remain elusive. Here, we address this issue by developing a computational algorithm to monitor proteome changes during the course of trait development. The algorithm is built within the mixture-model framework in which each mixture component is modeled by a specific group of proteins that display a similar temporal pattern of expression in trait development. A nonparametric approach based on Legendre orthogonal polynomials was used to fit dynamic changes of protein expression, increasing the power and flexibility of protein clustering. By analyzing a dataset of proteomic dynamics during early embryogenesis of the Chinese fir, the algorithm has successfully identified several distinct types of proteins that coordinate with each other to determine seed development in this forest tree commercially and environmentally important to China. The algorithm will find its immediate applications for the characterization of mechanistic underpinnings for any other biological processes in which protein abundance plays a key role. PMID:24955031

  13. Fuzzy jets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mackey, Lester; Nachman, Benjamin; Schwartzman, Ariel

    Collimated streams of particles produced in high energy physics experiments are organized using clustering algorithms to form jets . To construct jets, the experimental collaborations based at the Large Hadron Collider (LHC) primarily use agglomerative hierarchical clustering schemes known as sequential recombination. We propose a new class of algorithms for clustering jets that use infrared and collinear safe mixture models. These new algorithms, known as fuzzy jets , are clustered using maximum likelihood techniques and can dynamically determine various properties of jets like their size. We show that the fuzzy jet size adds additional information to conventional jet tagging variablesmore » in boosted topologies. Furthermore, we study the impact of pileup and show that with some slight modifications to the algorithm, fuzzy jets can be stable up to high pileup interaction multiplicities.« less

  14. Fuzzy jets

    DOE PAGES

    Mackey, Lester; Nachman, Benjamin; Schwartzman, Ariel; ...

    2016-06-01

    Collimated streams of particles produced in high energy physics experiments are organized using clustering algorithms to form jets . To construct jets, the experimental collaborations based at the Large Hadron Collider (LHC) primarily use agglomerative hierarchical clustering schemes known as sequential recombination. We propose a new class of algorithms for clustering jets that use infrared and collinear safe mixture models. These new algorithms, known as fuzzy jets , are clustered using maximum likelihood techniques and can dynamically determine various properties of jets like their size. We show that the fuzzy jet size adds additional information to conventional jet tagging variablesmore » in boosted topologies. Furthermore, we study the impact of pileup and show that with some slight modifications to the algorithm, fuzzy jets can be stable up to high pileup interaction multiplicities.« less

  15. Rigid-Cluster Models of Conformational Transitions in Macromolecular Machines and Assemblies

    PubMed Central

    Kim, Moon K.; Jernigan, Robert L.; Chirikjian, Gregory S.

    2005-01-01

    We present a rigid-body-based technique (called rigid-cluster elastic network interpolation) to generate feasible transition pathways between two distinct conformations of a macromolecular assembly. Many biological molecules and assemblies consist of domains which act more or less as rigid bodies during large conformational changes. These collective motions are thought to be strongly related with the functions of a system. This fact encourages us to simply model a macromolecule or assembly as a set of rigid bodies which are interconnected with distance constraints. In previous articles, we developed coarse-grained elastic network interpolation (ENI) in which, for example, only Cα atoms are selected as representatives in each residue of a protein. We interpolate distance differences of two conformations in ENI by using a simple quadratic cost function, and the feasible conformations are generated without steric conflicts. Rigid-cluster interpolation is an extension of the ENI method with rigid-clusters replacing point masses. Now the intermediate conformations in an anharmonic pathway can be determined by the translational and rotational displacements of large clusters in such a way that distance constraints are observed. We present the derivation of the rigid-cluster model and apply it to a variety of macromolecular assemblies. Rigid-cluster ENI is then modified for a hybrid model represented by a mixture of rigid clusters and point masses. Simulation results show that both rigid-cluster and hybrid ENI methods generate sterically feasible pathways of large systems in a very short time. For example, the HK97 virus capsid is an icosahedral symmetric assembly composed of 60 identical asymmetric units. Its original Hessian matrix size for a Cα coarse-grained model is >(300,000)2. However, it reduces to (84)2 when we apply the rigid-cluster model with icosahedral symmetry constraints. The computational cost of the interpolation no longer scales heavily with the size of structures; instead, it depends strongly on the minimal number of rigid clusters into which the system can be decomposed. PMID:15833998

  16. A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

    PubMed

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

    2016-02-01

    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  17. Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization.

    PubMed

    Mitra, Adway; Biswas, Soma; Bhattacharyya, Chiranjib

    2017-03-01

    A video is understood by users in terms of entities present in it. Entity Discovery is the task of building appearance model for each entity (e.g., a person), and finding all its occurrences in the video. We represent a video as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity. We pose Entity Discovery as tracklet clustering, and approach it by leveraging Temporal Coherence (TC): the property that temporally neighboring tracklets are likely to be associated with the same entity. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model entities and temporal segments using mixture components and sparse distributions. For discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement over state-of-the-art approaches to tracklet clustering in terms of clustering accuracy, cluster purity and entity coverage. The proposed methods can perform online tracklet clustering on streaming videos unlike existing approaches, and can automatically reject false tracklets. Finally we discuss entity-driven video summarization- where temporal segments of the video are selected based on the discovered entities, to create a semantically meaningful summary.

  18. Microheterogeneity in CH3OH/CD3OH mixture

    NASA Astrophysics Data System (ADS)

    Wrzeszcz, Władysław; Mazurek, Sylwester; Szostak, Roman; Tomza, Paweł; Czarnecki, Mirosław A.

    2018-01-01

    Recently, we demonstrated the presence of microheterogeneity in binary mixtures of unlike alcohols. [RSC Adv. 2016, 6, 37195-37202] The aim of this work was examination if this phenomenon occurs also in the mixture of very similar alcohols like CH3OH and CD3OH. Theoretical calculations suggest that the isotopic substitution in methyl group influences properties of the OH group. Hence, one can expect that this effect may lead to partial separation of CH3OH and CD3OH at a molecular level and it contributes to deviation from the ideal mixture. This work evidences that CH3OH/CD3OH mixture also deviates from the ideal one, but the extent of this deviation is much smaller as compared with the mixtures of other alcohols. It is of particular note that this deviation results mainly from the difference between the CH3 and CD3 groups, while the contribution from the OH groups is small. The structure of CH3OH/CD3OH mixture at a molecular level is similar to the structure of binary mixtures of other alcohols. The mixture is composed of the homoclusters of both alcohols and the mixed clusters. The homoclusters existing in the mixture are similar to those present in bulk alcohols. The highest population of the heteroclusters and the largest deviation from the ideal mixture were observed at equimolar mixture. Both the experimental and theoretical results reveal that in CH3OH/CD3OH mixture dominate the cyclic tetramers and larger clusters, while the population of the linear clusters is negligible. Though the extent and strength of hydrogen bonding in both alcohols are the same, the position and intensity of the 2ν(OH) band for CH3OH and CD3OH are different. We propose possible explanation of this observation.

  19. Microheterogeneity in CH3OH/CD3OH mixture.

    PubMed

    Wrzeszcz, Władysław; Mazurek, Sylwester; Szostak, Roman; Tomza, Paweł; Czarnecki, Mirosław A

    2018-01-05

    Recently, we demonstrated the presence of microheterogeneity in binary mixtures of unlike alcohols. [RSC Adv. 2016, 6, 37195-37202] The aim of this work was examination if this phenomenon occurs also in the mixture of very similar alcohols like CH 3 OH and CD 3 OH. Theoretical calculations suggest that the isotopic substitution in methyl group influences properties of the OH group. Hence, one can expect that this effect may lead to partial separation of CH 3 OH and CD 3 OH at a molecular level and it contributes to deviation from the ideal mixture. This work evidences that CH 3 OH/CD 3 OH mixture also deviates from the ideal one, but the extent of this deviation is much smaller as compared with the mixtures of other alcohols. It is of particular note that this deviation results mainly from the difference between the CH 3 and CD 3 groups, while the contribution from the OH groups is small. The structure of CH 3 OH/CD 3 OH mixture at a molecular level is similar to the structure of binary mixtures of other alcohols. The mixture is composed of the homoclusters of both alcohols and the mixed clusters. The homoclusters existing in the mixture are similar to those present in bulk alcohols. The highest population of the heteroclusters and the largest deviation from the ideal mixture were observed at equimolar mixture. Both the experimental and theoretical results reveal that in CH 3 OH/CD 3 OH mixture dominate the cyclic tetramers and larger clusters, while the population of the linear clusters is negligible. Though the extent and strength of hydrogen bonding in both alcohols are the same, the position and intensity of the 2ν(OH) band for CH 3 OH and CD 3 OH are different. We propose possible explanation of this observation. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Prediction of the size distributions of methanol-ethanol clusters detected in VUV laser/time-of-flight mass spectrometry.

    PubMed

    Liu, Yi; Consta, Styliani; Shi, Yujun; Lipson, R H; Goddard, William A

    2009-06-25

    The size distributions and geometries of vapor clusters equilibrated with methanol-ethanol (Me-Et) liquid mixtures were recently studied by vacuum ultraviolet (VUV) laser time-of-flight (TOF) mass spectrometry and density functional theory (DFT) calculations (Liu, Y.; Consta, S.; Ogeer, F.; Shi, Y. J.; Lipson, R. H. Can. J. Chem. 2007, 85, 843-852). On the basis of the mass spectra recorded, it was concluded that the formation of neutral tetramers is particularly prominent. Here we develop grand canonical Monte Carlo (GCMC) and molecular dynamics (MD) frameworks to compute cluster size distributions in vapor mixtures that allow a direct comparison with experimental mass spectra. Using the all-atom optimized potential for liquid simulations (OPLS-AA) force field, we systematically examined the neutral cluster size distributions as functions of pressure and temperature. These neutral cluster distributions were then used to derive ionized cluster distributions to compare directly with the experiments. The simulations suggest that supersaturation at 12 to 16 times the equilibrium vapor pressure at 298 K or supercooling at temperature 240 to 260 K at the equilibrium vapor pressure can lead to the relatively abundant tetramer population observed in the experiments. Our simulations capture the most distinct features observed in the experimental TOF mass spectra: Et(3)H(+) at m/z = 139 in the vapor corresponding to 10:90% Me-Et liquid mixture and Me(3)H(+) at m/z = 97 in the vapors corresponding to 50:50% and 90:10% Me-Et liquid mixtures. The hybrid GCMC scheme developed in this work extends the capability of studying the size distributions of neat clusters to mixed species and provides a useful tool for studying environmentally important systems such as atmospheric aerosols.

  1. Structure and Binding of Ionic Clusters in Th and Zr Chloride Melts

    NASA Astrophysics Data System (ADS)

    Akdeniz, Z.; Tosi, M. P.

    2001-11-01

    We discuss microscopic ionic models for the structure and the binding of small clusters which may exist as structural units in molten ThCl4 and ZrCl4 and in their mixtures with alkali halides according to Raman scattering studies of Photiadis and Papatheodorou. The models are adjusted to the two isolated tetrahedral molecules. Appreciably higher ionicity is found for ThCl4 than for ZrCl4, and this fact underlies the strikingly different behaviour of the two systems in the dense liquid state -in particular, a molecular-type structure for molten ZrCl4 against a structure including charged oligomers in molten ThCl4.

  2. Shear-induced clustering of Brownian colloids in associative polymer networks at moderate Péclet number

    NASA Astrophysics Data System (ADS)

    Kim, Juntae; Helgeson, Matthew E.

    2016-08-01

    We investigate shear-induced clustering and its impact on fluid rheology in polymer-colloid mixtures at moderate colloid volume fraction. By employing a thermoresponsive system that forms associative polymer-colloid networks, we present experiments of rheology and flow-induced microstructure on colloid-polymer mixtures in which the relative magnitudes of the time scales associated with relaxation of viscoelasticity and suspension microstructure are widely and controllably varied. In doing so, we explore several limits of relative magnitude of the relevant dimensionless shear rates, the Weissenberg number Wi and the Péclet number Pe. In all of these limits, we find that the fluid exhibits two distinct regimes of shear thinning at relatively low and high shear rates, in which the rheology collapses by scaling with Wi and Pe, respectively. Using three-dimensionally-resolved flow small-angle neutron scattering measurements, we observe clustering of the suspension above a critical shear rate corresponding to Pe ˜0.1 over a wide range of fluid conditions, having anisotropy with projected orientation along both the vorticity and compressional axes of shear. The degree of anisotropy is shown to scale with Pe. From this we formulate an empirical model for the shear stress and viscosity, in which the viscoelastic network stress is augmented by an asymptotic shear thickening contribution due to hydrodynamic clustering. Overall, our results elucidate the significant role of hydrodynamic interactions in contributing to shear-induced clustering of Brownian suspensions in viscoelastic liquids.

  3. Somatotyping using 3D anthropometry: a cluster analysis.

    PubMed

    Olds, Tim; Daniell, Nathan; Petkov, John; David Stewart, Arthur

    2013-01-01

    Somatotyping is the quantification of human body shape, independent of body size. Hitherto, somatotyping (including the most popular method, the Heath-Carter system) has been based on subjective visual ratings, sometimes supported by surface anthropometry. This study used data derived from three-dimensional (3D) whole-body scans as inputs for cluster analysis to objectively derive clusters of similar body shapes. Twenty-nine dimensions normalised for body size were measured on a purposive sample of 301 adults aged 17-56 years who had been scanned using a Vitus Smart laser scanner. K-means Cluster Analysis with v-fold cross-validation was used to determine shape clusters. Three male and three female clusters emerged, and were visualised using those scans closest to the cluster centroid and a caricature defined by doubling the difference between the average scan and the cluster centroid. The male clusters were decidedly endomorphic (high fatness), ectomorphic (high linearity), and endo-mesomorphic (a mixture of fatness and muscularity). The female clusters were clearly endomorphic, ectomorphic, and the ecto-mesomorphic (a mixture of linearity and muscularity). An objective shape quantification procedure combining 3D scanning and cluster analysis yielded shape clusters strikingly similar to traditional somatotyping.

  4. Clustering of gamma-ray burst types in the Fermi GBM catalogue: indications of photosphere and synchrotron emissions during the prompt phase

    NASA Astrophysics Data System (ADS)

    Acuner, Zeynep; Ryde, Felix

    2018-04-01

    Many different physical processes have been suggested to explain the prompt gamma-ray emission in gamma-ray bursts (GRBs). Although there are examples of both bursts with photospheric and synchrotron emission origins, these distinct spectral appearances have not been generalized to large samples of GRBs. Here, we search for signatures of the different emission mechanisms in the full Fermi Gamma-ray Space Telescope/GBM (Gamma-ray Burst Monitor) catalogue. We use Gaussian Mixture Models to cluster bursts according to their parameters from the Band function (α, β, and Epk) as well as their fluence and T90. We find five distinct clusters. We further argue that these clusters can be divided into bursts of photospheric origin (2/3 of all bursts, divided into three clusters) and bursts of synchrotron origin (1/3 of all bursts, divided into two clusters). For instance, the cluster that contains predominantly short bursts is consistent of photospheric emission origin. We discuss several reasons that can determine which cluster a burst belongs to: jet dissipation pattern and/or the jet content, or viewing angle.

  5. Verification of Bayesian Clustering in Travel Behaviour Research – First Step to Macroanalysis of Travel Behaviour

    NASA Astrophysics Data System (ADS)

    Satra, P.; Carsky, J.

    2018-04-01

    Our research is looking at the travel behaviour from a macroscopic view, taking one municipality as a basic unit. The travel behaviour of one municipality as a whole is becoming one piece of a data in the research of travel behaviour of a larger area, perhaps a country. A data pre-processing is used to cluster the municipalities in groups, which show similarities in their travel behaviour. Such groups can be then researched for reasons of their prevailing pattern of travel behaviour without any distortion caused by municipalities with a different pattern. This paper deals with actual settings of the clustering process, which is based on Bayesian statistics, particularly the mixture model. An optimization of the settings parameters based on correlation of pointer model parameters and relative number of data in clusters is helpful, however not fully reliable method. Thus, method for graphic representation of clusters needs to be developed in order to check their quality. A training of the setting parameters in 2D has proven to be a beneficial method, because it allows visual control of the produced clusters. The clustering better be applied on separate groups of municipalities, where competition of only identical transport modes can be found.

  6. Unsupervised Gaussian Mixture-Model With Expectation Maximization for Detecting Glaucomatous Progression in Standard Automated Perimetry Visual Fields.

    PubMed

    Yousefi, Siamak; Balasubramanian, Madhusudhanan; Goldbaum, Michael H; Medeiros, Felipe A; Zangwill, Linda M; Weinreb, Robert N; Liebmann, Jeffrey M; Girkin, Christopher A; Bowd, Christopher

    2016-05-01

    To validate Gaussian mixture-model with expectation maximization (GEM) and variational Bayesian independent component analysis mixture-models (VIM) for detecting glaucomatous progression along visual field (VF) defect patterns (GEM-progression of patterns (POP) and VIM-POP). To compare GEM-POP and VIM-POP with other methods. GEM and VIM models separated cross-sectional abnormal VFs from 859 eyes and normal VFs from 1117 eyes into abnormal and normal clusters. Clusters were decomposed into independent axes. The confidence limit (CL) of stability was established for each axis with a set of 84 stable eyes. Sensitivity for detecting progression was assessed in a sample of 83 eyes with known progressive glaucomatous optic neuropathy (PGON). Eyes were classified as progressed if any defect pattern progressed beyond the CL of stability. Performance of GEM-POP and VIM-POP was compared to point-wise linear regression (PLR), permutation analysis of PLR (PoPLR), and linear regression (LR) of mean deviation (MD), and visual field index (VFI). Sensitivity and specificity for detecting glaucomatous VFs were 89.9% and 93.8%, respectively, for GEM and 93.0% and 97.0%, respectively, for VIM. Receiver operating characteristic (ROC) curve areas for classifying progressed eyes were 0.82 for VIM-POP, 0.86 for GEM-POP, 0.81 for PoPLR, 0.69 for LR of MD, and 0.76 for LR of VFI. GEM-POP was significantly more sensitive to PGON than PoPLR and linear regression of MD and VFI in our sample, while providing localized progression information. Detection of glaucomatous progression can be improved by assessing longitudinal changes in localized patterns of glaucomatous defect identified by unsupervised machine learning.

  7. Amide-induced phase separation of hexafluoroisopropanol-water mixtures depending on the hydrophobicity of amides.

    PubMed

    Takamuku, Toshiyuki; Wada, Hiroshi; Kawatoko, Chiemi; Shimomura, Takuya; Kanzaki, Ryo; Takeuchi, Munetaka

    2012-06-21

    Amide-induced phase separation of hexafluoro-2-propanol (HFIP)-water mixtures has been investigated to elucidate solvation properties of the mixtures by means of small-angle neutron scattering (SANS), (1)H and (13)C NMR, and molecular dynamics (MD) simulation. The amides included N-methylformamide (NMF), N-methylacetamide (NMA), and N-methylpropionamide (NMP). The phase diagrams of amide-HFIP-water ternary systems at 298 K showed that phase separation occurs in a closed-loop area of compositions as well as an N,N-dimethylformamide (DMF) system previously reported. The phase separation area becomes wider as the hydrophobicity of amides increases in the order of NMF < NMA < DMF < NMP. Thus, the evolution of HFIP clusters around amides due to the hydrophobic interaction gives rise to phase separation of the mixtures. In contrast, the disruption of HFIP clusters causes the recovery of the homogeneity of the ternary systems. The present results showed that HFIP clusters are evolved with increasing amide content to the lower phase separation concentration in the same mechanism among the four amide systems. However, the disruption of HFIP clusters in the NMP and DMF systems with further increasing amide content to the upper phase separation concentration occurs in a different way from those in the NMF and NMA systems.

  8. Automated deconvolution of structured mixtures from heterogeneous tumor genomic data

    PubMed Central

    Roman, Theodore; Xie, Lu

    2017-01-01

    With increasing appreciation for the extent and importance of intratumor heterogeneity, much attention in cancer research has focused on profiling heterogeneity on a single patient level. Although true single-cell genomic technologies are rapidly improving, they remain too noisy and costly at present for population-level studies. Bulk sequencing remains the standard for population-scale tumor genomics, creating a need for computational tools to separate contributions of multiple tumor clones and assorted stromal and infiltrating cell populations to pooled genomic data. All such methods are limited to coarse approximations of only a few cell subpopulations, however. In prior work, we demonstrated the feasibility of improving cell type deconvolution by taking advantage of substructure in genomic mixtures via a strategy called simplicial complex unmixing. We improve on past work by introducing enhancements to automate learning of substructured genomic mixtures, with specific emphasis on genome-wide copy number variation (CNV) data, as well as the ability to process quantitative RNA expression data, and heterogeneous combinations of RNA and CNV data. We introduce methods for dimensionality estimation to better decompose mixture model substructure; fuzzy clustering to better identify substructure in sparse, noisy data; and automated model inference methods for other key model parameters. We further demonstrate their effectiveness in identifying mixture substructure in true breast cancer CNV data from the Cancer Genome Atlas (TCGA). Source code is available at https://github.com/tedroman/WSCUnmix PMID:29059177

  9. Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres.

    PubMed

    Banerjee, Arindam; Ghosh, Joydeep

    2004-05-01

    Competitive learning mechanisms for clustering, in general, suffer from poor performance for very high-dimensional (>1000) data because of "curse of dimensionality" effects. In applications such as document clustering, it is customary to normalize the high-dimensional input vectors to unit length, and it is sometimes also desirable to obtain balanced clusters, i.e., clusters of comparable sizes. The spherical kmeans (spkmeans) algorithm, which normalizes the cluster centers as well as the inputs, has been successfully used to cluster normalized text documents in 2000+ dimensional space. Unfortunately, like regular kmeans and its soft expectation-maximization-based version, spkmeans tends to generate extremely imbalanced clusters in high-dimensional spaces when the desired number of clusters is large (tens or more). This paper first shows that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von Mises-Fisher distributions as the generative model, and in fact, it can be considered as a batch-mode version of (normalized) competitive learning. The proposed generative model is then adapted in a principled way to yield three frequency-sensitive competitive learning variants that are applicable to static data and produced high-quality and well-balanced clusters for high-dimensional data. Like kmeans, each iteration is linear in the number of data points and in the number of clusters for all the three algorithms. A frequency-sensitive algorithm to cluster streaming data is also proposed. Experimental results on clustering of high-dimensional text data sets are provided to show the effectiveness and applicability of the proposed techniques. Index Terms-Balanced clustering, expectation maximization (EM), frequency-sensitive competitive learning (FSCL), high-dimensional clustering, kmeans, normalized data, scalable clustering, streaming data, text clustering.

  10. Construction of ground-state preserving sparse lattice models for predictive materials simulations

    NASA Astrophysics Data System (ADS)

    Huang, Wenxuan; Urban, Alexander; Rong, Ziqin; Ding, Zhiwei; Luo, Chuan; Ceder, Gerbrand

    2017-08-01

    First-principles based cluster expansion models are the dominant approach in ab initio thermodynamics of crystalline mixtures enabling the prediction of phase diagrams and novel ground states. However, despite recent advances, the construction of accurate models still requires a careful and time-consuming manual parameter tuning process for ground-state preservation, since this property is not guaranteed by default. In this paper, we present a systematic and mathematically sound method to obtain cluster expansion models that are guaranteed to preserve the ground states of their reference data. The method builds on the recently introduced compressive sensing paradigm for cluster expansion and employs quadratic programming to impose constraints on the model parameters. The robustness of our methodology is illustrated for two lithium transition metal oxides with relevance for Li-ion battery cathodes, i.e., Li2xFe2(1-x)O2 and Li2xTi2(1-x)O2, for which the construction of cluster expansion models with compressive sensing alone has proven to be challenging. We demonstrate that our method not only guarantees ground-state preservation on the set of reference structures used for the model construction, but also show that out-of-sample ground-state preservation up to relatively large supercell size is achievable through a rapidly converging iterative refinement. This method provides a general tool for building robust, compressed and constrained physical models with predictive power.

  11. Recognizing patterns of visual field loss using unsupervised machine learning

    NASA Astrophysics Data System (ADS)

    Yousefi, Siamak; Goldbaum, Michael H.; Zangwill, Linda M.; Medeiros, Felipe A.; Bowd, Christopher

    2014-03-01

    Glaucoma is a potentially blinding optic neuropathy that results in a decrease in visual sensitivity. Visual field abnormalities (decreased visual sensitivity on psychophysical tests) are the primary means of glaucoma diagnosis. One form of visual field testing is Frequency Doubling Technology (FDT) that tests sensitivity at 52 points within the visual field. Like other psychophysical tests used in clinical practice, FDT results yield specific patterns of defect indicative of the disease. We used Gaussian Mixture Model with Expectation Maximization (GEM), (EM is used to estimate the model parameters) to automatically separate FDT data into clusters of normal and abnormal eyes. Principal component analysis (PCA) was used to decompose each cluster into different axes (patterns). FDT measurements were obtained from 1,190 eyes with normal FDT results and 786 eyes with abnormal (i.e., glaucomatous) FDT results, recruited from a university-based, longitudinal, multi-center, clinical study on glaucoma. The GEM input was the 52-point FDT threshold sensitivities for all eyes. The optimal GEM model separated the FDT fields into 3 clusters. Cluster 1 contained 94% normal fields (94% specificity) and clusters 2 and 3 combined, contained 77% abnormal fields (77% sensitivity). For clusters 1, 2 and 3 the optimal number of PCA-identified axes were 2, 2 and 5, respectively. GEM with PCA successfully separated FDT fields from healthy and glaucoma eyes and identified familiar glaucomatous patterns of loss.

  12. Inferring sources of polycyclic aromatic hydrocarbons (PAHs) in sediments from the western Taiwan Strait through end-member mixing analysis.

    PubMed

    Li, Tao; Sun, Guihua; Ma, Shengzhong; Liang, Kai; Yang, Chupeng; Li, Bo; Luo, Weidong

    2016-11-15

    Concentration, spatial distribution, composition and sources of polycyclic aromatic hydrocarbons (PAHs) were investigated based on measurements of 16 PAH compounds in surface sediments of the western Taiwan Strait. Total PAH concentrations ranged from 2.41 to 218.54ngg -1 . Cluster analysis identified three site clusters representing the northern, central and southern regions. Sedimentary PAHs mainly originated from a mixture of pyrolytic and petrogenic in the north, from pyrolytic in the central, and from petrogenic in the south. An end-member mixing model was performed using PAH compound data to estimate mixing proportions for unknown end-members (i.e., extreme-value sample points) proposed by principal component analysis (PCA). The results showed that the analyzed samples can be expressed as mixtures of three end-members, and the mixing of different end-members was strongly related to the transport pathway controlled by two currents, which alternately prevail in the Taiwan Strait during different seasons. Copyright © 2016. Published by Elsevier Ltd.

  13. Structure, thermodynamics, and solubility in tetromino fluids.

    PubMed

    Barnes, Brian C; Siderius, Daniel W; Gelb, Lev D

    2009-06-16

    To better understand the self-assembly of small molecules and nanoparticles adsorbed at interfaces, we have performed extensive Monte Carlo simulations of a simple lattice model based on the seven hard "tetrominoes", connected shapes that occupy four lattice sites. The equations of state of the pure fluids and all of the binary mixtures are determined over a wide range of density, and a large selection of multicomponent mixtures are also studied at selected conditions. Calculations are performed in the grand canonical ensemble and are analogous to real systems in which molecules or nanoparticles reversibly adsorb to a surface or interface from a bulk reservoir. The model studied is athermal; objects in these simulations avoid overlap but otherwise do not interact. As a result, all of the behavior observed is entropically driven. The one-component fluids all exhibit marked self-ordering tendencies at higher densities, with quite complex structures formed in some cases. Significant clustering of objects with the same rotational state (orientation) is also observed in some of the pure fluids. In all of the binary mixtures, the two species are fully miscible at large scales, but exhibit strong species-specific clustering (segregation) at small scales. This behavior persists in multicomponent mixtures; even in seven-component mixtures of all the shapes there is significant association between objects of the same shape. To better understand these phenomena, we calculate the second virial coefficients of the tetrominoes and related quantities, extract thermodynamic volume of mixing data from the simulations of binary mixtures, and determine Henry's law solubilities for each shape in a variety of solvents. The overall picture obtained is one in which complementarity of both the shapes of individual objects and the characteristic structures of different fluids are important in determining the overall behavior of a fluid of a given composition, with sometimes counterintuitive results. Finally, we note that no sharp phase transitions are observed but that this appears to be due to the small size of the objects considered. It is likely that complex phase behavior may be found in systems of larger polyominoes.

  14. A Hierarchical Bayesian Procedure for Two-Mode Cluster Analysis

    ERIC Educational Resources Information Center

    DeSarbo, Wayne S.; Fong, Duncan K. H.; Liechty, John; Saxton, M. Kim

    2004-01-01

    This manuscript introduces a new Bayesian finite mixture methodology for the joint clustering of row and column stimuli/objects associated with two-mode asymmetric proximity, dominance, or profile data. That is, common clusters are derived which partition both the row and column stimuli/objects simultaneously into the same derived set of clusters.…

  15. Brome isotope selective control of CF3Br molecule clustering by IR laser radiation in gas-dynamic expansion of CF3Br - Ar mixture

    NASA Astrophysics Data System (ADS)

    Apatin, V. M.; Lokhman, V. N.; Makarov, G. N.; Ogurok, N.-D. D.; Ryabov, E. A.

    2018-02-01

    We report the results of research on the experimental control of CF3Br molecule clustering under gas-dynamic expansion of the CF3Br - Ar mixture at a nozzle exit by using IR laser radiation. A cw CO2 laser is used for exciting molecules and clusters in the beam and a time-of-flight mass-spectrometer with laser UV ionisation of particles for their detection. The parameters of the gas above the nozzle are determined (compositions and pressure) at which intensive molecule clustering occurs. It is found that in the case of the CF3Br gas without carrier when the pressure P0 above the nozzle does not exceed 4 atm, molecular clusters actually are not generated in the beam. If the gas mixture of CF3Br with argon is used at a pressure ratio 1 : N, where N >= 3, and the total pressure above the nozzle is P0 >= 2 atm, then there occurs molecule clustering. We study the dependences of the efficiency of suppressing the molecule clustering on parameters of the exciting pulse, gas parameters above the nozzle, and on a distance of the molecule irradiation zone from the nozzle exit section. It is shown that in the case of resonant vibrational excitation of gas-dynamically cooled CF3Br molecules at the nozzle exit one can realise isotope-selective suppression of molecule clustering with respect to bromine isotopes. With the CF3Br - Ar mixtures having the pressure ratio 1 : 3 and 1 : 15, the enrichment factors obtained with respect to bromine isotopes are kenr ≈ 1.05 ± 0.005 and kenr ≈ 1.06 ± 0.007, respectively, under jet irradiation by laser emission in the 9R(30) line (1084.635 cm-1). The results obtained let us assume that this method can be used to control clustering of molecules comprising heavy element isotopes, which have a small isotopic shift in IR absorption spectra.

  16. Discrete bivariate population balance modelling of heteroaggregation processes.

    PubMed

    Rollié, Sascha; Briesen, Heiko; Sundmacher, Kai

    2009-08-15

    Heteroaggregation in binary particle mixtures was simulated with a discrete population balance model in terms of two internal coordinates describing the particle properties. The considered particle species are of different size and zeta-potential. Property space is reduced with a semi-heuristic approach to enable an efficient solution. Aggregation rates are based on deterministic models for Brownian motion and stability, under consideration of DLVO interaction potentials. A charge-balance kernel is presented, relating the electrostatic surface potential to the property space by a simple charge balance. Parameter sensitivity with respect to the fractal dimension, aggregate size, hydrodynamic correction, ionic strength and absolute particle concentration was assessed. Results were compared to simulations with the literature kernel based on geometric coverage effects for clusters with heterogeneous surface properties. In both cases electrostatic phenomena, which dominate the aggregation process, show identical trends: impeded cluster-cluster aggregation at low particle mixing ratio (1:1), restabilisation at high mixing ratios (100:1) and formation of complex clusters for intermediate ratios (10:1). The particle mixing ratio controls the surface coverage extent of the larger particle species. Simulation results are compared to experimental flow cytometric data and show very satisfactory agreement.

  17. Direct Reconstruction of CT-Based Attenuation Correction Images for PET With Cluster-Based Penalties

    NASA Astrophysics Data System (ADS)

    Kim, Soo Mee; Alessio, Adam M.; De Man, Bruno; Kinahan, Paul E.

    2017-03-01

    Extremely low-dose (LD) CT acquisitions used for PET attenuation correction have high levels of noise and potential bias artifacts due to photon starvation. This paper explores the use of a priori knowledge for iterative image reconstruction of the CT-based attenuation map. We investigate a maximum a posteriori framework with cluster-based multinomial penalty for direct iterative coordinate decent (dICD) reconstruction of the PET attenuation map. The objective function for direct iterative attenuation map reconstruction used a Poisson log-likelihood data fit term and evaluated two image penalty terms of spatial and mixture distributions. The spatial regularization is based on a quadratic penalty. For the mixture penalty, we assumed that the attenuation map may consist of four material clusters: air + background, lung, soft tissue, and bone. Using simulated noisy sinogram data, dICD reconstruction was performed with different strengths of the spatial and mixture penalties. The combined spatial and mixture penalties reduced the root mean squared error (RMSE) by roughly two times compared with a weighted least square and filtered backprojection reconstruction of CT images. The combined spatial and mixture penalties resulted in only slightly lower RMSE compared with a spatial quadratic penalty alone. For direct PET attenuation map reconstruction from ultra-LD CT acquisitions, the combination of spatial and mixture penalties offers regularization of both variance and bias and is a potential method to reconstruct attenuation maps with negligible patient dose. The presented results, using a best-case histogram suggest that the mixture penalty does not offer a substantive benefit over conventional quadratic regularization and diminishes enthusiasm for exploring future application of the mixture penalty.

  18. Field-theoretical approach to a dense polymer with an ideal binary mixture of clustering centers.

    PubMed

    Fantoni, Riccardo; Müller-Nedebock, Kristian K

    2011-07-01

    We propose a field-theoretical approach to a polymer system immersed in an ideal mixture of clustering centers. The system contains several species of these clustering centers with different functionality, each of which connects a fixed number segments of the chain to each other. The field theory is solved using the saddle point approximation and evaluated for dense polymer melts using the random phase approximation. We find a short-ranged effective intersegment interaction with strength dependent on the average segment density and discuss the structure factor within this approximation. We also determine the fractions of linkers of the different functionalities.

  19. Discrete Element Model for Suppression of Coffee-Ring Effect

    NASA Astrophysics Data System (ADS)

    Xu, Ting; Lam, Miu Ling; Chen, Ting-Hsuan

    2017-02-01

    When a sessile droplet evaporates, coffee-ring effect drives the suspended particulate matters to the droplet edge, eventually forming a ring-shaped deposition. Because it causes a non-uniform distribution of solid contents, which is undesired in many applications, attempts have been made to eliminate the coffee-ring effect. Recent reports indicated that the coffee-ring effect can be suppressed by a mixture of spherical and non-spherical particles with enhanced particle-particle interaction at air-water interface. However, a model to comprehend the inter-particulate activities has been lacking. Here, we report a discrete element model (particle system) to investigate the phenomenon. The modeled dynamics included particle traveling following the capillary flow with Brownian motion, and its resultant 3D hexagonal close packing of particles along the contact line. For particles being adsorbed by air-water interface, we modeled cluster growth, cluster deformation, and cluster combination. We found that the suppression of coffee-ring effect does not require a circulatory flow driven by an inward Marangoni flow at air-water interface. Instead, the number of new cluster formation, which can be enhanced by increasing the ratio of non-spherical particles and the overall number of microspheres, is more dominant in the suppression process. Together, this model provides a useful platform elucidating insights for suppressing coffee-ring effect for practical applications in the future.

  20. Variable Screening for Cluster Analysis.

    ERIC Educational Resources Information Center

    Donoghue, John R.

    Inclusion of irrelevant variables in a cluster analysis adversely affects subgroup recovery. This paper examines using moment-based statistics to screen variables; only variables that pass the screening are then used in clustering. Normal mixtures are analytically shown often to possess negative kurtosis. Two related measures, "m" and…

  1. Phase behaviour in complementary DNA-coated gold nanoparticles and fd-viruses mixtures: a numerical study.

    PubMed

    Chiappini, Massimiliano; Eiser, Erika; Sciortino, Francesco

    2017-01-01

    A new gel-forming colloidal system based on a binary mixture of fd-viruses and gold nanoparticles functionalized with complementary DNA single strands has been recently introduced. Upon quenching below the DNA melt temperature, such a system results in a highly porous gel state, that may be developed in a new functional material of tunable porosity. In order to shed light on the gelation mechanism, we introduce a model closely mimicking the experimental one and we explore via Monte Carlo simulations its equilibrium phase diagram. Specifically, we model the system as a binary mixture of hard rods and hard spheres mutually interacting via a short-range square-well attractive potential. In the experimental conditions, we find evidence of a phase separation occurring either via nucleation-and-growth or via spinodal decomposition. The spinodal decomposition leads to the formation of small clusters of bonded rods and spheres whose further diffusion and aggregation leads to the formation of a percolating network in the system. Our results are consistent with the hypothesis that the mixture of DNA-coated fd-viruses and gold nanoparticles undergoes a non-equilibrium gelation via an arrested spinodal decomposition mechanism.

  2. Consensus sediment quality guidelines for polycyclic aromatic hydrocarbon mixtures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swartz, R.C.

    1999-04-01

    Sediment quality guidelines (SQGs) for polycyclic aromatic hydrocarbons (PAHs) have been derived from a variety of laboratory, field, and theoretical foundations. They include the screening level concentration, effects ranges-low and -median, equilibrium partitioning concentrations, apparent effects threshold, {Sigma}PAH model, and threshold and probable effects levels. The resolution of controversial differences among the PAH SQGs lies in an understanding of the effects of mixtures. Polycyclic aromatic hydrocarbons virtually always occur in field-collected sediment as a complex mixture of covarying compounds. When expressed as a mixture concentration, that is, total PAH (TPAH), the guidelines form three clusters that were intended in theirmore » original derivations to represent threshold (TEC = 290 {micro}g/g organic carbon [OC]), median (MEC = 1,800 {micro}g/g OC), and extreme (EEC = 10,000 {micro}g/g OC) effects concentrations. The TEC/MEC/EEC consensus guidelines provide a unifying synthesis of other SQGs, reflect causal rather than correlative effects, account for mixtures, and predict sediment toxicity and benthic community perturbations at sites of PAH contamination. The TEC offers the most useful SQG because PAH mixtures are unlikely to cause adverse effects on benthic ecosystems below the TEC.« less

  3. Weighted community detection and data clustering using message passing

    NASA Astrophysics Data System (ADS)

    Shi, Cheng; Liu, Yanchen; Zhang, Pan

    2018-03-01

    Grouping objects into clusters based on the similarities or weights between them is one of the most important problems in science and engineering. In this work, by extending message-passing algorithms and spectral algorithms proposed for an unweighted community detection problem, we develop a non-parametric method based on statistical physics, by mapping the problem to the Potts model at the critical temperature of spin-glass transition and applying belief propagation to solve the marginals corresponding to the Boltzmann distribution. Our algorithm is robust to over-fitting and gives a principled way to determine whether there are significant clusters in the data and how many clusters there are. We apply our method to different clustering tasks. In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms. In the clustering problem, where the data were generated by mixture models in the sparse regime, we show that our method works all the way down to the theoretical limit of detectability and gives accuracy very close to that of the optimal Bayesian inference. In the semi-supervised clustering problem, our method only needs several labels to work perfectly in classic datasets. Finally, we further develop Thouless-Anderson-Palmer equations which heavily reduce the computation complexity in dense networks but give almost the same performance as belief propagation.

  4. Polycyclic aromatic hydrocarbons as skin carcinogens: Comparison of benzo[a]pyrene, dibenzo[def,p]chrysene and three environmental mixtures in the FVB/N mouse

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Siddens, Lisbeth K.; Larkin, Andrew; Superfund Research Center, Oregon State University

    2012-11-01

    The polycyclic aromatic hydrocarbon (PAH), benzo[a]pyrene (BaP), was compared to dibenzo[def,p]chrysene (DBC) and combinations of three environmental PAH mixtures (coal tar, diesel particulate and cigarette smoke condensate) using a two stage, FVB/N mouse skin tumor model. DBC (4 nmol) was most potent, reaching 100% tumor incidence with a shorter latency to tumor formation, less than 20 weeks of 12-O-tetradecanoylphorbol-13-acetate (TPA) promotion compared to all other treatments. Multiplicity was 4 times greater than BaP (400 nmol). Both PAHs produced primarily papillomas followed by squamous cell carcinoma and carcinoma in situ. Diesel particulate extract (1 mg SRM 1650b; mix 1) did notmore » differ from toluene controls and failed to elicit a carcinogenic response. Addition of coal tar extract (1 mg SRM 1597a; mix 2) produced a response similar to BaP. Further addition of 2 mg of cigarette smoke condensate (mix 3) did not alter the response with mix 2. PAH-DNA adducts measured in epidermis 12 h post initiation and analyzed by {sup 32}P post‐labeling, did not correlate with tumor incidence. PAH‐dependent alteration in transcriptome of skin 12 h post initiation was assessed by microarray. Principal component analysis (sum of all treatments) of the 922 significantly altered genes (p < 0.05), showed DBC and BaP to cluster distinct from PAH mixtures and each other. BaP and mixtures up-regulated phase 1 and phase 2 metabolizing enzymes while DBC did not. The carcinogenicity with DBC and two of the mixtures was much greater than would be predicted based on published Relative Potency Factors (RPFs). -- Highlights: ► Dibenzo[def,p]chrysene (DBC), 3 PAH mixtures, benzo[a]pyrene (BaP) were compared. ► DBC and 2 PAH mixtures were more potent than Relative Potency Factor estimates. ► Transcriptome profiles 12 hours post initiation were analyzed by microarray. ► Principle components analysis of alterations revealed treatment-based clustering. ► DBC gave a unique pattern of gene alterations compared to BaP and PAH mixtures.« less

  5. Subspace K-means clustering.

    PubMed

    Timmerman, Marieke E; Ceulemans, Eva; De Roover, Kim; Van Leeuwen, Karla

    2013-12-01

    To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the existing related clustering methods, including deterministic, stochastic, and unsupervised learning approaches. To evaluate subspace K-means, we performed a comparative simulation study, in which we manipulated the overlap of subspaces, the between-cluster variance, and the error variance. The study shows that the subspace K-means algorithm is sensitive to local minima but that the problem can be reasonably dealt with by using partitions of various cluster procedures as a starting point for the algorithm. Subspace K-means performs very well in recovering the true clustering across all conditions considered and appears to be superior to its competitor methods: K-means, reduced K-means, factorial K-means, mixtures of factor analyzers (MFA), and MCLUST. The best competitor method, MFA, showed a performance similar to that of subspace K-means in easy conditions but deteriorated in more difficult ones. Using data from a study on parental behavior, we show that subspace K-means analysis provides a rich insight into the cluster characteristics, in terms of both the relative positions of the clusters (via the centroids) and the shape of the clusters (via the within-cluster residuals).

  6. A latent class distance association model for cross-classified data with a categorical response variable.

    PubMed

    Vera, José Fernando; de Rooij, Mark; Heiser, Willem J

    2014-11-01

    In this paper we propose a latent class distance association model for clustering in the predictor space of large contingency tables with a categorical response variable. The rows of such a table are characterized as profiles of a set of explanatory variables, while the columns represent a single outcome variable. In many cases such tables are sparse, with many zero entries, which makes traditional models problematic. By clustering the row profiles into a few specific classes and representing these together with the categories of the response variable in a low-dimensional Euclidean space using a distance association model, a parsimonious prediction model can be obtained. A generalized EM algorithm is proposed to estimate the model parameters and the adjusted Bayesian information criterion statistic is employed to test the number of mixture components and the dimensionality of the representation. An empirical example highlighting the advantages of the new approach and comparing it with traditional approaches is presented. © 2014 The British Psychological Society.

  7. Cluster Analysis and Gaussian Mixture Estimation of Correlated Time-Series by Means of Multi-dimensional Scaling

    NASA Astrophysics Data System (ADS)

    Ibuki, Takero; Suzuki, Sei; Inoue, Jun-ichi

    We investigate cross-correlations between typical Japanese stocks collected through Yahoo!Japan website ( http://finance.yahoo.co.jp/ ). By making use of multi-dimensional scaling (MDS) for the cross-correlation matrices, we draw two-dimensional scattered plots in which each point corresponds to each stock. To make a clustering for these data plots, we utilize the mixture of Gaussians to fit the data set to several Gaussian densities. By minimizing the so-called Akaike Information Criterion (AIC) with respect to parameters in the mixture, we attempt to specify the best possible mixture of Gaussians. It might be naturally assumed that all the two-dimensional data points of stocks shrink into a single small region when some economic crisis takes place. The justification of this assumption is numerically checked for the empirical Japanese stock data, for instance, those around 11 March 2011.

  8. Ab initio study of the structural properties of acetonitrile-water mixtures

    NASA Astrophysics Data System (ADS)

    Chen, Jinfan; Sit, Patrick H.-L.

    2015-08-01

    Structural properties of acetonitrile and acetonitrile-water mixtures are studied using Density Functional Theory (DFT) and ab initio molecular dynamics simulations. Stable molecular clusters consisted of several water and acetonitrile molecules are identified to provide microscopic understanding of the interaction among water and acetonitrile molecules. Ab initio molecular dynamics simulations are performed to study the liquid structure at the finite temperature. Three mixing compositions in which the mole fraction of acetonitrile equals 0.109, 0.5 and 0.891 are studied. These compositions correspond to three distinct structural regimes. At the 0.109 and 0.891 mole fraction of acetonitrile, the majority species are mostly connected among themselves and the minority species are either isolated or forming small clusters without disrupting the network of the majority species. At the 0.5 mole fraction of acetonitrile, large water and acetonitrile clusters persist throughout the simulation, exhibiting the microheterogeneous behavior in acetonitrile-water mixtures in the mid-range mixing ratio.

  9. Preferential solvation of lysozyme in dimethyl sulfoxide/water binary mixture probed by terahertz spectroscopy.

    PubMed

    Das, Dipak Kumar; Patra, Animesh; Mitra, Rajib Kumar

    2016-09-01

    We report the changes in the hydration dynamics around a model protein hen egg white lysozyme (HEWL) in water-dimethyl sulfoxide (DMSO) binary mixture using THz time domain spectroscopy (TTDS) technique. DMSO molecules get preferentially solvated at the protein surface, as indicated by circular dichroism (CD) and Fourier transform infrared (FTIR) study in the mid-infrared region, resulting in a conformational change in the protein, which consequently modifies the associated hydration dynamics. As a control we also study the collective hydration dynamics of water-DMSO binary mixture and it is found that it follows a non-ideal behavior owing to the formation of DMSO-water clusters. It is observed that the cooperative dynamics of water at the protein surface does follow the DMSO-mediated conformational modulation of the protein. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Mutual diffusion of binary liquid mixtures containing methanol, ethanol, acetone, benzene, cyclohexane, toluene, and carbon tetrachloride

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guevara-Carrion, Gabriela; Janzen, Tatjana; Muñoz-Muñoz, Y. Mauricio

    Mutual diffusion coefficients of all 20 binary liquid mixtures that can be formed out of methanol, ethanol, acetone, benzene, cyclohexane, toluene, and carbon tetrachloride without a miscibility gap are studied at ambient conditions of temperature and pressure in the entire composition range. The considered mixtures show a varying mixing behavior from almost ideal to strongly non-ideal. Predictive molecular dynamics simulations employing the Green-Kubo formalism are carried out. Radial distribution functions are analyzed to gain an understanding of the liquid structure influencing the diffusion processes. It is shown that cluster formation in mixtures containing one alcoholic component has a significant impactmore » on the diffusion process. The estimation of the thermodynamic factor from experimental vapor-liquid equilibrium data is investigated, considering three excess Gibbs energy models, i.e., Wilson, NRTL, and UNIQUAC. It is found that the Wilson model yields the thermodynamic factor that best suits the simulation results for the prediction of the Fick diffusion coefficient. Four semi-empirical methods for the prediction of the self-diffusion coefficients and nine predictive equations for the Fick diffusion coefficient are assessed and it is found that methods based on local composition models are more reliable. Finally, the shear viscosity and thermal conductivity are predicted and in most cases favorably compared with experimental literature values.« less

  11. Sequential updating of multimodal hydrogeologic parameter fields using localization and clustering techniques

    NASA Astrophysics Data System (ADS)

    Sun, Alexander Y.; Morris, Alan P.; Mohanty, Sitakanta

    2009-07-01

    Estimated parameter distributions in groundwater models may contain significant uncertainties because of data insufficiency. Therefore, adaptive uncertainty reduction strategies are needed to continuously improve model accuracy by fusing new observations. In recent years, various ensemble Kalman filters have been introduced as viable tools for updating high-dimensional model parameters. However, their usefulness is largely limited by the inherent assumption of Gaussian error statistics. Hydraulic conductivity distributions in alluvial aquifers, for example, are usually non-Gaussian as a result of complex depositional and diagenetic processes. In this study, we combine an ensemble Kalman filter with grid-based localization and a Gaussian mixture model (GMM) clustering techniques for updating high-dimensional, multimodal parameter distributions via dynamic data assimilation. We introduce innovative strategies (e.g., block updating and dimension reduction) to effectively reduce the computational costs associated with these modified ensemble Kalman filter schemes. The developed data assimilation schemes are demonstrated numerically for identifying the multimodal heterogeneous hydraulic conductivity distributions in a binary facies alluvial aquifer. Our results show that localization and GMM clustering are very promising techniques for assimilating high-dimensional, multimodal parameter distributions, and they outperform the corresponding global ensemble Kalman filter analysis scheme in all scenarios considered.

  12. Iron Catalyst Chemistry in High Pressure Carbon Monoxide Nanotube Reactor

    NASA Technical Reports Server (NTRS)

    Scott, Carl D.; Povitsky, Alexander; Dateo, Christopher; Gokcen, Tahir; Smalley, Richard E.

    2001-01-01

    The high-pressure carbon monoxide (HiPco) technique for producing single wall carbon nanotubes (SWNT) is analyzed using a chemical reaction model coupled with properties calculated along streamlines. Streamline properties for mixing jets are calculated by the FLUENT code using the k-e turbulent model for pure carbon monixide. The HiPco process introduces cold iron pentacarbonyl diluted in CO, or alternatively nitrogen, at high pressure, ca. 30 atmospheres into a conical mixing zone. Hot CO is also introduced via three jets at angles with respect to the axis of the reactor. Hot CO decomposes the Fe(CO)5 to release atomic Fe. Cluster reaction rates are from Krestinin, et aI., based on shock tube measurements. Another model is from classical cluster theory given by Girshick's team. The calculations are performed on streamlines that assume that a cold mixture of Fe(CO)5 in CO is introduced along the reactor axis. Then iron forms clusters that catalyze the formation of SWNTs from the Boudouard reaction on Fe-containing clusters by reaction with CO. To simulate the chemical process along streamlines that were calculated by the fluid dynamics code FLUENT, a time history of temperature and dilution are determined along streamlines. Alternative catalyst injection schemes are also evaluated.

  13. Density control of dodecamanganese clusters anchored on silicon(100).

    PubMed

    Condorelli, Guglielmo G; Motta, Alessandro; Favazza, Maria; Nativo, Paola; Fragalà, Ignazio L; Gatteschi, Dante

    2006-04-24

    A synthetic strategy to control the density of Mn12 clusters anchored on silicon(100) was investigated. Diluted monolayers suitable for Mn12 anchoring were prepared by Si-grafting mixtures of the methyl 10-undecylenoate precursor ligand with 1-decene spectator spacers. Different ratios of these mixtures were tested. The grafted surfaces were hydrolyzed to reveal the carboxylic groups available for the subsequent exchange with the [Mn12O12(OAc)16(H2O)4]4 H2O2 AcOH cluster. Modified surfaces were analyzed by attenuated total reflection (ATR)-FTIR spectroscopy, X-ray photoemission spectroscopy (XPS), and AFM imaging. Results of XPS and ATR-FTIR spectroscopy show that the surface mole ratio between grafted ester and decene is higher than in the source solution. The surface density of the Mn12 cluster is, in turn, strictly proportional to the ester mole fraction. Well-resolved and isolated clusters were observed by AFM, using a diluted ester/decene 1:1 solution.

  14. Multimodal brain-tumor segmentation based on Dirichlet process mixture model with anisotropic diffusion and Markov random field prior.

    PubMed

    Lu, Yisu; Jiang, Jun; Yang, Wei; Feng, Qianjin; Chen, Wufan

    2014-01-01

    Brain-tumor segmentation is an important clinical requirement for brain-tumor diagnosis and radiotherapy planning. It is well-known that the number of clusters is one of the most important parameters for automatic segmentation. However, it is difficult to define owing to the high diversity in appearance of tumor tissue among different patients and the ambiguous boundaries of lesions. In this study, a nonparametric mixture of Dirichlet process (MDP) model is applied to segment the tumor images, and the MDP segmentation can be performed without the initialization of the number of clusters. Because the classical MDP segmentation cannot be applied for real-time diagnosis, a new nonparametric segmentation algorithm combined with anisotropic diffusion and a Markov random field (MRF) smooth constraint is proposed in this study. Besides the segmentation of single modal brain-tumor images, we developed the algorithm to segment multimodal brain-tumor images by the magnetic resonance (MR) multimodal features and obtain the active tumor and edema in the same time. The proposed algorithm is evaluated using 32 multimodal MR glioma image sequences, and the segmentation results are compared with other approaches. The accuracy and computation time of our algorithm demonstrates very impressive performance and has a great potential for practical real-time clinical use.

  15. Multimodal Brain-Tumor Segmentation Based on Dirichlet Process Mixture Model with Anisotropic Diffusion and Markov Random Field Prior

    PubMed Central

    Lu, Yisu; Jiang, Jun; Chen, Wufan

    2014-01-01

    Brain-tumor segmentation is an important clinical requirement for brain-tumor diagnosis and radiotherapy planning. It is well-known that the number of clusters is one of the most important parameters for automatic segmentation. However, it is difficult to define owing to the high diversity in appearance of tumor tissue among different patients and the ambiguous boundaries of lesions. In this study, a nonparametric mixture of Dirichlet process (MDP) model is applied to segment the tumor images, and the MDP segmentation can be performed without the initialization of the number of clusters. Because the classical MDP segmentation cannot be applied for real-time diagnosis, a new nonparametric segmentation algorithm combined with anisotropic diffusion and a Markov random field (MRF) smooth constraint is proposed in this study. Besides the segmentation of single modal brain-tumor images, we developed the algorithm to segment multimodal brain-tumor images by the magnetic resonance (MR) multimodal features and obtain the active tumor and edema in the same time. The proposed algorithm is evaluated using 32 multimodal MR glioma image sequences, and the segmentation results are compared with other approaches. The accuracy and computation time of our algorithm demonstrates very impressive performance and has a great potential for practical real-time clinical use. PMID:25254064

  16. Thermodynamics of mixtures of patchy and spherical colloids of different sizes: A multi-body association theory with complete reference fluid information.

    PubMed

    Bansal, Artee; Valiya Parambathu, Arjun; Asthagiri, D; Cox, Kenneth R; Chapman, Walter G

    2017-04-28

    We present a theory to predict the structure and thermodynamics of mixtures of colloids of different diameters, building on our earlier work [A. Bansal et al., J. Chem. Phys. 145, 074904 (2016)] that considered mixtures with all particles constrained to have the same size. The patchy, solvent particles have short-range directional interactions, while the solute particles have short-range isotropic interactions. The hard-sphere mixture without any association site forms the reference fluid. An important ingredient within the multi-body association theory is the description of clustering of the reference solvent around the reference solute. Here we account for the physical, multi-body clusters of the reference solvent around the reference solute in terms of occupancy statistics in a defined observation volume. These occupancy probabilities are obtained from enhanced sampling simulations, but we also present statistical mechanical models to estimate these probabilities with limited simulation data. Relative to an approach that describes only up to three-body correlations in the reference, incorporating the complete reference information better predicts the bonding state and thermodynamics of the physical solute for a wide range of system conditions. Importantly, analysis of the residual chemical potential of the infinitely dilute solute from molecular simulation and theory shows that whereas the chemical potential is somewhat insensitive to the description of the structure of the reference fluid, the energetic and entropic contributions are not, with the results from the complete reference approach being in better agreement with particle simulations.

  17. Thermodynamics of mixtures of patchy and spherical colloids of different sizes: A multi-body association theory with complete reference fluid information

    NASA Astrophysics Data System (ADS)

    Bansal, Artee; Valiya Parambathu, Arjun; Asthagiri, D.; Cox, Kenneth R.; Chapman, Walter G.

    2017-04-01

    We present a theory to predict the structure and thermodynamics of mixtures of colloids of different diameters, building on our earlier work [A. Bansal et al., J. Chem. Phys. 145, 074904 (2016)] that considered mixtures with all particles constrained to have the same size. The patchy, solvent particles have short-range directional interactions, while the solute particles have short-range isotropic interactions. The hard-sphere mixture without any association site forms the reference fluid. An important ingredient within the multi-body association theory is the description of clustering of the reference solvent around the reference solute. Here we account for the physical, multi-body clusters of the reference solvent around the reference solute in terms of occupancy statistics in a defined observation volume. These occupancy probabilities are obtained from enhanced sampling simulations, but we also present statistical mechanical models to estimate these probabilities with limited simulation data. Relative to an approach that describes only up to three-body correlations in the reference, incorporating the complete reference information better predicts the bonding state and thermodynamics of the physical solute for a wide range of system conditions. Importantly, analysis of the residual chemical potential of the infinitely dilute solute from molecular simulation and theory shows that whereas the chemical potential is somewhat insensitive to the description of the structure of the reference fluid, the energetic and entropic contributions are not, with the results from the complete reference approach being in better agreement with particle simulations.

  18. Relation of neural structure to persistently low academic achievement: a longitudinal study of children with differing birth weights.

    PubMed

    Clark, Caron A C; Fang, Hua; Espy, Kimberly Andrews; Filipek, Pauline A; Juranek, Jenifer; Bangert, Barbara; Hack, Maureen; Taylor, H Gerry

    2013-05-01

    This study examined the relation of cerebral tissue reductions associated with VLBW to patterns of growth in core academic domains. Children born <750 g, 750 to 1,499 g, or >2,500 g completed measures of calculation, mathematical problem solving, and word decoding at time points spanning middle childhood and adolescence. K. A. Espy, H. Fang, D. Charak, N. M. Minich, and H. G. Taylor (2009, Growth mixture modeling of academic achievement in children of varying birth weight risk, Neuropsychology, Vol. 23, pp. 460-474) used growth mixture modeling to identify two growth trajectories (clusters) for each academic domain: an average achievement trajectory and a persistently low trajectory. In this study, 97 of the same participants underwent magnetic resonance imaging (MRI) in late adolescence, and cerebral tissue volumes were used to predict the probability of low growth cluster membership for each domain. Adjusting for whole brain volume (wbv), each 1-cm(3) reduction in caudate volume was associated with a 1.7- to 2.1-fold increase in the odds of low cluster membership for each domain. Each 1-mm(2) decrease in corpus callosum surface area increased these odds approximately 1.02-fold. Reduced cerebellar white matter volume was associated specifically with low calculation and decoding growth, and reduced cerebral white matter volume was associated with low calculation growth. Findings were similar when analyses were confined to the VLBW groups. Reduced volume of structures involved in connectivity, executive attention, and motor control may contribute to heterogeneous academic trajectories among children with VLBW.

  19. Efficient ensemble forecasting of marine ecology with clustered 1D models and statistical lateral exchange: application to the Red Sea

    NASA Astrophysics Data System (ADS)

    Dreano, Denis; Tsiaras, Kostas; Triantafyllou, George; Hoteit, Ibrahim

    2017-07-01

    Forecasting the state of large marine ecosystems is important for many economic and public health applications. However, advanced three-dimensional (3D) ecosystem models, such as the European Regional Seas Ecosystem Model (ERSEM), are computationally expensive, especially when implemented within an ensemble data assimilation system requiring several parallel integrations. As an alternative to 3D ecological forecasting systems, we propose to implement a set of regional one-dimensional (1D) water-column ecological models that run at a fraction of the computational cost. The 1D model domains are determined using a Gaussian mixture model (GMM)-based clustering method and satellite chlorophyll-a (Chl-a) data. Regionally averaged Chl-a data is assimilated into the 1D models using the singular evolutive interpolated Kalman (SEIK) filter. To laterally exchange information between subregions and improve the forecasting skills, we introduce a new correction step to the assimilation scheme, in which we assimilate a statistical forecast of future Chl-a observations based on information from neighbouring regions. We apply this approach to the Red Sea and show that the assimilative 1D ecological models can forecast surface Chl-a concentration with high accuracy. The statistical assimilation step further improves the forecasting skill by as much as 50%. This general approach of clustering large marine areas and running several interacting 1D ecological models is very flexible. It allows many combinations of clustering, filtering and regression technics to be used and can be applied to build efficient forecasting systems in other large marine ecosystems.

  20. Mean-cluster approach indicates cell sorting time scales are determined by collective dynamics

    NASA Astrophysics Data System (ADS)

    Beatrici, Carine P.; de Almeida, Rita M. C.; Brunnet, Leonardo G.

    2017-03-01

    Cell migration is essential to cell segregation, playing a central role in tissue formation, wound healing, and tumor evolution. Considering random mixtures of two cell types, it is still not clear which cell characteristics define clustering time scales. The mass of diffusing clusters merging with one another is expected to grow as td /d +2 when the diffusion constant scales with the inverse of the cluster mass. Cell segregation experiments deviate from that behavior. Explanations for that could arise from specific microscopic mechanisms or from collective effects, typical of active matter. Here we consider a power law connecting diffusion constant and cluster mass to propose an analytic approach to model cell segregation where we explicitly take into account finite-size corrections. The results are compared with active matter model simulations and experiments available in the literature. To investigate the role played by different mechanisms we considered different hypotheses describing cell-cell interaction: differential adhesion hypothesis and different velocities hypothesis. We find that the simulations yield normal diffusion for long time intervals. Analytic and simulation results show that (i) cluster evolution clearly tends to a scaling regime, disrupted only at finite-size limits; (ii) cluster diffusion is greatly enhanced by cell collective behavior, such that for high enough tendency to follow the neighbors, cluster diffusion may become independent of cluster size; (iii) the scaling exponent for cluster growth depends only on the mass-diffusion relation, not on the detailed local segregation mechanism. These results apply for active matter systems in general and, in particular, the mechanisms found underlying the increase in cell sorting speed certainly have deep implications in biological evolution as a selection mechanism.

  1. Predicting Adolescents' Bullying Participation from Developmental Trajectories of Social Status and Behavior.

    PubMed

    Pouwels, J Loes; Salmivalli, Christina; Saarento, Silja; van den Berg, Yvonne H M; Lansu, Tessa A M; Cillessen, Antonius H N

    2017-03-28

    The aim of this study was to determine how trajectory clusters of social status (social preference and perceived popularity) and behavior (direct aggression and prosocial behavior) from age 9 to age 14 predicted adolescents' bullying participant roles at age 16 and 17 (n = 266). Clusters were identified with multivariate growth mixture modeling (GMM). The findings showed that participants' developmental trajectories of social status and social behavior across childhood and early adolescence predicted their bullying participant role involvement in adolescence. Practical implications and suggestions for further research are discussed. © 2017 The Authors. Child Development published by Wiley Periodicals, Inc. on behalf of Society for Research in Child Development.

  2. Photothermal microfluidic cantilever deflection spectroscopy reflecting clustering mechanism of ethanol water mixtures

    NASA Astrophysics Data System (ADS)

    Ghoraishi, Maryam; Hawk, John; Thundat, Thomas

    Aqueous mixture of alcohol is a typical prototype for biomolecules, micelle formation, and structural stability of proteins. Therefore, Short chain alcohols such as EtOH have been used as a simple model for understanding of more complex aqueous biomolecules. Here we study vibrational energy peaks of EtOH water binary mixtures using micromechanical calorimetric spectroscopy using bimaterial microfluidic cantilevers (BMC). The IR spectra of EtOH-water are experimentally collected employing a BMC as concentration of EtOH changes from 20-100 wt%. As concentration of EtOH varies in the mixture, considerable shifts in the wavenumber at IR absorption peak maxima are reported. The experimentally measured shifts in the wavenumber at IR absorption peak maxima are related to changes in dipole moment (μ) of EtOH at different concentration. The relationship between IR absorption wavenumber for both anti and gauche conformers of EtOH, and inverse dipole moment, 1/ μ, of EtOH at different concentrations follows a power law dependence. Our technique offers a platform to investigate dipole effect on molecular vibrations of mixtures in confined picoliter volumes, previously unexplored with other analytical techniques due to limitations of volume under study.

  3. Automated flow cytometric analysis across large numbers of samples and cell types.

    PubMed

    Chen, Xiaoyi; Hasan, Milena; Libri, Valentina; Urrutia, Alejandra; Beitz, Benoît; Rouilly, Vincent; Duffy, Darragh; Patin, Étienne; Chalmond, Bernard; Rogge, Lars; Quintana-Murci, Lluis; Albert, Matthew L; Schwikowski, Benno

    2015-04-01

    Multi-parametric flow cytometry is a key technology for characterization of immune cell phenotypes. However, robust high-dimensional post-analytic strategies for automated data analysis in large numbers of donors are still lacking. Here, we report a computational pipeline, called FlowGM, which minimizes operator input, is insensitive to compensation settings, and can be adapted to different analytic panels. A Gaussian Mixture Model (GMM)-based approach was utilized for initial clustering, with the number of clusters determined using Bayesian Information Criterion. Meta-clustering in a reference donor permitted automated identification of 24 cell types across four panels. Cluster labels were integrated into FCS files, thus permitting comparisons to manual gating. Cell numbers and coefficient of variation (CV) were similar between FlowGM and conventional gating for lymphocyte populations, but notably FlowGM provided improved discrimination of "hard-to-gate" monocyte and dendritic cell (DC) subsets. FlowGM thus provides rapid high-dimensional analysis of cell phenotypes and is amenable to cohort studies. Copyright © 2015. Published by Elsevier Inc.

  4. Optical depth in particle-laden turbulent flows

    NASA Astrophysics Data System (ADS)

    Frankel, A.; Iaccarino, G.; Mani, A.

    2017-11-01

    Turbulent clustering of particles causes an increase in the radiation transmission through gas-particle mixtures. Attempts to capture the ensemble-averaged transmission lead to a closure problem called the turbulence-radiation interaction. A simple closure model based on the particle radial distribution function is proposed to capture the effect of turbulent fluctuations in the concentration on radiation intensity. The model is validated against a set of particle-resolved ray tracing experiments through particle fields from direct numerical simulations of particle-laden turbulence. The form of the closure model is generalizable to arbitrary stochastic media with known two-point correlation functions.

  5. Patterns of glaucomatous visual field loss in sita fields automatically identified using independent component analysis.

    PubMed

    Goldbaum, Michael H; Jang, Gil-Jin; Bowd, Chris; Hao, Jiucang; Zangwill, Linda M; Liebmann, Jeffrey; Girkin, Christopher; Jung, Tzyy-Ping; Weinreb, Robert N; Sample, Pamela A

    2009-12-01

    To determine if the patterns uncovered with variational Bayesian-independent component analysis-mixture model (VIM) applied to a large set of normal and glaucomatous fields obtained with the Swedish Interactive Thresholding Algorithm (SITA) are distinct, recognizable, and useful for modeling the severity of the field loss. SITA fields were obtained with the Humphrey Visual Field Analyzer (Carl Zeiss Meditec, Inc, Dublin, California) on 1,146 normal eyes and 939 glaucoma eyes from subjects followed by the Diagnostic Innovations in Glaucoma Study and the African Descent and Glaucoma Evaluation Study. VIM modifies independent component analysis (ICA) to develop separate sets of ICA axes in the cluster of normal fields and the 2 clusters of abnormal fields. Of 360 models, the model with the best separation of normal and glaucomatous fields was chosen for creating the maximally independent axes. Grayscale displays of fields generated by VIM on each axis were compared. SITA fields most closely associated with each axis and displayed in grayscale were evaluated for consistency of pattern at all severities. The best VIM model had 3 clusters. Cluster 1 (1,193) was mostly normal (1,089, 95% specificity) and had 2 axes. Cluster 2 (596) contained mildly abnormal fields (513) and 2 axes; cluster 3 (323) held mostly moderately to severely abnormal fields (322) and 5 axes. Sensitivity for clusters 2 and 3 combined was 88.9%. The VIM-generated field patterns differed from each other and resembled glaucomatous defects (eg, nasal step, arcuate, temporal wedge). SITA fields assigned to an axis resembled each other and the VIM-generated patterns for that axis. Pattern severity increased in the positive direction of each axis by expansion or deepening of the axis pattern. VIM worked well on SITA fields, separating them into distinctly different yet recognizable patterns of glaucomatous field defects. The axis and pattern properties make VIM a good candidate as a preliminary process for detecting progression.

  6. Clustering and variable selection in the presence of mixed variable types and missing data.

    PubMed

    Storlie, C B; Myers, S M; Katusic, S K; Weaver, A L; Voigt, R G; Croarkin, P E; Stoeckel, R E; Port, J D

    2018-05-17

    We consider the problem of model-based clustering in the presence of many correlated, mixed continuous, and discrete variables, some of which may have missing values. Discrete variables are treated with a latent continuous variable approach, and the Dirichlet process is used to construct a mixture model with an unknown number of components. Variable selection is also performed to identify the variables that are most influential for determining cluster membership. The work is motivated by the need to cluster patients thought to potentially have autism spectrum disorder on the basis of many cognitive and/or behavioral test scores. There are a modest number of patients (486) in the data set along with many (55) test score variables (many of which are discrete valued and/or missing). The goal of the work is to (1) cluster these patients into similar groups to help identify those with similar clinical presentation and (2) identify a sparse subset of tests that inform the clusters in order to eliminate unnecessary testing. The proposed approach compares very favorably with other methods via simulation of problems of this type. The results of the autism spectrum disorder analysis suggested 3 clusters to be most likely, while only 4 test scores had high (>0.5) posterior probability of being informative. This will result in much more efficient and informative testing. The need to cluster observations on the basis of many correlated, continuous/discrete variables with missing values is a common problem in the health sciences as well as in many other disciplines. Copyright © 2018 John Wiley & Sons, Ltd.

  7. Mixedness determination of rare earth-doped ceramics

    NASA Astrophysics Data System (ADS)

    Czerepinski, Jennifer H.

    The lack of chemical uniformity in a powder mixture, such as clustering of a minor component, can lead to deterioration of materials properties. A method to determine powder mixture quality is to correlate the chemical homogeneity of a multi-component mixture with its particle size distribution and mixing method. This is applicable to rare earth-doped ceramics, which require at least 1-2 nm dopant ion spacing to optimize optical properties. Mixedness simulations were conducted for random heterogeneous mixtures of Nd-doped LaF3 mixtures using the Concentric Shell Model of Mixedness (CSMM). Results indicate that when the host to dopant particle size ratio is 100, multi-scale concentration variance is optimized. In order to verify results from the model, experimental methods that probe a mixture at the micro, meso, and macro scales are needed. To directly compare CSMM results experimentally, an image processing method was developed to calculate variance profiles from electron images. An in-lens (IL) secondary electron image is subtracted from the corresponding Everhart-Thornley (ET) secondary electron image in a Field-Emission Scanning Electron Microscope (FESEM) to produce two phases and pores that can be quantified with 50 nm spatial resolution. A macro was developed to quickly analyze multi-scale compositional variance from these images. Results for a 50:50 mixture of NdF3 and LaF3 agree with the computational model. The method has proven to be applicable only for mixtures with major components and specific particle morphologies, but the macro is useful for any type of imaging that produces excellent phase contrast, such as confocal microscopy. Fluorescence spectroscopy was used as an indirect method to confirm computational results for Nd-doped LaF3 mixtures. Fluorescence lifetime can be used as a quantitative method to indirectly measure chemical homogeneity when the limits of electron microscopy have been reached. Fluorescence lifetime represents the compositional fluctuations of a dopant on the nanoscale while accounting for billions of particles in a fast, non-destructive manner. The significance of this study will show how small-scale fluctuations in homogeneity limit the optimization of optical properties, which can be improved by the proper selection of particle size and mixing method.

  8. Partially supervised speaker clustering.

    PubMed

    Tang, Hao; Chu, Stephen Mingyu; Hasegawa-Johnson, Mark; Huang, Thomas S

    2012-05-01

    Content-based multimedia indexing, retrieval, and processing as well as multimedia databases demand the structuring of the media content (image, audio, video, text, etc.), one significant goal being to associate the identity of the content to the individual segments of the signals. In this paper, we specifically address the problem of speaker clustering, the task of assigning every speech utterance in an audio stream to its speaker. We offer a complete treatment to the idea of partially supervised speaker clustering, which refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. By means of an independent training data set, we encode the prior knowledge at the various stages of the speaker clustering pipeline via 1) learning a speaker-discriminative acoustic feature transformation, 2) learning a universal speaker prior model, and 3) learning a discriminative speaker subspace, or equivalently, a speaker-discriminative distance metric. We study the directional scattering property of the Gaussian mixture model (GMM) mean supervector representation of utterances in the high-dimensional space, and advocate exploiting this property by using the cosine distance metric instead of the euclidean distance metric for speaker clustering in the GMM mean supervector space. We propose to perform discriminant analysis based on the cosine distance metric, which leads to a novel distance metric learning algorithm—linear spherical discriminant analysis (LSDA). We show that the proposed LSDA formulation can be systematically solved within the elegant graph embedding general dimensionality reduction framework. Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform traditional speaker clustering methods based on the “bag of acoustic features” representation and statistical model-based distance metrics, 2) our advocated use of the cosine distance metric yields consistent increases in the speaker clustering performance as compared to the commonly used euclidean distance metric, 3) our partially supervised speaker clustering concept and strategies significantly improve the speaker clustering performance over the baselines, and 4) our proposed LSDA algorithm further leads to state-of-the-art speaker clustering performance.

  9. Differential Attenuation of NMR Signals by Complementary Ion-Exchange Resin Beads for De Novo Analysis of Complex Metabolomics Mixtures.

    PubMed

    Zhang, Bo; Yuan, Jiaqi; Brüschweiler, Rafael

    2017-07-12

    A primary goal of metabolomics is the characterization of a potentially very large number of metabolites that are part of complex mixtures. Application to biofluids and tissue samples offers insights into biochemical metabolic pathways and their role in health and disease. 1D 1 H and 2D 13 C- 1 H HSQC NMR spectra are most commonly used for this purpose. They yield quantitative information about each proton of the mixture, but do not tell which protons belong to the same molecule. Interpretation requires the use of NMR spectral databases, which naturally limits these investigations to known metabolites. Here, a new method is presented that uses complementary ion exchange resin beads to differentially attenuate 2D NMR cross-peaks that belong to different metabolites. Based on their characteristic attenuation patterns, cross-peaks could be clustered and assigned to individual molecules, including unknown metabolites with multiple spin systems, as demonstrated for a metabolite model mixture and E. coli cell lysate. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. A Climatology of Global Aerosol Mixtures to Support Sentinel-5P and Earthcare Mission Applications

    NASA Astrophysics Data System (ADS)

    Taylor, M.; Kazadzis, S.; Amaridis, V.; Kahn, R. A.

    2015-11-01

    Since constraining aerosol type with satellite remote sensing continues to be a challenge, we present a newly derived global climatology of aerosol mixtures to support atmospheric composition studies that are planned for Sentinel-5P and EarthCARE.The global climatology is obtained via application of iterative cluster analysis to gridded global decadal and seasonal mean values of the aerosol optical depth (AOD) of sulfate, biomass burning, mineral dust and marine aerosol as a proportion of the total AOD at 500nm output from the Goddard Chemistry Aerosol Radiation and Transport (GOCART). For both the decadal and seasonal means, the number of aerosol mixtures (clusters) identified is ≈10. Analysis of the percentage contribution of the component aerosol types to each mixture allowed development of a straightforward naming convention and taxonomy, and assignment of primary colours for the generation of true colour-mixing and easy-to-interpret maps of the spatial distribution of clusters across the global grid. To further help characterize the mixtures, aerosol robotic network (AERONET) Level 2.0 Version 2 inversion products were extracted from each cluster‟s spatial domain and used to estimate climatological values of key optical and microphysical parameters.The aerosol type climatology represents current knowledge that would be enhanced, possibly corrected, and refined by high temporal and spectral resolution, cloud-free observations produced by Sentinel-5P and EarthCARE instruments. The global decadal mean and seasonal gridded partitions comprise a preliminary reference framework and global climatology that can help inform the choice of components and mixtures in aerosol retrieval algorithms used by instruments such as TROPOMI and ATLID, and to test retrieval results.

  11. MUSIC-Expected maximization gaussian mixture methodology for clustering and detection of task-related neuronal firing rates.

    PubMed

    Ortiz-Rosario, Alexis; Adeli, Hojjat; Buford, John A

    2017-01-15

    Researchers often rely on simple methods to identify involvement of neurons in a particular motor task. The historical approach has been to inspect large groups of neurons and subjectively separate neurons into groups based on the expertise of the investigator. In cases where neuron populations are small it is reasonable to inspect these neuronal recordings and their firing rates carefully to avoid data omissions. In this paper, a new methodology is presented for automatic objective classification of neurons recorded in association with behavioral tasks into groups. By identifying characteristics of neurons in a particular group, the investigator can then identify functional classes of neurons based on their relationship to the task. The methodology is based on integration of a multiple signal classification (MUSIC) algorithm to extract relevant features from the firing rate and an expectation-maximization Gaussian mixture algorithm (EM-GMM) to cluster the extracted features. The methodology is capable of identifying and clustering similar firing rate profiles automatically based on specific signal features. An empirical wavelet transform (EWT) was used to validate the features found in the MUSIC pseudospectrum and the resulting signal features captured by the methodology. Additionally, this methodology was used to inspect behavioral elements of neurons to physiologically validate the model. This methodology was tested using a set of data collected from awake behaving non-human primates. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Scattering Properties of Heterogeneous Mineral Particles with Absorbing Inclusions

    NASA Technical Reports Server (NTRS)

    Dlugach, Janna M.; Mishchenko, Michael I.

    2015-01-01

    We analyze the results of numerically exact computer modeling of scattering and absorption properties of randomly oriented poly-disperse heterogeneous particles obtained by placing microscopic absorbing grains randomly on the surfaces of much larger spherical mineral hosts or by imbedding them randomly inside the hosts. These computations are paralleled by those for heterogeneous particles obtained by fully encapsulating fractal-like absorbing clusters in the mineral hosts. All computations are performed using the superposition T-matrix method. In the case of randomly distributed inclusions, the results are compared with the outcome of Lorenz-Mie computations for an external mixture of the mineral hosts and absorbing grains. We conclude that internal aggregation can affect strongly both the integral radiometric and differential scattering characteristics of the heterogeneous particle mixtures.

  13. Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling.

    PubMed

    Chen, Vicky; Paisley, John; Lu, Xinghua

    2017-03-14

    Cancer is a complex disease driven by somatic genomic alterations (SGAs) that perturb signaling pathways and consequently cellular function. Identifying patterns of pathway perturbations would provide insights into common disease mechanisms shared among tumors, which is important for guiding treatment and predicting outcome. However, identifying perturbed pathways is challenging, because different tumors can have the same perturbed pathways that are perturbed by different SGAs. Here, we designed novel semantic representations that capture the functional similarity of distinct SGAs perturbing a common pathway in different tumors. Combining this representation with topic modeling would allow us to identify patterns in altered signaling pathways. We represented each gene with a vector of words describing its function, and we represented the SGAs of a tumor as a text document by pooling the words representing individual SGAs. We applied the nested hierarchical Dirichlet process (nHDP) model to a collection of tumors of 5 cancer types from TCGA. We identified topics (consisting of co-occurring words) representing the common functional themes of different SGAs. Tumors were clustered based on their topic associations, such that each cluster consists of tumors sharing common functional themes. The resulting clusters contained mixtures of cancer types, which indicates that different cancer types can share disease mechanisms. Survival analysis based on the clusters revealed significant differences in survival among the tumors of the same cancer type that were assigned to different clusters. The results indicate that applying topic modeling to semantic representations of tumors identifies patterns in the combinations of altered functional pathways in cancer.

  14. Modeling and clustering water demand patterns from real-world smart meter data

    NASA Astrophysics Data System (ADS)

    Cheifetz, Nicolas; Noumir, Zineb; Samé, Allou; Sandraz, Anne-Claire; Féliers, Cédric; Heim, Véronique

    2017-08-01

    Nowadays, drinking water utilities need an acute comprehension of the water demand on their distribution network, in order to efficiently operate the optimization of resources, manage billing and propose new customer services. With the emergence of smart grids, based on automated meter reading (AMR), a better understanding of the consumption modes is now accessible for smart cities with more granularities. In this context, this paper evaluates a novel methodology for identifying relevant usage profiles from the water consumption data produced by smart meters. The methodology is fully data-driven using the consumption time series which are seen as functions or curves observed with an hourly time step. First, a Fourier-based additive time series decomposition model is introduced to extract seasonal patterns from time series. These patterns are intended to represent the customer habits in terms of water consumption. Two functional clustering approaches are then used to classify the extracted seasonal patterns: the functional version of K-means, and the Fourier REgression Mixture (FReMix) model. The K-means approach produces a hard segmentation and K representative prototypes. On the other hand, the FReMix is a generative model and also produces K profiles as well as a soft segmentation based on the posterior probabilities. The proposed approach is applied to a smart grid deployed on the largest water distribution network (WDN) in France. The two clustering strategies are evaluated and compared. Finally, a realistic interpretation of the consumption habits is given for each cluster. The extensive experiments and the qualitative interpretation of the resulting clusters allow one to highlight the effectiveness of the proposed methodology.

  15. Monthly streamflow forecasting based on hidden Markov model and Gaussian Mixture Regression

    NASA Astrophysics Data System (ADS)

    Liu, Yongqi; Ye, Lei; Qin, Hui; Hong, Xiaofeng; Ye, Jiajun; Yin, Xingli

    2018-06-01

    Reliable streamflow forecasts can be highly valuable for water resources planning and management. In this study, we combined a hidden Markov model (HMM) and Gaussian Mixture Regression (GMR) for probabilistic monthly streamflow forecasting. The HMM is initialized using a kernelized K-medoids clustering method, and the Baum-Welch algorithm is then executed to learn the model parameters. GMR derives a conditional probability distribution for the predictand given covariate information, including the antecedent flow at a local station and two surrounding stations. The performance of HMM-GMR was verified based on the mean square error and continuous ranked probability score skill scores. The reliability of the forecasts was assessed by examining the uniformity of the probability integral transform values. The results show that HMM-GMR obtained reasonably high skill scores and the uncertainty spread was appropriate. Different HMM states were assumed to be different climate conditions, which would lead to different types of observed values. We demonstrated that the HMM-GMR approach can handle multimodal and heteroscedastic data.

  16. Multilocus sequence typing and pulsed-field gel electrophoresis analysis of Oenococcus oeni from different wine-producing regions of China.

    PubMed

    Wang, Tao; Li, Hua; Wang, Hua; Su, Jing

    2015-04-16

    The present study established a typing method with NotI-based pulsed-field gel electrophoresis (PFGE) and stress response gene schemed multilocus sequence typing (MLST) for 55 Oenococcus oeni strains isolated from six individual regions in China and two model strains PSU-1 (CP000411) and ATCC BAA-1163 (AAUV00000000). Seven stress response genes, cfa, clpL, clpP, ctsR, mleA, mleP and omrA, were selected for MLST testing, and positive selective pressure was detected for these genes. Furthermore, both methods separated the strains into two clusters. The PFGE clusters are correlated with the region, whereas the sequence types (STs) formed by the MLST confirm the two clusters identified by PFGE. In addition, the population structure was a mixture of evolutionary pathways, and the strains exhibited both clonal and panmictic characteristics. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Generation of spectral clusters in a mixture of noble and Raman-active gases.

    PubMed

    Hosseini, Pooria; Abdolvand, Amir; St J Russell, Philip

    2016-12-01

    We report a novel scheme for the generation of dense clusters of Raman sidebands. The scheme uses a broadband-guiding hollow-core photonic crystal fiber (HC-PCF) filled with a mixture of H2, D2, and Xe for efficient interaction between the gas mixture and a green laser pump pulse (532 nm, 1 ns) of only 5 μJ of energy. This results in the generation from noise of more than 135 rovibrational Raman sidebands covering the visible spectral region with an average spacing of only 2.2 THz. Such a spectrally dense and compact fiber-based source is ideal for applications where closely spaced narrow-band laser lines with high spectral power density are required, such as in spectroscopy and sensing. When the HC-PCF is filled with a H2-D2 mixture, the Raman comb spans the spectral region from the deep UV (280 nm) to the near infrared (1000 nm).

  18. Glaucomatous patterns in Frequency Doubling Technology (FDT) perimetry data identified by unsupervised machine learning classifiers.

    PubMed

    Bowd, Christopher; Weinreb, Robert N; Balasubramanian, Madhusudhanan; Lee, Intae; Jang, Giljin; Yousefi, Siamak; Zangwill, Linda M; Medeiros, Felipe A; Girkin, Christopher A; Liebmann, Jeffrey M; Goldbaum, Michael H

    2014-01-01

    The variational Bayesian independent component analysis-mixture model (VIM), an unsupervised machine-learning classifier, was used to automatically separate Matrix Frequency Doubling Technology (FDT) perimetry data into clusters of healthy and glaucomatous eyes, and to identify axes representing statistically independent patterns of defect in the glaucoma clusters. FDT measurements were obtained from 1,190 eyes with normal FDT results and 786 eyes with abnormal FDT results from the UCSD-based Diagnostic Innovations in Glaucoma Study (DIGS) and African Descent and Glaucoma Evaluation Study (ADAGES). For all eyes, VIM input was 52 threshold test points from the 24-2 test pattern, plus age. FDT mean deviation was -1.00 dB (S.D. = 2.80 dB) and -5.57 dB (S.D. = 5.09 dB) in FDT-normal eyes and FDT-abnormal eyes, respectively (p<0.001). VIM identified meaningful clusters of FDT data and positioned a set of statistically independent axes through the mean of each cluster. The optimal VIM model separated the FDT fields into 3 clusters. Cluster N contained primarily normal fields (1109/1190, specificity 93.1%) and clusters G1 and G2 combined, contained primarily abnormal fields (651/786, sensitivity 82.8%). For clusters G1 and G2 the optimal number of axes were 2 and 5, respectively. Patterns automatically generated along axes within the glaucoma clusters were similar to those known to be indicative of glaucoma. Fields located farther from the normal mean on each glaucoma axis showed increasing field defect severity. VIM successfully separated FDT fields from healthy and glaucoma eyes without a priori information about class membership, and identified familiar glaucomatous patterns of loss.

  19. Laser ablation synthesis of arsenic-phosphide Asm Pn clusters from As-P mixtures. Laser desorption ionisation with quadrupole ion trap time-of-flight mass spectrometry: The mass spectrometer as a synthesizer.

    PubMed

    Kubáček, Pavel; Prokeš, Lubomír; Pamreddy, Annapurna; Peña-Méndez, Eladia María; Conde, José Elias; Alberti, Milan; Havel, Josef

    2018-05-30

    Only a few arsenic phosphides are known. A high potential for the generation of new compounds is offered by Laser Ablation Synthesis (LAS) and when Laser Desorption Ionization (LDI) is coupled with simultaneous Time-Of-Flight Mass Spectrometry (TOFMS), immediate identification of the clusters can be achieved. LAS was used for the generation of arsenic phosphides via laser ablation of phosphorus-arsenic mixtures while quadrupole ion trap time-of-flight mass spectrometry (QIT-TOFMS) was used to acquire the mass spectra. Many new As m P n ± clusters (479 binary and 369 mono-elemental) not yet described in the literature were generated in the gas phase and their stoichiometry determined. The likely structures for some of the observed clusters arbitrary selected (20) were computed by density functional theory (DFT) optimization. LAS is an advantageous approach for the generation of new As m P n clusters, while mass spectrometry was found to be an efficient technique for the determination of cluster stoichiometry. The results achieved might inspire the synthesis of new materials. Copyright © 2018 John Wiley & Sons, Ltd.

  20. Surface anisotropy of iron oxide nanoparticles and slabs from first principles: Influence of coatings and ligands as a test of the Heisenberg model

    NASA Astrophysics Data System (ADS)

    Brymora, Katarzyna; Calvayrac, Florent

    2017-07-01

    We performed ab initio computations of the magnetic properties of simple iron oxide clusters and slabs. We considered an iron oxide cluster functionalized by a molecule or glued to a gold cluster of the same size. We also considered a magnetite slab coated by cobalt oxide or a mixture of iron oxide and cobalt oxide. The changes in magnetic behavior were explored using constrained magnetic calculations. A possible value for the surface anisotropy was estimated from the fit of a classical Heisenberg model on ab initio results. The value was found to be compatible with estimations obtained by other means, or inferred from experimental results. The addition of a ligand, coating, or of a metallic nanoparticle to the systems degraded the quality of the description by the Heisenberg Hamiltonian. Proposing a change in the anisotropies allowing for the proportion of each transition atom we could get a much better description of the magnetism of series of hybrid cobalt and iron oxide systems.

  1. Identifying technical aliases in SELDI mass spectra of complex mixtures of proteins

    PubMed Central

    2013-01-01

    Background Biomarker discovery datasets created using mass spectrum protein profiling of complex mixtures of proteins contain many peaks that represent the same protein with different charge states. Correlated variables such as these can confound the statistical analyses of proteomic data. Previously we developed an algorithm that clustered mass spectrum peaks that were biologically or technically correlated. Here we demonstrate an algorithm that clusters correlated technical aliases only. Results In this paper, we propose a preprocessing algorithm that can be used for grouping technical aliases in mass spectrometry protein profiling data. The stringency of the variance allowed for clustering is customizable, thereby affecting the number of peaks that are clustered. Subsequent analysis of the clusters, instead of individual peaks, helps reduce difficulties associated with technically-correlated data, and can aid more efficient biomarker identification. Conclusions This software can be used to pre-process and thereby decrease the complexity of protein profiling proteomics data, thus simplifying the subsequent analysis of biomarkers by decreasing the number of tests. The software is also a practical tool for identifying which features to investigate further by purification, identification and confirmation. PMID:24010718

  2. Extension of mixture-of-experts networks for binary classification of hierarchical data.

    PubMed

    Ng, Shu-Kay; McLachlan, Geoffrey J

    2007-09-01

    For many applied problems in the context of medically relevant artificial intelligence, the data collected exhibit a hierarchical or clustered structure. Ignoring the interdependence between hierarchical data can result in misleading classification. In this paper, we extend the mechanism for mixture-of-experts (ME) networks for binary classification of hierarchical data. Another extension is to quantify cluster-specific information on data hierarchy by random effects via the generalized linear mixed-effects model (GLMM). The extension of ME networks is implemented by allowing for correlation in the hierarchical data in both the gating and expert networks via the GLMM. The proposed model is illustrated using a real thyroid disease data set. In our study, we consider 7652 thyroid diagnosis records from 1984 to early 1987 with complete information on 20 attribute values. We obtain 10 independent random splits of the data into a training set and a test set in the proportions 85% and 15%. The test sets are used to assess the generalization performance of the proposed model, based on the percentage of misclassifications. For comparison, the results obtained from the ME network with independence assumption are also included. With the thyroid disease data, the misclassification rate on test sets for the extended ME network is 8.9%, compared to 13.9% for the ME network. In addition, based on model selection methods described in Section 2, a network with two experts is selected. These two expert networks can be considered as modeling two groups of patients with high and low incidence rates. Significant variation among the predicted cluster-specific random effects is detected in the patient group with low incidence rate. It is shown that the extended ME network outperforms the ME network for binary classification of hierarchical data. With the thyroid disease data, useful information on the relative log odds of patients with diagnosed conditions at different periods can be evaluated. This information can be taken into consideration for the assessment of treatment planning of the disease. The proposed extended ME network thus facilitates a more general approach to incorporate data hierarchy mechanism in network modeling.

  3. A quasichemical approach for protein-cluster free energies in dilute solution

    NASA Astrophysics Data System (ADS)

    Young, Teresa M.; Roberts, Christopher J.

    2007-10-01

    Reversible formation of protein oligomers or small clusters is a key step in processes such as protein polymerization, fibril formation, and protein phase separation from dilute solution. A straightforward, statistical mechanical approach to accurately calculate cluster free energies in solution is presented using a cell-based, quasichemical (QC) approximation for the partition function of proteins in an implicit solvent. The inputs to the model are the protein potential of mean force (PMF) and the corresponding subcell degeneracies up to relatively low particle densities. The approach is tested using simple two and three dimensional lattice models in which proteins interact with either isotropic or anisotropic nearest-neighbor attractions. Comparison with direct Monte Carlo simulation shows that cluster probabilities and free energies of oligomer formation (ΔGi0) are quantitatively predicted by the QC approach for protein volume fractions ˜10-2 (weight/volume concentration ˜10gl-1) and below. For small clusters, ΔGi0 depends weakly on the strength of short-ranged attractive interactions for most experimentally relevant values of the normalized osmotic second virial coefficient (b2*). For larger clusters (i ≫2), there is a small but non-negligible b2* dependence. The results suggest that nonspecific, hydrophobic attractions may not significantly stabilize prenuclei in processes such as non-native aggregation. Biased Monte Carlo methods are shown to accurately provide subcell degeneracies that are intractable to obtain analytically or by direct enumeration, and so offer a means to generalize the approach to mixtures and proteins with more complex PMFs.

  4. Insulin Resistance: Regression and Clustering

    PubMed Central

    Yoon, Sangho; Assimes, Themistocles L.; Quertermous, Thomas; Hsiao, Chin-Fu; Chuang, Lee-Ming; Hwu, Chii-Min; Rajaratnam, Bala; Olshen, Richard A.

    2014-01-01

    In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with “main effects” is not satisfactory, but prediction that includes interactions may be. PMID:24887437

  5. Mixing and demixing of binary mixtures of polar chiral active particles.

    PubMed

    Ai, Bao-Quan; Shao, Zhi-Gang; Zhong, Wei-Rong

    2018-05-17

    We study a binary mixture of polar chiral (counterclockwise or clockwise) active particles in a two-dimensional box with periodic boundary conditions. Besides the excluded volume interactions between particles, the particles are also subjected to the polar velocity alignment. From the extensive Brownian dynamics simulations, it is found that the particle configuration (mixing or demixing) is determined by the competition between the chirality difference and the polar velocity alignment. When the chirality difference competes with the polar velocity alignment, the clockwise particles aggregate in one cluster and the counterclockwise particles aggregate in the other cluster; thus, the particles are demixed and can be separated. However, when the chirality difference or the polar velocity alignment is dominant, the particles are mixed. Our findings could be used for the experimental pursuit of the separation of binary mixtures of chiral active particles.

  6. Numerical trials of HISSE

    NASA Technical Reports Server (NTRS)

    Peters, C.; Kampe, F. (Principal Investigator)

    1980-01-01

    The mathematical description and implementation of the statistical estimation procedure known as the Houston integrated spatial/spectral estimator (HISSE) is discussed. HISSE is based on a normal mixture model and is designed to take advantage of spectral and spatial information of LANDSAT data pixels, utilizing the initial classification and clustering information provided by the AMOEBA algorithm. The HISSE calculates parametric estimates of class proportions which reduce the error inherent in estimates derived from typical classify and count procedures common to nonparametric clustering algorithms. It also singles out spatial groupings of pixels which are most suitable for labeling classes. These calculations are designed to aid the analyst/interpreter in labeling patches with a crop class label. Finally, HISSE's initial performance on an actual LANDSAT agricultural ground truth data set is reported.

  7. Circular Mixture Modeling of Color Distribution for Blind Stain Separation in Pathology Images.

    PubMed

    Li, Xingyu; Plataniotis, Konstantinos N

    2017-01-01

    In digital pathology, to address color variation and histological component colocalization in pathology images, stain decomposition is usually performed preceding spectral normalization and tissue component segmentation. This paper examines the problem of stain decomposition, which is a naturally nonnegative matrix factorization (NMF) problem in algebra, and introduces a systematical and analytical solution consisting of a circular color analysis module and an NMF-based computation module. Unlike the paradigm of existing stain decomposition algorithms where stain proportions are computed from estimated stain spectra using a matrix inverse operation directly, the introduced solution estimates stain spectra and stain depths via probabilistic reasoning individually. Since the proposed method pays extra attentions to achromatic pixels in color analysis and stain co-occurrence in pixel clustering, it achieves consistent and reliable stain decomposition with minimum decomposition residue. Particularly, aware of the periodic and angular nature of hue, we propose the use of a circular von Mises mixture model to analyze the hue distribution, and provide a complete color-based pixel soft-clustering solution to address color mixing introduced by stain overlap. This innovation combined with saturation-weighted computation makes our study effective for weak stains and broad-spectrum stains. Extensive experimentation on multiple public pathology datasets suggests that our approach outperforms state-of-the-art blind stain separation methods in terms of decomposition effectiveness.

  8. Induced liquid-crystalline ordering in solutions of stiff and flexible amphiphilic macromolecules: Effect of mixture composition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Glagolev, Mikhail K.; Vasilevskaya, Valentina V., E-mail: vvvas@polly.phys.msu.ru; Khokhlov, Alexei R.

    Impact of mixture composition on self-organization in concentrated solutions of stiff helical and flexible macromolecules was studied by means of molecular dynamics simulation. The macromolecules were composed of identical amphiphilic monomer units but a fraction f of macromolecules had stiff helical backbones and the remaining chains were flexible. In poor solvents the compacted flexible macromolecules coexist with bundles or filament clusters from few intertwined stiff helical macromolecules. The increase of relative content f of helical macromolecules leads to increase of the length of helical clusters, to alignment of clusters with each other, and then to liquid-crystalline-like ordering along a singlemore » direction. The formation of filament clusters causes segregation of helical and flexible macromolecules and the alignment of the filaments induces effective liquid-like ordering of flexible macromolecules. A visual analysis and calculation of order parameter relaying the anisotropy of diffraction allow concluding that transition from disordered to liquid-crystalline state proceeds sharply at relatively low content of stiff components.« less

  9. A method of using cluster analysis to study statistical dependence in multivariate data

    NASA Technical Reports Server (NTRS)

    Borucki, W. J.; Card, D. H.; Lyle, G. C.

    1975-01-01

    A technique is presented that uses both cluster analysis and a Monte Carlo significance test of clusters to discover associations between variables in multidimensional data. The method is applied to an example of a noisy function in three-dimensional space, to a sample from a mixture of three bivariate normal distributions, and to the well-known Fisher's Iris data.

  10. CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data.

    PubMed

    Fidaner, Işık Barış; Cankorur-Cetinkaya, Ayca; Dikicioglu, Duygu; Kirdar, Betul; Cemgil, Ali Taylan; Oliver, Stephen G

    2016-02-01

    Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications. The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG. sgo24@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  11. Real-Time EEG Signal Enhancement Using Canonical Correlation Analysis and Gaussian Mixture Clustering

    PubMed Central

    Huang, Chih-Sheng; Yang, Wen-Yu; Chuang, Chun-Hsiang; Wang, Yu-Kai

    2018-01-01

    Electroencephalogram (EEG) signals are usually contaminated with various artifacts, such as signal associated with muscle activity, eye movement, and body motion, which have a noncerebral origin. The amplitude of such artifacts is larger than that of the electrical activity of the brain, so they mask the cortical signals of interest, resulting in biased analysis and interpretation. Several blind source separation methods have been developed to remove artifacts from the EEG recordings. However, the iterative process for measuring separation within multichannel recordings is computationally intractable. Moreover, manually excluding the artifact components requires a time-consuming offline process. This work proposes a real-time artifact removal algorithm that is based on canonical correlation analysis (CCA), feature extraction, and the Gaussian mixture model (GMM) to improve the quality of EEG signals. The CCA was used to decompose EEG signals into components followed by feature extraction to extract representative features and GMM to cluster these features into groups to recognize and remove artifacts. The feasibility of the proposed algorithm was demonstrated by effectively removing artifacts caused by blinks, head/body movement, and chewing from EEG recordings while preserving the temporal and spectral characteristics of the signals that are important to cognitive research. PMID:29599950

  12. Molecular dynamics investigations of liquid-vapor interaction and adsorption of formaldehyde, oxocarbons, and water in graphitic slit pores.

    PubMed

    Huang, Pei-Hsing; Hung, Shang-Chao; Huang, Ming-Yueh

    2014-08-07

    Formaldehyde exposure has been associated with several human cancers, including leukemia and nasopharyngeal carcinoma, motivating the present investigation on the microscopic adsorption behaviors of formaldehyde in multi-component-mixture-filled micropores. Molecular dynamics (MD) simulation was used to investigate the liquid-vapor interaction and adsorption of formaldehyde, oxocarbons, and water in graphitic slit pores. The effects of the slit width, system temperature, concentration, and the constituent ratio of the mixture on the diffusion and adsorption properties are studied. As a result of interactions between the components, the z-directional self-diffusivity (D(z)) in the mixture substantially decreased by about one order of magnitude as compared with that of pure (single-constituent) adsorbates. When the concentration exceeds a certain threshold, the D(z) values dramatically decrease due to over-saturation inducing barriers to diffusion. The binding energy between the adsorbate and graphite at the first adsorption monolayer is calculated to be 3.99, 2.01, 3.49, and 2.67 kcal mol(-1) for CO2, CO, CH2O, and H2O, respectively. These values agree well with those calculated using the density functional theory coupled cluster method and experimental results. A low solubility of CO2 in water and water preferring to react with CH2O, forming hydrated methanediol clusters, are observed. Because the cohesion in a hydrated methanediol cluster is much higher than the adhesion between clusters and the graphitic surface, the hydrated methanediol clusters were hydrophobic, exhibiting a large contact angle on graphite.

  13. Viscosity of Associated Mixtures Approximated by the Grunberg-Nissan Model

    NASA Astrophysics Data System (ADS)

    Marczak, W.; Adamczyk, N.; Łężniak, M.

    2012-04-01

    Previous experiments demonstrated that microheterogeneities occur in liquid systems (2-methylpyridine or 2,6-dimethylpyridine) + water. They are most probably due to the association of the hydrates through hydrogen bonds between water molecules. Substitution of methanol for water causes that the mixtures become homogenous. The results of viscometric studies reported in this study confirmed that the molecular clusters in aqueous solutions are much larger than the complexes occurring in the methanolic systems. Taking into consideration "kinetic entities" rather than monomeric molecules, the dependence of viscosity on concentration and temperature have been satisfactorily approximated by the Grunberg-Nissan relation with two adjustable coefficients. The kinetic entities were trimers of water, dimers of methanol, and monomeric amines. The same approach proved to be valid for the activation energy of viscous flow as well.

  14. Gene expression profiles in rainbow trout, Onchorynchus mykiss, exposed to a simple chemical mixture.

    PubMed

    Hook, Sharon E; Skillman, Ann D; Gopalan, Banu; Small, Jack A; Schultz, Irvin R

    2008-03-01

    Among proposed uses for microarrays in environmental toxiciology is the identification of key contributors to toxicity within a mixture. However, it remains uncertain whether the transcriptomic profiles resulting from exposure to a mixture have patterns of altered gene expression that contain identifiable contributions from each toxicant component. We exposed isogenic rainbow trout Onchorynchus mykiss, to sublethal levels of ethynylestradiol, 2,2,4,4-tetrabromodiphenyl ether, and chromium VI or to a mixture of all three toxicants Fluorescently labeled complementary DNA (cDNA) were generated and hybridized against a commercially available Salmonid array spotted with 16,000 cDNAs. Data were analyzed using analysis of variance (p<0.05) with a Benjamani-Hochberg multiple test correction (Genespring [Agilent] software package) to identify up and downregulated genes. Gene clustering patterns that can be used as "expression signatures" were determined using hierarchical cluster analysis. The gene ontology terms associated with significantly altered genes were also used to identify functional groups that were associated with toxicant exposure. Cross-ontological analytics approach was used to assign functional annotations to genes with "unknown" function. Our analysis indicates that transcriptomic profiles resulting from the mixture exposure resemble those of the individual contaminant exposures, but are not a simple additive list. However, patterns of altered genes representative of each component of the mixture are clearly discernible, and the functional classes of genes altered represent the individual components of the mixture. These findings indicate that the use of microarrays to identify transcriptomic profiles may aid in the identification of key stressors within a chemical mixture, ultimately improving environmental assessment.

  15. COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS*

    PubMed Central

    Keller, Joshua P.; Drton, Mathias; Larson, Timothy; Kaufman, Joel D.; Sandler, Dale P.; Szpiro, Adam A.

    2017-01-01

    Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive k-means procedure identifies centers using a mixture model and is followed by multi-class spatial prediction. In simulations, we demonstrate that predictive k-means can reduce misclassification error by over 50% compared to ordinary k-means, with minimal loss in cluster representativeness. The improved prediction accuracy results in large gains of 30% or more in power for detecting effect modification by cluster in a simulated health analysis. In an analysis of the NIEHS Sister Study cohort using predictive k-means, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. Our cluster-based analysis shows that for subjects assigned to a cluster located in the Midwestern U.S., a 10 μg/m3 difference in exposure is associated with 4.37 mmHg (95% CI, 2.38, 6.35) higher SBP. PMID:28572869

  16. Hydration of alcohol clusters in 1-propanol-water mixture studied by quasielastic neutron scattering and an interpretation of anomalous excess partial molar volume.

    PubMed

    Misawa, M; Inamura, Y; Hosaka, D; Yamamuro, O

    2006-08-21

    Quasielastic neutron scattering measurements have been made for 1-propanol-water mixtures in a range of alcohol concentration from 0.0 to 0.167 in mole fraction at 25 degrees C. Fraction alpha of water molecules hydrated to fractal surface of alcohol clusters in 1-propanol-water mixture was obtained as a function of alcohol concentration. Average hydration number N(ws) of 1-propanol molecule is derived from the value of alpha as a function of alcohol concentration. By extrapolating N(ws) to infinite dilution, we obtain values of 12-13 as hydration number of isolated 1-propanol molecule. A simple interpretation of structural origin of anomalous excess partial molar volume of water is proposed and as a result a simple equation for the excess partial molar volume is deduced in terms of alpha. Calculated values of the excess partial molar volumes of water and 1-propanol and the excess molar volume of the mixture are in good agreement with experimental values.

  17. Hydrogen bonding in a mixture of protic ionic liquids: a molecular dynamics simulation study.

    PubMed

    Paschek, Dietmar; Golub, Benjamin; Ludwig, Ralf

    2015-04-07

    We report results of molecular dynamics (MD) simulations characterising the hydrogen bonding in mixtures of two different protic ionic liquids sharing the same cation: triethylammonium-methylsulfonate (TEAMS) and triethylammonium-triflate (TEATF). The triethylammonium-cation acts as a hydrogen-bond donor, being able to donate a single hydrogen-bond. Both, the methylsulfonate- and the triflate-anions can act as hydrogen-bond acceptors, which can accept multiple hydrogen bonds via their respective SO3-groups. In addition, replacing a methyl-group in the methylsulfonate by a trifluoromethyl-group in the triflate significantly weakens the strength of a hydrogen bond from an adjacent triethylammonium cation to the oxygen-site in the SO3-group of the anion. Our MD simulations show that these subtle differences in hydrogen bond strength significantly affect the formation of differently-sized hydrogen-bonded aggregates in these mixtures as a function of the mixture-composition. Moreover, the reported hydrogen-bonded cluster sizes can be predicted and explained by a simple combinatorial lattice model, based on the approximate coordination number of the ions, and using statistical weights that mostly account for the fact that each anion can only accept three hydrogen bonds.

  18. Calculation of composition distribution of ultrafine ion-H2O-H2SO4 clusters using a modified binary ion nucleation theory

    NASA Technical Reports Server (NTRS)

    Singh, J. J.; Smith, A. S.; Chan, L. Y.; Yue, G. K.

    1982-01-01

    Thomson's ion nucleation theory was modified to include the effects of curvature dependence of the microscopic surface tension of field dependent, nonlinear, dielectric properties of the liquid; and of sulfuric acid hydrate formation in binary mixtures of water and sulfuric acid vapors. The modified theory leads to a broadening of the ion cluster spectrum, and shifts it towards larger numbers of H2O and H2SO4 molecules. Whether there is more shifting towards larger numbers of H2O or H2SO4 molecules depends on the relative humidity and relative acidity of the mixture. Usually, a broadening of the spectrum is accompanied by a lowering of the mean cluster intensity. For fixed values of relative humidity and relative acidity, a similar broadening pattern is observed when the temperature is lowered. These features of the modified theory illustrate that a trace of sulfuric acid can facilitate the formation of ultrafine, stable, prenucleation ion clusters as well as the growth of the prenucleation ion clusters towards the critical saddle point conditions, even with low values of relative humidity and relative acidity.

  19. BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation

    PubMed Central

    Heidelberg, John F.; Tully, Benjamin J.

    2017-01-01

    Metagenomics has become an integral part of defining microbial diversity in various environments. Many ecosystems have characteristically low biomass and few cultured representatives. Linking potential metabolisms to phylogeny in environmental microorganisms is important for interpreting microbial community functions and the impacts these communities have on geochemical cycles. However, with metagenomic studies there is the computational hurdle of ‘binning’ contigs into phylogenetically related units or putative genomes. Binning methods have been implemented with varying approaches such as k-means clustering, Gaussian mixture models, hierarchical clustering, neural networks, and two-way clustering; however, many of these suffer from biases against low coverage/abundance organisms and closely related taxa/strains. We are introducing a new binning method, BinSanity, that utilizes the clustering algorithm affinity propagation (AP), to cluster assemblies using coverage with compositional based refinement (tetranucleotide frequency and percent GC content) to optimize bins containing multiple source organisms. This separation of composition and coverage based clustering reduces bias for closely related taxa. BinSanity was developed and tested on artificial metagenomes varying in size and complexity. Results indicate that BinSanity has a higher precision, recall, and Adjusted Rand Index compared to five commonly implemented methods. When tested on a previously published environmental metagenome, BinSanity generated high completion and low redundancy bins corresponding with the published metagenome-assembled genomes. PMID:28289564

  20. Real-Time Kinetic Probes Support Monothiol Glutaredoxins As Intermediate Carriers in Fe-S Cluster Biosynthetic Pathways.

    PubMed

    Vranish, James N; Das, Deepika; Barondeau, David P

    2016-11-18

    Iron-sulfur (Fe-S) clusters are protein cofactors that are required for many essential cellular functions. Fe-S clusters are synthesized and inserted into target proteins by an elaborate biosynthetic process. The insensitivity of most Fe-S assembly and transfer assays requires high concentrations for components and places major limits on reaction complexity. Recently, fluorophore labels were shown to be effective at reporting cluster content for Fe-S proteins. Here, the incorporation of this labeling approach allowed the design and interrogation of complex Fe-S cluster biosynthetic reactions that mimic in vivo conditions. A bacterial Fe-S assembly complex, composed of the cysteine desulfurase IscS and scaffold protein IscU, was used to generate [2Fe-2S] clusters for transfer to mixtures of putative intermediate carrier and acceptor proteins. The focus of this study was to test whether the monothiol glutaredoxin, Grx4, functions as an obligate [2Fe-2S] carrier protein in the Fe-S cluster distribution network. Interestingly, [2Fe-2S] clusters generated by the IscS-IscU complex transferred to Grx4 at rates comparable to previous assays using uncomplexed IscU as a cluster source in chaperone-assisted transfer reactions. Further, we provide evidence that [2Fe-2S]-Grx4 delivers clusters to multiple classes of Fe-S targets via direct ligand exchange in a process that is both dynamic and reversible. Global fits of cluster transfer kinetics support a model in which Grx4 outcompetes terminal target proteins for IscU-bound [2Fe-2S] clusters and functions as an intermediate cluster carrier. Overall, these studies demonstrate the power of chemically conjugated fluorophore reporters for unraveling mechanistic details of biological metal cofactor assembly and distribution networks.

  1. Paramagnetic Attraction of Impurity-Helium Solids

    NASA Technical Reports Server (NTRS)

    Bernard, E. P.; Boltnev, R. E.; Khmelenko, V. V.; Lee, D. M.

    2003-01-01

    Impurity-helium solids are formed when a mixture of impurity and helium gases enters a volume of superfluid helium. Typical choices of impurity gas are hydrogen deuteride, deuterium, nitrogen, neon and argon, or a mixture of these. These solids consist of individual impurity atoms and molecules as well as clusters of impurity atoms and molecules covered with layers of solidified helium. The clusters have an imperfect crystalline structure and diameters ranging up to 90 angstroms, depending somewhat on the choice of impurity. Immediately following formation the clusters aggregate into loosely connected porous solids that are submerged in and completely permeated by the liquid helium. Im-He solids are extremely effective at stabilizing high concentrations of free radicals, which can be introduced by applying a high power RF dis- charge to the impurity gas mixture just before it strikes the super fluid helium. Average concentrations of 10(exp 19) nitrogen atoms/cc and 5 x 10(exp 18) deuterium atoms/cc can be achieved this way. It shows a typical sample formed from a mixture of atomic and molecular hydrogen and deuterium. It shows typical sample formed from atomic and molecular nitrogen. Much of the stability of Im-He solids is attributed to their very large surface area to volume ratio and their permeation by super fluid helium. Heat resulting from a chance meeting and recombination of free radicals is quickly dissipated by the super fluid helium instead of thermally promoting the diffusion of other nearby free radicals.

  2. Modification of Gaussian mixture models for data classification in high energy physics

    NASA Astrophysics Data System (ADS)

    Štěpánek, Michal; Franc, Jiří; Kůs, Václav

    2015-01-01

    In high energy physics, we deal with demanding task of signal separation from background. The Model Based Clustering method involves the estimation of distribution mixture parameters via the Expectation-Maximization algorithm in the training phase and application of Bayes' rule in the testing phase. Modifications of the algorithm such as weighting, missing data processing, and overtraining avoidance will be discussed. Due to the strong dependence of the algorithm on initialization, genetic optimization techniques such as mutation, elitism, parasitism, and the rank selection of individuals will be mentioned. Data pre-processing plays a significant role for the subsequent combination of final discriminants in order to improve signal separation efficiency. Moreover, the results of the top quark separation from the Tevatron collider will be compared with those of standard multivariate techniques in high energy physics. Results from this study has been used in the measurement of the inclusive top pair production cross section employing DØ Tevatron full Runll data (9.7 fb-1).

  3. Experimentally Derived Mechanical and Flow Properties of Fine-grained Soil Mixtures

    NASA Astrophysics Data System (ADS)

    Schneider, J.; Peets, C. S.; Flemings, P. B.; Day-Stirrat, R. J.; Germaine, J. T.

    2009-12-01

    As silt content in mudrocks increases, compressibility linearly decreases and permeability exponentially increases. We prepared mixtures of natural Boston Blue Clay (BBC) and synthetic silt in the ratios of 100:0, 86:14, 68:32, and 50:50, respectively. To recreate natural conditions yet remove variability and soil disturbance, we resedimented all mixtures to a total stress of 100 kPa. We then loaded them to approximately 2.3 MPa in a CRS (constant-rate-of-strain) uniaxial consolidation device. The analyses show that the higher the silt content in the mixture, the stiffer the material is. Compression index as well as liquid and plastic limits linearly decrease with increasing silt content. Vertical permeability increases exponentially with porosity as well as with silt content. Fabric alignment determined through High Resolution X-ray Texture Goniometry (HRXTG) expressed as maximum pole density (m.r.d.) decreases with silt content at a given stress. However, this relationship is not linear instead there are two clusters: the mixtures with higher clay contents (100:0, 84:16) have m.r.d. around 3.9 and mixtures with higher silt contents (68:32, 50:50) have m.r.d. around 2.5. Specific surface area (SSA) measurements show a positive correlation to the total clay content. The amount of silt added to the clay reduces specific surface area, grain orientation, and fabric alignment; thus, it affects compression and fluid flow behavior on a micro- and macroscale. Our results are comparable with previous studies such as kaolinite / silt mixtures (Konrad & Samson [2000], Wagg & Konrad [1990]). We are studying this behavior to understand how fine-grained rocks consolidate. This problem is important to practical and fundamental programs. For example, these sediments can potentially act as either a tight gas reservoir or a seal for hydrocarbons or geologic storage of CO2. This study also provides a systematic approach for developing models of permeability and compressibility behavior needed as inputs for basin modeling.

  4. A New Cluster Analysis-Marker-Controlled Watershed Method for Separating Particles of Granular Soils.

    PubMed

    Alam, Md Ferdous; Haque, Asadul

    2017-10-18

    An accurate determination of particle-level fabric of granular soils from tomography data requires a maximum correct separation of particles. The popular marker-controlled watershed separation method is widely used to separate particles. However, the watershed method alone is not capable of producing the maximum separation of particles when subjected to boundary stresses leading to crushing of particles. In this paper, a new separation method, named as Monash Particle Separation Method (MPSM), has been introduced. The new method automatically determines the optimal contrast coefficient based on cluster evaluation framework to produce the maximum accurate separation outcomes. Finally, the particles which could not be separated by the optimal contrast coefficient were separated by integrating cuboid markers generated from the clustering by Gaussian mixture models into the routine watershed method. The MPSM was validated on a uniformly graded sand volume subjected to one-dimensional compression loading up to 32 MPa. It was demonstrated that the MPSM is capable of producing the best possible separation of particles required for the fabric analysis.

  5. Experimental study of fusion neutron and proton yields produced by petawatt-laser-irradiated D₂-³He or CD₄-³He clustering gases.

    PubMed

    Bang, W; Barbui, M; Bonasera, A; Quevedo, H J; Dyer, G; Bernstein, A C; Hagel, K; Schmidt, K; Gaul, E; Donovan, M E; Consoli, F; De Angelis, R; Andreoli, P; Barbarino, M; Kimura, S; Mazzocco, M; Natowitz, J B; Ditmire, T

    2013-09-01

    We report on experiments in which the Texas Petawatt laser irradiated a mixture of deuterium or deuterated methane clusters and helium-3 gas, generating three types of nuclear fusion reactions: D(d,^{3}He)n, D(d,t)p, and ^{3}He(d,p)^{4}He. We measured the yields of fusion neutrons and protons from these reactions and found them to agree with yields based on a simple cylindrical plasma model using known cross sections and measured plasma parameters. Within our measurement errors, the fusion products were isotropically distributed. Plasma temperatures, important for the cross sections, were determined by two independent methods: (1) deuterium ion time of flight and (2) utilizing the ratio of neutron yield to proton yield from D(d,^{3}He)n and ^{3}He(d,p)^{4}He reactions, respectively. This experiment produced the highest ion temperature ever achieved with laser-irradiated deuterium clusters.

  6. Contaminant source identification using semi-supervised machine learning

    NASA Astrophysics Data System (ADS)

    Vesselinov, Velimir V.; Alexandrov, Boian S.; O'Malley, Daniel

    2018-05-01

    Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. In this paper, we propose a new contaminant source identification approach that performs decomposition of the observation mixtures based on Non-negative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the unknown number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. NMFk is tested on synthetic and real-world site data. The NMFk algorithm works with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).

  7. Contaminant source identification using semi-supervised machine learning

    DOE PAGES

    Vesselinov, Velimir Valentinov; Alexandrov, Boian S.; O’Malley, Dan

    2017-11-08

    Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may needmore » to be simulated in these models which further complicates the analyses. In this paper, we propose a new contaminant source identification approach that performs decomposition of the observation mixtures based on Non-negative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the unknown number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. NMFk is tested on synthetic and real-world site data. Finally, the NMFk algorithm works with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).« less

  8. Contaminant source identification using semi-supervised machine learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vesselinov, Velimir Valentinov; Alexandrov, Boian S.; O’Malley, Dan

    Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may needmore » to be simulated in these models which further complicates the analyses. In this paper, we propose a new contaminant source identification approach that performs decomposition of the observation mixtures based on Non-negative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the unknown number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. NMFk is tested on synthetic and real-world site data. Finally, the NMFk algorithm works with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).« less

  9. Joint Clustering and Component Analysis of Correspondenceless Point Sets: Application to Cardiac Statistical Modeling.

    PubMed

    Gooya, Ali; Lekadir, Karim; Alba, Xenia; Swift, Andrew J; Wild, Jim M; Frangi, Alejandro F

    2015-01-01

    Construction of Statistical Shape Models (SSMs) from arbitrary point sets is a challenging problem due to significant shape variation and lack of explicit point correspondence across the training data set. In medical imaging, point sets can generally represent different shape classes that span healthy and pathological exemplars. In such cases, the constructed SSM may not generalize well, largely because the probability density function (pdf) of the point sets deviates from the underlying assumption of Gaussian statistics. To this end, we propose a generative model for unsupervised learning of the pdf of point sets as a mixture of distinctive classes. A Variational Bayesian (VB) method is proposed for making joint inferences on the labels of point sets, and the principal modes of variations in each cluster. The method provides a flexible framework to handle point sets with no explicit point-to-point correspondences. We also show that by maximizing the marginalized likelihood of the model, the optimal number of clusters of point sets can be determined. We illustrate this work in the context of understanding the anatomical phenotype of the left and right ventricles in heart. To this end, we use a database containing hearts of healthy subjects, patients with Pulmonary Hypertension (PH), and patients with Hypertrophic Cardiomyopathy (HCM). We demonstrate that our method can outperform traditional PCA in both generalization and specificity measures.

  10. Crystallization of carbon-oxygen mixtures in white dwarf stars.

    PubMed

    Horowitz, C J; Schneider, A S; Berry, D K

    2010-06-11

    We determine the phase diagram for dense carbon-oxygen mixtures in white dwarf (WD) star interiors using molecular dynamics simulations involving liquid and solid phases. Our phase diagram agrees well with predictions from Ogata et al. and from Medin and Cumming and gives lower melting temperatures than Segretain et al. Observations of WD crystallization in the globular cluster NGC 6397 by Winget et al. suggest that the melting temperature of WD cores is close to that for pure carbon. If this is true, our phase diagram implies that the central oxygen abundance in these stars is less than about 60%. This constraint, along with assumptions about convection in stellar evolution models, limits the effective S factor for the 12C(α,γ)16O reaction to S(300)≤170  keV b.

  11. Analysis of Spin Financial Market by GARCH Model

    NASA Astrophysics Data System (ADS)

    Takaishi, Tetsuya

    2013-08-01

    A spin model is used for simulations of financial markets. To determine return volatility in the spin financial market we use the GARCH model often used for volatility estimation in empirical finance. We apply the Bayesian inference performed by the Markov Chain Monte Carlo method to the parameter estimation of the GARCH model. It is found that volatility determined by the GARCH model exhibits "volatility clustering" also observed in the real financial markets. Using volatility determined by the GARCH model we examine the mixture-of-distribution hypothesis (MDH) suggested for the asset return dynamics. We find that the returns standardized by volatility are approximately standard normal random variables. Moreover we find that the absolute standardized returns show no significant autocorrelation. These findings are consistent with the view of the MDH for the return dynamics.

  12. Automatic classification of unexploded ordnance applied to Spencer Range live site for 5x5 TEMTADS sensor

    NASA Astrophysics Data System (ADS)

    Sigman, John B.; Barrowes, Benjamin E.; O'Neill, Kevin; Shubitidze, Fridon

    2013-06-01

    This paper details methods for automatic classification of Unexploded Ordnance (UXO) as applied to sensor data from the Spencer Range live site. The Spencer Range is a former military weapons range in Spencer, Tennessee. Electromagnetic Induction (EMI) sensing is carried out using the 5x5 Time-domain Electromagnetic Multi-sensor Towed Array Detection System (5x5 TEMTADS), which has 25 receivers and 25 co-located transmitters. Every transmitter is activated sequentially, each followed by measuring the magnetic field in all 25 receivers, from 100 microseconds to 25 milliseconds. From these data target extrinsic and intrinsic parameters are extracted using the Differential Evolution (DE) algorithm and the Ortho-Normalized Volume Magnetic Source (ONVMS) algorithms, respectively. Namely, the inversion provides x, y, and z locations and a time series of the total ONVMS principal eigenvalues, which are intrinsic properties of the objects. The eigenvalues are fit to a power-decay empirical model, the Pasion-Oldenburg model, providing 3 coefficients (k, b, and g) for each object. The objects are grouped geometrically into variably-sized clusters, in the k-b-g space, using clustering algorithms. Clusters matching a priori characteristics are identified as Targets of Interest (TOI), and larger clusters are automatically subclustered. Ground Truths (GT) at the center of each class are requested, and probability density functions are created for clusters that have centroid TOI using a Gaussian Mixture Model (GMM). The probability functions are applied to all remaining anomalies. All objects of UXO probability higher than a chosen threshold are placed in a ranked dig list. This prioritized list is scored and the results are demonstrated and analyzed.

  13. A Bayesian hierarchical model for mortality data from cluster-sampling household surveys in humanitarian crises.

    PubMed

    Heudtlass, Peter; Guha-Sapir, Debarati; Speybroeck, Niko

    2018-05-31

    The crude death rate (CDR) is one of the defining indicators of humanitarian emergencies. When data from vital registration systems are not available, it is common practice to estimate the CDR from household surveys with cluster-sampling design. However, sample sizes are often too small to compare mortality estimates to emergency thresholds, at least in a frequentist framework. Several authors have proposed Bayesian methods for health surveys in humanitarian crises. Here, we develop an approach specifically for mortality data and cluster-sampling surveys. We describe a Bayesian hierarchical Poisson-Gamma mixture model with generic (weakly informative) priors that could be used as default in absence of any specific prior knowledge, and compare Bayesian and frequentist CDR estimates using five different mortality datasets. We provide an interpretation of the Bayesian estimates in the context of an emergency threshold and demonstrate how to interpret parameters at the cluster level and ways in which informative priors can be introduced. With the same set of weakly informative priors, Bayesian CDR estimates are equivalent to frequentist estimates, for all practical purposes. The probability that the CDR surpasses the emergency threshold can be derived directly from the posterior of the mean of the mixing distribution. All observation in the datasets contribute to the estimation of cluster-level estimates, through the hierarchical structure of the model. In a context of sparse data, Bayesian mortality assessments have advantages over frequentist ones already when using only weakly informative priors. More informative priors offer a formal and transparent way of combining new data with existing data and expert knowledge and can help to improve decision-making in humanitarian crises by complementing frequentist estimates.

  14. Clustering high-dimensional mixed data to uncover sub-phenotypes: joint analysis of phenotypic and genotypic data.

    PubMed

    McParland, D; Phillips, C M; Brennan, L; Roche, H M; Gormley, I C

    2017-12-10

    The LIPGENE-SU.VI.MAX study, like many others, recorded high-dimensional continuous phenotypic data and categorical genotypic data. LIPGENE-SU.VI.MAX focuses on the need to account for both phenotypic and genetic factors when studying the metabolic syndrome (MetS), a complex disorder that can lead to higher risk of type 2 diabetes and cardiovascular disease. Interest lies in clustering the LIPGENE-SU.VI.MAX participants into homogeneous groups or sub-phenotypes, by jointly considering their phenotypic and genotypic data, and in determining which variables are discriminatory. A novel latent variable model that elegantly accommodates high dimensional, mixed data is developed to cluster LIPGENE-SU.VI.MAX participants using a Bayesian finite mixture model. A computationally efficient variable selection algorithm is incorporated, estimation is via a Gibbs sampling algorithm and an approximate BIC-MCMC criterion is developed to select the optimal model. Two clusters or sub-phenotypes ('healthy' and 'at risk') are uncovered. A small subset of variables is deemed discriminatory, which notably includes phenotypic and genotypic variables, highlighting the need to jointly consider both factors. Further, 7 years after the LIPGENE-SU.VI.MAX data were collected, participants underwent further analysis to diagnose presence or absence of the MetS. The two uncovered sub-phenotypes strongly correspond to the 7-year follow-up disease classification, highlighting the role of phenotypic and genotypic factors in the MetS and emphasising the potential utility of the clustering approach in early screening. Additionally, the ability of the proposed approach to define the uncertainty in sub-phenotype membership at the participant level is synonymous with the concepts of precision medicine and nutrition. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  15. Sugar administration is an effective adjunctive therapy in the treatment of Pseudomonas aeruginosa pneumonia

    PubMed Central

    Bucior, Iwona; Abbott, Jason; Song, Yuanlin; Matthay, Michael A.

    2013-01-01

    Treatment of acute and chronic pulmonary infections caused by opportunistic pathogen Pseudomonas aeruginosa is limited by the increasing frequency of multidrug bacterial resistance. Here, we describe a novel adjunctive therapy in which administration of a mix of simple sugars—mannose, fucose, and galactose—inhibits bacterial attachment, limits lung damage, and potentiates conventional antibiotic therapy. The sugar mixture inhibits adhesion of nonmucoid and mucoid P. aeruginosa strains to bronchial epithelial cells in vitro. In a murine model of acute pneumonia, treatment with the sugar mixture alone diminishes lung damage, bacterial dissemination to the subpleural alveoli, and neutrophil- and IL-8-driven inflammatory responses. Remarkably, the sugars act synergistically with anti-Pseudomonas antibiotics, including β-lactams and quinolones, to further reduce bacterial lung colonization and damage. To probe the mechanism, we examined the effects of sugars in the presence or absence of antibiotics during growth in liquid culture and in an ex vivo infection model utilizing freshly dissected mouse tracheas and lungs. We demonstrate that the sugar mixture induces rapid but reversible formation of bacterial clusters that exhibited enhanced susceptibility to antibiotics compared with individual bacteria. Our findings reveal that sugar inhalation, an inexpensive and safe therapeutic, could be used in combination with conventional antibiotic therapy to more effectively treat P. aeruginosa lung infections. PMID:23792737

  16. Encoding the local connectivity patterns of fMRI for cognitive task and state classification.

    PubMed

    Onal Ertugrul, Itir; Ozay, Mete; Yarman Vural, Fatos T

    2018-06-15

    In this work, we propose a novel framework to encode the local connectivity patterns of brain, using Fisher vectors (FV), vector of locally aggregated descriptors (VLAD) and bag-of-words (BoW) methods. We first obtain local descriptors, called mesh arc descriptors (MADs) from fMRI data, by forming local meshes around anatomical regions, and estimating their relationship within a neighborhood. Then, we extract a dictionary of relationships, called brain connectivity dictionary by fitting a generative Gaussian mixture model (GMM) to a set of MADs, and selecting codewords at the mean of each component of the mixture. Codewords represent connectivity patterns among anatomical regions. We also encode MADs by VLAD and BoW methods using k-Means clustering. We classify cognitive tasks using the Human Connectome Project (HCP) task fMRI dataset and cognitive states using the Emotional Memory Retrieval (EMR). We train support vector machines (SVMs) using the encoded MADs. Results demonstrate that, FV encoding of MADs can be successfully employed for classification of cognitive tasks, and outperform VLAD and BoW representations. Moreover, we identify the significant Gaussians in mixture models by computing energy of their corresponding FV parts, and analyze their effect on classification accuracy. Finally, we suggest a new method to visualize the codewords of the learned brain connectivity dictionary.

  17. Fluctuating micro-heterogeneity in water-tert-butyl alcohol mixtures and lambda-type divergence of the mean cluster size with phase transition-like multiple anomalies

    NASA Astrophysics Data System (ADS)

    Banerjee, Saikat; Furtado, Jonathan; Bagchi, Biman

    2014-05-01

    Water-tert-butyl alcohol (TBA) binary mixture exhibits a large number of thermodynamic and dynamic anomalies. These anomalies are observed at surprisingly low TBA mole fraction, with xTBA ≈ 0.03-0.07. We demonstrate here that the origin of the anomalies lies in the local structural changes that occur due to self-aggregation of TBA molecules. We observe a percolation transition of the TBA molecules at xTBA ≈ 0.05. We note that "islands" of TBA clusters form even below this mole fraction, while a large spanning cluster emerges above that mole fraction. At this percolation threshold, we observe a lambda-type divergence in the fluctuation of the size of the largest TBA cluster, reminiscent of a critical point. Alongside, the structure of water is also perturbed, albeit weakly, by the aggregation of TBA molecules. There is a monotonic decrease in the tetrahedral order parameter of water, while the dipole moment correlation shows a weak nonlinearity. Interestingly, water molecules themselves exhibit a reverse percolation transition at higher TBA concentration, xTBA ≈ 0.45, where large spanning water clusters now break-up into small clusters. This is accompanied by significant divergence of the fluctuations in the size of largest water cluster. This second transition gives rise to another set of anomalies around. Both the percolation transitions can be regarded as manifestations of Janus effect at small molecular level.

  18. Fluctuating micro-heterogeneity in water-tert-butyl alcohol mixtures and lambda-type divergence of the mean cluster size with phase transition-like multiple anomalies.

    PubMed

    Banerjee, Saikat; Furtado, Jonathan; Bagchi, Biman

    2014-05-21

    Water-tert-butyl alcohol (TBA) binary mixture exhibits a large number of thermodynamic and dynamic anomalies. These anomalies are observed at surprisingly low TBA mole fraction, with x(TBA) ≈ 0.03-0.07. We demonstrate here that the origin of the anomalies lies in the local structural changes that occur due to self-aggregation of TBA molecules. We observe a percolation transition of the TBA molecules at x(TBA) ≈ 0.05. We note that "islands" of TBA clusters form even below this mole fraction, while a large spanning cluster emerges above that mole fraction. At this percolation threshold, we observe a lambda-type divergence in the fluctuation of the size of the largest TBA cluster, reminiscent of a critical point. Alongside, the structure of water is also perturbed, albeit weakly, by the aggregation of TBA molecules. There is a monotonic decrease in the tetrahedral order parameter of water, while the dipole moment correlation shows a weak nonlinearity. Interestingly, water molecules themselves exhibit a reverse percolation transition at higher TBA concentration, x(TBA) ≈ 0.45, where large spanning water clusters now break-up into small clusters. This is accompanied by significant divergence of the fluctuations in the size of largest water cluster. This second transition gives rise to another set of anomalies around. Both the percolation transitions can be regarded as manifestations of Janus effect at small molecular level.

  19. Speaker Clustering for a Mixture of Singing and Reading (Preprint)

    DTIC Science & Technology

    2012-03-01

    diarization [2, 3] which answers the ques- tion of ”who spoke when?” is a combination of speaker segmentation and clustering. Although it is possible to...focuses on speaker clustering, the techniques developed here can be applied to speaker diarization . For the remainder of this paper, the term ”speech...and retrieval,” Proceedings of the IEEE, vol. 88, 2000. [2] S. Tranter and D. Reynolds, “An overview of automatic speaker diarization systems,” IEEE

  20. [4Fe-4S]-cluster-depleted Azotobacter vinelandii ferredoxin I: a new 3Fe iron-sulfur protein.

    PubMed Central

    Stephens, P J; Morgan, T V; Devlin, F; Penner-Hahn, J E; Hodgson, K O; Scott, R A; Stout, C D; Burgess, B K

    1985-01-01

    Fe(CN)6(-3) oxidation of the aerobically isolated 7Fe Azotobacter vinelandii ferredoxin I, (7Fe)FdI, is a degradative reaction. Destruction of the [4Fe-4S] cluster occurs first, followed by destruction of the [3Fe-3S] cluster. At a Fe(CN)6(-3)/(7Fe)FdI concentration ratio of 20, the product is a mixture of apoprotein and protein containing only a [3Fe-3S] cluster, (3Fe)FdI. This protein mixture, after partial purification, has been characterized by absorption, CD, magnetic CD, and EPR and Fe x-ray absorption spectroscopies. EPR and magnetic CD spectra provide strong evidence that the [3Fe-3S] cluster in (3Fe)FdI is essentially identical in structure to that in (7Fe)FdI. Analysis of the extended x-ray absorption fine structure (EXAFS) of (3Fe)FdI finds Fe scattering at an average Fe...Fe distance of approximately equal to 2.7 A. The structure of the oxidized [3Fe-3S] cluster in solutions of oxidized (3Fe)FdI, and, by extension, of oxidized (7Fe)FdI, is thus different from that obtained by x-ray crystallography on oxidized (7Fe)FdI. Possible interpretations of this result are discussed. PMID:2994040

  1. Lithium-air batteries, method for making lithium-air batteries

    DOEpatents

    Vajda, Stefan; Curtiss, Larry A.; Lu, Jun; Amine, Khalil; Tyo, Eric C.

    2016-11-15

    The invention provides a method for generating Li.sub.2O.sub.2 or composites of it, the method uses mixing lithium ions with oxygen ions in the presence of a catalyst. The catalyst comprises a plurality of metal clusters, their alloys and mixtures, each cluster consisting of between 3 and 18 metal atoms. The invention also describes a lithium-air battery which uses a lithium metal anode, and a cathode opposing the anode. The cathode supports metal clusters, each cluster consisting of size selected clusters, taken from a range of between approximately 3 and approximately 18 metal atoms, and an electrolyte positioned between the anode and the cathode.

  2. How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing

    NASA Astrophysics Data System (ADS)

    Decyk, V. K.; Dauger, D. E.

    We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.

  3. Strategy for good dispersion of well-defined tetrapods in semiconducting polymer matrices.

    PubMed

    Lim, Jaehoon; Borg, Lisa zur; Dolezel, Stefan; Schmid, Friederike; Char, Kookheon; Zentel, Rudolf

    2014-10-01

    The morphology or dispersion control in inorganic/organic hybrid systems is studied, which consist of monodisperse CdSe tetrapods (TPs) with grafted semiconducting block copolymers with excess polymers of the same type. Tetrapod arm-length and amount of polymer loading are varied in order to find the ideal morphology for hybrid solar cells. Additionally, polymers without anchor groups are mixed with the TPs to study the effect of such anchor groups on the hybrid morphology. A numerical model is developed and Monte Carlo simulations to study the basis of compatibility or dispersibility of TPs in polymer matrices are performed. The simulations show that bare TPs tend to form clusters in the matrix of excess polymers. The clustering is significantly reduced after grafting polymer chains to the TPs, which is confirmed experimentally. Transmission electron microscopy reveals that the block copolymer-TP mixtures ("hybrids") show much better film qualities and TP distributions within the films when compared with the homopolymer-TP mixtures ("blends"), representing massive aggregations and cracks in the films. This grafting-to approach for the modification of TPs significantly improves the dispersion of the TPs in matrices of "excess" polymers up to the arm length of 100 nm. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. The influence of particle shape on dielectric enhancement in metal-insulator composites

    NASA Astrophysics Data System (ADS)

    Doyle, W. T.; Jacobs, I. S.

    1992-04-01

    Disordered suspensions of conducting particles exhibit substantial permittivity enhancements beyond the predictions of the Clausius-Mossotti equation and other purely dipolar approximations. The magnitude of the enhancement depends upon the shape of the particles. A recently developed effective cluster model for spherical particles [Phys. Rev. B 42, 9319 (1990)] that treats a disordered suspension as a mixture, or mesosuspension, of isolated spheres and close-packed spherical clusters of arbitrary size is in excellent agreement with experiments on well-stirred suspensions of spheres over the entire accessible range of volume loading. In this paper, the effective cluster model is extended to be applicable to disordered suspensions of arbitrarily shaped conducting particles. Two physical parameters are used to characterize a general suspension: the angular average polarizability of an isolated particle, and the volume loading at closest packing of the suspension. Multipole interactions within the clusters are treated exactly. External particle-shape-dependent interactions between clusters and isolated particles are treated in the dipole approximation in two ways: explicitly, using the Clausius-Mossotti equation, and implicitly, using the Wiener equation. Both versions of the model are used to find the permittivity of a monodisperse suspension of conducting spheroids, for which the model parameters can be determined independently. The two versions are in good agreement when the axial ratio of the particles is not extreme. The Clausius-Mossotti version of the model yields a mesoscopic analogue of the dielectric virial expansion. It is limited to small volume loadings when the particles have an extremely nonspherical shape. The Wiener equation version of the model holds at all volume loadings for particles of arbitrary shape. Comparison of the two versions of the model leads to a simple physical interpretation of Wiener's equation. The models are compared with experiments of Kelly, Stenoien, and Isbell [J. Appl. Phys. 24, 258 (1953)] on aluminum and zinc particles in paraffin, with Nasuhoglu's experiments on iron particles in oil [Commun. Fac. Sci. Univ. Ankara 4, 108 (1952)], and with new X-band and Kα-band permittivity measurements on Ni-Cr alloy particles in a polyurethane binder.

  5. Tracing catchment fine sediment sources using the new SIFT (SedIment Fingerprinting Tool) open source software.

    PubMed

    Pulley, S; Collins, A L

    2018-09-01

    The mitigation of diffuse sediment pollution requires reliable provenance information so that measures can be targeted. Sediment source fingerprinting represents one approach for supporting these needs, but recent methodological developments have resulted in an increasing complexity of data processing methods rendering the approach less accessible to non-specialists. A comprehensive new software programme (SIFT; SedIment Fingerprinting Tool) has therefore been developed which guides the user through critical data analysis decisions and automates all calculations. Multiple source group configurations and composite fingerprints are identified and tested using multiple methods of uncertainty analysis. This aims to explore the sediment provenance information provided by the tracers more comprehensively than a single model, and allows for model configurations with high uncertainties to be rejected. This paper provides an overview of its application to an agricultural catchment in the UK to determine if the approach used can provide a reduction in uncertainty and increase in precision. Five source group classifications were used; three formed using a k-means cluster analysis containing 2, 3 and 4 clusters, and two a-priori groups based upon catchment geology. Three different composite fingerprints were used for each classification and bi-plots, range tests, tracer variability ratios and virtual mixtures tested the reliability of each model configuration. Some model configurations performed poorly when apportioning the composition of virtual mixtures, and different model configurations could produce different sediment provenance results despite using composite fingerprints able to discriminate robustly between the source groups. Despite this uncertainty, dominant sediment sources were identified, and those in close proximity to each sediment sampling location were found to be of greatest importance. This new software, by integrating recent methodological developments in tracer data processing, guides users through key steps. Critically, by applying multiple model configurations and uncertainty assessment, it delivers more robust solutions for informing catchment management of the sediment problem than many previously used approaches. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  6. A Hierarchical Framework for State-Space Matrix Inference and Clustering.

    PubMed

    Zuo, Chandler; Chen, Kailei; Hewitt, Kyle J; Bresnick, Emery H; Keleş, Sündüz

    2016-09-01

    In recent years, a large number of genomic and epigenomic studies have been focusing on the integrative analysis of multiple experimental datasets measured over a large number of observational units. The objectives of such studies include not only inferring a hidden state of activity for each unit over individual experiments, but also detecting highly associated clusters of units based on their inferred states. Although there are a number of methods tailored for specific datasets, there is currently no state-of-the-art modeling framework for this general class of problems. In this paper, we develop the MBASIC ( M atrix B ased A nalysis for S tate-space I nference and C lustering) framework. MBASIC consists of two parts: state-space mapping and state-space clustering. In state-space mapping, it maps observations onto a finite state-space, representing the activation states of units across conditions. In state-space clustering, MBASIC incorporates a finite mixture model to cluster the units based on their inferred state-space profiles across all conditions. Both the state-space mapping and clustering can be simultaneously estimated through an Expectation-Maximization algorithm. MBASIC flexibly adapts to a large number of parametric distributions for the observed data, as well as the heterogeneity in replicate experiments. It allows for imposing structural assumptions on each cluster, and enables model selection using information criterion. In our data-driven simulation studies, MBASIC showed significant accuracy in recovering both the underlying state-space variables and clustering structures. We applied MBASIC to two genome research problems using large numbers of datasets from the ENCODE project. The first application grouped genes based on transcription factor occupancy profiles of their promoter regions in two different cell types. The second application focused on identifying groups of loci that are similar to a GATA2 binding site that is functional at its endogenous locus by utilizing transcription factor occupancy data and illustrated applicability of MBASIC in a wide variety of problems. In both studies, MBASIC showed higher levels of raw data fidelity than analyzing these data with a two-step approach using ENCODE results on transcription factor occupancy data.

  7. Non-negative Matrix Factorization and Co-clustering: A Promising Tool for Multi-tasks Bearing Fault Diagnosis

    NASA Astrophysics Data System (ADS)

    Shen, Fei; Chen, Chao; Yan, Ruqiang

    2017-05-01

    Classical bearing fault diagnosis methods, being designed according to one specific task, always pay attention to the effectiveness of extracted features and the final diagnostic performance. However, most of these approaches suffer from inefficiency when multiple tasks exist, especially in a real-time diagnostic scenario. A fault diagnosis method based on Non-negative Matrix Factorization (NMF) and Co-clustering strategy is proposed to overcome this limitation. Firstly, some high-dimensional matrixes are constructed using the Short-Time Fourier Transform (STFT) features, where the dimension of each matrix equals to the number of target tasks. Then, the NMF algorithm is carried out to obtain different components in each dimension direction through optimized matching, such as Euclidean distance and divergence distance. Finally, a Co-clustering technique based on information entropy is utilized to realize classification of each component. To verity the effectiveness of the proposed approach, a series of bearing data sets were analysed in this research. The tests indicated that although the diagnostic performance of single task is comparable to traditional clustering methods such as K-mean algorithm and Guassian Mixture Model, the accuracy and computational efficiency in multi-tasks fault diagnosis are improved.

  8. Fullerene formation and annealing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mintmire, J.W.

    1996-04-05

    Why does the highly symmetric carbon cluster C{sub 60} form in such profusion under the right conditions? This question was first asked in 1985, when Kroto suggested that the predominance of the C{sub 60} carbon clusters observed in the molecular beam experiments could be explained by the truncated icosahedral (or soccer ball) form. The name given to this cluster, buckminsterfullerene, led to the use of the term fullerenes for the family of hollow-cage carbon clusters made up of even numbers of triply coordinated carbons arranged with 12 pentagonal rings and an almost arbitrary number of hexagonal rings. More than amore » decade later, we still lack a completely satisfying understanding of the fundamental chemistry that takes place during fullerene formation. Most current models for fullerene formation require a facile mechanism for ring rearrangement in the fullerene structure, but the simplest proposed mechanisms are believed to have unrealistically high activation barriers. In recent research calculations have suggested that atomic carbon in the reaction mixture could act as a catalyst and allow substantially lower activation barriers for fullerene annealing. This article discusses the background for this research and other adjunct research. 14 refs.« less

  9. Topological mappings of video and audio data.

    PubMed

    Fyfe, Colin; Barbakh, Wesam; Ooi, Wei Chuan; Ko, Hanseok

    2008-12-01

    We review a new form of self-organizing map which is based on a nonlinear projection of latent points into data space, identical to that performed in the Generative Topographic Mapping (GTM).(1) But whereas the GTM is an extension of a mixture of experts, this model is an extension of a product of experts.(2) We show visualisation and clustering results on a data set composed of video data of lips uttering 5 Korean vowels. Finally we note that we may dispense with the probabilistic underpinnings of the product of experts and derive the same algorithm as a minimisation of mean squared error between the prototypes and the data. This leads us to suggest a new algorithm which incorporates local and global information in the clustering. Both ot the new algorithms achieve better results than the standard Self-Organizing Map.

  10. Binary Mixtures of Particles with Different Diffusivities Demix.

    PubMed

    Weber, Simon N; Weber, Christoph A; Frey, Erwin

    2016-02-05

    The influence of size differences, shape, mass, and persistent motion on phase separation in binary mixtures has been intensively studied. Here we focus on the exclusive role of diffusivity differences in binary mixtures of equal-sized particles. We find an effective attraction between the less diffusive particles, which are essentially caged in the surrounding species with the higher diffusion constant. This effect leads to phase separation for systems above a critical size: A single close-packed cluster made up of the less diffusive species emerges. Experiments for testing our predictions are outlined.

  11. Experimental studies on ion mobility in xenon-trimethylamine mixtures

    NASA Astrophysics Data System (ADS)

    Trindade, A. M. F.; Encarnação, P. M. C. C.; Escada, J.; Cortez, A. F. V.; Neves, P. N. B.; Conde, C. A. N.; Borges, F. I. G. M.; Santos, F. P.

    2017-07-01

    In this paper we present experimental results for ion reduced mobilities (K0) in gaseous trimethylamine, TMA—(CH3)3N, and xenon-TMA mixtures for reduced electric fields E/N between 7.5 and 60 Td and in the pressure range from 0.5 to 10 Torr, at room temperature. Both in the mixtures and in pure TMA only one peak was observed in the time of arrival spectra, which is believed to be due to two TMA ions with similar mass, (CH3)3N+ (59 u) and (CH3)2CH2N+ (58 u), whose mobility is indistinguishable in our experimental system. The possibility of ion cluster formation is also discussed. In pure TMA, for the E/N range investigated, an average value of 0.56 cm2V-1s-1 was obtained for the reduced mobility of TMA ions. For the studied mixtures, it was observed that even a very small amount of gaseous TMA (~0.2%) in xenon leads to the production of the above referred TMA ions or clusters. The reduced mobility value of this ion or ions in Xe-TMA mixtures is higher than the value in pure TMA: around 0.8 cm2V-1s-1 for TMA concentrations from 0.2% to about 10%, decreasing for higher TMA percentages, eventually converging to the reduced mobility value in pure TMA.

  12. Discrete mixture modeling to address genetic heterogeneity in time-to-event regression

    PubMed Central

    Eng, Kevin H.; Hanlon, Bret M.

    2014-01-01

    Motivation: Time-to-event regression models are a critical tool for associating survival time outcomes with molecular data. Despite mounting evidence that genetic subgroups of the same clinical disease exist, little attention has been given to exploring how this heterogeneity affects time-to-event model building and how to accommodate it. Methods able to diagnose and model heterogeneity should be valuable additions to the biomarker discovery toolset. Results: We propose a mixture of survival functions that classifies subjects with similar relationships to a time-to-event response. This model incorporates multivariate regression and model selection and can be fit with an expectation maximization algorithm, we call Cox-assisted clustering. We illustrate a likely manifestation of genetic heterogeneity and demonstrate how it may affect survival models with little warning. An application to gene expression in ovarian cancer DNA repair pathways illustrates how the model may be used to learn new genetic subsets for risk stratification. We explore the implications of this model for censored observations and the effect on genomic predictors and diagnostic analysis. Availability and implementation: R implementation of CAC using standard packages is available at https://gist.github.com/programeng/8620b85146b14b6edf8f Data used in the analysis are publicly available. Contact: kevin.eng@roswellpark.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24532723

  13. On the clustering of multidimensional pictorial data

    NASA Technical Reports Server (NTRS)

    Bryant, J. D. (Principal Investigator)

    1979-01-01

    Obvious approaches to reducing the cost (in computer resources) of applying current clustering techniques to the problem of remote sensing are discussed. The use of spatial information in finding fields and in classifying mixture pixels is examined, and the AMOEBA clustering program is described. Internally, a pattern recognition program, from without, AMOEBA appears to be an unsupervised clustering program. It is fast and automatic. No choices (such as arbitrary thresholds to set split/combine sequences) need be made. The problem of finding the number of clusters is solved automatically. At the conclusion of the program, all points in the scene are classified; however, a provision is included for a reject classification of some points which, within the theoretical framework, cannot rationally be assigned to any cluster.

  14. Graph-Theoretic Analysis of Monomethyl Phosphate Clustering in Ionic Solutions.

    PubMed

    Han, Kyungreem; Venable, Richard M; Bryant, Anne-Marie; Legacy, Christopher J; Shen, Rong; Li, Hui; Roux, Benoît; Gericke, Arne; Pastor, Richard W

    2018-02-01

    All-atom molecular dynamics simulations combined with graph-theoretic analysis reveal that clustering of monomethyl phosphate dianion (MMP 2- ) is strongly influenced by the types and combinations of cations in the aqueous solution. Although Ca 2+ promotes the formation of stable and large MMP 2- clusters, K + alone does not. Nonetheless, clusters are larger and their link lifetimes are longer in mixtures of K + and Ca 2+ . This "synergistic" effect depends sensitively on the Lennard-Jones interaction parameters between Ca 2+ and the phosphorus oxygen and correlates with the hydration of the clusters. The pronounced MMP 2- clustering effect of Ca 2+ in the presence of K + is confirmed by Fourier transform infrared spectroscopy. The characterization of the cation-dependent clustering of MMP 2- provides a starting point for understanding cation-dependent clustering of phosphoinositides in cell membranes.

  15. Geometric comparison of popular mixture-model distances.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mitchell, Scott A.

    2010-09-01

    Statistical Latent Dirichlet Analysis produces mixture model data that are geometrically equivalent to points lying on a regular simplex in moderate to high dimensions. Numerous other statistical models and techniques also produce data in this geometric category, even though the meaning of the axes and coordinate values differs significantly. A distance function is used to further analyze these points, for example to cluster them. Several different distance functions are popular amongst statisticians; which distance function is chosen is usually driven by the historical preference of the application domain, information-theoretic considerations, or by the desirability of the clustering results. Relatively littlemore » consideration is usually given to how distance functions geometrically transform data, or the distances algebraic properties. Here we take a look at these issues, in the hope of providing complementary insight and inspiring further geometric thought. Several popular distances, {chi}{sup 2}, Jensen - Shannon divergence, and the square of the Hellinger distance, are shown to be nearly equivalent; in terms of functional forms after transformations, factorizations, and series expansions; and in terms of the shape and proximity of constant-value contours. This is somewhat surprising given that their original functional forms look quite different. Cosine similarity is the square of the Euclidean distance, and a similar geometric relationship is shown with Hellinger and another cosine. We suggest a geodesic variation of Hellinger. The square-root projection that arises in Hellinger distance is briefly compared to standard normalization for Euclidean distance. We include detailed derivations of some ratio and difference bounds for illustrative purposes. We provide some constructions that nearly achieve the worst-case ratios, relevant for contours.« less

  16. Model-based recursive partitioning to identify risk clusters for metabolic syndrome and its components: findings from the International Mobility in Aging Study

    PubMed Central

    Pirkle, Catherine M; Wu, Yan Yan; Zunzunegui, Maria-Victoria; Gómez, José Fernando

    2018-01-01

    Objective Conceptual models underpinning much epidemiological research on ageing acknowledge that environmental, social and biological systems interact to influence health outcomes. Recursive partitioning is a data-driven approach that allows for concurrent exploration of distinct mixtures, or clusters, of individuals that have a particular outcome. Our aim is to use recursive partitioning to examine risk clusters for metabolic syndrome (MetS) and its components, in order to identify vulnerable populations. Study design Cross-sectional analysis of baseline data from a prospective longitudinal cohort called the International Mobility in Aging Study (IMIAS). Setting IMIAS includes sites from three middle-income countries—Tirana (Albania), Natal (Brazil) and Manizales (Colombia)—and two from Canada—Kingston (Ontario) and Saint-Hyacinthe (Quebec). Participants Community-dwelling male and female adults, aged 64–75 years (n=2002). Primary and secondary outcome measures We apply recursive partitioning to investigate social and behavioural risk factors for MetS and its components. Model-based recursive partitioning (MOB) was used to cluster participants into age-adjusted risk groups based on variabilities in: study site, sex, education, living arrangements, childhood adversities, adult occupation, current employment status, income, perceived income sufficiency, smoking status and weekly minutes of physical activity. Results 43% of participants had MetS. Using MOB, the primary partitioning variable was participant sex. Among women from middle-incomes sites, the predicted proportion with MetS ranged from 58% to 68%. Canadian women with limited physical activity had elevated predicted proportions of MetS (49%, 95% CI 39% to 58%). Among men, MetS ranged from 26% to 41% depending on childhood social adversity and education. Clustering for MetS components differed from the syndrome and across components. Study site was a primary partitioning variable for all components except HDL cholesterol. Sex was important for most components. Conclusion MOB is a promising technique for identifying disease risk clusters (eg, vulnerable populations) in modestly sized samples. PMID:29500203

  17. ReaxFF molecular dynamics simulation of intermolecular structure formation in acetic acid-water mixtures at elevated temperatures and pressures

    NASA Astrophysics Data System (ADS)

    Sengul, Mert Y.; Randall, Clive A.; van Duin, Adri C. T.

    2018-04-01

    The intermolecular structure formation in liquid and supercritical acetic acid-water mixtures was investigated using ReaxFF-based molecular dynamics simulations. The microscopic structures of acetic acid-water mixtures with different acetic acid mole fractions (1.0 ≥ xHAc ≥ 0.2) at ambient and critical conditions were examined. The potential energy surface associated with the dissociation of acetic acid molecules was calculated using a metadynamics procedure to optimize the dissociation energy of ReaxFF potential. At ambient conditions, depending on the acetic acid concentration, either acetic acid clusters or water clusters are dominant in the liquid mixture. When acetic acid is dominant (0.4 ≤ xHAc), cyclic dimers and chain structures between acetic acid molecules are present in the mixture. Both structures disappear at increased water content of the mixture. It was found by simulations that the acetic acid molecules released from these dimer and chain structures tend to stay in a dipole-dipole interaction. These structural changes are in agreement with the experimental results. When switched to critical conditions, the long-range interactions (e.g., second or fourth neighbor) disappear and the water-water and acetic acid-acetic acid structural formations become disordered. The simulated radial distribution function for water-water interactions is in agreement with experimental and computational studies. The first neighbor interactions between acetic acid and water molecules are preserved at relatively lower temperatures of the critical region. As higher temperatures are reached in the critical region, these interactions were observed to weaken. These simulations indicate that ReaxFF molecular dynamics simulations are an appropriate tool for studying supercritical water/organic acid mixtures.

  18. Statistical uncertainty of extreme wind storms over Europe derived from a probabilistic clustering technique

    NASA Astrophysics Data System (ADS)

    Walz, Michael; Leckebusch, Gregor C.

    2016-04-01

    Extratropical wind storms pose one of the most dangerous and loss intensive natural hazards for Europe. However, due to only 50 years of high quality observational data, it is difficult to assess the statistical uncertainty of these sparse events just based on observations. Over the last decade seasonal ensemble forecasts have become indispensable in quantifying the uncertainty of weather prediction on seasonal timescales. In this study seasonal forecasts are used in a climatological context: By making use of the up to 51 ensemble members, a broad and physically consistent statistical base can be created. This base can then be used to assess the statistical uncertainty of extreme wind storm occurrence more accurately. In order to determine the statistical uncertainty of storms with different paths of progression, a probabilistic clustering approach using regression mixture models is used to objectively assign storm tracks (either based on core pressure or on extreme wind speeds) to different clusters. The advantage of this technique is that the entire lifetime of a storm is considered for the clustering algorithm. Quadratic curves are found to describe the storm tracks most accurately. Three main clusters (diagonal, horizontal or vertical progression of the storm track) can be identified, each of which have their own particulate features. Basic storm features like average velocity and duration are calculated and compared for each cluster. The main benefit of this clustering technique, however, is to evaluate if the clusters show different degrees of uncertainty, e.g. more (less) spread for tracks approaching Europe horizontally (diagonally). This statistical uncertainty is compared for different seasonal forecast products.

  19. Characterization of Rhinitis According to the Asthma Status in Adults Using an Unsupervised Approach in the EGEA Study.

    PubMed

    Burte, Emilie; Bousquet, Jean; Varraso, Raphaëlle; Gormand, Frédéric; Just, Jocelyne; Matran, Régis; Pin, Isabelle; Siroux, Valérie; Jacquemin, Bénédicte; Nadif, Rachel

    2015-01-01

    The classification of rhinitis in adults is missing in epidemiological studies. To identify phenotypes of adult rhinitis using an unsupervised approach (data-driven) compared with a classical hypothesis-driven approach. 983 adults of the French Epidemiological Study on the Genetics and Environment of Asthma (EGEA) were studied. Self-reported symptoms related to rhinitis such as nasal symptoms, hay fever, sinusitis, conjunctivitis, and sensitivities to different triggers (dust, animals, hay/flowers, cold air…) were used. Allergic sensitization was defined by at least one positive skin prick test to 12 aeroallergens. Mixture model was used to cluster participants, independently in those without (Asthma-, n = 582) and with asthma (Asthma+, n = 401). Three clusters were identified in both groups: 1) Cluster A (55% in Asthma-, and 22% in Asthma+) mainly characterized by the absence of nasal symptoms, 2) Cluster B (23% in Asthma-, 36% in Asthma+) mainly characterized by nasal symptoms all over the year, sinusitis and a low prevalence of positive skin prick tests, and 3) Cluster C (22% in Asthma-, 42% in Asthma+) mainly characterized by a peak of nasal symptoms during spring, a high prevalence of positive skin prick tests and a high report of hay fever, allergic rhinitis and conjunctivitis. The highest rate of polysensitization (80%) was found in participants with comorbid asthma and allergic rhinitis. This cluster analysis highlighted three clusters of rhinitis with similar characteristics than those known by clinicians but differing according to allergic sensitization, and this whatever the asthma status. These clusters could be easily rebuilt using a small number of variables.

  20. The Manhattan Frame Model-Manhattan World Inference in the Space of Surface Normals.

    PubMed

    Straub, Julian; Freifeld, Oren; Rosman, Guy; Leonard, John J; Fisher, John W

    2018-01-01

    Objects and structures within man-made environments typically exhibit a high degree of organization in the form of orthogonal and parallel planes. Traditional approaches utilize these regularities via the restrictive, and rather local, Manhattan World (MW) assumption which posits that every plane is perpendicular to one of the axes of a single coordinate system. The aforementioned regularities are especially evident in the surface normal distribution of a scene where they manifest as orthogonally-coupled clusters. This motivates the introduction of the Manhattan-Frame (MF) model which captures the notion of an MW in the surface normals space, the unit sphere, and two probabilistic MF models over this space. First, for a single MF we propose novel real-time MAP inference algorithms, evaluate their performance and their use in drift-free rotation estimation. Second, to capture the complexity of real-world scenes at a global scale, we extend the MF model to a probabilistic mixture of Manhattan Frames (MMF). For MMF inference we propose a simple MAP inference algorithm and an adaptive Markov-Chain Monte-Carlo sampling algorithm with Metropolis-Hastings split/merge moves that let us infer the unknown number of mixture components. We demonstrate the versatility of the MMF model and inference algorithm across several scales of man-made environments.

  1. Exploring the possibility to store the mixed oxygen-hydrogen cluster in clathrate hydrate in molar ratio 1:2 (O2+2H2).

    PubMed

    Qin, Yan; Du, Qi-Shi; Xie, Neng-Zhong; Li, Jian-Xiu; Huang, Ri-Bo

    2017-05-01

    An interesting possibility is explored: storing the mixture of oxygen and hydrogen in clathrate hydrate in molar ratio 1:2. The interaction energies between oxygen, hydrogen, and clathrate hydrate are calculated using high level quantum chemical methods. The useful conclusion points from this study are summarized as follows. (1) The interaction energies of oxygen-hydrogen mixed cluster are larger than the energies of pure hydrogen molecular cluster. (2) The affinity of oxygen molecules with water molecules is larger than that of the hydrogen molecules with water molecules. (3) The dimension of O 2 -2H 2 interaction structure is smaller than the dimension of CO 2 -2H 2 interaction structure. (4) The escaping energy of oxygen molecules from the hydrate cell is larger than that of the hydrogen molecules. (5) The high affinity of the oxygen molecules with both the water molecules and the hydrogen molecules may promote the stability of oxygen-hydrogen mixture in the clathrate hydrate. Therefore it is possible to store the mixed (O 2 +2H 2 ) cluster in clathrate hydrate. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. ASHEE: a compressible, Equilibrium-Eulerian model for volcanic ash plumes

    NASA Astrophysics Data System (ADS)

    Cerminara, M.; Esposti Ongaro, T.; Berselli, L. C.

    2015-10-01

    A new fluid-dynamic model is developed to numerically simulate the non-equilibrium dynamics of polydisperse gas-particle mixtures forming volcanic plumes. Starting from the three-dimensional N-phase Eulerian transport equations (Neri et al., 2003) for a mixture of gases and solid dispersed particles, we adopt an asymptotic expansion strategy to derive a compressible version of the first-order non-equilibrium model (Ferry and Balachandar, 2001), valid for low concentration regimes (particle volume fraction less than 10-3) and particles Stokes number (St, i.e., the ratio between their relaxation time and flow characteristic time) not exceeding about 0.2. The new model, which is called ASHEE (ASH Equilibrium Eulerian), is significantly faster than the N-phase Eulerian model while retaining the capability to describe gas-particle non-equilibrium effects. Direct numerical simulation accurately reproduce the dynamics of isotropic, compressible turbulence in subsonic regime. For gas-particle mixtures, it describes the main features of density fluctuations and the preferential concentration and clustering of particles by turbulence, thus verifying the model reliability and suitability for the numerical simulation of high-Reynolds number and high-temperature regimes in presence of a dispersed phase. On the other hand, Large-Eddy Numerical Simulations of forced plumes are able to reproduce their observed averaged and instantaneous flow properties. In particular, the self-similar Gaussian radial profile and the development of large-scale coherent structures are reproduced, including the rate of turbulent mixing and entrainment of atmospheric air. Application to the Large-Eddy Simulation of the injection of the eruptive mixture in a stratified atmosphere describes some of important features of turbulent volcanic plumes, including air entrainment, buoyancy reversal, and maximum plume height. For very fine particles (St → 0, when non-equilibrium effects are negligible) the model reduces to the so-called dusty-gas model. However, coarse particles partially decouple from the gas phase within eddies (thus modifying the turbulent structure) and preferentially concentrate at the eddy periphery, eventually being lost from the plume margins due to the concurrent effect of gravity. By these mechanisms, gas-particle non-equilibrium processes are able to influence the large-scale behavior of volcanic plumes.

  3. Mixtures of amino-acid based ionic liquids and water.

    PubMed

    Chaban, Vitaly V; Fileti, Eudes Eterno

    2015-09-01

    New ionic liquids (ILs) involving increasing numbers of organic and inorganic ions are continuously being reported. We recently developed a new force field; in the present work, we applied that force field to investigate the structural properties of a few novel imidazolium-based ILs in aqueous mixtures via molecular dynamics (MD) simulations. Using cluster analysis, radial distribution functions, and spatial distribution functions, we argue that organic ions (imidazolium, deprotonated alanine, deprotonated methionine, deprotonated tryptophan) are well dispersed in aqueous media, irrespective of the IL content. Aqueous dispersions exhibit desirable properties for chemical engineering. The ILs exist as ion pairs in relatively dilute aqueous mixtures (10 mol%), while more concentrated mixtures feature a certain amount of larger ionic aggregates.

  4. CFD Modelling of Particle Mixtures in a 2D CFB

    NASA Astrophysics Data System (ADS)

    Seppälä, M.; Kallio, S.

    The capability of Fluent 6.2.16 to simulate particle mixtures in a laboratory scale 2D circulating fluidized bed (CFB) unit has been tested. In the simulations, the solids were described as one or two particle phases. The loading ratio of small to large particles, particle diameters and the gas inflow velocity were varied. The 40 cm wide and 3 m high 2D CFB was modeled using a grid with 31080 cells. The outflow of particles at the top of the CFB was monitored and emanated particles were fed back to the riser through a return duct. The paper presents the segregation patterns of the particle phases obtained from the simulations. When the fraction of large particles was 50% or larger, large particles segregated, as expected, to the wall regions and to the bottom part of the riser. However, when the fraction of large particles was 10%, an excess of large particles was found in the upper half of the riser. The explanation for this unexpected phenomenon was found in the distribution of the large particles between the slow clusters and the faster moving lean suspension.

  5. Influence of Xe and Kr impurities on x-ray yield from debris-free plasma x-ray sources with an Ar supersonic gas jet irradiated by femtosecond near-infrared-wavelength laser pulses

    NASA Astrophysics Data System (ADS)

    Kantsyrev, V. L.; Schultz, K. A.; Shlyaptseva, V. V.; Petrov, G. M.; Safronova, A. S.; Petkov, E. E.; Moschella, J. J.; Shrestha, I.; Cline, W.; Wiewior, P.; Chalyy, O.

    2016-11-01

    Many aspects of physical phenomena occurring when an intense laser pulse with subpicosecond duration and an intensity of 1018-1019W /cm2 heats an underdense plasma in a supersonic clustered gas jet are studied to determine the relative contribution of thermal and nonthermal processes to soft- and hard-x-ray emission from debris-free plasmas. Experiments were performed at the University of Nevada, Reno (UNR) Leopard laser operated with a 15-J, 350-fs pulse and different pulse contrasts (107 or 105). The supersonic linear (elongated) nozzle generated Xe cluster-monomer gas jets as well as jets with Kr-Ar or Xe-Kr-Ar mixtures with densities of 1018-1019cm-3 . Prior to laser heating experiments, all jets were probed with optical interferometry and Rayleigh scattering to measure jet density and cluster distribution parameters. The supersonic linear jet provides the capability to study the anisotropy of x-ray yield from laser plasma and also laser beam self-focusing in plasma, which leads to efficient x-ray generation. Plasma diagnostics included x-ray diodes, pinhole cameras, and spectrometers. Jet signatures of x-ray emission from pure Xe gas, as well as from a mixture with Ar and Kr, was found to be very different. The most intense x-ray emission in the 1-9 KeV spectral region was observed from gas mixtures rather than pure Xe. Also, this x-ray emission was strongly anisotropic with respect to the direction of laser beam polarization. Non-local thermodynamic equilibrium (Non-LTE) models have been implemented to analyze the x-ray spectra to determine the plasma temperature and election density. Evidence of electron beam generation in the supersonic jet plasma was found. The influence of the subpicosecond laser pulse contrast (a ratio between the laser peak intensity and pedestal pulse intensity) on the jets' x-ray emission characteristics is discussed. Surprisingly, it was found that the x-ray yield was not sensitive to the prepulse contrast ratio.

  6. The primordial and evolutionary abundance variations in globular-cluster stars: a problem with two unknowns

    NASA Astrophysics Data System (ADS)

    Denissenkov, P. A.; VandenBerg, D. A.; Hartwick, F. D. A.; Herwig, F.; Weiss, A.; Paxton, B.

    2015-04-01

    We demonstrate that among the potential sources of the primordial abundance variations of the proton-capture elements in globular-cluster stars proposed so far, such as the hot-bottom burning in massive asymptotic giant branch stars and H burning in the convective cores of supermassive and fast-rotating massive main-sequence (MS) stars, only the supermassive MS stars with M > 104 M⊙ can explain all the observed abundance correlations without any fine-tuning of model parameters. We use our assumed chemical composition for the pristine gas in M13 (NGC 6205) and its mixtures with 50 and 90 per cent of the material partially processed in H burning in the 6 × 104 M⊙ MS model star as the initial compositions for the normal, intermediate, and extreme populations of low-mass stars in this globular cluster, as suggested by its O-Na anticorrelation. We evolve these stars from the zero-age MS to the red giant branch (RGB) tip with the thermohaline and parametric prescriptions for the RGB extra mixing. We find that the 3He-driven thermohaline convection cannot explain the evolutionary decline of [C/Fe] in M13 RGB stars, which, on the other hand, is well reproduced with the universal values for the mixing depth and rate calibrated using the observed decrease of [C/Fe] with MV in the globular cluster NGC5466 that does not have the primordial abundance variations.

  7. Convective Self-Sustained Motion in Mixtures of Chemically Active and Passive Particles.

    PubMed

    Shklyaev, Oleg E; Shum, Henry; Yashin, Victor V; Balazs, Anna C

    2017-08-15

    We develop a model to describe the behavior of a system of active and passive particles in solution that can undergo spontaneous self-organization and self-sustained motion. The active particles are uniformly coated with a catalyst that decomposes the reagent in the surrounding fluid. The resulting variations in the fluid density give rise to a convective flow around the active particles. The generated fluid flow, in turn, drives the self-organization of both the active and passive particles into clusters that undergo self-sustained propulsion along the bottom wall of a microchamber. This propulsion continues until the reagents in the solution are consumed. Depending on the number of active and passive particles and the structure of the self-organized cluster, these assemblies can translate, spin, or remain stationary. We also illustrate a scenario in which the geometry of the container is harnessed to direct the motion of a self-organized, self-propelled cluster. The findings provide guidelines for creating autonomously moving active particles, or chemical "motors" that can transport passive cargo in microfluidic devices.

  8. Brownian dynamics simulations of insulin microspheres formation

    NASA Astrophysics Data System (ADS)

    Li, Wei; Chakrabarti, Amit; Gunton, James

    2010-03-01

    Recent experiments have indicated a novel, aqueous process of microsphere insulin fabrication based on controlled phase separation of protein from water-soluble polymers. We investigate the insulin microsphere crystal formation from insulin-PEG-water systems via 3D Brownian Dynamics simulations. We use the two component Asakura-Oosawa model to simulate the kinetics of this colloid polymer mixture. We first perform a deep quench below the liquid-crystal boundary that leads to fractal formation. We next heat the system to obtain a break-up of the fractal clusters and subsequently cool the system to obtain a spherical aggregation of droplets with a relatively narrow size distribution. We analyze the structure factor S(q) to identify the cluster dimension. S(q) crosses over from a power law q dependence of 1.8 (in agreement with DLCA) to 4 as q increases, which shows the evolution from fractal to spherical clusters. By studying the bond-order parameters, we find the phase transition from liquid-like droplets to crystals which exhibit local HCP and FCC order. This work is supported by grants from the NSF and Mathers Foundation.

  9. A New Cluster Analysis-Marker-Controlled Watershed Method for Separating Particles of Granular Soils

    PubMed Central

    Alam, Md Ferdous

    2017-01-01

    An accurate determination of particle-level fabric of granular soils from tomography data requires a maximum correct separation of particles. The popular marker-controlled watershed separation method is widely used to separate particles. However, the watershed method alone is not capable of producing the maximum separation of particles when subjected to boundary stresses leading to crushing of particles. In this paper, a new separation method, named as Monash Particle Separation Method (MPSM), has been introduced. The new method automatically determines the optimal contrast coefficient based on cluster evaluation framework to produce the maximum accurate separation outcomes. Finally, the particles which could not be separated by the optimal contrast coefficient were separated by integrating cuboid markers generated from the clustering by Gaussian mixture models into the routine watershed method. The MPSM was validated on a uniformly graded sand volume subjected to one-dimensional compression loading up to 32 MPa. It was demonstrated that the MPSM is capable of producing the best possible separation of particles required for the fabric analysis. PMID:29057823

  10. Metal mixtures in urban and rural populations in the US: The Multi-Ethnic Study of Atherosclerosis and the Strong Heart Study.

    PubMed

    Pang, Yuanjie; Peng, Roger D; Jones, Miranda R; Francesconi, Kevin A; Goessler, Walter; Howard, Barbara V; Umans, Jason G; Best, Lyle G; Guallar, Eliseo; Post, Wendy S; Kaufman, Joel D; Vaidya, Dhananjay; Navas-Acien, Ana

    2016-05-01

    Natural and anthropogenic sources of metal exposure differ for urban and rural residents. We searched to identify patterns of metal mixtures which could suggest common environmental sources and/or metabolic pathways of different urinary metals, and compared metal-mixtures in two population-based studies from urban/sub-urban and rural/town areas in the US: the Multi-Ethnic Study of Atherosclerosis (MESA) and the Strong Heart Study (SHS). We studied a random sample of 308 White, Black, Chinese-American, and Hispanic participants in MESA (2000-2002) and 277 American Indian participants in SHS (1998-2003). We used principal component analysis (PCA), cluster analysis (CA), and linear discriminant analysis (LDA) to evaluate nine urinary metals (antimony [Sb], arsenic [As], cadmium [Cd], lead [Pb], molybdenum [Mo], selenium [Se], tungsten [W], uranium [U] and zinc [Zn]). For arsenic, we used the sum of inorganic and methylated species (∑As). All nine urinary metals were higher in SHS compared to MESA participants. PCA and CA revealed the same patterns in SHS, suggesting 4 distinct principal components (PC) or clusters (∑As-U-W, Pb-Sb, Cd-Zn, Mo-Se). In MESA, CA showed 2 large clusters (∑As-Mo-Sb-U-W, Cd-Pb-Se-Zn), while PCA showed 4 PCs (Sb-U-W, Pb-Se-Zn, Cd-Mo, ∑As). LDA indicated that ∑As, U, W, and Zn were the most discriminant variables distinguishing MESA and SHS participants. In SHS, the ∑As-U-W cluster and PC might reflect groundwater contamination in rural areas, and the Cd-Zn cluster and PC could reflect common sources from meat products or metabolic interactions. Among the metals assayed, ∑As, U, W and Zn differed the most between MESA and SHS, possibly reflecting disproportionate exposure from drinking water and perhaps food in rural Native communities compared to urban communities around the US. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. TWave: High-Order Analysis of Functional MRI

    PubMed Central

    Barnathan, Michael; Megalooikonomou, Vasileios; Faloutsos, Christos; Faro, Scott; Mohamed, Feroze B.

    2011-01-01

    The traditional approach to functional image analysis models images as matrices of raw voxel intensity values. Although such a representation is widely utilized and heavily entrenched both within neuroimaging and in the wider data mining community, the strong interactions among space, time, and categorical modes such as subject and experimental task inherent in functional imaging yield a dataset with “high-order” structure, which matrix models are incapable of exploiting. Reasoning across all of these modes of data concurrently requires a high-order model capable of representing relationships between all modes of the data in tandem. We thus propose to model functional MRI data using tensors, which are high-order generalizations of matrices equivalent to multidimensional arrays or data cubes. However, several unique challenges exist in the high-order analysis of functional medical data: naïve tensor models are incapable of exploiting spatiotemporal locality patterns, standard tensor analysis techniques exhibit poor efficiency, and mixtures of numeric and categorical modes of data are very often present in neuroimaging experiments. Formulating the problem of image clustering as a form of Latent Semantic Analysis and using the WaveCluster algorithm as a baseline, we propose a comprehensive hybrid tensor and wavelet framework for clustering, concept discovery, and compression of functional medical images which successfully addresses these challenges. Our approach reduced runtime and dataset size on a 9.3 GB finger opposition motor task fMRI dataset by up to 98% while exhibiting improved spatiotemporal coherence relative to standard tensor, wavelet, and voxel-based approaches. Our clustering technique was capable of automatically differentiating between the frontal areas of the brain responsible for task-related habituation and the motor regions responsible for executing the motor task, in contrast to a widely used fMRI analysis program, SPM, which only detected the latter region. Furthermore, our approach discovered latent concepts suggestive of subject handedness nearly 100x faster than standard approaches. These results suggest that a high-order model is an integral component to accurate scalable functional neuroimaging. PMID:21729758

  12. Research of the multimodal brain-tumor segmentation algorithm

    NASA Astrophysics Data System (ADS)

    Lu, Yisu; Chen, Wufan

    2015-12-01

    It is well-known that the number of clusters is one of the most important parameters for automatic segmentation. However, it is difficult to define owing to the high diversity in appearance of tumor tissue among different patients and the ambiguous boundaries of lesions. In this study, a nonparametric mixture of Dirichlet process (MDP) model is applied to segment the tumor images, and the MDP segmentation can be performed without the initialization of the number of clusters. A new nonparametric segmentation algorithm combined with anisotropic diffusion and a Markov random field (MRF) smooth constraint is proposed in this study. Besides the segmentation of single modal brain tumor images, we developed the algorithm to segment multimodal brain tumor images by the magnetic resonance (MR) multimodal features and obtain the active tumor and edema in the same time. The proposed algorithm is evaluated and compared with other approaches. The accuracy and computation time of our algorithm demonstrates very impressive performance.

  13. Individual Human Brain Areas Can Be Identified from Their Characteristic Spectral Activation Fingerprints.

    PubMed

    Keitel, Anne; Gross, Joachim

    2016-06-01

    The human brain can be parcellated into diverse anatomical areas. We investigated whether rhythmic brain activity in these areas is characteristic and can be used for automatic classification. To this end, resting-state MEG data of 22 healthy adults was analysed. Power spectra of 1-s long data segments for atlas-defined brain areas were clustered into spectral profiles ("fingerprints"), using k-means and Gaussian mixture (GM) modelling. We demonstrate that individual areas can be identified from these spectral profiles with high accuracy. Our results suggest that each brain area engages in different spectral modes that are characteristic for individual areas. Clustering of brain areas according to similarity of spectral profiles reveals well-known brain networks. Furthermore, we demonstrate task-specific modulations of auditory spectral profiles during auditory processing. These findings have important implications for the classification of regional spectral activity and allow for novel approaches in neuroimaging and neurostimulation in health and disease.

  14. On the streaming model for redshift-space distortions

    NASA Astrophysics Data System (ADS)

    Kuruvilla, Joseph; Porciani, Cristiano

    2018-06-01

    The streaming model describes the mapping between real and redshift space for 2-point clustering statistics. Its key element is the probability density function (PDF) of line-of-sight pairwise peculiar velocities. Following a kinetic-theory approach, we derive the fundamental equations of the streaming model for ordered and unordered pairs. In the first case, we recover the classic equation while we demonstrate that modifications are necessary for unordered pairs. We then discuss several statistical properties of the pairwise velocities for DM particles and haloes by using a suite of high-resolution N-body simulations. We test the often used Gaussian ansatz for the PDF of pairwise velocities and discuss its limitations. Finally, we introduce a mixture of Gaussians which is known in statistics as the generalised hyperbolic distribution and show that it provides an accurate fit to the PDF. Once inserted in the streaming equation, the fit yields an excellent description of redshift-space correlations at all scales that vastly outperforms the Gaussian and exponential approximations. Using a principal-component analysis, we reduce the complexity of our model for large redshift-space separations. Our results increase the robustness of studies of anisotropic galaxy clustering and are useful for extending them towards smaller scales in order to test theories of gravity and interacting dark-energy models.

  15. The role of chemometrics in single and sequential extraction assays: a review. Part II. Cluster analysis, multiple linear regression, mixture resolution, experimental design and other techniques.

    PubMed

    Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo

    2011-03-04

    Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.

  16. Sloshing in its cD halo: MUSE kinematics of the central galaxy NGC 3311 in the Hydra I cluster

    NASA Astrophysics Data System (ADS)

    Barbosa, C. E.; Arnaboldi, M.; Coccato, L.; Gerhard, O.; Mendes de Oliveira, C.; Hilker, M.; Richtler, T.

    2018-01-01

    Context. Early-type galaxies (ETGs) show a strong size evolution with redshift. This evolution is explained by fast "in-situ" star formation at high-z followed by a late mass assembly mostly driven by minor mergers that deposit stars primarily in the outer halo. Aims: We aim to identify the main structural components of the Hydra I cD galaxy NGC 3311 to investigate the connection between the central galaxy and the surrounding stellar halo. Methods: We produce maps of the line-of-sight velocity distribution (LOSVD) moments from a mosaic of MUSE pointings covering NGC 3311 out to 25 kpc. Combining deep photometric and spectroscopic data, we model the LOSVD maps using a finite mixture distribution, including four non-concentric components that are nearly isothermal spheroids, with different line-of-sight systemic velocities V, velocity dispersions σ, and small (constant) values of the higher order Gauss-Hermite moments h3 and h4. Results: The kinemetry analysis indicates that NGC 3311 is classified as a slow rotator, although the galaxy shows a line-of-sight velocity gradient along the photometric major axis. The comparison of the correlations between h3 and h4 with V/σ with simulated galaxies indicates that NGC 3311 assembled mainly through dry mergers. The σ profile rises to ≃ 400 km s-1 at 20 kpc, a significant fraction (0.55) of the Hydra I cluster velocity dispersion, indicating that stars there were stripped from progenitors orbiting in the cluster core. The finite mixture distribution modeling supports three inner components related to the central galaxy and a fourth component with large effective radius (51 kpc) and velocity dispersion (327 km s-1) consistent with a cD envelope. We find that the cD envelope is offset from the center of NGC 3311 both spatially (8.6 kpc) and in velocity (ΔV = 204 km s-1), but coincides with the cluster core X-ray isophotes and the mean velocity of core galaxies. Also, the envelope contributes to the broad wings of the LOSVD measured by large h4 values within 10 kpc. Conclusions: The cD envelope of NGC 3311 is dynamically associated with the cluster core, which in Hydra I is in addition displaced from the cluster center, presumably due to a recent subcluster merger. The combined datacubes are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/609/A78

  17. Noise-robust unsupervised spike sorting based on discriminative subspace learning with outlier handling.

    PubMed

    Keshtkaran, Mohammad Reza; Yang, Zhi

    2017-06-01

    Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.

  18. Noise-robust unsupervised spike sorting based on discriminative subspace learning with outlier handling

    NASA Astrophysics Data System (ADS)

    Keshtkaran, Mohammad Reza; Yang, Zhi

    2017-06-01

    Objective. Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. Approach. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Main results. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. Significance. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.

  19. Unsupervised classification of multivariate geostatistical data: Two algorithms

    NASA Astrophysics Data System (ADS)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  20. Iterative local Gaussian clustering for expressed genes identification linked to malignancy of human colorectal carcinoma

    PubMed Central

    Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri

    2007-01-01

    Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis. PMID:18305825

  1. Iterative local Gaussian clustering for expressed genes identification linked to malignancy of human colorectal carcinoma.

    PubMed

    Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri

    2007-12-30

    Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis.

  2. MC 2: A Deeper Look at ZwCl 2341.1+0000 with Bayesian Galaxy Clustering and Weak Lensing Analyses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Benson, B.; Wittman, D. M.; Golovich, N.

    ZwCl 2341.1+0000, a merging galaxy cluster with disturbed X-ray morphology and widely separated (~3 Mpc) double radio relics, was thought to be an extremely massive (10 - 30 X 10 14M⊙) and complex system with little known about its merger history. We present JVLA 2-4 GHz observations of the cluster, along with new spectroscopy from our Keck/DEIMOS survey, and apply Gaussian Mixture Modeling to the three-dimensional distribution of 227 con rmed cluster galaxies. After adopting the Bayesian Information Criterion to avoid over tting, which we discover can bias total dynamical mass estimates high, we nd that a three-substructure model withmore » a total dynamical mass estimate of 9:39 ± 0:81 X 10 14M⊙ is favored. We also present deep Subaru imaging and perform the rst weak lensing analysis on this system, obtaining a weak lensing mass estimate of 5:57±2:47X10 14M⊙. This is a more robust estimate because it does not depend on the dynamical state of the system, which is disturbed due to the merger. Our results indicate that ZwCl 2341.1+0000 is a multiple merger system comprised of at least three substructures, with the main merger that produced the radio relics occurring near to the plane of the sky, and a younger merger in the North occurring closer to the line of sight. Dynamical modeling of the main merger reproduces observed quantities (relic positions and polarizations, subcluster separation and radial velocity difference), if the merger axis angle of ~10 +34 -6 degrees and the collision speed at pericenter is ~1900 +300 -200 km/s.« less

  3. MC 2: A Deeper Look at ZwCl 2341.1+0000 with Bayesian Galaxy Clustering and Weak Lensing Analyses

    DOE PAGES

    Benson, B.; Wittman, D. M.; Golovich, N.; ...

    2017-05-16

    ZwCl 2341.1+0000, a merging galaxy cluster with disturbed X-ray morphology and widely separated (~3 Mpc) double radio relics, was thought to be an extremely massive (10 - 30 X 10 14M⊙) and complex system with little known about its merger history. We present JVLA 2-4 GHz observations of the cluster, along with new spectroscopy from our Keck/DEIMOS survey, and apply Gaussian Mixture Modeling to the three-dimensional distribution of 227 con rmed cluster galaxies. After adopting the Bayesian Information Criterion to avoid over tting, which we discover can bias total dynamical mass estimates high, we nd that a three-substructure model withmore » a total dynamical mass estimate of 9:39 ± 0:81 X 10 14M⊙ is favored. We also present deep Subaru imaging and perform the rst weak lensing analysis on this system, obtaining a weak lensing mass estimate of 5:57±2:47X10 14M⊙. This is a more robust estimate because it does not depend on the dynamical state of the system, which is disturbed due to the merger. Our results indicate that ZwCl 2341.1+0000 is a multiple merger system comprised of at least three substructures, with the main merger that produced the radio relics occurring near to the plane of the sky, and a younger merger in the North occurring closer to the line of sight. Dynamical modeling of the main merger reproduces observed quantities (relic positions and polarizations, subcluster separation and radial velocity difference), if the merger axis angle of ~10 +34 -6 degrees and the collision speed at pericenter is ~1900 +300 -200 km/s.« less

  4. Deterministic annealing for density estimation by multivariate normal mixtures

    NASA Astrophysics Data System (ADS)

    Kloppenburg, Martin; Tavan, Paul

    1997-03-01

    An approach to maximum-likelihood density estimation by mixtures of multivariate normal distributions for large high-dimensional data sets is presented. Conventionally that problem is tackled by notoriously unstable expectation-maximization (EM) algorithms. We remove these instabilities by the introduction of soft constraints, enabling deterministic annealing. Our developments are motivated by the proof that algorithmically stable fuzzy clustering methods that are derived from statistical physics analogs are special cases of EM procedures.

  5. Closed-cage tungsten oxide clusters in the gas phase.

    PubMed

    Singh, D M David Jeba; Pradeep, T; Thirumoorthy, Krishnan; Balasubramanian, Krishnan

    2010-05-06

    During the course of a study on the clustering of W-Se and W-S mixtures in the gas phase using laser desorption ionization (LDI) mass spectrometry, we observed several anionic W-O clusters. Three distinct species, W(6)O(19)(-), W(13)O(29)(-), and W(14)O(32)(-), stand out as intense peaks in the regular mass spectral pattern of tungsten oxide clusters suggesting unusual stabilities for them. Moreover, these clusters do not fragment in the postsource decay analysis. While trying to understand the precursor material, which produced these clusters, we found the presence of nanoscale forms of tungsten oxide. The structure and thermodynamic parameters of tungsten clusters have been explored using relativistic quantum chemical methods. Our computed results of atomization energy are consistent with the observed LDI mass spectra. The computational results suggest that the clusters observed have closed-cage structure. These distinct W(13) and W(14) clusters were observed for the first time in the gas phase.

  6. An adaptive data-driven method for accurate prediction of remaining useful life of rolling bearings

    NASA Astrophysics Data System (ADS)

    Peng, Yanfeng; Cheng, Junsheng; Liu, Yanfei; Li, Xuejun; Peng, Zhihua

    2018-06-01

    A novel data-driven method based on Gaussian mixture model (GMM) and distance evaluation technique (DET) is proposed to predict the remaining useful life (RUL) of rolling bearings. The data sets are clustered by GMM to divide all data sets into several health states adaptively and reasonably. The number of clusters is determined by the minimum description length principle. Thus, either the health state of the data sets or the number of the states is obtained automatically. Meanwhile, the abnormal data sets can be recognized during the clustering process and removed from the training data sets. After obtaining the health states, appropriate features are selected by DET for increasing the classification and prediction accuracy. In the prediction process, each vibration signal is decomposed into several components by empirical mode decomposition. Some common statistical parameters of the components are calculated first and then the features are clustered using GMM to divide the data sets into several health states and remove the abnormal data sets. Thereafter, appropriate statistical parameters of the generated components are selected using DET. Finally, least squares support vector machine is utilized to predict the RUL of rolling bearings. Experimental results indicate that the proposed method reliably predicts the RUL of rolling bearings.

  7. Identification of Reliable Components in Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS): a Data-Driven Approach across Metabolic Processes.

    PubMed

    Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun

    2015-11-04

    There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.

  8. Widom Lines in Binary Mixtures of Supercritical Fluids.

    PubMed

    Raju, Muralikrishna; Banuti, Daniel T; Ma, Peter C; Ihme, Matthias

    2017-06-08

    Recent experiments on pure fluids have identified distinct liquid-like and gas-like regimes even under supercritical conditions. The supercritical liquid-gas transition is marked by maxima in response functions that define a line emanating from the critical point, referred to as Widom line. However, the structure of analogous state transitions in mixtures of supercritical fluids has not been determined, and it is not clear whether a Widom line can be identified for binary mixtures. Here, we present first evidence for the existence of multiple Widom lines in binary mixtures from molecular dynamics simulations. By considering mixtures of noble gases, we show that, depending on the phase behavior, mixtures transition from a liquid-like to a gas-like regime via distinctly different pathways, leading to phase relationships of surprising complexity and variety. Specifically, we show that miscible binary mixtures have behavior analogous to a pure fluid and the supercritical state space is characterized by a single liquid-gas transition. In contrast, immiscible binary mixture undergo a phase separation in which the clusters transition separately at different temperatures, resulting in multiple distinct Widom lines. The presence of this unique transition behavior emphasizes the complexity of the supercritical state to be expected in high-order mixtures of practical relevance.

  9. Photoinduced nucleation: a novel tool for detecting molecules in air at ultra-low concentrations

    DOEpatents

    Katz, Joseph L.; Lihavainen, Heikki; Rudek, Markus M.; Salter, Brian C.

    2002-01-01

    A method and apparatus for determining the presence of molecules in a gas at concentrations of less than about 100 ppb. Light having wavelengths in the range from about 200 nm to about 350 nm is used to illuminate a flowing sample of the gas causing the molecules if present to form clusters. A mixture of the illuminated gas and a vapor is cooled until the vapor is supersaturated so that there is a small rate of homogeneous nucleation. The supersaturated vapor condenses on the clusters thus causing the clusters to grow to a size sufficient to be counted by light scattering and then the clusters are counted.

  10. The relationship between multilevel models and non-parametric multilevel mixture models: Discrete approximation of intraclass correlation, random coefficient distributions, and residual heteroscedasticity.

    PubMed

    Rights, Jason D; Sterba, Sonya K

    2016-11-01

    Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.

  11. Human recognition based on head-shoulder contour extraction and BP neural network

    NASA Astrophysics Data System (ADS)

    Kong, Xiao-fang; Wang, Xiu-qin; Gu, Guohua; Chen, Qian; Qian, Wei-xian

    2014-11-01

    In practical application scenarios like video surveillance and human-computer interaction, human body movements are uncertain because the human body is a non-rigid object. Based on the fact that the head-shoulder part of human body can be less affected by the movement, and will seldom be obscured by other objects, in human detection and recognition, a head-shoulder model with its stable characteristics can be applied as a detection feature to describe the human body. In order to extract the head-shoulder contour accurately, a head-shoulder model establish method with combination of edge detection and the mean-shift algorithm in image clustering has been proposed in this paper. First, an adaptive method of mixture Gaussian background update has been used to extract targets from the video sequence. Second, edge detection has been used to extract the contour of moving objects, and the mean-shift algorithm has been combined to cluster parts of target's contour. Third, the head-shoulder model can be established, according to the width and height ratio of human head-shoulder combined with the projection histogram of the binary image, and the eigenvectors of the head-shoulder contour can be acquired. Finally, the relationship between head-shoulder contour eigenvectors and the moving objects will be formed by the training of back-propagation (BP) neural network classifier, and the human head-shoulder model can be clustered for human detection and recognition. Experiments have shown that the method combined with edge detection and mean-shift algorithm proposed in this paper can extract the complete head-shoulder contour, with low calculating complexity and high efficiency.

  12. Uncertainty quantification and experimental design based on unsupervised machine learning identification of contaminant sources and groundwater types using hydrogeochemical data

    NASA Astrophysics Data System (ADS)

    Vesselinov, V. V.

    2017-12-01

    Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical species. Numerous geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. As a result, these types of model analyses are typically extremely challenging. Here, we demonstrate a new contaminant source identification approach that performs decomposition of the observation mixtures based on Nonnegative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. We also demonstrate how NMFk can be extended to perform uncertainty quantification and experimental design related to real-world site characterization. The NMFk algorithm works with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios). The NMFk algorithm has been extensively tested on synthetic datasets; NMFk analyses have been actively performed on real-world data collected at the Los Alamos National Laboratory (LANL) groundwater sites related to Chromium and RDX contamination.

  13. The Red-giant Branch Bump Revisited: Constraints on Envelope Overshooting in a Wide Range of Masses and Metallicities

    NASA Astrophysics Data System (ADS)

    Khan, Saniya; Hall, Oliver J.; Miglio, Andrea; Davies, Guy R.; Mosser, Benoît; Girardi, Léo; Montalbán, Josefina

    2018-06-01

    The red-giant branch bump provides valuable information for the investigation of the internal structure of low-mass stars. Because current models are unable to accurately predict the occurrence and efficiency of mixing processes beyond convective boundaries, one can use the luminosity of the bump—a diagnostic of the maximum extension of the convective envelope during the first-dredge up—as a calibrator for such processes. By combining asteroseismic and spectroscopic constraints, we expand the analysis of the bump to masses and metallicities beyond those previously accessible using globular clusters. Our data set comprises nearly 3000 red-giant stars observed by Kepler and with APOGEE spectra. Using statistical mixture models, we are able to detect the bump in the average seismic parameters ν max and < {{Δ }}ν > , and show that its observed position reveals general trends with mass and metallicity in line with expectations from models. Moreover, our analysis indicates that standard stellar models underestimate the depth of efficiently mixed envelopes. The inclusion of significant overshooting from the base of the convective envelope, with an efficiency that increases with decreasing metallicity, allows us to reproduce the observed location of the bump. Interestingly, this trend was also reported in previous studies of globular clusters.

  14. Size exclusion chromatography for semipreparative scale separation of Au38(SR)24 and Au40(SR)24 and larger clusters.

    PubMed

    Knoppe, Stefan; Boudon, Julien; Dolamic, Igor; Dass, Amala; Bürgi, Thomas

    2011-07-01

    Size exclusion chromatography (SEC) on a semipreparative scale (10 mg and more) was used to size-select ultrasmall gold nanoclusters (<2 nm) from polydisperse mixtures. In particular, the ubiquitous byproducts of the etching process toward Au(38)(SR)(24) (SR, thiolate) clusters were separated and gained in high monodispersity (based on mass spectrometry). The isolated fractions were characterized by UV-vis spectroscopy, MALDI mass spectrometry, HPLC, and electron microscopy. Most notably, the separation of Au(38)(SR)(24) and Au(40)(SR)(24) clusters is demonstrated.

  15. Dynamics, Chemical Abundances, and ages of Globular Clusters in the Virgo Cluster of Galaxies

    NASA Astrophysics Data System (ADS)

    Guhathakurta, Puragra; NGVS Collaboration

    2018-01-01

    We present a study of the dynamics, metallicities, and ages of globular clusters (GCs) in the Next Generation Virgo cluster Survey (NGVS), a deep, multi-band (u, g, r, i, z, and Ks), wide-field (104 deg2) imaging survey carried out using the 3.6-m Canada-France-Hawaii Telescope and MegaCam imager. GC candidates were selected from the NGVS survey using photometric and image morphology criteria and these were followed up with deep, medium-resolution, multi-object spectroscopy using the Keck II 10-m telescope and DEIMOS spectrograph. The primary spectroscopic targets were candidate GC satellites of dwarf elliptical (dE) and ultra-diffuse galaxies (UDGs) in the Virgo cluster. While many objects were confirmed as GC satellites of Virgo dEs and UDGs, many turned out to be non-satellites based on their radial velocity and/or positional mismatch any identifiable Virgo cluster galaxy. We have used a combination of spectral characteristics (e.g., presence of absorption vs. emission lines), new Gaussian mixture modeling of radial velocity and sky position data, and a new extreme deconvolution analysis of ugrizKs photometry and image morphology, to classify all the objects in our sample into: (1) GC satellites of dE galaxies, (2) GC satellites of UDGs, (3) intra-cluster GCs (ICGCs) in the Virgo cluster, (4) GCs in the outer halo of the central cluster galaxy M87, (5) foreground Milky Way stars, and (6) distant background galaxies. We use these data to study the dynamics and dark matter content of dE and UDGs in the Virgo cluster, place important constraints on the nature of dE nuclei, and study the origin of ICGCs versus GCs in the remote M87 halo.We are grateful for financial support from the NSF and NASA/STScI.

  16. Personal exposure to mixtures of volatile organic compounds: modeling and further analysis of the RIOPA data.

    PubMed

    Batterman, Stuart; Su, Feng-Chiao; Li, Shi; Mukherjee, Bhramar; Jia, Chunrong

    2014-06-01

    Emission sources of volatile organic compounds (VOCs*) are numerous and widespread in both indoor and outdoor environments. Concentrations of VOCs indoors typically exceed outdoor levels, and most people spend nearly 90% of their time indoors. Thus, indoor sources generally contribute the majority of VOC exposures for most people. VOC exposure has been associated with a wide range of acute and chronic health effects; for example, asthma, respiratory diseases, liver and kidney dysfunction, neurologic impairment, and cancer. Although exposures to most VOCs for most persons fall below health-based guidelines, and long-term trends show decreases in ambient emissions and concentrations, a subset of individuals experience much higher exposures that exceed guidelines. Thus, exposure to VOCs remains an important environmental health concern. The present understanding of VOC exposures is incomplete. With the exception of a few compounds, concentration and especially exposure data are limited; and like other environmental data, VOC exposure data can show multiple modes, low and high extreme values, and sometimes a large portion of data below method detection limits (MDLs). Field data also show considerable spatial or interpersonal variability, and although evidence is limited, temporal variability seems high. These characteristics can complicate modeling and other analyses aimed at risk assessment, policy actions, and exposure management. In addition to these analytic and statistical issues, exposure typically occurs as a mixture, and mixture components may interact or jointly contribute to adverse effects. However most pollutant regulations, guidelines, and studies remain focused on single compounds, and thus may underestimate cumulative exposures and risks arising from coexposures. In addition, the composition of VOC mixtures has not been thoroughly investigated, and mixture components show varying and complex dependencies. Finally, although many factors are known to affect VOC exposures, many personal, environmental, and socioeconomic determinants remain to be identified, and the significance and applicability of the determinants reported in the literature are uncertain. To help answer these unresolved questions and overcome limitations of previous analyses, this project used several novel and powerful statistical modeling and analysis techniques and two large data sets. The overall objectives of this project were (1) to identify and characterize exposure distributions (including extreme values), (2) evaluate mixtures (including dependencies), and (3) identify determinants of VOC exposure. METHODS VOC data were drawn from two large data sets: the Relationships of Indoor, Outdoor, and Personal Air (RIOPA) study (1999-2001) and the National Health and Nutrition Examination Survey (NHANES; 1999-2000). The RIOPA study used a convenience sample to collect outdoor, indoor, and personal exposure measurements in three cities (Elizabeth, NJ; Houston, TX; Los Angeles, CA). In each city, approximately 100 households with adults and children who did not smoke were sampled twice for 18 VOCs. In addition, information about 500 variables associated with exposure was collected. The NHANES used a nationally representative sample and included personal VOC measurements for 851 participants. NHANES sampled 10 VOCs in common with RIOPA. Both studies used similar sampling methods and study periods. Specific Aim 1. To estimate and model extreme value exposures, extreme value distribution models were fitted to the top 10% and 5% of VOC exposures. Health risks were estimated for individual VOCs and for three VOC mixtures. Simulated extreme value data sets, generated for each VOC and for fitted extreme value and lognormal distributions, were compared with measured concentrations (RIOPA observations) to evaluate each model's goodness of fit. Mixture distributions were fitted with the conventional finite mixture of normal distributions and the semi-parametric Dirichlet process mixture (DPM) of normal distributions for three individual VOCs (chloroform, 1,4-DCB, and styrene). Goodness of fit for these full distribution models was also evaluated using simulated data. Specific Aim 2. Mixtures in the RIOPA VOC data set were identified using positive matrix factorization (PMF) and by toxicologic mode of action. Dependency structures of a mixture's components were examined using mixture fractions and were modeled using copulas, which address correlations of multiple components across their entire distributions. Five candidate copulas (Gaussian, t, Gumbel, Clayton, and Frank) were evaluated, and the performance of fitted models was evaluated using simulation and mixture fractions. Cumulative cancer risks were calculated for mixtures, and results from copulas and multivariate lognormal models were compared with risks based on RIOPA observations. Specific Aim 3. Exposure determinants were identified using stepwise regressions and linear mixed-effects models (LMMs). Specific Aim 1. Extreme value exposures in RIOPA typically were best fitted by three-parameter generalized extreme value (GEV) distributions, and sometimes by the two-parameter Gumbel distribution. In contrast, lognormal distributions significantly underestimated both the level and likelihood of extreme values. Among the VOCs measured in RIOPA, 1,4-dichlorobenzene (1,4-DCB) was associated with the greatest cancer risks; for example, for the highest 10% of measurements of 1,4-DCB, all individuals had risk levels above 10(-4), and 13% of all participants had risk levels above 10(-2). Of the full-distribution models, the finite mixture of normal distributions with two to four clusters and the DPM of normal distributions had superior performance in comparison with the lognormal models. DPM distributions provided slightly better fit than the finite mixture distributions; the advantages of the DPM model were avoiding certain convergence issues associated with the finite mixture distributions, adaptively selecting the number of needed clusters, and providing uncertainty estimates. Although the results apply to the RIOPA data set, GEV distributions and mixture models appear more broadly applicable. These models can be used to simulate VOC distributions, which are neither normally nor lognormally distributed, and they accurately represent the highest exposures, which may have the greatest health significance. Specific Aim 2. Four VOC mixtures were identified and apportioned by PMF; they represented gasoline vapor, vehicle exhaust, chlorinated solvents and disinfection byproducts, and cleaning products and odorants. The last mixture (cleaning products and odorants) accounted for the largest fraction of an individual's total exposure (average of 42% across RIOPA participants). Often, a single compound dominated a mixture but the mixture fractions were heterogeneous; that is, the fractions of the compounds changed with the concentration of the mixture. Three VOC mixtures were identified by toxicologic mode of action and represented VOCs associated with hematopoietic, liver, and renal tumors. Estimated lifetime cumulative cancer risks exceeded 10(-3) for about 10% of RIOPA participants. The dependency structures of the VOC mixtures in the RIOPA data set fitted Gumbel (two mixtures) and t copulas (four mixtures). These copula types emphasize dependencies found in the upper and lower tails of a distribution. The copulas reproduced both risk predictions and exposure fractions with a high degree of accuracy and performed better than multivariate lognormal distributions. Specific Aim 3. In an analysis focused on the home environment and the outdoor (close to home) environment, home VOC concentrations dominated personal exposures (66% to 78% of the total exposure, depending on VOC); this was largely the result of the amount of time participants spent at home and the fact that indoor concentrations were much higher than outdoor concentrations for most VOCs. In a different analysis focused on the sources inside the home and outside (but close to the home), it was assumed that 100% of VOCs from outside sources would penetrate the home. Outdoor VOC sources accounted for 5% (d-limonene) to 81% (carbon tetrachloride [CTC]) of the total exposure. Personal exposure and indoor measurements had similar determinants depending on the VOC. Gasoline-related VOCs (e.g., benzene and methyl tert-butyl ether [MTBE]) were associated with city, residences with attached garages, pumping gas, wind speed, and home air exchange rate (AER). Odorant and cleaning-related VOCs (e.g., 1,4-DCB and chloroform) also were associated with city, and a residence's AER, size, and family members showering. Dry-cleaning and industry-related VOCs (e.g., tetrachloroethylene [or perchloroethylene, PERC] and trichloroethylene [TCE]) were associated with city, type of water supply to the home, and visits to the dry cleaner. These and other relationships were significant, they explained from 10% to 40% of the variance in the measurements, and are consistent with known emission sources and those reported in the literature. Outdoor concentrations of VOCs had only two determinants in common: city and wind speed. Overall, personal exposure was dominated by the home setting, although a large fraction of indoor VOC concentrations were due to outdoor sources. City of residence, personal activities, household characteristics, and meteorology were significant determinants. Concentrations in RIOPA were considerably lower than levels in the nationally representative NHANES for all VOCs except MTBE and 1,4-DCB. Differences between RIOPA and NHANES results can be explained by contrasts between the sampling designs and staging in the two studies, and by differences in the demographics, smoking, employment, occupations, and home locations. (ABSTRACT TRUNCATED)

  17. Solvent effects on the polar network of ionic liquid solutions

    NASA Astrophysics Data System (ADS)

    Bernardes, Carlos E. S.; Shimizu, Karina; Canongia Lopes, José N.

    2015-05-01

    Molecular dynamics simulations were used to probe mixtures of ionic liquids (ILs) with common molecular solvents. Four types of systems were considered: (i) 1-ethyl-3-methylimidazolium bis(trifluoromethylsulfonyl)imide plus benzene, hexafluorobenzene or 1,2-difluorobenzene mixtures; (ii) choline-based ILs plus ether mixtures (iii) choline-based ILs plus n-alkanol mixtures; and (iv) 1-butyl-3-methylimidazolium nitrate and 1-ethyl-3-methylimidazolium ethyl sulfate aqueous mixtures. The results produced a wealth of structural and aggregation information that highlight the resilience of the polar network of the ILs (formed by clusters of alternating ions and counter-ions) to the addition of different types of molecular solvent. The analysis of the MD data also shows that the intricate balance between different types of interaction (electrostatic, van der Waals, H-bond-like) between the different species present in the mixtures has a profound effect on the morphology of the mixtures at a mesoscopic scale. In the case of the IL aqueous solutions, the present results suggest an alternative interpretation for very recently published x-ray and neutron diffraction data on similar systems.

  18. Application of a fuzzy neural network model in predicting polycyclic aromatic hydrocarbon-mediated perturbations of the Cyp1b1 transcriptional regulatory network in mouse skin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Larkin, Andrew; Department of Statistics, Oregon State University; Superfund Research Center, Oregon State University

    2013-03-01

    Polycyclic aromatic hydrocarbons (PAHs) are present in the environment as complex mixtures with components that have diverse carcinogenic potencies and mostly unknown interactive effects. Non-additive PAH interactions have been observed in regulation of cytochrome P450 (CYP) gene expression in the CYP1 family. To better understand and predict biological effects of complex mixtures, such as environmental PAHs, an 11 gene input-1 gene output fuzzy neural network (FNN) was developed for predicting PAH-mediated perturbations of dermal Cyp1b1 transcription in mice. Input values were generalized using fuzzy logic into low, medium, and high fuzzy subsets, and sorted using k-means clustering to create Mamdanimore » logic functions for predicting Cyp1b1 mRNA expression. Model testing was performed with data from microarray analysis of skin samples from FVB/N mice treated with toluene (vehicle control), dibenzo[def,p]chrysene (DBC), benzo[a]pyrene (BaP), or 1 of 3 combinations of diesel particulate extract (DPE), coal tar extract (CTE) and cigarette smoke condensate (CSC) using leave-one-out cross-validation. Predictions were within 1 log{sub 2} fold change unit of microarray data, with the exception of the DBC treatment group, where the unexpected down-regulation of Cyp1b1 expression was predicted but did not reach statistical significance on the microarrays. Adding CTE to DPE was predicted to increase Cyp1b1 expression, whereas adding CSC to CTE and DPE was predicted to have no effect, in agreement with microarray results. The aryl hydrocarbon receptor repressor (Ahrr) was determined to be the most significant input variable for model predictions using back-propagation and normalization of FNN weights. - Highlights: ► Tested a model to predict PAH mixture-mediated changes in Cyp1b1 expression ► Quantitative predictions in agreement with microarrays for Cyp1b1 induction ► Unexpected difference in expression between DBC and other treatments predicted ► Model predictions for combining PAH mixtures in agreement with microarrays ► Predictions highly dependent on aryl hydrocarbon receptor repressor expression.« less

  19. Measurement and description of underlying dimensions of comorbid mental disorders using Factor Mixture Models: results of the ESEMeD project.

    PubMed

    Almansa, Josué; Vermunt, Jeroen K; Forero, Carlos G; Vilagut, Gemma; De Graaf, Ron; De Girolamo, Giovanni; Alonso, Jordi

    2011-06-01

    Epidemiological studies on mental health and mental comorbidity are usually based on prevalences and correlations between disorders, or some other form of bivariate clustering of disorders. In this paper, we propose a Factor Mixture Model (FMM) methodology based on conceptual models aiming to measure and summarize distinctive disorder information in the internalizing and externalizing dimensions. This methodology includes explicit modelling of subpopulations with and without 12 month disorders ("ill" and "healthy") by means of latent classes, as well as assessment of model invariance and estimation of dimensional scores. We applied this methodology with an internalizing/externalizing two-factor model, to a representative sample gathered in the European Study of the Epidemiology of Mental Disorders (ESEMeD) study -- which includes 8796 individuals from six countries, and used the CIDI 3.0 instrument for disorder assessment. Results revealed that southern European countries have significantly higher mental health levels concerning internalizing/externalizing disorders than central countries; males suffered more externalizing disorders than women did, and conversely, internalizing disorders were more frequent in women. Differences in mental-health level between socio-demographic groups were due to different proportions of healthy and ill individuals and, noticeably, to the ameliorating influence of marital status on severity. An advantage of latent model-based scores is that the inclusion of additional mental-health dimensional information -- other than diagnostic data -- allows for greater precision within a target range of scores. Copyright © 2011 John Wiley & Sons, Ltd.

  20. A Theoretical Investigation of the Plausibility of Reactions Between Ammonia and Carbonyl Species (Formaldehyde, Acetaldehyde, and Acetone) in Interstellar Ice Analogs at Ultracold Temperatures

    NASA Technical Reports Server (NTRS)

    Chen, Lina; Woon, David E.

    2011-01-01

    We have reexamined the reaction between formaldehyde and ammonia, which was previously studied by us and other workers in modestly sized cluster calculations. Larger model systems with up to 12H2O were employed, and reactions of two more carbonyl species, acetaldehyde and acetone, were also carried out. Calculations were performed at the B3LYP/6-31+G** level with bulk solvent effects treated with a polarizable continuum model; limited MP2/6-31+G** calculations were also performed. We found that while the barrier for the concerted proton relay mechanism described in previous work remains modest, it is still prohibitively high for the reaction to occur under the ultracold conditions that prevail in dense interstellar clouds. However, a new pathway emerged in more realistic clusters that involves at least one barrierless step for two of the carbonyl species considered here: ammonia reacts with formaldehyde and acetaldehyde to form a partial charge transfer species in small clusters (4H2O) and a protonated hydroxyamino intermediate species in large clusters (9H2O, 12H2O); modest barriers that decrease sharply with cluster size are found for the analogous processes for the acetone-NH3 reaction. Furthermore, if a second ammonia replaces one of the water molecules in calculations in the 9H2O clusters, deprotonation can occur to yield the same neutral hydroxyamino species that is formed via the original concerted proton relay mechanism. In at least one position, deprotonation is barrierless when zero-point energy is included. In addition to describing the structures and energetics of the reactions between formaldehyde, acetaldehyde, and acetone with ammonia, we report spectroscopic predictions of the observable vibrational features that are expected to be present in ice mixtures of different composition.

  1. Synthesis and characterization of Fe colloid catalysts in inverse micelle solutions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martino, A.; Stoker, M.; Hicks, M.

    1995-12-31

    Surfactant molecules, possessing a hydrophilic head group and a hydrophobic tail group, aggregate in various solvents to form structured solutions. In two component mixtures of surfactant and organic solvents (e.g., toluene and alkanes), surfactants aggregate to form inverse micelles. Here, the hydrophilic head groups shield themselves by forming a polar core, and the hydrophobic tails groups are free to move about in the surrounding oleic phase. The formation of Fe clusters in inverse miscelles was studied.Iron salts are solubilized within the polar interior of inverse micelles, and the addition of the reducing agent LiBH{sub 4} initiates a chemical reduction tomore » produce monodisperse, nanometer sized Fe based particles. The reaction sequence is sustained by material exchange between inverse micelles. The surfactant interface provides a spatial constraint on the reaction volume, and reactions carried out in these micro-heterogeneous solutions produce colloidal sized particles (10-100{Angstrom}) stabilized in solution against flocculation of surfactant. The clusters were stabilized with respect to size with transmission electron microscopy (TEM) and with respect to chemical composition with Mossbauer spectroscopy, electron diffraction, and x-ray photoelectron spectroscopy (XPS). In addition, these iron based clusters were tested for catalytic activity in a model hydrogenolysis reaction. The hydrogenolysis of naphthyl bibenzyl methane was used as a model for coal pyrolysis.« less

  2. Topic detection using paragraph vectors to support active learning in systematic reviews.

    PubMed

    Hashimoto, Kazuma; Kontonatsios, Georgios; Miwa, Makoto; Ananiadou, Sophia

    2016-08-01

    Systematic reviews require expert reviewers to manually screen thousands of citations in order to identify all relevant articles to the review. Active learning text classification is a supervised machine learning approach that has been shown to significantly reduce the manual annotation workload by semi-automating the citation screening process of systematic reviews. In this paper, we present a new topic detection method that induces an informative representation of studies, to improve the performance of the underlying active learner. Our proposed topic detection method uses a neural network-based vector space model to capture semantic similarities between documents. We firstly represent documents within the vector space, and cluster the documents into a predefined number of clusters. The centroids of the clusters are treated as latent topics. We then represent each document as a mixture of latent topics. For evaluation purposes, we employ the active learning strategy using both our novel topic detection method and a baseline topic model (i.e., Latent Dirichlet Allocation). Results obtained demonstrate that our method is able to achieve a high sensitivity of eligible studies and a significantly reduced manual annotation cost when compared to the baseline method. This observation is consistent across two clinical and three public health reviews. The tool introduced in this work is available from https://nactem.ac.uk/pvtopic/. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  3. MYStIX: Dynamical evolution of young clusters

    NASA Astrophysics Data System (ADS)

    Kuhn, Michael A.

    2014-08-01

    The spatial structure of young stellar clusters in Galactic star-forming regions provides insight into these clusters’ dynamical evolution---a topic with implications for open questions in star-formation and cluster survival. The Massive Young Star-Forming Complex Study in Infrared and X-ray (MYStIX) provides a sample of >30,000 young stars in star-forming regions (d<3.6 kpc) that contain at least one O-type star. We use the finite mixture model analysis to identify subclusters of stars and determine their properties: including subcluster radii, intrinsic numbers of stars, central density, ellipticity, obscuration, and age. In 17 MYStIX regions we find 142 subclusters, with a diverse radii and densities and age spreads of up to ~1 Myr in a region. There is a strong negative correlation between subcluster radius and density, which indicates that embedded subclusters expand but also gain stars as they age. Subcluster expansion is also shown by a positive radius--age correlation, which indicates that subclusters are expanding at <1 km/s. The subcluster ellipticity distribution and number--density relation show signs of a hierarchical merger scenario, whereby young stellar clusters are built up through mergers of smaller clumps, causing evolution from a clumpy spatial distribution of stars (seen in some regions) to a simpler distribution of stars (seen in other regions). Many of the simple young stellar clusters show signs of dynamically relaxation, even though they are not old enough for this to have occurred through two-body interactions. However, this apparent contradiction might be explained if small subcluster, which have shorter dynamical relaxation times, can produce dynamically relaxed clusters through hierarchical mergers.

  4. A comparative study on the interaction of acridine and synthetic bis-acridine with G-quadruplex structure.

    PubMed

    Nagesh, Narayana; Krishnaiah, Abburi

    2003-07-31

    DNA from the telomeres contains a stretch of simple tandemly repeated sequences in which clusters of G residues alternate with clusters of T/A sequences along one DNA strand. Model telomeric G-clusters form four-stranded structures in presence of Na(I), K(I) and NH(4)(I) ions. Electrophoretic and spectroscopic studies were made with the telomeric related sequences d(T6G16) or d(G4T2G4T2G4T2G4). It was noticed earlier that G-quadruplex may either be inter-molecular, or intra-molecular, or a mixture of both. CD spectral characteristics of various G-quadruplex DNA suggests that the CD maximum at 293 nm corresponds to that of an intra-molecular G-quadruplex structure or hairpin dimers. Fluorescence titration studies also show that acridine and the bis-acridine are interacting with G-quadruplex DNA and destabilize the K(I)-quadruplex structure more efficiently than the quadruplex formed by NH(4)(I) ion. Among the two drugs studied, acridine is more capable of breaking the G-quadruplex structure than bis-acridine. This result is further confirmed by the CD experiments.

  5. Designing a robust activity recognition framework for health and exergaming using wearable sensors.

    PubMed

    Alshurafa, Nabil; Xu, Wenyao; Liu, Jason J; Huang, Ming-Chun; Mortazavi, Bobak; Roberts, Christian K; Sarrafzadeh, Majid

    2014-09-01

    Detecting human activity independent of intensity is essential in many applications, primarily in calculating metabolic equivalent rates and extracting human context awareness. Many classifiers that train on an activity at a subset of intensity levels fail to recognize the same activity at other intensity levels. This demonstrates weakness in the underlying classification method. Training a classifier for an activity at every intensity level is also not practical. In this paper, we tackle a novel intensity-independent activity recognition problem where the class labels exhibit large variability, the data are of high dimensionality, and clustering algorithms are necessary. We propose a new robust stochastic approximation framework for enhanced classification of such data. Experiments are reported using two clustering techniques, K-Means and Gaussian Mixture Models. The stochastic approximation algorithm consistently outperforms other well-known classification schemes which validate the use of our proposed clustered data representation. We verify the motivation of our framework in two applications that benefit from intensity-independent activity recognition. The first application shows how our framework can be used to enhance energy expenditure calculations. The second application is a novel exergaming environment aimed at using games to reward physical activity performed throughout the day, to encourage a healthy lifestyle.

  6. Effector role reversal during evolution: the case of frataxin in Fe-S cluster biosynthesis†

    PubMed Central

    Bridwell-Rabb, Jennifer; Iannuzzi, Clara; Pastore, Annalisa; Barondeau, David P.

    2012-01-01

    Human frataxin (FXN) has been intensively studied since the discovery that the FXN gene is associated with the neurodegenerative disease Friedreich’s ataxia. Human FXN is a component of the NFS1-ISD11-ISCU2-FXN (SDUF) core Fe-S assembly complex and activates the cysteine desulfurase and Fe-S cluster biosynthesis reactions. In contrast, the Escherichia coli FXN homolog CyaY inhibits Fe-S cluster biosynthesis. To resolve this discrepancy, enzyme kinetic experiments were performed for the human and E. coli systems in which analogous cysteine desulfurase, Fe-S assembly scaffold, and frataxin components were interchanged. Surprisingly, our results reveal that activation or inhibition by the frataxin homolog is determined by which cysteine desulfurase is present and not by the identity of the frataxin homolog. These data are consistent with a model in which the frataxin-less Fe-S assembly complex exists as a mixture of functional and nonfunctional states, which are stabilized by binding of frataxin homologs. Intriguingly, this appears to be an unusual example in which modifications to an enzyme during evolution inverts or reverses the mode of control imparted by a regulatory molecule. PMID:22352884

  7. Effector role reversal during evolution: the case of frataxin in Fe-S cluster biosynthesis.

    PubMed

    Bridwell-Rabb, Jennifer; Iannuzzi, Clara; Pastore, Annalisa; Barondeau, David P

    2012-03-27

    Human frataxin (FXN) has been intensively studied since the discovery that the FXN gene is associated with the neurodegenerative disease Friedreich's ataxia. Human FXN is a component of the NFS1-ISD11-ISCU2-FXN (SDUF) core Fe-S assembly complex and activates the cysteine desulfurase and Fe-S cluster biosynthesis reactions. In contrast, the Escherichia coli FXN homologue CyaY inhibits Fe-S cluster biosynthesis. To resolve this discrepancy, enzyme kinetic experiments were performed for the human and E. coli systems in which analogous cysteine desulfurase, Fe-S assembly scaffold, and frataxin components were interchanged. Surprisingly, our results reveal that activation or inhibition by the frataxin homologue is determined by which cysteine desulfurase is present and not by the identity of the frataxin homologue. These data are consistent with a model in which the frataxin-less Fe-S assembly complex exists as a mixture of functional and nonfunctional states, which are stabilized by binding of frataxin homologues. Intriguingly, this appears to be an unusual example in which modifications to an enzyme during evolution inverts or reverses the mode of control imparted by a regulatory molecule.

  8. A ground truth based comparative study on clustering of gene expression data.

    PubMed

    Zhu, Yitan; Wang, Zuyi; Miller, David J; Clarke, Robert; Xuan, Jianhua; Hoffman, Eric P; Wang, Yue

    2008-05-01

    Given the variety of available clustering methods for gene expression data analysis, it is important to develop an appropriate and rigorous validation scheme to assess the performance and limitations of the most widely used clustering algorithms. In this paper, we present a ground truth based comparative study on the functionality, accuracy, and stability of five data clustering methods, namely hierarchical clustering, K-means clustering, self-organizing maps, standard finite normal mixture fitting, and a caBIG toolkit (VIsual Statistical Data Analyzer--VISDA), tested on sample clustering of seven published microarray gene expression datasets and one synthetic dataset. We examined the performance of these algorithms in both data-sufficient and data-insufficient cases using quantitative performance measures, including cluster number detection accuracy and mean and standard deviation of partition accuracy. The experimental results showed that VISDA, an interactive coarse-to-fine maximum likelihood fitting algorithm, is a solid performer on most of the datasets, while K-means clustering and self-organizing maps optimized by the mean squared compactness criterion generally produce more stable solutions than the other methods.

  9. Robust spike sorting of retinal ganglion cells tuned to spot stimuli.

    PubMed

    Ghahari, Alireza; Badea, Tudor C

    2016-08-01

    We propose an automatic spike sorting approach for the data recorded from a microelectrode array during visual stimulation of wild type retinas with tiled spot stimuli. The approach first detects individual spikes per electrode by their signature local minima. With the mixture probability distribution of the local minima estimated afterwards, it applies a minimum-squared-error clustering algorithm to sort the spikes into different clusters. A template waveform for each cluster per electrode is defined, and a number of reliability tests are performed on it and its corresponding spikes. Finally, a divisive hierarchical clustering algorithm is used to deal with the correlated templates per cluster type across all the electrodes. According to the measures of performance of the spike sorting approach, it is robust even in the cases of recordings with low signal-to-noise ratio.

  10. Combined Mössbauer spectroscopic, multi-edge X-ray absorption spectroscopic, and density functional theoretical study of the radical SAM enzyme spore photoproduct lyase.

    PubMed

    Silver, Sunshine C; Gardenghi, David J; Naik, Sunil G; Shepard, Eric M; Huynh, Boi Hanh; Szilagyi, Robert K; Broderick, Joan B

    2014-03-01

    Spore photoproduct lyase (SPL), a member of the radical S-adenosyl-L-methionine (SAM) superfamily, catalyzes the direct reversal of the spore photoproduct, a thymine dimer specific to bacterial spores, to two thymines. SPL requires SAM and a redox-active [4Fe-4S] cluster for catalysis. Mössbauer analysis of anaerobically purified SPL indicates the presence of a mixture of cluster states with the majority (40 %) as [2Fe-2S](2+) clusters and a smaller amount (15 %) as [4Fe-4S](2+) clusters. On reduction, the cluster content changes to primarily (60 %) [4Fe-4S](+). The speciation information from Mössbauer data allowed us to deconvolute iron and sulfur K-edge X-ray absorption spectra to uncover electronic (X-ray absorption near-edge structure, XANES) and geometric (extended X-ray absorption fine structure, EXAFS) structural features of the Fe-S clusters, and their interactions with SAM. The iron K-edge EXAFS data provide evidence for elongation of a [2Fe-2S] rhomb of the [4Fe-4S] cluster on binding SAM on the basis of an Fe···Fe scatterer at 3.0 Å. The XANES spectra of reduced SPL in the absence and presence of SAM overlay one another, indicating that SAM is not undergoing reductive cleavage. The X-ray absorption spectroscopy data for SPL samples and data for model complexes from the literature allowed the deconvolution of contributions from [2Fe-2S] and [4Fe-4S] clusters to the sulfur K-edge XANES spectra. The analysis of pre-edge features revealed electronic changes in the Fe-S clusters as a function of the presence of SAM. The spectroscopic findings were further corroborated by density functional theory calculations that provided insights into structural and electronic perturbations that can be correlated by considering the role of SAM as a catalyst or substrate.

  11. Modeling the coupled return-spread high frequency dynamics of large tick assets

    NASA Astrophysics Data System (ADS)

    Curato, Gianbiagio; Lillo, Fabrizio

    2015-01-01

    Large tick assets, i.e. assets where one tick movement is a significant fraction of the price and bid-ask spread is almost always equal to one tick, display a dynamics in which price changes and spread are strongly coupled. We present an approach based on the hidden Markov model, also known in econometrics as the Markov switching model, for the dynamics of price changes, where the latent Markov process is described by the transitions between spreads. We then use a finite Markov mixture of logit regressions on past squared price changes to describe temporal dependencies in the dynamics of price changes. The model can thus be seen as a double chain Markov model. We show that the model describes the shape of the price change distribution at different time scales, volatility clustering, and the anomalous decrease of kurtosis. We calibrate our models based on Nasdaq stocks and we show that this model reproduces remarkably well the statistical properties of real data.

  12. Disordered multihyperuniformity derived from binary plasmas

    NASA Astrophysics Data System (ADS)

    Lomba, Enrique; Weis, Jean-Jacques; Torquato, Salvatore

    2018-01-01

    Disordered multihyperuniform many-particle systems are exotic amorphous states that allow exquisite color sensing capabilities due to their anomalous suppression of density fluctuations for distinct subsets of particles, as recently evidenced in photoreceptor mosaics in avian retina. Motivated by this biological finding, we present a statistical-mechanical model that rigorously achieves disordered multihyperuniform many-body systems by tuning interactions in binary mixtures of nonadditive hard-disk plasmas. We demonstrate that multihyperuniformity competes with phase separation and stabilizes a clustered phase. Our work provides a systematic means to generate disordered multihyperuniform solids, and hence lays the groundwork to explore their potentially unique photonic, phononic, electronic, and transport properties.

  13. Generation of monodisperse droplets by spontaneous condensation of flow in nozzles

    NASA Technical Reports Server (NTRS)

    Lai, Der-Shaiun; Kadambi, J. R.

    1993-01-01

    Submicron size monodisperse particles are of interest in many industrial and scientific applications. These include the manufacture of ceramic parts using fine ceramic particles, the production of thin films by deposition of ionized clusters, monodisperse seed particles for laser anemometry, and the study of size dependence of cluster chemical and physical properties. An inexpensive and relatively easy way to generate such particles is by utilizing the phenomenon of spontaneous condensation. The phenomenon occurs when the vapor or a mixture of a vapor and a noncondensing gas is expanded at a high expansion rate. The saturation line is crossed with the supercooled vapor behaving like a gas, until all of a sudden at the so called Wilson point, condensation occurs, resulting in a large number of relatively monodisperse droplets. The droplet size is a function of the expansion rate, inlet conditions, mass fraction of vapor, gas properties, etc. Spontaneous condensation of steam and water vapor and air mixture in a one dimensional nozzle was modeled and the resulting equations solved numerically. The droplet size distribution at the exit of various one dimensional nozzles and the flow characteristics such as pressure ratio, mean droplet radius, vapor and droplet temperatures, nucleation flux, supercooling, wetness, etc., along the axial distance were obtained. The numerical results compared very well with the available experimental data. The effect of inlet conditions, nozzle expansion rates, and vapor mass fractions on droplet mean radius, droplet size distribution, and pressure ratio were examined.

  14. Comparison of Intra-cluster and M87 Halo Orphan Globular Clusters in the Virgo Cluster

    NASA Astrophysics Data System (ADS)

    Louie, Tiffany Kaye; Tuan, Jin Zong; Martellini, Adhara; Guhathakurta, Puragra; Toloba, Elisa; Peng, Eric; Longobardi, Alessia; Lim, Sungsoon

    2018-01-01

    We present a study of “orphan” globular clusters (GCs) — GCs with no identifiable nearby host galaxy — discovered in NGVS, a 104 deg2 CFHT/MegaCam imaging survey. At the distance of the Virgo cluster, GCs are bright enough to make good spectroscopic targets and many are barely resolved in good ground-based seeing. Our orphan GC sample is derived from a subset of NGVS-selected GC candidates that were followed up with Keck/DEIMOS spectroscopy. While our primary spectroscopic targets were candidate GC satellites of Virgo dwarf elliptical and ultra-diffuse galaxies, many objects turned out to be non-satellites based on a radial velocity mismatch with the Virgo galaxy they are projected close to. Using a combination of spectral characteristics (e.g., absorption vs. emission), Gaussian mixture modeling of radial velocity and positions, and extreme deconvolution analysis of ugrizk photometry and image morphology, these non-satellites were classified into: (1) intra-cluster GCs (ICGCs) in the Virgo cluster, (2) GCs in the outer halo of M87, (3) foreground Milky Way stars, and (4) background galaxies. The statistical distinction between ICGCs and M87 halo GCs is based on velocity distributions (mean of 1100 vs. 1300 km/s and dispersions of 700 vs. 400 km/s, respectively) and radial distribution (diffuse vs. centrally concentrated, respectively). We used coaddition to increase the spectral SNR for the two classes of orphan GCs and measured the equivalent widths (EWs) of the Mg b and H-beta absorption lines. These EWs were compared to single stellar population models to obtain mean age and metallicity estimates. The ICGCs and M87 halo GCs have <[Fe/H> = –0.6+/–0.3 and –0.4+/–0.3 dex, respectively, and mean ages of >~ 5 and >~ 10 Gyr, respectively. This suggests the M87 halo GCs formed in relatively high-mass galaxies that avoided being tidally disrupted by M87 until they were close to the cluster center, while IGCCs formed in relatively low-mass galaxies that were tidally disrupted in the cluster outskirts. Most of this work was carried out by high school students working under the auspices of the Science Internship Program (SIP) at UC Santa Cruz. We are grateful for financial support from the NSF and NASA/STScI.

  15. Personal Exposure to Mixtures of Volatile Organic Compounds: Modeling and Further Analysis of the RIOPA Data

    PubMed Central

    Batterman, Stuart; Su, Feng-Chiao; Li, Shi; Mukherjee, Bhramar; Jia, Chunrong

    2015-01-01

    INTRODUCTION Emission sources of volatile organic compounds (VOCs) are numerous and widespread in both indoor and outdoor environments. Concentrations of VOCs indoors typically exceed outdoor levels, and most people spend nearly 90% of their time indoors. Thus, indoor sources generally contribute the majority of VOC exposures for most people. VOC exposure has been associated with a wide range of acute and chronic health effects; for example, asthma, respiratory diseases, liver and kidney dysfunction, neurologic impairment, and cancer. Although exposures to most VOCs for most persons fall below health-based guidelines, and long-term trends show decreases in ambient emissions and concentrations, a subset of individuals experience much higher exposures that exceed guidelines. Thus, exposure to VOCs remains an important environmental health concern. The present understanding of VOC exposures is incomplete. With the exception of a few compounds, concentration and especially exposure data are limited; and like other environmental data, VOC exposure data can show multiple modes, low and high extreme values, and sometimes a large portion of data below method detection limits (MDLs). Field data also show considerable spatial or interpersonal variability, and although evidence is limited, temporal variability seems high. These characteristics can complicate modeling and other analyses aimed at risk assessment, policy actions, and exposure management. In addition to these analytic and statistical issues, exposure typically occurs as a mixture, and mixture components may interact or jointly contribute to adverse effects. However most pollutant regulations, guidelines, and studies remain focused on single compounds, and thus may underestimate cumulative exposures and risks arising from coexposures. In addition, the composition of VOC mixtures has not been thoroughly investigated, and mixture components show varying and complex dependencies. Finally, although many factors are known to affect VOC exposures, many personal, environmental, and socioeconomic determinants remain to be identified, and the significance and applicability of the determinants reported in the literature are uncertain. To help answer these unresolved questions and overcome limitations of previous analyses, this project used several novel and powerful statistical modeling and analysis techniques and two large data sets. The overall objectives of this project were (1) to identify and characterize exposure distributions (including extreme values), (2) evaluate mixtures (including dependencies), and (3) identify determinants of VOC exposure. METHODS VOC data were drawn from two large data sets: the Relationships of Indoor, Outdoor, and Personal Air (RIOPA) study (1999–2001) and the National Health and Nutrition Examination Survey (NHANES; 1999–2000). The RIOPA study used a convenience sample to collect outdoor, indoor, and personal exposure measurements in three cities (Elizabeth, NJ; Houston, TX; Los Angeles, CA). In each city, approximately 100 households with adults and children who did not smoke were sampled twice for 18 VOCs. In addition, information about 500 variables associated with exposure was collected. The NHANES used a nationally representative sample and included personal VOC measurements for 851 participants. NHANES sampled 10 VOCs in common with RIOPA. Both studies used similar sampling methods and study periods. Specific Aim 1 To estimate and model extreme value exposures, extreme value distribution models were fitted to the top 10% and 5% of VOC exposures. Health risks were estimated for individual VOCs and for three VOC mixtures. Simulated extreme value data sets, generated for each VOC and for fitted extreme value and lognormal distributions, were compared with measured concentrations (RIOPA observations) to evaluate each model’s goodness of fit. Mixture distributions were fitted with the conventional finite mixture of normal distributions and the semi-parametric Dirichlet process mixture (DPM) of normal distributions for three individual VOCs (chloroform, 1,4-DCB, and styrene). Goodness of fit for these full distribution models was also evaluated using simulated data. Specific Aim 2 Mixtures in the RIOPA VOC data set were identified using positive matrix factorization (PMF) and by toxicologic mode of action. Dependency structures of a mixture’s components were examined using mixture fractions and were modeled using copulas, which address correlations of multiple components across their entire distributions. Five candidate copulas (Gaussian, t, Gumbel, Clayton, and Frank) were evaluated, and the performance of fitted models was evaluated using simulation and mixture fractions. Cumulative cancer risks were calculated for mixtures, and results from copulas and multivariate lognormal models were compared with risks based on RIOPA observations. Specific Aim 3 Exposure determinants were identified using stepwise regressions and linear mixed-effects models (LMMs). RESULTS Specific Aim 1 Extreme value exposures in RIOPA typically were best fitted by three-parameter generalized extreme value (GEV) distributions, and sometimes by the two-parameter Gumbel distribution. In contrast, lognormal distributions significantly underestimated both the level and likelihood of extreme values. Among the VOCs measured in RIOPA, 1,4-dichlorobenzene (1,4-DCB) was associated with the greatest cancer risks; for example, for the highest 10% of measurements of 1,4-DCB, all individuals had risk levels above 10−4, and 13% of all participants had risk levels above 10−2. Of the full-distribution models, the finite mixture of normal distributions with two to four clusters and the DPM of normal distributions had superior performance in comparison with the lognormal models. DPM distributions provided slightly better fit than the finite mixture distributions; the advantages of the DPM model were avoiding certain convergence issues associated with the finite mixture distributions, adaptively selecting the number of needed clusters, and providing uncertainty estimates. Although the results apply to the RIOPA data set, GEV distributions and mixture models appear more broadly applicable. These models can be used to simulate VOC distributions, which are neither normally nor lognormally distributed, and they accurately represent the highest exposures, which may have the greatest health significance. Specific Aim 2 Four VOC mixtures were identified and apportioned by PMF; they represented gasoline vapor, vehicle exhaust, chlorinated solvents and disinfection byproducts, and cleaning products and odorants. The last mixture (cleaning products and odorants) accounted for the largest fraction of an individual’s total exposure (average of 42% across RIOPA participants). Often, a single compound dominated a mixture but the mixture fractions were heterogeneous; that is, the fractions of the compounds changed with the concentration of the mixture. Three VOC mixtures were identified by toxicologic mode of action and represented VOCs associated with hematopoietic, liver, and renal tumors. Estimated lifetime cumulative cancer risks exceeded 10−3 for about 10% of RIOPA participants. The dependency structures of the VOC mixtures in the RIOPA data set fitted Gumbel (two mixtures) and t copulas (four mixtures). These copula types emphasize dependencies found in the upper and lower tails of a distribution. The copulas reproduced both risk predictions and exposure fractions with a high degree of accuracy and performed better than multivariate lognormal distributions. Specific Aim 3 In an analysis focused on the home environment and the outdoor (close to home) environment, home VOC concentrations dominated personal exposures (66% to 78% of the total exposure, depending on VOC); this was largely the result of the amount of time participants spent at home and the fact that indoor concentrations were much higher than outdoor concentrations for most VOCs. In a different analysis focused on the sources inside the home and outside (but close to the home), it was assumed that 100% of VOCs from outside sources would penetrate the home. Outdoor VOC sources accounted for 5% (d-limonene) to 81% (carbon tetrachloride [CTC]) of the total exposure. Personal exposure and indoor measurements had similar determinants depending on the VOC. Gasoline-related VOCs (e.g., benzene and methyl tert-butyl ether [MTBE]) were associated with city, residences with attached garages, pumping gas, wind speed, and home air exchange rate (AER). Odorant and cleaning-related VOCs (e.g., 1,4-DCB and chloroform) also were associated with city, and a residence’s AER, size, and family members showering. Dry-cleaning and industry-related VOCs (e.g., tetrachloroethylene [or perchloroethylene, PERC] and trichloroethylene [TCE]) were associated with city, type of water supply to the home, and visits to the dry cleaner. These and other relationships were significant, they explained from 10% to 40% of the variance in the measurements, and are consistent with known emission sources and those reported in the literature. Outdoor concentrations of VOCs had only two determinants in common: city and wind speed. Overall, personal exposure was dominated by the home setting, although a large fraction of indoor VOC concentrations were due to outdoor sources. City of residence, personal activities, household characteristics, and meteorology were significant determinants. Concentrations in RIOPA were considerably lower than levels in the nationally representative NHANES for all VOCs except MTBE and 1,4-DCB. Differences between RIOPA and NHANES results can be explained by contrasts between the sampling designs and staging in the two studies, and by differences in the demographics, smoking, employment, occupations, and home locations. A portion of these differences are due to the nature of the convenience (RIOPA) and representative (NHANES) sampling strategies used in the two studies. CONCLUSIONS Accurate models for exposure data, which can feature extreme values, multiple modes, data below the MDL, heterogeneous interpollutant dependency structures, and other complex characteristics, are needed to estimate exposures and risks and to develop control and management guidelines and policies. Conventional and novel statistical methods were applied to data drawn from two large studies to understand the nature and significance of VOC exposures. Both extreme value distributions and mixture models were found to provide excellent fit to single VOC compounds (univariate distributions), and copulas may be the method of choice for VOC mixtures (multivariate distributions), especially for the highest exposures, which fit parametric models poorly and which may represent the greatest health risk. The identification of exposure determinants, including the influence of both certain activities (e.g., pumping gas) and environments (e.g., residences), provides information that can be used to manage and reduce exposures. The results obtained using the RIOPA data set add to our understanding of VOC exposures and further investigations using a more representative population and a wider suite of VOCs are suggested to extend and generalize results. PMID:25145040

  16. Spatio-Temporal Regression Based Clustering of Precipitation Extremes in a Presence of Systematically Missing Covariates

    NASA Astrophysics Data System (ADS)

    Kaiser, Olga; Martius, Olivia; Horenko, Illia

    2017-04-01

    Regression based Generalized Pareto Distribution (GPD) models are often used to describe the dynamics of hydrological threshold excesses relying on the explicit availability of all of the relevant covariates. But, in real application the complete set of relevant covariates might be not available. In this context, it was shown that under weak assumptions the influence coming from systematically missing covariates can be reflected by a nonstationary and nonhomogenous dynamics. We present a data-driven, semiparametric and an adaptive approach for spatio-temporal regression based clustering of threshold excesses in a presence of systematically missing covariates. The nonstationary and nonhomogenous behavior of threshold excesses is describes by a set of local stationary GPD models, where the parameters are expressed as regression models, and a non-parametric spatio-temporal hidden switching process. Exploiting nonparametric Finite Element time-series analysis Methodology (FEM) with Bounded Variation of the model parameters (BV) for resolving the spatio-temporal switching process, the approach goes beyond strong a priori assumptions made is standard latent class models like Mixture Models and Hidden Markov Models. Additionally, the presented FEM-BV-GPD provides a pragmatic description of the corresponding spatial dependence structure by grouping together all locations that exhibit similar behavior of the switching process. The performance of the framework is demonstrated on daily accumulated precipitation series over 17 different locations in Switzerland from 1981 till 2013 - showing that the introduced approach allows for a better description of the historical data.

  17. A Poisson nonnegative matrix factorization method with parameter subspace clustering constraint for endmember extraction in hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Sun, Weiwei; Ma, Jun; Yang, Gang; Du, Bo; Zhang, Liangpei

    2017-06-01

    A new Bayesian method named Poisson Nonnegative Matrix Factorization with Parameter Subspace Clustering Constraint (PNMF-PSCC) has been presented to extract endmembers from Hyperspectral Imagery (HSI). First, the method integrates the liner spectral mixture model with the Bayesian framework and it formulates endmember extraction into a Bayesian inference problem. Second, the Parameter Subspace Clustering Constraint (PSCC) is incorporated into the statistical program to consider the clustering of all pixels in the parameter subspace. The PSCC could enlarge differences among ground objects and helps finding endmembers with smaller spectrum divergences. Meanwhile, the PNMF-PSCC method utilizes the Poisson distribution as the prior knowledge of spectral signals to better explain the quantum nature of light in imaging spectrometer. Third, the optimization problem of PNMF-PSCC is formulated into maximizing the joint density via the Maximum A Posterior (MAP) estimator. The program is finally solved by iteratively optimizing two sub-problems via the Alternating Direction Method of Multipliers (ADMM) framework and the FURTHESTSUM initialization scheme. Five state-of-the art methods are implemented to make comparisons with the performance of PNMF-PSCC on both the synthetic and real HSI datasets. Experimental results show that the PNMF-PSCC outperforms all the five methods in Spectral Angle Distance (SAD) and Root-Mean-Square-Error (RMSE), and especially it could identify good endmembers for ground objects with smaller spectrum divergences.

  18. Detection of smoothly distributed spatial outliers, with applications to identifying the distribution of parenchymal hyperinflation following an airway challenge in asthmatics.

    PubMed

    Thurman, Andrew L; Choi, Jiwoong; Choi, Sanghun; Lin, Ching-Long; Hoffman, Eric A; Lee, Chang Hyun; Chan, Kung-Sik

    2017-05-10

    Methacholine challenge tests are used to measure changes in pulmonary function that indicate symptoms of asthma. In addition to pulmonary function tests, which measure global changes in pulmonary function, computed tomography images taken at full inspiration before and after administration of methacholine provide local air volume changes (hyper-inflation post methacholine) at individual acinar units, indicating local airway hyperresponsiveness. Some of the acini may have extreme air volume changes relative to the global average, indicating hyperresponsiveness, and those extreme values may occur in clusters. We propose a Gaussian mixture model with a spatial smoothness penalty to improve prediction of hyperresponsive locations that occur in spatial clusters. A simulation study provides evidence that the spatial smoothness penalty improves prediction under different data-generating mechanisms. We apply this method to computed tomography data from Seoul National University Hospital on five healthy and ten asthmatic subjects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  19. Individual Human Brain Areas Can Be Identified from Their Characteristic Spectral Activation Fingerprints

    PubMed Central

    Keitel, Anne; Gross, Joachim

    2016-01-01

    The human brain can be parcellated into diverse anatomical areas. We investigated whether rhythmic brain activity in these areas is characteristic and can be used for automatic classification. To this end, resting-state MEG data of 22 healthy adults was analysed. Power spectra of 1-s long data segments for atlas-defined brain areas were clustered into spectral profiles (“fingerprints”), using k-means and Gaussian mixture (GM) modelling. We demonstrate that individual areas can be identified from these spectral profiles with high accuracy. Our results suggest that each brain area engages in different spectral modes that are characteristic for individual areas. Clustering of brain areas according to similarity of spectral profiles reveals well-known brain networks. Furthermore, we demonstrate task-specific modulations of auditory spectral profiles during auditory processing. These findings have important implications for the classification of regional spectral activity and allow for novel approaches in neuroimaging and neurostimulation in health and disease. PMID:27355236

  20. The XXL Survey. II. The bright cluster sample: catalogue and luminosity function

    NASA Astrophysics Data System (ADS)

    Pacaud, F.; Clerc, N.; Giles, P. A.; Adami, C.; Sadibekova, T.; Pierre, M.; Maughan, B. J.; Lieu, M.; Le Fèvre, J. P.; Alis, S.; Altieri, B.; Ardila, F.; Baldry, I.; Benoist, C.; Birkinshaw, M.; Chiappetti, L.; Démoclès, J.; Eckert, D.; Evrard, A. E.; Faccioli, L.; Gastaldello, F.; Guennou, L.; Horellou, C.; Iovino, A.; Koulouridis, E.; Le Brun, V.; Lidman, C.; Liske, J.; Maurogordato, S.; Menanteau, F.; Owers, M.; Poggianti, B.; Pomarède, D.; Pompei, E.; Ponman, T. J.; Rapetti, D.; Reiprich, T. H.; Smith, G. P.; Tuffs, R.; Valageas, P.; Valtchanov, I.; Willis, J. P.; Ziparo, F.

    2016-06-01

    Context. The XXL Survey is the largest survey carried out by the XMM-Newton satellite and covers a total area of 50 square degrees distributed over two fields. It primarily aims at investigating the large-scale structures of the Universe using the distribution of galaxy clusters and active galactic nuclei as tracers of the matter distribution. The survey will ultimately uncover several hundreds of galaxy clusters out to a redshift of ~2 at a sensitivity of ~10-14 erg s-1 cm-2 in the [0.5-2] keV band. Aims: This article presents the XXL bright cluster sample, a subsample of 100 galaxy clusters selected from the full XXL catalogue by setting a lower limit of 3 × 10-14 erg s-1 cm-2 on the source flux within a 1' aperture. Methods: The selection function was estimated using a mixture of Monte Carlo simulations and analytical recipes that closely reproduce the source selection process. An extensive spectroscopic follow-up provided redshifts for 97 of the 100 clusters. We derived accurate X-ray parameters for all the sources. Scaling relations were self-consistently derived from the same sample in other publications of the series. On this basis, we study the number density, luminosity function, and spatial distribution of the sample. Results: The bright cluster sample consists of systems with masses between M500 = 7 × 1013 and 3 × 1014 M⊙, mostly located between z = 0.1 and 0.5. The observed sky density of clusters is slightly below the predictions from the WMAP9 model, and significantly below the prediction from the Planck 2015 cosmology. In general, within the current uncertainties of the cluster mass calibration, models with higher values of σ8 and/or ΩM appear more difficult to accommodate. We provide tight constraints on the cluster differential luminosity function and find no hint of evolution out to z ~ 1. We also find strong evidence for the presence of large-scale structures in the XXL bright cluster sample and identify five new superclusters. Based on observations obtained with XMM-Newton, an ESA science mission with instruments and contributions directly funded by ESA Member States and NASA. Based on observations made with ESO Telescopes at the La Silla and Paranal Observatories under programme ID 089.A-0666 and LP191.A-0268.The Master Catalogue is available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/592/A2

  1. What Can Be Learned from Inverse Statistics?

    NASA Astrophysics Data System (ADS)

    Ahlgren, Peter Toke Heden; Dahl, Henrik; Jensen, Mogens Høgh; Simonsen, Ingve

    One stylized fact of financial markets is an asymmetry between the most likely time to profit and to loss. This gain-loss asymmetry is revealed by inverse statistics, a method closely related to empirically finding first passage times. Many papers have presented evidence about the asymmetry, where it appears and where it does not. Also, various interpretations and explanations for the results have been suggested. In this chapter, we review the published results and explanations. We also examine the results and show that some are at best fragile. Similarly, we discuss the suggested explanations and propose a new model based on Gaussian mixtures. Apart from explaining the gain-loss asymmetry, this model also has the potential to explain other stylized facts such as volatility clustering, fat tails, and power law behavior of returns.

  2. The Reactivity and Structure of Size Selected VxO y Clusters on a TiO2 (110)-(1 X 1) Surface of Variable Oxidation State

    NASA Astrophysics Data System (ADS)

    Neilson, Hunter L.

    The Reactivity and Structure of Size Selected VxOy Clusters on a TiO2 (110) Surface of Variable Oxidation State by Hunter L Neilson The selective oxidative dehydrogenation of methanol by vanadium oxide/TiO2 model systems has received a great deal of interest in the surface science community. Previous studies using temperature programmed desorption and reaction (TPD/R) to probe the oxidation of methanol to formaldehyde by vanadia/TiO2 model catalysts have shown that the activity of these systems vary considerably based on the way in which the model system is prepared with formaldehyde desorption temperatures observed anywhere from room temperature to 660 K. The principle reason for this variation is that the preparation of sub-monolayer films of vanadia on TiO2 produces clusters with a multitude of VxOy structures and a mixture of vanadium oxidation states. As a result the stoichiometry of the active vanadium oxide catalyst as well as the oxidation state of vanadium in the active catalyst remain unknown. To better understand this system, our group has probed the reactivity and structure of size-selected Vx, VOy and VxOy clusters on a reduced TiO2 (110) support in ultra-high vacuum (UHV) via TPD/R and scanning tunneling microscopy (STM). Ex situ preparation of these clusters in the gas phase prior to deposition has allowed us to systematically vary the stoichiometry of the vanadia clusters; a layer of control not available via the usual routes to vanadium oxide. The most active catalysts are shown to have (VO3)n stoichiometry in agreement with the theoretical models of the Metiu group. We have shown that both the activity and selectivity of V2O6 and V3O9 cluster catalysts depend sensitively on the oxidation state of the TiO2 (110) support. For example, V2O6 on a reduced surface is selective for the oxidation of methanol to formaldehyde while the selectivity shifts to favor methyl formate as the surface becomes increasingly oxidized. STM studies show that the structure of size-selected V2O6 clusters, upon adsorption to the surface, varies considerably with the oxidation state of the support, in good agreement with our reactivity studies. V 3O9 was shown to catalyze the oxidation of methanol to both formaldehyde and methyl formate on a reduced surface while STM suggests that, unlike V2O6, these clusters are prone to decomposition upon adsorption to the surface. Furthermore, TPD/R of size selected V 2O5 and V2O7 on TiO2 suggests that altering the stoichiometry of the (VO3)n clusters by a single oxygen atom significantly inhibits the activity of these catalysts.

  3. Markov chains at the interface of combinatorics, computing, and statistical physics

    NASA Astrophysics Data System (ADS)

    Streib, Amanda Pascoe

    The fields of statistical physics, discrete probability, combinatorics, and theoretical computer science have converged around efforts to understand random structures and algorithms. Recent activity in the interface of these fields has enabled tremendous breakthroughs in each domain and has supplied a new set of techniques for researchers approaching related problems. This thesis makes progress on several problems in this interface whose solutions all build on insights from multiple disciplinary perspectives. First, we consider a dynamic growth process arising in the context of DNA-based self-assembly. The assembly process can be modeled as a simple Markov chain. We prove that the chain is rapidly mixing for large enough bias in regions of Zd. The proof uses a geometric distance function and a variant of path coupling in order to handle distances that can be exponentially large. We also provide the first results in the case of fluctuating bias, where the bias can vary depending on the location of the tile, which arises in the nanotechnology application. Moreover, we use intuition from statistical physics to construct a choice of the biases for which the Markov chain Mmon requires exponential time to converge. Second, we consider a related problem regarding the convergence rate of biased permutations that arises in the context of self-organizing lists. The Markov chain Mnn in this case is a nearest-neighbor chain that allows adjacent transpositions, and the rate of these exchanges is governed by various input parameters. It was conjectured that the chain is always rapidly mixing when the inversion probabilities are positively biased, i.e., we put nearest neighbor pair x < y in order with bias 1/2 ≤ pxy ≤ 1 and out of order with bias 1 - pxy. The Markov chain Mmon was known to have connections to a simplified version of this biased card-shuffling. We provide new connections between Mnn and Mmon by using simple combinatorial bijections, and we prove that Mnn is always rapidly mixing for two general classes of positively biased { pxy}. More significantly, we also prove that the general conjecture is false by exhibiting values for the pxy, with 1/2 ≤ pxy ≤ 1 for all x < y, but for which the transposition chain will require exponential time to converge. Finally, we consider a model of colloids, which are binary mixtures of molecules with one type of molecule suspended in another. It is believed that at low density typical configurations will be well-mixed throughout, while at high density they will separate into clusters. This clustering has proved elusive to verify, since all local sampling algorithms are known to be inefficient at high density, and in fact a new nonlocal algorithm was recently shown to require exponential time in some cases. We characterize the high and low density phases for a general family of discrete interfering binary mixtures by showing that they exhibit a "clustering property" at high density and not at low density. The clustering property states that there will be a region that has very high area, very small perimeter, and high density of one type of molecule. Special cases of interfering binary mixtures include the Ising model at fixed magnetization and independent sets.

  4. Disease Mapping of Zero-excessive Mesothelioma Data in Flanders

    PubMed Central

    Neyens, Thomas; Lawson, Andrew B.; Kirby, Russell S.; Nuyts, Valerie; Watjou, Kevin; Aregay, Mehreteab; Carroll, Rachel; Nawrot, Tim S.; Faes, Christel

    2016-01-01

    Purpose To investigate the distribution of mesothelioma in Flanders using Bayesian disease mapping models that account for both an excess of zeros and overdispersion. Methods The numbers of newly diagnosed mesothelioma cases within all Flemish municipalities between 1999 and 2008 were obtained from the Belgian Cancer Registry. To deal with overdispersion, zero-inflation and geographical association, the hurdle combined model was proposed, which has three components: a Bernoulli zero-inflation mixture component to account for excess zeros, a gamma random effect to adjust for overdispersion and a normal conditional autoregressive random effect to attribute spatial association. This model was compared with other existing methods in literature. Results The results indicate that hurdle models with a random effects term accounting for extra-variance in the Bernoulli zero-inflation component fit the data better than hurdle models that do not take overdispersion in the occurrence of zeros into account. Furthermore, traditional models that do not take into account excessive zeros but contain at least one random effects term that models extra-variance in the counts have better fits compared to their hurdle counterparts. In other words, the extra-variability, due to an excess of zeros, can be accommodated by spatially structured and/or unstructured random effects in a Poisson model such that the hurdle mixture model is not necessary. Conclusions Models taking into account zero-inflation do not always provide better fits to data with excessive zeros than less complex models. In this study, a simple conditional autoregressive model identified a cluster in mesothelioma cases near a former asbestos processing plant (Kapelle-op-den-Bos). This observation is likely linked with historical local asbestos exposures. Future research will clarify this. PMID:27908590

  5. Disease mapping of zero-excessive mesothelioma data in Flanders.

    PubMed

    Neyens, Thomas; Lawson, Andrew B; Kirby, Russell S; Nuyts, Valerie; Watjou, Kevin; Aregay, Mehreteab; Carroll, Rachel; Nawrot, Tim S; Faes, Christel

    2017-01-01

    To investigate the distribution of mesothelioma in Flanders using Bayesian disease mapping models that account for both an excess of zeros and overdispersion. The numbers of newly diagnosed mesothelioma cases within all Flemish municipalities between 1999 and 2008 were obtained from the Belgian Cancer Registry. To deal with overdispersion, zero inflation, and geographical association, the hurdle combined model was proposed, which has three components: a Bernoulli zero-inflation mixture component to account for excess zeros, a gamma random effect to adjust for overdispersion, and a normal conditional autoregressive random effect to attribute spatial association. This model was compared with other existing methods in literature. The results indicate that hurdle models with a random effects term accounting for extra variance in the Bernoulli zero-inflation component fit the data better than hurdle models that do not take overdispersion in the occurrence of zeros into account. Furthermore, traditional models that do not take into account excessive zeros but contain at least one random effects term that models extra variance in the counts have better fits compared to their hurdle counterparts. In other words, the extra variability, due to an excess of zeros, can be accommodated by spatially structured and/or unstructured random effects in a Poisson model such that the hurdle mixture model is not necessary. Models taking into account zero inflation do not always provide better fits to data with excessive zeros than less complex models. In this study, a simple conditional autoregressive model identified a cluster in mesothelioma cases near a former asbestos processing plant (Kapelle-op-den-Bos). This observation is likely linked with historical local asbestos exposures. Future research will clarify this. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Improving estimation of kinetic parameters in dynamic force spectroscopy using cluster analysis

    NASA Astrophysics Data System (ADS)

    Yen, Chi-Fu; Sivasankar, Sanjeevi

    2018-03-01

    Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (koff) and the width of the potential energy barrier (xβ) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated koff and xβ can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of koff and xβ, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.

  7. Underdetermined blind separation of three-way fluorescence spectra of PAHs in water

    NASA Astrophysics Data System (ADS)

    Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing

    2018-06-01

    In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique.

  8. Carbon solids in oxygen-deficient explosives (LA-UR-13-21151)

    NASA Astrophysics Data System (ADS)

    Peery, Travis

    2013-06-01

    The phase behavior of excess carbon in oxygen-deficient explosives has a significant effect on detonation properties and product equations of state. Mixtures of fuel oil in ammonium nitrate (ANFO) above a stoichiometric ratio demonstrate that even small amounts of graphite, on the order of 5% by mole fraction, can substantially alter the Chapman-Jouget (CJ) state properties, a central ingredient in modeling the products equation of state. Similar effects can be seen for Composition B, which borders the carbon phase boundary between graphite and diamond. Nano-diamond formation adds complexity to the product modeling because of surface adsorption effects. I will discuss these carbon phase issues in our equation of state modeling of detonation products, including our statistical mechanics description of carbon clustering and surface chemistry to properly treat solid carbon formation. This work is supported by the Advanced Simulation and Computing Program, under the NNSA.

  9. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  10. Demixing in symmetric supersolid mixtures

    NASA Astrophysics Data System (ADS)

    Jain, Piyush; Moroni, Saverio; Boninsegni, Massimo; Pollet, Lode

    2013-09-01

    The droplet crystal phase of a symmetric binary mixture of soft-core bosons is studied by computer simulation. At high temperature each droplet comprises on average equal numbers of particles of either component, but the two components demix below the supersolid transition temperature, i.e., droplets mostly consist of particles of one component. Clustering of droplets of the same component is also observed. Demixing is driven by quantum tunneling of particles across droplets over the system and does not take place in an insulating crystal. This effect provides an unambiguous experimental signature of supersolidity.

  11. MRI-alone radiation therapy planning for prostate cancer: Automatic fiducial marker detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ghose, Soumya, E-mail: soumya.ghose@case.edu; Mitra, Jhimli; Rivest-Hénault, David

    Purpose: The feasibility of radiation therapy treatment planning using substitute computed tomography (sCT) generated from magnetic resonance images (MRIs) has been demonstrated by a number of research groups. One challenge with an MRI-alone workflow is the accurate identification of intraprostatic gold fiducial markers, which are frequently used for prostate localization prior to each dose delivery fraction. This paper investigates a template-matching approach for the detection of these seeds in MRI. Methods: Two different gradient echo T1 and T2* weighted MRI sequences were acquired from fifteen prostate cancer patients and evaluated for seed detection. For training, seed templates from manual contoursmore » were selected in a spectral clustering manifold learning framework. This aids in clustering “similar” gold fiducial markers together. The marker with the minimum distance to a cluster centroid was selected as the representative template of that cluster during training. During testing, Gaussian mixture modeling followed by a Markovian model was used in automatic detection of the probable candidates. The probable candidates were rigidly registered to the templates identified from spectral clustering, and a similarity metric is computed for ranking and detection. Results: A fiducial detection accuracy of 95% was obtained compared to manual observations. Expert radiation therapist observers were able to correctly identify all three implanted seeds on 11 of the 15 scans (the proposed method correctly identified all seeds on 10 of the 15). Conclusions: An novel automatic framework for gold fiducial marker detection in MRI is proposed and evaluated with detection accuracies comparable to manual detection. When radiation therapists are unable to determine the seed location in MRI, they refer back to the planning CT (only available in the existing clinical framework); similarly, an automatic quality control is built into the automatic software to ensure that all gold seeds are either correctly detected or a warning is raised for further manual intervention.« less

  12. MRI-alone radiation therapy planning for prostate cancer: Automatic fiducial marker detection.

    PubMed

    Ghose, Soumya; Mitra, Jhimli; Rivest-Hénault, David; Fazlollahi, Amir; Stanwell, Peter; Pichler, Peter; Sun, Jidi; Fripp, Jurgen; Greer, Peter B; Dowling, Jason A

    2016-05-01

    The feasibility of radiation therapy treatment planning using substitute computed tomography (sCT) generated from magnetic resonance images (MRIs) has been demonstrated by a number of research groups. One challenge with an MRI-alone workflow is the accurate identification of intraprostatic gold fiducial markers, which are frequently used for prostate localization prior to each dose delivery fraction. This paper investigates a template-matching approach for the detection of these seeds in MRI. Two different gradient echo T1 and T2* weighted MRI sequences were acquired from fifteen prostate cancer patients and evaluated for seed detection. For training, seed templates from manual contours were selected in a spectral clustering manifold learning framework. This aids in clustering "similar" gold fiducial markers together. The marker with the minimum distance to a cluster centroid was selected as the representative template of that cluster during training. During testing, Gaussian mixture modeling followed by a Markovian model was used in automatic detection of the probable candidates. The probable candidates were rigidly registered to the templates identified from spectral clustering, and a similarity metric is computed for ranking and detection. A fiducial detection accuracy of 95% was obtained compared to manual observations. Expert radiation therapist observers were able to correctly identify all three implanted seeds on 11 of the 15 scans (the proposed method correctly identified all seeds on 10 of the 15). An novel automatic framework for gold fiducial marker detection in MRI is proposed and evaluated with detection accuracies comparable to manual detection. When radiation therapists are unable to determine the seed location in MRI, they refer back to the planning CT (only available in the existing clinical framework); similarly, an automatic quality control is built into the automatic software to ensure that all gold seeds are either correctly detected or a warning is raised for further manual intervention.

  13. Translational and Rotational Diffusion of Two Differently Charged Solutes in Ethylammonium Nitrate-Methanol Mixture: Does the Nanostructure of the Amphiphiles Influence the Motion of the Solute?

    PubMed

    Kundu, Niloy; Roy, Arpita; Dutta, Rupam; Sarkar, Nilmoni

    2016-06-23

    In this Article, we have investigated the translational and rotational diffusion of two structurally similar but differently charged solutes (rhodamine 6G perchlorate and fluorescein sodium salt) in ethylammonium nitrate (EAN)-methanol (CH3OH) mixture to understand the effect of added ionic liquid on the motion of the solutes. EAN and CH3OH both are amphiphilic molecules and characterized by an extended hydrogen bonding network. Recently, Russina et al. found that a wide distribution of clusters exist in the CH3OH rich region (0.10 ≤ χEAN ≤ 0.15) and EAN molecules preserve their bulk-sponge-like morphology (Russina, O.; Sferrazza, A.; Caminiti, R.; Triolo, A. J. Phys. Chem. Lett. 2014, 5, 1738-1742). The effect of this microheterogeneous mixture on the solute's motion shows some interesting results compared to other PIL (protic ionic liquid)-cosolvent mixtures. Analysis of the time-resolved anisotropy data with the aid of Stokes-Einstein-Debye (SED) hydrodynamic theory predicts that the reorientation time of both of the solutes appears close to the stick hydrodynamic line in the methanol rich region. The hydrogen bond accepting solutes experience specific interaction with CH3OH, and with increasing concentration of EAN, the specific interaction between the solute and solvent molecules is decreased while the decrease is more prominent in the low mole fraction of EAN due to the large size of cluster formation. The temperature dependent anisotropy measurements show that the hydrogen bonding interaction between EAN and CH3OH is increased with increasing temperature. Moreover, fluorescence correlation spectroscopy (FCS) shows the dynamic heterogeneity of the mixture which is due to the segregation of the alkyl chain of the PIL. Formation of a large cluster at a low mole fraction of IL (0.10 ≤ χEAN ≤ 0.15) can be proved by the insensitivity of the translational diffusion and rotational activation energy of the solutes to the concentration of EAN. Thus, the result of the work suggests that the addition of EAN to the CH3OH affects the specific interaction between solute and solvent and, as a consequence, the translational motion as well as the rotational motion of the solutes are modulated.

  14. Constraints on the formation history of the elliptical galaxy NGC 3923 from the colors of its globular clusters

    NASA Technical Reports Server (NTRS)

    Zepf, Stephen E.; Ashman, Keith M.; Geisler, Doug

    1995-01-01

    We present a study of the colors of globular clusters associated with the elliptical galaxy NGC 3923. Our final sample consists of Wasington system C and T(sub 1) photometry for 143 globular cluster candidates with an expected contamination of no more than 10%. We find that the color distribution of the NGC 3923 globular cluster system (GCS) is broad and appears to have at least two peaks. A mixture modeling analysis of the color distribution indicates that a two-component model is favored over a single-component one at a high level of confidence (greater than 99%). This evidence for more than one population in the GCS of NGC 3923 is similar to that previously noted for the four other elliptical galaxies for which similar data have been published. Furthermore, we find that the NGC 3923 GCS is redder than the GCSs of previously studed elliptical galaxies of similar luminosity. The median metallicity inferred from our (C-(T(sub 1)))(sub 0) colors is (Fe/H)(sub med) = -0.56, with an uncertainty of 0.14 dex arising from all sources of uncertainty in the mean color. This is more metal rich than the median metallicity found for the GCS of M87 using the same method, (Fe/H)(sub med) = -0.94. Since M87 is more luminous than NGC 3923, this result points to significant scatter about any trend of higher GCS metallicity with increasing galaxy luminosity. We also show that there is a color gradient in the NGC 3923 GCS corresponding to about -0.5 dex in Delta(Fe/H)/Delta(log r). We conclude that the shape of the color distribution of individual GCSs and the variation in mean color among the GCSs of ellipticals are difficult to understand if elliptical galaxies are formed in a single protogalactic collapse. Models in which ellipticals and their globular clusters are formed in more than one event, such as a merger scenario, are more successful in accounting for these observations.

  15. Estimating the concrete compressive strength using hard clustering and fuzzy clustering based regression techniques.

    PubMed

    Nagwani, Naresh Kumar; Deo, Shirish V

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.

  16. Estimating the Concrete Compressive Strength Using Hard Clustering and Fuzzy Clustering Based Regression Techniques

    PubMed Central

    Nagwani, Naresh Kumar; Deo, Shirish V.

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939

  17. Quantum molecular dynamics study on the proton exchange, ionic structures, and transport properties of warm dense hydrogen-deuterium mixtures

    NASA Astrophysics Data System (ADS)

    Liu, Lei; Li, Zhi-Guo; Dai, Jia-Yu; Chen, Qi-Feng; Chen, Xiang-Rong

    2018-06-01

    Comprehensive knowledge of physical properties such as equation of state (EOS), proton exchange, dynamic structures, diffusion coefficients, and viscosities of hydrogen-deuterium mixtures with densities from 0.1 to 5 g /cm3 and temperatures from 1 to 50 kK has been presented via quantum molecular dynamics (QMD) simulations. The existing multi-shock experimental EOS provides an important benchmark to evaluate exchange-correlation functionals. The comparison of simulations with experiments indicates that a nonlocal van der Waals density functional (vdW-DF1) produces excellent results. Fraction analysis of molecules using a weighted integral over pair distribution functions was performed. A dissociation diagram together with a boundary where the proton exchange (H2+D2⇌2 HD ) occurs was generated, which shows evidence that the HD molecules form as the H2 and D2 molecules are almost 50% dissociated. The mechanism of proton exchange can be interpreted as a process of dissociation followed by recombination. The ionic structures at extreme conditions were analyzed by the effective coordination number model. High-order cluster, circle, and chain structures can be founded in the strongly coupled warm dense regime. The present QMD diffusion coefficient and viscosity can be used to benchmark two analytical one-component plasma (OCP) models: the Coulomb and Yukawa OCP models.

  18. The effect of binary mixtures of zinc, copper, cadmium, and nickel on the growth of the freshwater diatom Navicula pelliculosa and comparison with mixture toxicity model predictions.

    PubMed

    Nagai, Takashi; De Schamphelaere, Karel A C

    2016-11-01

    The authors investigated the effect of binary mixtures of zinc (Zn), copper (Cu), cadmium (Cd), and nickel (Ni) on the growth of a freshwater diatom, Navicula pelliculosa. A 7 × 7 full factorial experimental design (49 combinations in total) was used to test each binary metal mixture. A 3-d fluorescence microplate toxicity assay was used to test each combination. Mixture effects were predicted by concentration addition and independent action models based on a single-metal concentration-response relationship between the relative growth rate and the calculated free metal ion activity. Although the concentration addition model predicted the observed mixture toxicity significantly better than the independent action model for the Zn-Cu mixture, the independent action model predicted the observed mixture toxicity significantly better than the concentration addition model for the Cd-Zn, Cd-Ni, and Cd-Cu mixtures. For the Zn-Ni and Cu-Ni mixtures, it was unclear which of the 2 models was better. Statistical analysis concerning antagonistic/synergistic interactions showed that the concentration addition model is generally conservative (with the Zn-Ni mixture being the sole exception), indicating that the concentration addition model would be useful as a method for a conservative first-tier screening-level risk analysis of metal mixtures. Environ Toxicol Chem 2016;35:2765-2773. © 2016 SETAC. © 2016 SETAC.

  19. Surfactant-controlled polymerization of semiconductor clusters to quantum dots through competing step-growth and living chain-growth mechanisms.

    PubMed

    Evans, Christopher M; Love, Alyssa M; Weiss, Emily A

    2012-10-17

    This article reports control of the competition between step-growth and living chain-growth polymerization mechanisms in the formation of cadmium chalcogenide colloidal quantum dots (QDs) from CdSe(S) clusters by varying the concentration of anionic surfactant in the synthetic reaction mixture. The growth of the particles proceeds by step-addition from initially nucleated clusters in the absence of excess phosphinic or carboxylic acids, which adsorb as their anionic conjugate bases, and proceeds indirectly by dissolution of clusters, and subsequent chain-addition of monomers to stable clusters (Ostwald ripening) in the presence of excess phosphinic or carboxylic acid. Fusion of clusters by step-growth polymerization is an explanation for the consistent observation of so-called "magic-sized" clusters in QD growth reactions. Living chain-addition (chain addition with no explicit termination step) produces QDs over a larger range of sizes with better size dispersity than step-addition. Tuning the molar ratio of surfactant to Se(2-)(S(2-)), the limiting ionic reagent, within the living chain-addition polymerization allows for stoichiometric control of QD radius without relying on reaction time.

  20. A fast learning method for large scale and multi-class samples of SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  1. Mixture Rasch Models with Joint Maximum Likelihood Estimation

    ERIC Educational Resources Information Center

    Willse, John T.

    2011-01-01

    This research provides a demonstration of the utility of mixture Rasch models. Specifically, a model capable of estimating a mixture partial credit model using joint maximum likelihood is presented. Like the partial credit model, the mixture partial credit model has the beneficial feature of being appropriate for analysis of assessment data…

  2. Signal Partitioning Algorithm for Highly Efficient Gaussian Mixture Modeling in Mass Spectrometry

    PubMed Central

    Polanski, Andrzej; Marczyk, Michal; Pietrowska, Monika; Widlak, Piotr; Polanska, Joanna

    2015-01-01

    Mixture - modeling of mass spectra is an approach with many potential applications including peak detection and quantification, smoothing, de-noising, feature extraction and spectral signal compression. However, existing algorithms do not allow for automated analyses of whole spectra. Therefore, despite highlighting potential advantages of mixture modeling of mass spectra of peptide/protein mixtures and some preliminary results presented in several papers, the mixture modeling approach was so far not developed to the stage enabling systematic comparisons with existing software packages for proteomic mass spectra analyses. In this paper we present an efficient algorithm for Gaussian mixture modeling of proteomic mass spectra of different types (e.g., MALDI-ToF profiling, MALDI-IMS). The main idea is automated partitioning of protein mass spectral signal into fragments. The obtained fragments are separately decomposed into Gaussian mixture models. The parameters of the mixture models of fragments are then aggregated to form the mixture model of the whole spectrum. We compare the elaborated algorithm to existing algorithms for peak detection and we demonstrate improvements of peak detection efficiency obtained by using Gaussian mixture modeling. We also show applications of the elaborated algorithm to real proteomic datasets of low and high resolution. PMID:26230717

  3. A Holarctic Biogeographical Analysis of the Collembola (Arthropoda, Hexapoda) Unravels Recent Post-Glacial Colonization Patterns

    PubMed Central

    Ávila-Jiménez, María Luisa; Coulson, Stephen James

    2011-01-01

    We aimed to describe the main Arctic biogeographical patterns of the Collembola, and analyze historical factors and current climatic regimes determining Arctic collembolan species distribution. Furthermore, we aimed to identify possible dispersal routes, colonization sources and glacial refugia for Arctic collembola. We implemented a Gaussian Mixture Clustering method on species distribution ranges and applied a distance- based parametric bootstrap test on presence-absence collembolan species distribution data. Additionally, multivariate analysis was performed considering species distributions, biodiversity, cluster distribution and environmental factors (temperature and precipitation). No clear relation was found between current climatic regimes and species distribution in the Arctic. Gaussian Mixture Clustering found common elements within Siberian areas, Atlantic areas, the Canadian Arctic, a mid-Siberian cluster and specific Beringian elements, following the same pattern previously described, using a variety of molecular methods, for Arctic plants. Species distribution hence indicate the influence of recent glacial history, as LGM glacial refugia (mid-Siberia, and Beringia) and major dispersal routes to high Arctic island groups can be identified. Endemic species are found in the high Arctic, but no specific biogeographical pattern can be clearly identified as a sign of high Arctic glacial refugia. Ocean currents patterns are suggested as being an important factor shaping the distribution of Arctic Collembola, which is consistent with Antarctic studies in collembolan biogeography. The clear relations between cluster distribution and geographical areas considering their recent glacial history, lack of relationship of species distribution with current climatic regimes, and consistency with previously described Arctic patterns in a series of organisms inferred using a variety of methods, suggest that historical phenomena shaping contemporary collembolan distribution can be inferred through biogeographical analysis. PMID:26467728

  4. MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.

    PubMed

    Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi

    2018-03-05

    Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.

  5. Infalling groups and galaxy transformations in the cluster A2142

    NASA Astrophysics Data System (ADS)

    Einasto, Maret; Deshev, Boris; Lietzen, Heidi; Kipper, Rain; Tempel, Elmo; Park, Changbom; Gramann, Mirt; Heinämäki, Pekka; Saar, Enn; Einasto, Jaan

    2018-03-01

    Context. Superclusters of galaxies provide dynamical environments for the study of the formation and evolution of structures in the cosmic web from galaxies, to the richest galaxy clusters, and superclusters themselves. Aims: We study galaxy populations and search for possible merging substructures in the rich galaxy cluster A2142 in the collapsing core of the supercluster SCl A2142, which may give rise to radio and X-ray structures in the cluster, and affect galaxy properties of this cluster. Methods: We used normal mixture modelling to select substructure of the cluster A2142. We compared alignments of the cluster, its brightest galaxies (hereafter BCGs), subclusters, and supercluster axes. The projected phase space (PPS) diagram and clustercentric distributions are used to analyse the dynamics of the cluster and study the distribution of various galaxy populations in the cluster and subclusters. Results: We find several infalling galaxy groups and subclusters. The cluster, supercluster, BCGs, and one infalling subcluster are all aligned. Their orientation is correlated with the alignment of the radio and X-ray haloes of the cluster. Galaxy populations in the main cluster and in the outskirts subclusters are different. Galaxies in the centre of the main cluster at the clustercentric distances 0.5 h-1 Mpc (Dc/Rvir < 0.5, Rvir = 0.9 h-1 Mpc) have older stellar populations (with the median age of 10-11 Gyr) than galaxies at larger clustercentric distances. Star-forming and recently quenched galaxies are located mostly at the clustercentric distances Dc ≈ 1.8 h-1 Mpc, where subclusters fall into the cluster and the properties of galaxies change rapidly. In this region the median age of stellar populations of galaxies is about 2 Gyr. Galaxies in A2142 on average have higher stellar masses, lower star formation rates, and redder colours than galaxies in rich groups. The total mass in infalling groups and subclusters is M ≈ 6 × 1014 h-1 M⊙, that is approximately half of the mass of the cluster. This mass is sufficient for the mass growth of the cluster from redshift z = 0.5 (half-mass epoch) to the present. Conclusions: Our analysis suggests that the cluster A2142 has formed as a result of past and present mergers and infallen groups, predominantly along the supercluster axis. Mergers cause complex radio and X-ray structure of the cluster and affect the properties of galaxies in the cluster, especially at the boundaries of the cluster in the infall region. Explaining the differences between galaxy populations, mass, and richness of A2142, and other groups and clusters may lead to better insight about the formation and evolution of rich galaxy clusters.

  6. Metal mixtures in urban and rural populations in the US: The Multi-Ethnic Study of Atherosclerosis and the Strong Heart Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pang, Yuanjie, E-mail: yuanjie.p@gmail.com

    Background: Natural and anthropogenic sources of metal exposure differ for urban and rural residents. We searched to identify patterns of metal mixtures which could suggest common environmental sources and/or metabolic pathways of different urinary metals, and compared metal-mixtures in two population-based studies from urban/sub-urban and rural/town areas in the US: the Multi-Ethnic Study of Atherosclerosis (MESA) and the Strong Heart Study (SHS). Methods: We studied a random sample of 308 White, Black, Chinese-American, and Hispanic participants in MESA (2000–2002) and 277 American Indian participants in SHS (1998–2003). We used principal component analysis (PCA), cluster analysis (CA), and linear discriminant analysismore » (LDA) to evaluate nine urinary metals (antimony [Sb], arsenic [As], cadmium [Cd], lead [Pb], molybdenum [Mo], selenium [Se], tungsten [W], uranium [U] and zinc [Zn]). For arsenic, we used the sum of inorganic and methylated species (∑As). Results: All nine urinary metals were higher in SHS compared to MESA participants. PCA and CA revealed the same patterns in SHS, suggesting 4 distinct principal components (PC) or clusters (∑As-U-W, Pb-Sb, Cd-Zn, Mo-Se). In MESA, CA showed 2 large clusters (∑As-Mo-Sb-U-W, Cd-Pb-Se-Zn), while PCA showed 4 PCs (Sb-U-W, Pb-Se-Zn, Cd-Mo, ∑As). LDA indicated that ∑As, U, W, and Zn were the most discriminant variables distinguishing MESA and SHS participants. Conclusions: In SHS, the ∑As-U-W cluster and PC might reflect groundwater contamination in rural areas, and the Cd-Zn cluster and PC could reflect common sources from meat products or metabolic interactions. Among the metals assayed, ∑As, U, W and Zn differed the most between MESA and SHS, possibly reflecting disproportionate exposure from drinking water and perhaps food in rural Native communities compared to urban communities around the US. - Highlights: • We identified and compared environmental sources of urinary metals in MESA and SHS. • ∑As-U-W in SHS may reflect groundwater contamination in rural areas. • Cd-Zn in SHS may reflect common sources from meat products or metabolic interaction. • ∑As, U, W, and Zn differed the most between MESA and SHS participants.« less

  7. Identifiability in N-mixture models: a large-scale screening test with bird data.

    PubMed

    Kéry, Marc

    2018-02-01

    Binomial N-mixture models have proven very useful in ecology, conservation, and monitoring: they allow estimation and modeling of abundance separately from detection probability using simple counts. Recently, doubts about parameter identifiability have been voiced. I conducted a large-scale screening test with 137 bird data sets from 2,037 sites. I found virtually no identifiability problems for Poisson and zero-inflated Poisson (ZIP) binomial N-mixture models, but negative-binomial (NB) models had problems in 25% of all data sets. The corresponding multinomial N-mixture models had no problems. Parameter estimates under Poisson and ZIP binomial and multinomial N-mixture models were extremely similar. Identifiability problems became a little more frequent with smaller sample sizes (267 and 50 sites), but were unaffected by whether the models did or did not include covariates. Hence, binomial N-mixture model parameters with Poisson and ZIP mixtures typically appeared identifiable. In contrast, NB mixtures were often unidentifiable, which is worrying since these were often selected by Akaike's information criterion. Identifiability of binomial N-mixture models should always be checked. If problems are found, simpler models, integrated models that combine different observation models or the use of external information via informative priors or penalized likelihoods, may help. © 2017 by the Ecological Society of America.

  8. Quantitative characterization of the viscosity of a microemulsion

    NASA Technical Reports Server (NTRS)

    Berg, Robert F.; Moldover, Michael R.; Huang, John S.

    1987-01-01

    The viscosity of the three-component microemulsion water/decane/AOT has been measured as a function of temperature and droplet volume fraction. At temperatures well below the phase-separation temperature the viscosity is described by treating the droplets as hard spheres suspended in decane. Upon approaching the two-phase region from low temperature, there is a large (as much as a factor of four) smooth increase of the viscosity which may be related to the percolation-like transition observed in the electrical conductivity. This increase in viscosity is not completely consistent with either a naive electroviscous model or a simple clustering model. The divergence of the viscosity near the critical point (39 C) is superimposed upon the smooth increase. The magnitude and temperature dependence of the critical divergence are similar to that seen near the critical points of binary liquid mixtures.

  9. Modeling abundance using multinomial N-mixture models

    USGS Publications Warehouse

    Royle, Andy

    2016-01-01

    Multinomial N-mixture models are a generalization of the binomial N-mixture models described in Chapter 6 to allow for more complex and informative sampling protocols beyond simple counts. Many commonly used protocols such as multiple observer sampling, removal sampling, and capture-recapture produce a multivariate count frequency that has a multinomial distribution and for which multinomial N-mixture models can be developed. Such protocols typically result in more precise estimates than binomial mixture models because they provide direct information about parameters of the observation process. We demonstrate the analysis of these models in BUGS using several distinct formulations that afford great flexibility in the types of models that can be developed, and we demonstrate likelihood analysis using the unmarked package. Spatially stratified capture-recapture models are one class of models that fall into the multinomial N-mixture framework, and we discuss analysis of stratified versions of classical models such as model Mb, Mh and other classes of models that are only possible to describe within the multinomial N-mixture framework.

  10. Social network types among older Korean adults: Associations with subjective health.

    PubMed

    Sohn, Sung Yun; Joo, Won-Tak; Kim, Woo Jung; Kim, Se Joo; Youm, Yoosik; Kim, Hyeon Chang; Park, Yeong-Ran; Lee, Eun

    2017-01-01

    With population aging now a global phenomenon, the health of older adults is becoming an increasingly important issue. Because the Korean population is aging at an unprecedented rate, preparing for public health problems associated with old age is particularly salient in this country. As the physical and mental health of older adults is related to their social relationships, investigating the social networks of older adults and their relationship to health status is important for establishing public health policies. The aims of this study were to identify social network types among older adults in South Korea and to examine the relationship of these social network types with self-rated health and depression. Data from the Korean Social Life, Health, and Aging Project were analyzed. Model-based clustering using finite normal mixture modeling was conducted to identify the social network types based on ten criterion variables of social relationships and activities: marital status, number of children, number of close relatives, number of friends, frequency of attendance at religious services, attendance at organized group meetings, in-degree centrality, out-degree centrality, closeness centrality, and betweenness centrality. Multivariate regression analysis was conducted to examine associations between the identified social network types and self-rated health and depression. The model-based clustering analysis revealed that social networks clustered into five types: diverse, family, congregant, congregant-restricted, and restricted. Diverse or family social network types were significantly associated with more favorable subjective mental health, whereas the restricted network type was significantly associated with poorer ratings of mental and physical health. In addition, our analysis identified unique social network types related to religious activities. In summary, we developed a comprehensive social network typology for older Korean adults. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Meta-analysis of Diagnostic Accuracy and ROC Curves with Covariate Adjusted Semiparametric Mixtures.

    PubMed

    Doebler, Philipp; Holling, Heinz

    2015-12-01

    Many screening tests dichotomize a measurement to classify subjects. Typically a cut-off value is chosen in a way that allows identification of an acceptable number of cases relative to a reference procedure, but does not produce too many false positives at the same time. Thus for the same sample many pairs of sensitivities and false positive rates result as the cut-off is varied. The curve of these points is called the receiver operating characteristic (ROC) curve. One goal of diagnostic meta-analysis is to integrate ROC curves and arrive at a summary ROC (SROC) curve. Holling, Böhning, and Böhning (Psychometrika 77:106-126, 2012a) demonstrated that finite semiparametric mixtures can describe the heterogeneity in a sample of Lehmann ROC curves well; this approach leads to clusters of SROC curves of a particular shape. We extend this work with the help of the [Formula: see text] transformation, a flexible family of transformations for proportions. A collection of SROC curves is constructed that approximately contains the Lehmann family but in addition allows the modeling of shapes beyond the Lehmann ROC curves. We introduce two rationales for determining the shape from the data. Using the fact that each curve corresponds to a natural univariate measure of diagnostic accuracy, we show how covariate adjusted mixtures lead to a meta-regression on SROC curves. Three worked examples illustrate the method.

  12. An outbreak of Salmonella Typhimurium phage type 42 associated with the consumption of raw flour.

    PubMed

    McCallum, Lisa; Paine, Shevaun; Sexton, Kerry; Dufour, Muriel; Dyet, Kristin; Wilson, Maurice; Campbell, Donald; Bandaranayake, Don; Hope, Virginia

    2013-02-01

    A cluster of salmonellosis cases caused by Salmonella Typhimurium phage type 42 (STM42) emerged in New Zealand in October 2008. STM42 isolates from a wheat-based poultry feed raw material (broll; i.e., product containing wheat flour and particles of grain) had been identified in the 2 months prior to this cluster. Initial investigations indicated that eating uncooked baking mixture was associated with illness. A case-control study was conducted to test the hypothesis that there was an association between STM42 cases and consumption of raw flour or other baking ingredients. Salmonella isolates from human and non-human sources were compared using pulsed-field gel electrophoresis (PFGE) and multiple-locus variable number tandem repeat analysis (MLVA). Environmental investigations included testing flour and other baking ingredients from case homes, unopened bags of flour purchased from retail stores, and inspection of an implicated flour mill. A case-control study of 39 cases and 66 controls found cases had 4.5 times the odds of consuming uncooked baking mixture as controls (95% confidence interval [CI] 1.6-12.5, p-value 0.001). Examination of individual baking ingredients found that, after adjusting for eggs, flour had an odds ratio (OR) of 5.7 (95% CI 1.1-29.1, p-value 0.035). After adjusting for flour, eggs had an OR of 0.8 (95% CI 0.2-3.4, p-value 0.762). PFGE patterns were identical for all STM42 isolates tested; however, MLVA distinguished isolates that were epidemiologically linked to the cluster. STM42 was recovered from flour taken from four cases' homes, two unopened packs purchased from retail stores and packs from three batches of retrieved (recalled) product. This outbreak was associated with the consumption of uncooked baking mixture containing flour contaminated with STM42. The implicated flour mill initiated a voluntary withdrawal from sale of all batches of flour thought to be contaminated. Media releases informed the public about implicated flour brands and the risks of consuming uncooked baking mixture.

  13. Underdetermined blind separation of three-way fluorescence spectra of PAHs in water.

    PubMed

    Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing

    2018-06-15

    In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. The Kirkwood-Buff theory of solutions and the local composition of liquid mixtures.

    PubMed

    Shulgin, Ivan L; Ruckenstein, Eli

    2006-06-29

    The present paper is devoted to the local composition of liquid mixtures calculated in the framework of the Kirkwood-Buff theory of solutions. A new method is suggested to calculate the excess (or deficit) number of various molecules around a selected (central) molecule in binary and multicomponent liquid mixtures in terms of measurable macroscopic thermodynamic quantities, such as the derivatives of the chemical potentials with respect to concentrations, the isothermal compressibility, and the partial molar volumes. This method accounts for an inaccessible volume due to the presence of a central molecule and is applied to binary and ternary mixtures. For the ideal binary mixture it is shown that because of the difference in the volumes of the pure components there is an excess (or deficit) number of different molecules around a central molecule. The excess (or deficit) becomes zero when the components of the ideal binary mixture have the same volume. The new method is also applied to methanol + water and 2-propanol + water mixtures. In the case of the 2-propanol + water mixture, the new method, in contrast to the other ones, indicates that clusters dominated by 2-propanol disappear at high alcohol mole fractions, in agreement with experimental observations. Finally, it is shown that the application of the new procedure to the ternary mixture water/protein/cosolvent at infinite dilution of the protein led to almost the same results as the methods involving a reference state.

  15. Concentration addition and independent action model: Which is better in predicting the toxicity for metal mixtures on zebrafish larvae.

    PubMed

    Gao, Yongfei; Feng, Jianfeng; Kang, Lili; Xu, Xin; Zhu, Lin

    2018-01-01

    The joint toxicity of chemical mixtures has emerged as a popular topic, particularly on the additive and potential synergistic actions of environmental mixtures. We investigated the 24h toxicity of Cu-Zn, Cu-Cd, and Cu-Pb and 96h toxicity of Cd-Pb binary mixtures on the survival of zebrafish larvae. Joint toxicity was predicted and compared using the concentration addition (CA) and independent action (IA) models with different assumptions in the toxic action mode in toxicodynamic processes through single and binary metal mixture tests. Results showed that the CA and IA models presented varying predictive abilities for different metal combinations. For the Cu-Cd and Cd-Pb mixtures, the CA model simulated the observed survival rates better than the IA model. By contrast, the IA model simulated the observed survival rates better than the CA model for the Cu-Zn and Cu-Pb mixtures. These findings revealed that the toxic action mode may depend on the combinations and concentrations of tested metal mixtures. Statistical analysis of the antagonistic or synergistic interactions indicated that synergistic interactions were observed for the Cu-Cd and Cu-Pb mixtures, non-interactions were observed for the Cd-Pb mixtures, and slight antagonistic interactions for the Cu-Zn mixtures. These results illustrated that the CA and IA models are consistent in specifying the interaction patterns of binary metal mixtures. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Concentration Addition, Independent Action and Generalized Concentration Addition Models for Mixture Effect Prediction of Sex Hormone Synthesis In Vitro

    PubMed Central

    Hadrup, Niels; Taxvig, Camilla; Pedersen, Mikael; Nellemann, Christine; Hass, Ulla; Vinggaard, Anne Marie

    2013-01-01

    Humans are concomitantly exposed to numerous chemicals. An infinite number of combinations and doses thereof can be imagined. For toxicological risk assessment the mathematical prediction of mixture effects, using knowledge on single chemicals, is therefore desirable. We investigated pros and cons of the concentration addition (CA), independent action (IA) and generalized concentration addition (GCA) models. First we measured effects of single chemicals and mixtures thereof on steroid synthesis in H295R cells. Then single chemical data were applied to the models; predictions of mixture effects were calculated and compared to the experimental mixture data. Mixture 1 contained environmental chemicals adjusted in ratio according to human exposure levels. Mixture 2 was a potency adjusted mixture containing five pesticides. Prediction of testosterone effects coincided with the experimental Mixture 1 data. In contrast, antagonism was observed for effects of Mixture 2 on this hormone. The mixtures contained chemicals exerting only limited maximal effects. This hampered prediction by the CA and IA models, whereas the GCA model could be used to predict a full dose response curve. Regarding effects on progesterone and estradiol, some chemicals were having stimulatory effects whereas others had inhibitory effects. The three models were not applicable in this situation and no predictions could be performed. Finally, the expected contributions of single chemicals to the mixture effects were calculated. Prochloraz was the predominant but not sole driver of the mixtures, suggesting that one chemical alone was not responsible for the mixture effects. In conclusion, the GCA model seemed to be superior to the CA and IA models for the prediction of testosterone effects. A situation with chemicals exerting opposing effects, for which the models could not be applied, was identified. In addition, the data indicate that in non-potency adjusted mixtures the effects cannot always be accounted for by single chemicals. PMID:23990906

  17. Analysing the health effects of simultaneous exposure to physical and chemical properties of airborne particles

    PubMed Central

    Pirani, Monica; Best, Nicky; Blangiardo, Marta; Liverani, Silvia; Atkinson, Richard W.; Fuller, Gary W.

    2015-01-01

    Background Airborne particles are a complex mix of organic and inorganic compounds, with a range of physical and chemical properties. Estimation of how simultaneous exposure to air particles affects the risk of adverse health response represents a challenge for scientific research and air quality management. In this paper, we present a Bayesian approach that can tackle this problem within the framework of time series analysis. Methods We used Dirichlet process mixture models to cluster time points with similar multipollutant and response profiles, while adjusting for seasonal cycles, trends and temporal components. Inference was carried out via Markov Chain Monte Carlo methods. We illustrated our approach using daily data of a range of particle metrics and respiratory mortality for London (UK) 2002–2005. To better quantify the average health impact of these particles, we measured the same set of metrics in 2012, and we computed and compared the posterior predictive distributions of mortality under the exposure scenario in 2012 vs 2005. Results The model resulted in a partition of the days into three clusters. We found a relative risk of 1.02 (95% credible intervals (CI): 1.00, 1.04) for respiratory mortality associated with days characterised by high posterior estimates of non-primary particles, especially nitrate and sulphate. We found a consistent reduction in the airborne particles in 2012 vs 2005 and the analysis of the posterior predictive distributions of respiratory mortality suggested an average annual decrease of − 3.5% (95% CI: − 0.12%, − 5.74%). Conclusions We proposed an effective approach that enabled the better understanding of hidden structures in multipollutant health effects within time series analysis. It allowed the identification of exposure metrics associated with respiratory mortality and provided a tool to assess the changes in health effects from various policies to control the ambient particle matter mixtures. PMID:25795926

  18. Analysing the health effects of simultaneous exposure to physical and chemical properties of airborne particles.

    PubMed

    Pirani, Monica; Best, Nicky; Blangiardo, Marta; Liverani, Silvia; Atkinson, Richard W; Fuller, Gary W

    2015-06-01

    Airborne particles are a complex mix of organic and inorganic compounds, with a range of physical and chemical properties. Estimation of how simultaneous exposure to air particles affects the risk of adverse health response represents a challenge for scientific research and air quality management. In this paper, we present a Bayesian approach that can tackle this problem within the framework of time series analysis. We used Dirichlet process mixture models to cluster time points with similar multipollutant and response profiles, while adjusting for seasonal cycles, trends and temporal components. Inference was carried out via Markov Chain Monte Carlo methods. We illustrated our approach using daily data of a range of particle metrics and respiratory mortality for London (UK) 2002-2005. To better quantify the average health impact of these particles, we measured the same set of metrics in 2012, and we computed and compared the posterior predictive distributions of mortality under the exposure scenario in 2012 vs 2005. The model resulted in a partition of the days into three clusters. We found a relative risk of 1.02 (95% credible intervals (CI): 1.00, 1.04) for respiratory mortality associated with days characterised by high posterior estimates of non-primary particles, especially nitrate and sulphate. We found a consistent reduction in the airborne particles in 2012 vs 2005 and the analysis of the posterior predictive distributions of respiratory mortality suggested an average annual decrease of -3.5% (95% CI: -0.12%, -5.74%). We proposed an effective approach that enabled the better understanding of hidden structures in multipollutant health effects within time series analysis. It allowed the identification of exposure metrics associated with respiratory mortality and provided a tool to assess the changes in health effects from various policies to control the ambient particle matter mixtures. Copyright © 2015. Published by Elsevier Ltd.

  19. Detecting Mixtures from Structural Model Differences Using Latent Variable Mixture Modeling: A Comparison of Relative Model Fit Statistics

    ERIC Educational Resources Information Center

    Henson, James M.; Reise, Steven P.; Kim, Kevin H.

    2007-01-01

    The accuracy of structural model parameter estimates in latent variable mixture modeling was explored with a 3 (sample size) [times] 3 (exogenous latent mean difference) [times] 3 (endogenous latent mean difference) [times] 3 (correlation between factors) [times] 3 (mixture proportions) factorial design. In addition, the efficacy of several…

  20. THE RED SEQUENCE AT BIRTH IN THE GALAXY CLUSTER Cl J1449+0856 AT z = 2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strazzullo, V.; Pannella, M.; Daddi, E.

    We use Hubble Space Telescope /WFC3 imaging to study the red population in the IR-selected, X-ray detected, low-mass cluster Cl J1449+0856 at z = 2, one of the few bona fide established clusters discovered at this redshift, and likely a typical progenitor of an average massive cluster today. This study explores the presence and significance of an early red sequence in the core of this structure, investigating the nature of red-sequence galaxies, highlighting environmental effects on cluster galaxy populations at high redshift, and at the same time underlining similarities and differences with other distant dense environments. Our results suggest thatmore » the red population in the core of Cl J1449+0856 is made of a mixture of quiescent and dusty star-forming galaxies, with a seedling of the future red sequence already growing in the very central cluster region, and already characterizing the inner cluster core with respect to lower-density environments. On the other hand, the color–magnitude diagram of this cluster is definitely different from that of lower-redshift z ≲ 1 clusters, as well as of some rare particularly evolved massive clusters at similar redshift, and it is suggestive of a transition phase between active star formation and passive evolution occurring in the protocluster and established lower-redshift cluster regimes.« less

  1. Ice/water slurry blocking phenomenon at a tube orifice.

    PubMed

    Hirochi, Takero; Yamada, Shuichi; Shintate, Tuyoshi; Shirakashi, Masataka

    2002-10-01

    The phenomenon of ice-particle/water mixture blocking flow through a pipeline is a problem that needs to be solved before mixture flow can be applied for practical use in cold energy transportation in a district cooling system. In this work, the blocking mechanism of ice-particle slurry at a tube orifice is investigated and a criterion for blocking is presented. The cohesive nature of ice particles is shown to cause compressed plug type blocking and the compressive yield stress of a particle cluster is presented as a measure for the cohesion strength of ice particles.

  2. Highly efficient classification and identification of human pathogenic bacteria by MALDI-TOF MS.

    PubMed

    Hsieh, Sen-Yung; Tseng, Chiao-Li; Lee, Yun-Shien; Kuo, An-Jing; Sun, Chien-Feng; Lin, Yen-Hsiu; Chen, Jen-Kun

    2008-02-01

    Accurate and rapid identification of pathogenic microorganisms is of critical importance in disease treatment and public health. Conventional work flows are time-consuming, and procedures are multifaceted. MS can be an alternative but is limited by low efficiency for amino acid sequencing as well as low reproducibility for spectrum fingerprinting. We systematically analyzed the feasibility of applying MS for rapid and accurate bacterial identification. Directly applying bacterial colonies without further protein extraction to MALDI-TOF MS analysis revealed rich peak contents and high reproducibility. The MS spectra derived from 57 isolates comprising six human pathogenic bacterial species were analyzed using both unsupervised hierarchical clustering and supervised model construction via the Genetic Algorithm. Hierarchical clustering analysis categorized the spectra into six groups precisely corresponding to the six bacterial species. Precise classification was also maintained in an independently prepared set of bacteria even when the numbers of m/z values were reduced to six. In parallel, classification models were constructed via Genetic Algorithm analysis. A model containing 18 m/z values accurately classified independently prepared bacteria and identified those species originally not used for model construction. Moreover bacteria fewer than 10(4) cells and different species in bacterial mixtures were identified using the classification model approach. In conclusion, the application of MALDI-TOF MS in combination with a suitable model construction provides a highly accurate method for bacterial classification and identification. The approach can identify bacteria with low abundance even in mixed flora, suggesting that a rapid and accurate bacterial identification using MS techniques even before culture can be attained in the near future.

  3. Maximum likelihood estimation of finite mixture model for economic data

    NASA Astrophysics Data System (ADS)

    Phoong, Seuk-Yen; Ismail, Mohd Tahir

    2014-06-01

    Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.

  4. A Large C+N+O Abundance Spread in Giant Stars of the Globular Cluster NGC 1851

    NASA Astrophysics Data System (ADS)

    Yong, David; Grundahl, Frank; D'Antona, Francesca; Karakas, Amanda I.; Lattanzio, John C.; Norris, John E.

    2009-04-01

    Abundances of C, N, and O are determined in four bright red giants that span the known abundance range for light (Na and Al) and s-process (Zr and La) elements in the globular cluster NGC 1851. The abundance sum C+N+O exhibits a range of 0.6 dex, a factor of 4, in contrast to other clusters in which no significant C+N+O spread is found. Such an abundance range offers support for the Cassisi et al. scenario in which the double subgiant branch populations are coeval but with different mixtures of C+N+O abundances. Further, the Na, Al, Zr, and La abundances are correlated with C+N+O, and therefore NGC 1851 is the first cluster to provide strong support for the scenario in which asymptotic giant branch stars are responsible for the globular cluster light element abundance variations. This paper includes data gathered with the 6.5 meter Magellan Telescopes located at Las Campanas Observatory, Chile.

  5. Fault Network Reconstruction using Agglomerative Clustering: Applications to South Californian Seismicity

    NASA Astrophysics Data System (ADS)

    Kamer, Yavor; Ouillon, Guy; Sornette, Didier; Wössner, Jochen

    2014-05-01

    We present applications of a new clustering method for fault network reconstruction based on the spatial distribution of seismicity. Unlike common approaches that start from the simplest large scale and gradually increase the complexity trying to explain the small scales, our method uses a bottom-up approach, by an initial sampling of the small scales and then reducing the complexity. The new approach also exploits the location uncertainty associated with each event in order to obtain a more accurate representation of the spatial probability distribution of the seismicity. For a given dataset, we first construct an agglomerative hierarchical cluster (AHC) tree based on Ward's minimum variance linkage. Such a tree starts out with one cluster and progressively branches out into an increasing number of clusters. To atomize the structure into its constitutive protoclusters, we initialize a Gaussian Mixture Modeling (GMM) at a given level of the hierarchical clustering tree. We then let the GMM converge using an Expectation Maximization (EM) algorithm. The kernels that become ill defined (less than 4 points) at the end of the EM are discarded. By incrementing the number of initialization clusters (by atomizing at increasingly populated levels of the AHC tree) and repeating the procedure above, we are able to determine the maximum number of Gaussian kernels the structure can hold. The kernels in this configuration constitute our protoclusters. In this setting, merging of any pair will lessen the likelihood (calculated over the pdf of the kernels) but in turn will reduce the model's complexity. The information loss/gain of any possible merging can thus be quantified based on the Minimum Description Length (MDL) principle. Similar to an inter-distance matrix, where the matrix element di,j gives the distance between points i and j, we can construct a MDL gain/loss matrix where mi,j gives the information gain/loss resulting from the merging of kernels i and j. Based on this matrix, merging events resulting in MDL gain are performed in descending order until no gainful merging is possible anymore. We envision that the results of this study could lead to a better understanding of the complex interactions within the Californian fault system and hopefully use the acquired insights for earthquake forecasting.

  6. Optimization of self-interstitial clusters in 3C-SiC with genetic algorithm

    NASA Astrophysics Data System (ADS)

    Ko, Hyunseok; Kaczmarowski, Amy; Szlufarska, Izabela; Morgan, Dane

    2017-08-01

    Under irradiation, SiC develops damage commonly referred to as black spot defects, which are speculated to be self-interstitial atom clusters. To understand the evolution of these defect clusters and their impacts (e.g., through radiation induced swelling) on the performance of SiC in nuclear applications, it is important to identify the cluster composition, structure, and shape. In this work the genetic algorithm code StructOpt was utilized to identify groundstate cluster structures in 3C-SiC. The genetic algorithm was used to explore clusters of up to ∼30 interstitials of C-only, Si-only, and Si-C mixtures embedded in the SiC lattice. We performed the structure search using Hamiltonians from both density functional theory and empirical potentials. The thermodynamic stability of clusters was investigated in terms of their composition (with a focus on Si-only, C-only, and stoichiometric) and shape (spherical vs. planar), as a function of the cluster size (n). Our results suggest that large Si-only clusters are likely unstable, and clusters are predominantly C-only for n ≤ 10 and stoichiometric for n > 10. The results imply that there is an evolution of the shape of the most stable clusters, where small clusters are stable in more spherical geometries while larger clusters are stable in more planar configurations. We also provide an estimated energy vs. size relationship, E(n), for use in future analysis.

  7. The Next Generation Fornax Survey (NGFS). IV. Mass and Age Bimodality of Nuclear Clusters in the Fornax Core Region

    NASA Astrophysics Data System (ADS)

    Ordenes-Briceño, Yasna; Puzia, Thomas H.; Eigenthaler, Paul; Taylor, Matthew A.; Muñoz, Roberto P.; Zhang, Hongxin; Alamo-Martínez, Karla; Ribbeck, Karen X.; Grebel, Eva K.; Ángel, Simón; Côté, Patrick; Ferrarese, Laura; Hilker, Michael; Lançon, Ariane; Mieske, Steffen; Miller, Bryan W.; Rong, Yu; Sánchez-Janssen, Ruben

    2018-06-01

    We present the analysis of 61 nucleated dwarf galaxies in the central regions (≲R vir/4) of the Fornax galaxy cluster. The galaxies and their nuclei are studied as part of the Next Generation Fornax Survey using optical imaging obtained with the Dark Energy Camera mounted at Blanco/Cerro Tololo Inter-American Observatory and near-infrared data obtained with VIRCam at VISTA/ESO. We decompose the nucleated dwarfs in nucleus and spheroid, after subtracting the surface brightness profile of the spheroid component and studying the nucleus using point source photometry. In general, nuclei are consistent with colors of confirmed metal-poor globular clusters, but with significantly smaller dispersion than other confirmed compact stellar systems in Fornax. We find a bimodal nucleus mass distribution with peaks located at {log}({{ \\mathcal M }}* /{M}ȯ )≃ 5.4 and ∼6.3. These two nucleus subpopulations have different stellar population properties: the more massive nuclei are older than ∼2 Gyr and have metal-poor stellar populations (Z ≤ 0.02 Z ⊙), while the less massive nuclei are younger than ∼2 Gyr with metallicities in the range 0.02 < Z/Z ⊙ ≤ 1. We find that the nucleus mass ({{ \\mathcal M }}nuc}) versus galaxy mass ({{ \\mathcal M }}gal}) relation becomes shallower for less massive galaxies starting around 108 M ⊙, and the mass ratio {η }n={{ \\mathcal M }}nuc}/{{ \\mathcal M }}gal} shows a clear anticorrelation with {{ \\mathcal M }}gal} for the lowest masses, reaching 10%. We test current theoretical models of nuclear cluster formation and find that they cannot fully reproduce the observed trends. A likely mixture of in situ star formation and star cluster mergers seems to be acting during nucleus growth over cosmic time.

  8. Cancer Tissue Engineering: A Novel 3D Polystyrene Scaffold for In Vitro Isolation and Amplification of Lymphoma Cancer Cells from Heterogeneous Cell Mixtures

    PubMed Central

    Caicedo-Carvajal, Carlos E.; Liu, Qing; Remache, Yvonne; Goy, Andre; Suh, K. Stephen

    2011-01-01

    Isolation and amplification of primary lymphoma cells in vitro setting is technically and biologically challenging task. To optimize culture environment and mimic in vivo conditions, lymphoma cell lines were used as a test case and were grown in 3-dimension (3D) using a novel 3D tissue culture polystyrene scaffold with neonatal stromal cells to represent a lymphoma microenvironment. In this model, the cell proliferation was enhanced more than 200-fold or 20,000% neoplastic surplus in 7 days when less than 1% lymphoma cells were cocultured with 100-fold excess of neonatal stroma cells, representing 3.2-fold higher proliferative rate than 2D coculture model. The lymphoma cells grew and aggregated to form clusters during 3D coculture and did not maintained the parental phenotype to grow in single-cell suspension. The cluster size was over 5-fold bigger in the 3D coculture by day 4 than 2D coculture system and contained less than 0.00001% of neonatal fibroblast trace. This preliminary data indicate that novel 3D scaffold geometry and coculturing environment can be customized to amplify primary cancer cells from blood or tissues related to hematological cancer and subsequently used for personalized drug screening procedures. PMID:22073378

  9. Multi-target detection and positioning in crowds using multiple camera surveillance

    NASA Astrophysics Data System (ADS)

    Huang, Jiahu; Zhu, Qiuyu; Xing, Yufeng

    2018-04-01

    In this study, we propose a pixel correspondence algorithm for positioning in crowds based on constraints on the distance between lines of sight, grayscale differences, and height in a world coordinates system. First, a Gaussian mixture model is used to obtain the background and foreground from multi-camera videos. Second, the hair and skin regions are extracted as regions of interest. Finally, the correspondences between each pixel in the region of interest are found under multiple constraints and the targets are positioned by pixel clustering. The algorithm can provide appropriate redundancy information for each target, which decreases the risk of losing targets due to a large viewing angle and wide baseline. To address the correspondence problem for multiple pixels, we construct a pixel-based correspondence model based on a similar permutation matrix, which converts the correspondence problem into a linear programming problem where a similar permutation matrix is found by minimizing an objective function. The correct pixel correspondences can be obtained by determining the optimal solution of this linear programming problem and the three-dimensional position of the targets can also be obtained by pixel clustering. Finally, we verified the algorithm with multiple cameras in experiments, which showed that the algorithm has high accuracy and robustness.

  10. Structure and component dynamics in binary mixtures of poly(2-(dimethylamino)ethyl methacrylate) with water and tetrahydrofuran: A diffraction, calorimetric, and dielectric spectroscopy study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goracci, G., E-mail: sckgorag@ehu.es; Arbe, A.; Alegría, A.

    2016-04-21

    We have combined X-ray diffraction, neutron diffraction with polarization analysis, small angle neutron scattering, differential scanning calorimetry, and broad band dielectric spectroscopy to investigate the structure and dynamics of binary mixtures of poly (2-(dimethylamino)ethyl methacrylate) with either water or tetrahydrofuran (THF) at different concentrations. Aqueous mixtures are characterized by a highly heterogeneous structure where water clusters coexist with an underlying nano-segregation of main chains and side groups of the polymeric matrix. THF molecules are homogeneously distributed among the polymeric nano-domains for concentrations of one THF molecule/monomer or lower. A more heterogeneous situation is found for higher THF amounts, but withoutmore » evidences for solvent clusters. In THF-mixtures, we observe a remarkable reduction of the glass-transition temperature which is enhanced with increasing amount of solvent but seems to reach saturation at high THF concentrations. Adding THF markedly reduces the activation energy of the polymer β-relaxation. The presence of THF molecules seemingly hinders a slow component of this process which is active in the dry state. The aqueous mixtures present a strikingly broad glass-transition feature, revealing a highly heterogeneous behavior in agreement with the structural study. Regarding the solvent dynamics, deep in the glassy state all data can be described by an Arrhenius temperature dependence with a rather similar activation energy. However, the values of the characteristic times are about three orders of magnitude smaller for THF than for water. Water dynamics display a crossover toward increasingly higher apparent activation energies in the region of the onset of the glass transition, supporting its interpretation as a consequence of the freezing of the structural relaxation of the surrounding matrix. The absence of such a crossover (at least in the wide dynamic window here accessed) in THF is attributed to the lack of cooperativity effects in the relaxation of these molecules within the polymeric matrix.« less

  11. iHelp: an intelligent online helpdesk system.

    PubMed

    Wang, Dingding; Li, Tao; Zhu, Shenghuo; Gong, Yihong

    2011-02-01

    Due to the importance of high-quality customer service, many companies use intelligent helpdesk systems (e.g., case-based systems) to improve customer service quality. However, these systems face two challenges: 1) Case retrieval measures: most case-based systems use traditional keyword-matching-based ranking schemes for case retrieval and have difficulty to capture the semantic meanings of cases and 2) result representation: most case-based systems return a list of past cases ranked by their relevance to a new request, and customers have to go through the list and examine the cases one by one to identify their desired cases. To address these challenges, we develop iHelp, an intelligent online helpdesk system, to automatically find problem-solution patterns from the past customer-representative interactions. When a new customer request arrives, iHelp searches and ranks the past cases based on their semantic relevance to the request, groups the relevant cases into different clusters using a mixture language model and symmetric matrix factorization, and summarizes each case cluster to generate recommended solutions. Case and user studies have been conducted to show the full functionality and the effectiveness of iHelp.

  12. Steganalysis feature improvement using expectation maximization

    NASA Astrophysics Data System (ADS)

    Rodriguez, Benjamin M.; Peterson, Gilbert L.; Agaian, Sos S.

    2007-04-01

    Images and data files provide an excellent opportunity for concealing illegal or clandestine material. Currently, there are over 250 different tools which embed data into an image without causing noticeable changes to the image. From a forensics perspective, when a system is confiscated or an image of a system is generated the investigator needs a tool that can scan and accurately identify files suspected of containing malicious information. The identification process is termed the steganalysis problem which focuses on both blind identification, in which only normal images are available for training, and multi-class identification, in which both the clean and stego images at several embedding rates are available for training. In this paper an investigation of a clustering and classification technique (Expectation Maximization with mixture models) is used to determine if a digital image contains hidden information. The steganalysis problem is for both anomaly detection and multi-class detection. The various clusters represent clean images and stego images with between 1% and 10% embedding percentage. Based on the results it is concluded that the EM classification technique is highly suitable for both blind detection and the multi-class problem.

  13. 7 CFR 52.1850 - Sizes of raisins with seeds-except layer or cluster.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... MARKETING SERVICE (Standards, Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE REGULATIONS AND STANDARDS UNDER THE AGRICULTURAL MARKETING ACT OF 1946 PROCESSED FRUITS AND VEGETABLES, PROCESSED PRODUCTS... perforations 22/64-inch in diameter. (3) Mixed size raisins means a mixture which does not meet either the...

  14. Optical narrow band frequency analysis of polystyrene bead mixtures

    NASA Astrophysics Data System (ADS)

    Popov, Kaloyan A.; Kurzweg, Timothy P.

    2010-02-01

    Early pre-cancerous conditions in tissue can be studied as mixture of cancerous and healthy cells. White light spectroscopy is a promising technique for determining the size of scattering elements, which, in cells are the nuclei. However, in a mixture of different sized scatterers, possibly between healthy and cancerous cells, the white light spectroscopy spatial data is not easily analyzed, making it difficult to determine the individual components that comprise the mixture. We have previously found by obtaining spatial limited data by using an optical filter and converting this spatial data into the Fourier domain, we can determine characteristic signature frequencies for individual scatterers. In this paper, we show analysis of phantom tissues representing esophagus tissue. We examine phantom tissue representing pre-cancerous conditions, when some of the cell nuclei increase in size. We also experimentally show a relationship between the particle concentration and the amplitude of the Fourier signature peak. In addition, we discuss the frequency peak amplitude dependency based on the Tyndall Effect, which describes particles aggregating into clusters.

  15. Predicting herbicide mixture effects on multiple algal species using mixture toxicity models.

    PubMed

    Nagai, Takashi

    2017-10-01

    The validity of the application of mixture toxicity models, concentration addition and independent action, to a species sensitivity distribution (SSD) for calculation of a multisubstance potentially affected fraction was examined in laboratory experiments. Toxicity assays of herbicide mixtures using 5 species of periphytic algae were conducted. Two mixture experiments were designed: a mixture of 5 herbicides with similar modes of action and a mixture of 5 herbicides with dissimilar modes of action, corresponding to the assumptions of the concentration addition and independent action models, respectively. Experimentally obtained mixture effects on 5 algal species were converted to the fraction of affected (>50% effect on growth rate) species. The predictive ability of the concentration addition and independent action models with direct application to SSD depended on the mode of action of chemicals. That is, prediction was better for the concentration addition model than the independent action model for the mixture of herbicides with similar modes of action. In contrast, prediction was better for the independent action model than the concentration addition model for the mixture of herbicides with dissimilar modes of action. Thus, the concentration addition and independent action models could be applied to SSD in the same manner as for a single-species effect. The present study to validate the application of the concentration addition and independent action models to SSD supports the usefulness of the multisubstance potentially affected fraction as the index of ecological risk. Environ Toxicol Chem 2017;36:2624-2630. © 2017 SETAC. © 2017 SETAC.

  16. The single scattering properties of the aerosol particles as aggregated spheres

    NASA Astrophysics Data System (ADS)

    Wu, Y.; Gu, X.; Cheng, T.; Xie, D.; Yu, T.; Chen, H.; Guo, J.

    2012-08-01

    The light scattering and absorption properties of anthropogenic aerosol particles such as soot aggregates are complicated in the temporal and spatial distribution, which introduce uncertainty of radiative forcing on global climate change. In order to study the single scattering properties of anthorpogenic aerosol particles, the structures of these aerosols such as soot paticles and soot-containing mixtures with the sulfate or organic matter, are simulated using the parallel diffusion limited aggregation algorithm (DLA) based on the transmission electron microscope images (TEM). Then, the single scattering properties of randomly oriented aerosols, such as scattering matrix, single scattering albedo (SSA), and asymmetry parameter (AP), are computed using the superposition T-matrix method. The comparisons of the single scattering properties of these specific types of clusters with different morphological and chemical factors such as fractal parameters, aspect ratio, monomer radius, mixture mode and refractive index, indicate that these different impact factors can respectively generate the significant influences on the single scattering properties of these aerosols. The results show that aspect ratio of circumscribed shape has relatively small effect on single scattering properties, for both differences of SSA and AP are less than 0.1. However, mixture modes of soot clusters with larger sulfate particles have remarkably important effects on the scattering and absorption properties of aggregated spheres, and SSA of those soot-containing mixtures are increased in proportion to the ratio of larger weakly absorbing attachments. Therefore, these complex aerosols come from man made pollution cannot be neglected in the aerosol retrievals. The study of the single scattering properties on these kinds of aggregated spheres is important and helpful in remote sensing observations and atmospheric radiation balance computations.

  17. Recommender system based on scarce information mining.

    PubMed

    Lu, Wei; Chung, Fu-Lai; Lai, Kunfeng; Zhang, Liang

    2017-09-01

    Guessing what user may like is now a typical interface for video recommendation. Nowadays, the highly popular user generated content sites provide various sources of information such as tags for recommendation tasks. Motivated by a real world online video recommendation problem, this work targets at the long tail phenomena of user behavior and the sparsity of item features. A personalized compound recommendation framework for online video recommendation called Dirichlet mixture probit model for information scarcity (DPIS) is hence proposed. Assuming that each clicking sample is generated from a representation of user preferences, DPIS models the sample level topic proportions as a multinomial item vector, and utilizes topical clustering on the user part for recommendation through a probit classifier. As demonstrated by the real-world application, the proposed DPIS achieves better performance in accuracy, perplexity as well as diversity in coverage than traditional methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. The ergot alkaloid gene cluster: functional analyses and evolutionary aspects.

    PubMed

    Lorenz, Nicole; Haarmann, Thomas; Pazoutová, Sylvie; Jung, Manfred; Tudzynski, Paul

    2009-01-01

    Ergot alkaloids and their derivatives have been traditionally used as therapeutic agents in migraine, blood pressure regulation and help in childbirth and abortion. Their production in submerse culture is a long established biotechnological process. Ergot alkaloids are produced mainly by members of the genus Claviceps, with Claviceps purpurea as best investigated species concerning the biochemistry of ergot alkaloid synthesis (EAS). Genes encoding enzymes involved in EAS have been shown to be clustered; functional analyses of EAS cluster genes have allowed to assign specific functions to several gene products. Various Claviceps species differ with respect to their host specificity and their alkaloid content; comparison of the ergot alkaloid clusters in these species (and of clavine alkaloid clusters in other genera) yields interesting insights into the evolution of cluster structure. This review focuses on recently published and also yet unpublished data on the structure and evolution of the EAS gene cluster and on the function and regulation of cluster genes. These analyses have also significant biotechnological implications: the characterization of non-ribosomal peptide synthetases (NRPS) involved in the synthesis of the peptide moiety of ergopeptines opened interesting perspectives for the synthesis of ergot alkaloids; on the other hand, defined mutants could be generated producing interesting intermediates or only single peptide alkaloids (instead of the alkaloid mixtures usually produced by industrial strains).

  19. Measurement and Structural Model Class Separation in Mixture CFA: ML/EM versus MCMC

    ERIC Educational Resources Information Center

    Depaoli, Sarah

    2012-01-01

    Parameter recovery was assessed within mixture confirmatory factor analysis across multiple estimator conditions under different simulated levels of mixture class separation. Mixture class separation was defined in the measurement model (through factor loadings) and the structural model (through factor variances). Maximum likelihood (ML) via the…

  20. ODE constrained mixture modelling: a method for unraveling subpopulation structures and dynamics.

    PubMed

    Hasenauer, Jan; Hasenauer, Christine; Hucho, Tim; Theis, Fabian J

    2014-07-01

    Functional cell-to-cell variability is ubiquitous in multicellular organisms as well as bacterial populations. Even genetically identical cells of the same cell type can respond differently to identical stimuli. Methods have been developed to analyse heterogeneous populations, e.g., mixture models and stochastic population models. The available methods are, however, either incapable of simultaneously analysing different experimental conditions or are computationally demanding and difficult to apply. Furthermore, they do not account for biological information available in the literature. To overcome disadvantages of existing methods, we combine mixture models and ordinary differential equation (ODE) models. The ODE models provide a mechanistic description of the underlying processes while mixture models provide an easy way to capture variability. In a simulation study, we show that the class of ODE constrained mixture models can unravel the subpopulation structure and determine the sources of cell-to-cell variability. In addition, the method provides reliable estimates for kinetic rates and subpopulation characteristics. We use ODE constrained mixture modelling to study NGF-induced Erk1/2 phosphorylation in primary sensory neurones, a process relevant in inflammatory and neuropathic pain. We propose a mechanistic pathway model for this process and reconstructed static and dynamical subpopulation characteristics across experimental conditions. We validate the model predictions experimentally, which verifies the capabilities of ODE constrained mixture models. These results illustrate that ODE constrained mixture models can reveal novel mechanistic insights and possess a high sensitivity.

  1. Confocal Imaging of Confined Quiescent and Flowing Colloid-polymer Mixtures

    PubMed Central

    Conrad, Jacinta C.

    2014-01-01

    The behavior of confined colloidal suspensions with attractive interparticle interactions is critical to the rational design of materials for directed assembly1-3, drug delivery4, improved hydrocarbon recovery5-7, and flowable electrodes for energy storage8. Suspensions containing fluorescent colloids and non-adsorbing polymers are appealing model systems, as the ratio of the polymer radius of gyration to the particle radius and concentration of polymer control the range and strength of the interparticle attraction, respectively. By tuning the polymer properties and the volume fraction of the colloids, colloid fluids, fluids of clusters, gels, crystals, and glasses can be obtained9. Confocal microscopy, a variant of fluorescence microscopy, allows an optically transparent and fluorescent sample to be imaged with high spatial and temporal resolution in three dimensions. In this technique, a small pinhole or slit blocks the emitted fluorescent light from regions of the sample that are outside the focal volume of the microscope optical system. As a result, only a thin section of the sample in the focal plane is imaged. This technique is particularly well suited to probe the structure and dynamics in dense colloidal suspensions at the single-particle scale: the particles are large enough to be resolved using visible light and diffuse slowly enough to be captured at typical scan speeds of commercial confocal systems10. Improvements in scan speeds and analysis algorithms have also enabled quantitative confocal imaging of flowing suspensions11-16,37. In this paper, we demonstrate confocal microscopy experiments to probe the confined phase behavior and flow properties of colloid-polymer mixtures. We first prepare colloid-polymer mixtures that are density- and refractive-index matched. Next, we report a standard protocol for imaging quiescent dense colloid-polymer mixtures under varying confinement in thin wedge-shaped cells. Finally, we demonstrate a protocol for imaging colloid-polymer mixtures during microchannel flow. PMID:24894062

  2. Spectral Modeling of the 0.4-2.5 μm Phobos CRISM dataset

    NASA Astrophysics Data System (ADS)

    Pajola, Maurizio; Roush, Ted; Dalle Ore, Cristina; Marzo, Giuseppe A.; Simioni, Emanuele

    2017-04-01

    We present the spectral modeling of the 0.4-2.5 μm MRO/CRISM Phobos dataset. After applying a statistical clustering technique, based on a K-means partitioning algorithm, we identified eight separate clusters in the Phobos CRISM data, extending the surface coverage beyond the previous analyses of Fraeman et al. (2012, 2014). Each resulting cluster is characterized by an average and its associated variability. We modeled these different spectra using a radiative transfer code based on the approach of Shkuratov et al. (1999). We used the optical constants of the model proposed by Pajola et al. (2013) in our effort, i.e. the Tagish Lake meteorite (TL) and the Mg-rich pyroxene glass (PM80). The Shkuratov model is used in an algorithm that iteratively, and simultaneously changes the relative abundance and grain sizes of the selected components to minimize the differences between the model and observations using a chi-squared criterion. The best-fitting models were achieved with a simple intimate mixture showing that the relative percentages of TL and PM80 vary between 80-20% and 95-5%, respectively, and grain sizes for TL are 12-14 μm and 20-22 μm for PM80. This work aims to return a detailed picture of the surface properties of Phobos identifying specific areas that may be of interest for future planetary exploration, as the proposed Japanese Mars Moon eXploration (MMX) sample return mission. Acknowledgements: We make use of the public NASA-Planetary Data System MRO-CRISM spectral data of Phobos. M.P. was supported for this research by an appointment to the National Aeronautics and Space Administration (NASA) Post-doctoral Program at the Ames Research Center administered by Universities Space Research Association (USRA) through a contract with NASA. References: Fraeman et al. 2012, J. Geophy. Res, E00J15, 10.1029/2012JE004137; Fraeman et al., 2014, Icarus, 229, 196-205, 10.1016/icarus.2013.11.021; Shkuratov, Y. et al. (1999), Icarus, 137, 235. Pajola et al., 2013, The Astrophysical Journal, 777:127, 10.1088/0004-637X/777/2/127.

  3. The importance of botellas and other plant mixtures in Dominican traditional medicine

    PubMed Central

    Vandebroek, Ina; Balick, Michael J.; Ososki, Andreana; Kronenberg, Fredi; Yukes, Jolene; Wade, Christine; Jiménez, Francisco; Peguero, Brígido; Castillo, Daisy

    2010-01-01

    Ethnopharmacological relevance Plant mixtures are understudied in ethnobotanical research Aim of the study To investigate the importance of plant mixtures (remedies consisting of at least two plants) in Dominican traditional medicine. Materials and Methods A Spanish language questionnaire was administered to 174 Dominicans living in New York City (NYC) and 145 Dominicans living in the Dominican Republic (DR), including lay persons (who self-medicate with plants) and specialists (traditional healers). Plants were identified through specimens purchased in NYC botánica shops and Latino grocery shops, and from voucher collections. Results The percentage of mixtures as compared to single plants in plant use reports varied between 32 to 41%, depending on the geographic location (NYC or DR) and participant status (lay person or specialist). Respiratory conditions, reproductive health and genitourinary conditions were the main categories for which Dominicans use plant mixtures. Lay persons reported significantly more mixtures prepared as teas, mainly used in NYC to treat respiratory conditions. Specialists mentioned significantly more botellas (bottled herbal mixtures), used most frequently in the DR to treat reproductive health and genitourinary conditions. Cluster analysis demonstrated that different plant species are used to treat respiratory conditions as compared to reproductive health and genitourinary conditions. Interview participants believed that combining plants in mixtures increases their potency and versatility as medicines. Conclusions The present study demonstrates the importance and complexity of plant mixtures in Dominican traditional medicine and the variation in its practices influenced by migration from the DR to NYC, shedding new light on the foundations of a particular ethnomedical system. PMID:20006697

  4. The importance of botellas and other plant mixtures in Dominican traditional medicine.

    PubMed

    Vandebroek, Ina; Balick, Michael J; Ososki, Andreana; Kronenberg, Fredi; Yukes, Jolene; Wade, Christine; Jiménez, Francisco; Peguero, Brígido; Castillo, Daisy

    2010-03-02

    Plant mixtures are understudied in ethnobotanical research. To investigate the importance of plant mixtures (remedies consisting of at least two plants) in Dominican traditional medicine. A Spanish language questionnaire was administered to 174 Dominicans living in New York City (NYC) and 145 Dominicans living in the Dominican Republic (DR), including lay persons (who self-medicate with plants) and specialists (traditional healers). Plants were identified through specimens purchased in NYC botánica shops and Latino grocery shops, and from voucher collections. The percentage of mixtures as compared to single plants in plant use reports varied between 32 and 41%, depending on the geographic location (NYC or DR) and participant status (lay person or specialist). Respiratory conditions, reproductive health and genitourinary conditions were the main categories for which Dominicans use plant mixtures. Lay persons reported significantly more mixtures prepared as teas, mainly used in NYC to treat respiratory conditions. Specialists mentioned significantly more botellas (bottled herbal mixtures), used most frequently in the DR to treat reproductive health and genitourinary conditions. Cluster analysis demonstrated that different plant species are used to treat respiratory conditions as compared to reproductive health and genitourinary conditions. Interview participants believed that combining plants in mixtures increases their potency and versatility as medicines. The present study demonstrates the importance and complexity of plant mixtures in Dominican traditional medicine and the variation in its practices influenced by migration from the DR to NYC, shedding new light on the foundations of a particular ethnomedical system. Copyright (c) 2009 Elsevier Ireland Ltd. All rights reserved.

  5. A study of finite mixture model: Bayesian approach on financial time series data

    NASA Astrophysics Data System (ADS)

    Phoong, Seuk-Yen; Ismail, Mohd Tahir

    2014-07-01

    Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.

  6. Accounting for non-independent detection when estimating abundance of organisms with a Bayesian approach

    USGS Publications Warehouse

    Martin, Julien; Royle, J. Andrew; MacKenzie, Darryl I.; Edwards, Holly H.; Kery, Marc; Gardner, Beth

    2011-01-01

    Summary 1. Binomial mixture models use repeated count data to estimate abundance. They are becoming increasingly popular because they provide a simple and cost-effective way to account for imperfect detection. However, these models assume that individuals are detected independently of each other. This assumption may often be violated in the field. For instance, manatees (Trichechus manatus latirostris) may surface in turbid water (i.e. become available for detection during aerial surveys) in a correlated manner (i.e. in groups). However, correlated behaviour, affecting the non-independence of individual detections, may also be relevant in other systems (e.g. correlated patterns of singing in birds and amphibians). 2. We extend binomial mixture models to account for correlated behaviour and therefore to account for non-independent detection of individuals. We simulated correlated behaviour using beta-binomial random variables. Our approach can be used to simultaneously estimate abundance, detection probability and a correlation parameter. 3. Fitting binomial mixture models to data that followed a beta-binomial distribution resulted in an overestimation of abundance even for moderate levels of correlation. In contrast, the beta-binomial mixture model performed considerably better in our simulation scenarios. We also present a goodness-of-fit procedure to evaluate the fit of beta-binomial mixture models. 4. We illustrate our approach by fitting both binomial and beta-binomial mixture models to aerial survey data of manatees in Florida. We found that the binomial mixture model did not fit the data, whereas there was no evidence of lack of fit for the beta-binomial mixture model. This example helps illustrate the importance of using simulations and assessing goodness-of-fit when analysing ecological data with N-mixture models. Indeed, both the simulations and the goodness-of-fit procedure highlighted the limitations of the standard binomial mixture model for aerial manatee surveys. 5. Overestimation of abundance by binomial mixture models owing to non-independent detections is problematic for ecological studies, but also for conservation. For example, in the case of endangered species, it could lead to inappropriate management decisions, such as downlisting. These issues will be increasingly relevant as more ecologists apply flexible N-mixture models to ecological data.

  7. A competitive binding model predicts the response of mammalian olfactory receptors to mixtures

    NASA Astrophysics Data System (ADS)

    Singh, Vijay; Murphy, Nicolle; Mainland, Joel; Balasubramanian, Vijay

    Most natural odors are complex mixtures of many odorants, but due to the large number of possible mixtures only a small fraction can be studied experimentally. To get a realistic understanding of the olfactory system we need methods to predict responses to complex mixtures from single odorant responses. Focusing on mammalian olfactory receptors (ORs in mouse and human), we propose a simple biophysical model for odor-receptor interactions where only one odor molecule can bind to a receptor at a time. The resulting competition for occupancy of the receptor accounts for the experimentally observed nonlinear mixture responses. We first fit a dose-response relationship to individual odor responses and then use those parameters in a competitive binding model to predict mixture responses. With no additional parameters, the model predicts responses of 15 (of 18 tested) receptors to within 10 - 30 % of the observed values, for mixtures with 2, 3 and 12 odorants chosen from a panel of 30. Extensions of our basic model with odorant interactions lead to additional nonlinearities observed in mixture response like suppression, cooperativity, and overshadowing. Our model provides a systematic framework for characterizing and parameterizing such mixing nonlinearities from mixture response data.

  8. Presence of Li Clusters in Molten LiCl-Li

    PubMed Central

    Merwin, Augustus; Phillips, William C.; Williamson, Mark A.; Willit, James L.; Motsegood, Perry N.; Chidambaram, Dev

    2016-01-01

    Molten mixtures of lithium chloride and metallic lithium are of significant interest in various metal oxide reduction processes. These solutions have been reported to exhibit seemingly anomalous physical characteristics that lack a comprehensive explanation. In the current work, the physical chemistry of molten solutions of lithium chloride and metallic lithium, with and without lithium oxide, was investigated using in situ Raman spectroscopy. The Raman spectra obtained from these solutions were in agreement with the previously reported spectrum of the lithium cluster, Li8. This observation is indicative of a nanofluid type colloidal suspension of Li8 in a molten salt matrix. It is suggested that the formation and suspension of lithium clusters in lithium chloride is the cause of various phenomena exhibited by these solutions that were previously unexplainable. PMID:27145895

  9. Application of neuroanatomical features to tractography clustering.

    PubMed

    Wang, Qian; Yap, Pew-Thian; Wu, Guorong; Shen, Dinggang

    2013-09-01

    Diffusion tensor imaging allows unprecedented insight into brain neural connectivity in vivo by allowing reconstruction of neuronal tracts via captured patterns of water diffusion in white matter microstructures. However, tractography algorithms often output hundreds of thousands of fibers, rendering subsequent data analysis intractable. As a remedy, fiber clustering techniques are able to group fibers into dozens of bundles and thus facilitate analyses. Most existing fiber clustering methods rely on geometrical information of fibers, by viewing them as curves in 3D Euclidean space. The important neuroanatomical aspect of fibers, however, is ignored. In this article, the neuroanatomical information of each fiber is encapsulated in the associativity vector, which functions as the unique "fingerprint" of the fiber. Specifically, each entry in the associativity vector describes the relationship between the fiber and a certain anatomical ROI in a fuzzy manner. The value of the entry approaches 1 if the fiber is spatially related to the ROI at high confidence; on the contrary, the value drops closer to 0. The confidence of the ROI is calculated by diffusing the ROI according to the underlying fibers from tractography. In particular, we have adopted the fast marching method for simulation of ROI diffusion. Using the associativity vectors of fibers, we further model fibers as observations sampled from multivariate Gaussian mixtures in the feature space. To group all fibers into relevant major bundles, an expectation-maximization clustering approach is employed. Experimental results indicate that our method results in anatomically meaningful bundles that are highly consistent across subjects. Copyright © 2012 Wiley Periodicals, Inc., a Wiley company.

  10. Hydrophobic fluorine mediated switching of the hydrogen bonding site as well as orientation of water molecules in the aqueous mixture of monofluoroethanol: IR, molecular dynamics and quantum chemical studies.

    PubMed

    Mondal, Saptarsi; Biswas, Biswajit; Nandy, Tonima; Singh, Prashant Chandra

    2017-09-20

    The local structures between water-water, alcohol-water and alcohol-alcohol have been investigated for aqueous mixtures of ethanol (ETH) and monofluoroethanol (MFE) by the deconvolution of IR bands in the OH stretching region, molecular dynamics simulation and quantum chemical calculations. It has been found that the addition of a small amount of ETH into the aqueous medium increases the strength of the hydrogen bonds between water molecules. In an aqueous mixture of MFE, the substitution of a single fluorine induces a change in the orientation as well as the hydrogen bonding site of water molecules from the oxygen to the fluorine terminal of MFE. The switching of the hydrogen bonding site of water in the aqueous mixture of MFE results in comparatively strong hydrogen bonds between MFE and water molecules as well as less clustering of water molecules, unlike the case of the aqueous mixture of ETH. These findings about the modification of a hydrogen bond network by the hydrophobic fluorine group probably make fluorinated molecules useful for pharmaceutical as well as biological applications.

  11. Revisiting the Aqueous Solutions of Dimethyl Sulfoxide by Spectroscopy in the Mid- and Near-Infrared: Experiments and Car-Parrinello Simulations.

    PubMed

    Wallace, Victoria M; Dhumal, Nilesh R; Zehentbauer, Florian M; Kim, Hyung J; Kiefer, Johannes

    2015-11-19

    The infrared and near-infrared spectra of the aqueous solutions of dimethyl sulfoxide are revisited. Experimental and computational vibrational spectra are analyzed and compared. The latter are determined as the Fourier transformation of the velocity autocorrelation function of data obtained from Car-Parrinello molecular dynamics simulations. The experimental absorption spectra are deconvolved, and the excess spectra are determined. The two-dimensional excess contour plot provides a means of visualizing and identifying spectral regions and concentration ranges exhibiting nonideal behavior. In the binary mixtures, the analysis of the SO stretching band provides a semiquantitative picture of the formation and dissociation of hydrogen-bonded DMSO-water complexes. A maximum concentration of these clusters is found in the equimolar mixture. At high DMSO concentration, the formation of rather stable 3DMSO:1water complexes is suggested. The formation of 1DMSO:2water clusters, in which the water oxygen atoms interact with the sulfoxide methyl groups, is proposed as a possible reason for the marked depression of the freezing temperature at the eutectic point.

  12. A scoring metric for multivariate data for reproducibility analysis using chemometric methods

    PubMed Central

    Sheen, David A.; de Carvalho Rocha, Werickson Fortunato; Lippa, Katrice A.; Bearden, Daniel W.

    2017-01-01

    Process quality control and reproducibility in emerging measurement fields such as metabolomics is normally assured by interlaboratory comparison testing. As a part of this testing process, spectral features from a spectroscopic method such as nuclear magnetic resonance (NMR) spectroscopy are attributed to particular analytes within a mixture, and it is the metabolite concentrations that are returned for comparison between laboratories. However, data quality may also be assessed directly by using binned spectral data before the time-consuming identification and quantification. Use of the binned spectra has some advantages, including preserving information about trace constituents and enabling identification of process difficulties. In this paper, we demonstrate the use of binned NMR spectra to conduct a detailed interlaboratory comparison and composition analysis. Spectra of synthetic and biologically-obtained metabolite mixtures, taken from a previous interlaboratory study, are compared with cluster analysis using a variety of distance and entropy metrics. The individual measurements are then evaluated based on where they fall within their clusters, and a laboratory-level scoring metric is developed, which provides an assessment of each laboratory’s individual performance. PMID:28694553

  13. Novel algorithm for simultaneous component detection and pseudo-molecular ion characterization in liquid chromatography-mass spectrometry.

    PubMed

    Zhang, Yufeng; Wang, Xiaoan; Wo, Siukwan; Ho, Hingman; Han, Quanbin; Fan, Xiaohui; Zuo, Zhong

    2015-01-01

    Resolving components and determining their pseudo-molecular ions (PMIs) are crucial steps in identifying complex herbal mixtures by liquid chromatography-mass spectrometry. To tackle such labor-intensive steps, we present here a novel algorithm for simultaneous detection of components and their PMIs. Our method consists of three steps: (1) obtaining a simplified dataset containing only mono-isotopic masses by removal of background noise and isotopic cluster ions based on the isotopic distribution model derived from all the reported natural compounds in dictionary of natural products; (2) stepwise resolving and removing all features of the highest abundant component from current simplified dataset and calculating PMI of each component according to an adduct-ion model, in which all non-fragment ions in a mass spectrum are considered as PMI plus one or several neutral species; (3) visual classification of detected components by principal component analysis (PCA) to exclude possible non-natural compounds (such as pharmaceutical excipients). This algorithm has been successfully applied to a standard mixture and three herbal extract/preparations. It indicated that our algorithm could detect components' features as a whole and report their PMI with an accuracy of more than 98%. Furthermore, components originated from excipients/contaminants could be easily separated from those natural components in the bi-plots of PCA. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Testing job typologies and identifying at-risk subpopulations using factor mixture models.

    PubMed

    Keller, Anita C; Igic, Ivana; Meier, Laurenz L; Semmer, Norbert K; Schaubroeck, John M; Brunner, Beatrice; Elfering, Achim

    2017-10-01

    Research in occupational health psychology has tended to focus on the effects of single job characteristics or various job characteristics combined into 1 factor. However, such a variable-centered approach does not account for the clustering of job attributes among groups of employees. We addressed this issue by using a person-centered approach to (a) investigate the occurrence of different empirical constellations of perceived job stressors and resources and (b) validate the meaningfulness of profiles by analyzing their association with employee well-being and performance. We applied factor mixture modeling to identify profiles in 4 large samples consisting of employees in Switzerland (Studies 1 and 2) and the United States (Studies 3 and 4). We identified 2 profiles that spanned the 4 samples, with 1 reflecting a combination of relatively low stressors and high resources (P1) and the other relatively high stressors and low resources (P3). The profiles differed mainly in terms of their organizational and social aspects. Employees in P1 reported significantly higher mean levels of job satisfaction, performance, and general health, and lower means in exhaustion compared with P3. Additional analyses showed differential relationships between job attributes and outcomes depending on profile membership. These findings may benefit organizational interventions as they show that perceived work stressors and resources more strongly influence satisfaction and well-being in particular profiles. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  15. Estimation of value at risk and conditional value at risk using normal mixture distributions model

    NASA Astrophysics Data System (ADS)

    Kamaruzzaman, Zetty Ain; Isa, Zaidi

    2013-04-01

    Normal mixture distributions model has been successfully applied in financial time series analysis. In this paper, we estimate the return distribution, value at risk (VaR) and conditional value at risk (CVaR) for monthly and weekly rates of returns for FTSE Bursa Malaysia Kuala Lumpur Composite Index (FBMKLCI) from July 1990 until July 2010 using the two component univariate normal mixture distributions model. First, we present the application of normal mixture distributions model in empirical finance where we fit our real data. Second, we present the application of normal mixture distributions model in risk analysis where we apply the normal mixture distributions model to evaluate the value at risk (VaR) and conditional value at risk (CVaR) with model validation for both risk measures. The empirical results provide evidence that using the two components normal mixture distributions model can fit the data well and can perform better in estimating value at risk (VaR) and conditional value at risk (CVaR) where it can capture the stylized facts of non-normality and leptokurtosis in returns distribution.

  16. Achiral-to-chiral transition in benzil solidification: analogies with racemic conglomerates systems showing deracemization.

    PubMed

    El-Hachemi, Zoubir; Arteaga, Oriol; Canillas, Adolf; Crusats, Joaquim; Sorrenti, Alessandro; Veintemillas-Verdaguer, Sabino; Ribo, Josep M

    2013-07-01

    Experimental results show that benzil (1,2-diphenyl-1,2-ethanedione), an achiral compound that crystallizes as a racemic conglomerate, yields by solidification polycrystalline scalemic mixtures of high enantiomeric excesses. These results are related to those previously reported in this type of compounds on deracemizations of racemic mixtures of crystal enantiomorphs obtained by wet grinding. However, the present results strongly suggest that these experiments cannot be explained without taking into account chiral recognition interactions at the level of precritical clusters. The conditions that would define a general thermodynamic scenario for such deracemizations are discussed. © 2013 Wiley Periodicals, Inc.

  17. Single-step generation of metal-plasma polymer multicore@shell nanoparticles from the gas phase.

    PubMed

    Solař, Pavel; Polonskyi, Oleksandr; Olbricht, Ansgar; Hinz, Alexander; Shelemin, Artem; Kylián, Ondřej; Choukourov, Andrei; Faupel, Franz; Biederman, Hynek

    2017-08-17

    Nanoparticles composed of multiple silver cores and a plasma polymer shell (multicore@shell) were prepared in a single step with a gas aggregation cluster source operating with Ar/hexamethyldisiloxane mixtures and optionally oxygen. The size distribution of the metal inclusions as well as the chemical composition and the thickness of the shells were found to be controlled by the composition of the working gas mixture. Shell matrices ranging from organosilicon plasma polymer to nearly stoichiometric SiO 2 were obtained. The method allows facile fabrication of multicore@shell nanoparticles with tailored functional properties, as demonstrated here with the optical response.

  18. Reduced chemical kinetic model of detonation combustion of one- and multi-fuel gaseous mixtures with air

    NASA Astrophysics Data System (ADS)

    Fomin, P. A.

    2018-03-01

    Two-step approximate models of chemical kinetics of detonation combustion of (i) one hydrocarbon fuel CnHm (for example, methane, propane, cyclohexane etc.) and (ii) multi-fuel gaseous mixtures (∑aiCniHmi) (for example, mixture of methane and propane, synthesis gas, benzene and kerosene) are presented for the first time. The models can be used for any stoichiometry, including fuel/fuels-rich mixtures, when reaction products contain molecules of carbon. Owing to the simplicity and high accuracy, the models can be used in multi-dimensional numerical calculations of detonation waves in corresponding gaseous mixtures. The models are in consistent with the second law of thermodynamics and Le Chatelier's principle. Constants of the models have a clear physical meaning. The models can be used for calculation thermodynamic parameters of the mixture in a state of chemical equilibrium.

  19. ODE Constrained Mixture Modelling: A Method for Unraveling Subpopulation Structures and Dynamics

    PubMed Central

    Hasenauer, Jan; Hasenauer, Christine; Hucho, Tim; Theis, Fabian J.

    2014-01-01

    Functional cell-to-cell variability is ubiquitous in multicellular organisms as well as bacterial populations. Even genetically identical cells of the same cell type can respond differently to identical stimuli. Methods have been developed to analyse heterogeneous populations, e.g., mixture models and stochastic population models. The available methods are, however, either incapable of simultaneously analysing different experimental conditions or are computationally demanding and difficult to apply. Furthermore, they do not account for biological information available in the literature. To overcome disadvantages of existing methods, we combine mixture models and ordinary differential equation (ODE) models. The ODE models provide a mechanistic description of the underlying processes while mixture models provide an easy way to capture variability. In a simulation study, we show that the class of ODE constrained mixture models can unravel the subpopulation structure and determine the sources of cell-to-cell variability. In addition, the method provides reliable estimates for kinetic rates and subpopulation characteristics. We use ODE constrained mixture modelling to study NGF-induced Erk1/2 phosphorylation in primary sensory neurones, a process relevant in inflammatory and neuropathic pain. We propose a mechanistic pathway model for this process and reconstructed static and dynamical subpopulation characteristics across experimental conditions. We validate the model predictions experimentally, which verifies the capabilities of ODE constrained mixture models. These results illustrate that ODE constrained mixture models can reveal novel mechanistic insights and possess a high sensitivity. PMID:24992156

  20. Identification and Control of Aircrafts using Multiple Models and Adaptive Critics

    NASA Technical Reports Server (NTRS)

    Principe, Jose C.

    2007-01-01

    We compared two possible implementations of local linear models for control: one approach is based on a self-organizing map (SOM) to cluster the dynamics followed by a set of linear models operating at each cluster. Therefore the gating function is hard (a single local model will represent the regional dynamics). This simplifies the controller design since there is a one to one mapping between controllers and local models. The second approach uses a soft gate using a probabilistic framework based on a Gaussian Mixture Model (also called a dynamic mixture of experts). In this approach several models may be active at a given time, we can expect a smaller number of models, but the controller design is more involved, with potentially better noise rejection characteristics. Our experiments showed that the SOM provides overall best performance in high SNRs, but the performance degrades faster than with the GMM for the same noise conditions. The SOM approach required about an order of magnitude more models than the GMM, so in terms of implementation cost, the GMM is preferable. The design of the SOM is straight forward, while the design of the GMM controllers, although still reasonable, is more involved and needs more care in the selection of the parameters. Either one of these locally linear approaches outperform global nonlinear controllers based on neural networks, such as the time delay neural network (TDNN). Therefore, in essence the local model approach warrants practical implementations. In order to call the attention of the control community for this design methodology we extended successfully the multiple model approach to PID controllers (still today the most widely used control scheme in the industry), and wrote a paper on this subject. The echo state network (ESN) is a recurrent neural network with the special characteristics that only the output parameters are trained. The recurrent connections are preset according to the problem domain and are fixed. In a nutshell, the states of the reservoir of recurrent processing elements implement a projection space, where the desired response is optimally projected. This architecture trades training efficiency by a large increase in the dimension of the recurrent layer. However, the power of the recurrent neural networks can be brought to bear on practical difficult problems. Our goal was to implement an adaptive critic architecture implementing Bellman s approach to optimal control. However, we could only characterize the ESN performance as a critic in value function evaluation, which is just one of the pieces of the overall adaptive critic controller. The results were very convincing, and the simplicity of the implementation was unparalleled.

  1. Applicability study of classical and contemporary models for effective complex permittivity of metal powders.

    PubMed

    Kiley, Erin M; Yakovlev, Vadim V; Ishizaki, Kotaro; Vaucher, Sebastien

    2012-01-01

    Microwave thermal processing of metal powders has recently been a topic of a substantial interest; however, experimental data on the physical properties of mixtures involving metal particles are often unavailable. In this paper, we perform a systematic analysis of classical and contemporary models of complex permittivity of mixtures and discuss the use of these models for determining effective permittivity of dielectric matrices with metal inclusions. Results from various mixture and core-shell mixture models are compared to experimental data for a titanium/stearic acid mixture and a boron nitride/graphite mixture (both obtained through the original measurements), and for a tungsten/Teflon mixture (from literature). We find that for certain experiments, the average error in determining the effective complex permittivity using Lichtenecker's, Maxwell Garnett's, Bruggeman's, Buchelnikov's, and Ignatenko's models is about 10%. This suggests that, for multiphysics computer models describing the processing of metal powder in the full temperature range, input data on effective complex permittivity obtained from direct measurement has, up to now, no substitute.

  2. Modeling and analysis of personal exposures to VOC mixtures using copulas

    PubMed Central

    Su, Feng-Chiao; Mukherjee, Bhramar; Batterman, Stuart

    2014-01-01

    Environmental exposures typically involve mixtures of pollutants, which must be understood to evaluate cumulative risks, that is, the likelihood of adverse health effects arising from two or more chemicals. This study uses several powerful techniques to characterize dependency structures of mixture components in personal exposure measurements of volatile organic compounds (VOCs) with aims of advancing the understanding of environmental mixtures, improving the ability to model mixture components in a statistically valid manner, and demonstrating broadly applicable techniques. We first describe characteristics of mixtures and introduce several terms, including the mixture fraction which represents a mixture component's share of the total concentration of the mixture. Next, using VOC exposure data collected in the Relationship of Indoor Outdoor and Personal Air (RIOPA) study, mixtures are identified using positive matrix factorization (PMF) and by toxicological mode of action. Dependency structures of mixture components are examined using mixture fractions and modeled using copulas, which address dependencies of multiple variables across the entire distribution. Five candidate copulas (Gaussian, t, Gumbel, Clayton, and Frank) are evaluated, and the performance of fitted models was evaluated using simulation and mixture fractions. Cumulative cancer risks are calculated for mixtures, and results from copulas and multivariate lognormal models are compared to risks calculated using the observed data. Results obtained using the RIOPA dataset showed four VOC mixtures, representing gasoline vapor, vehicle exhaust, chlorinated solvents and disinfection by-products, and cleaning products and odorants. Often, a single compound dominated the mixture, however, mixture fractions were generally heterogeneous in that the VOC composition of the mixture changed with concentration. Three mixtures were identified by mode of action, representing VOCs associated with hematopoietic, liver and renal tumors. Estimated lifetime cumulative cancer risks exceeded 10−3 for about 10% of RIOPA participants. Factors affecting the likelihood of high concentration mixtures included city, participant ethnicity, and house air exchange rates. The dependency structures of the VOC mixtures fitted Gumbel (two mixtures) and t (four mixtures) copulas, types that emphasize tail dependencies. Significantly, the copulas reproduced both risk predictions and exposure fractions with a high degree of accuracy, and performed better than multivariate lognormal distributions. Copulas may be the method of choice for VOC mixtures, particularly for the highest exposures or extreme events, cases that poorly fit lognormal distributions and that represent the greatest risks. PMID:24333991

  3. East Greenland and Barents Sea polar bears (Ursus maritimus): adaptive variation between two populations using skull morphometrics as an indicator of environmental and genetic differences.

    PubMed

    Pertoldi, Cino; Sonne, Christian; Wiig, Øystein; Baagøe, Hans J; Loeschcke, Volker; Bechshøft, Thea Østergaard

    2012-06-01

    A morphometric study was conducted on four skull traits of 37 male and 18 female adult East Greenland polar bears (Ursus maritimus) collected 1892-1968, and on 54 male and 44 female adult Barents Sea polar bears collected 1950-1969. The aim was to compare differences in size and shape of the bear skulls using a multivariate approach, characterizing the variation between the two populations using morphometric traits as an indicator of environmental and genetic differences. Mixture analysis testing for geographic differentiation within each population revealed three clusters for Barents Sea males and three clusters for Barents Sea females. East Greenland consisted of one female and one male cluster. A principal component analysis (PCA) conducted on the clusters defined by the mixture analysis, showed that East Greenland and Barents Sea polar bear populations overlapped to a large degree, especially with regards to females. Multivariate analyses of variance (MANOVA) showed no significant differences in morphometric means between the two populations, but differences were detected between clusters from each respective geographic locality. To estimate the importance of genetics and environment in the morphometric differences between the bears, a PCA was performed on the covariance matrix derived from the skull measurements. Skull trait size (PC1) explained approx. 80% of the morphometric variation, whereas shape (PC2) defined approx. 15%, indicating some genetic differentiation. Hence, both environmental and genetic factors seem to have contributed to the observed skull differences between the two populations. Overall, results indicate that many Barents Sea polar bears are morphometrically similar to the East Greenland ones, suggesting an exchange of individuals between the two populations. Furthermore, a subpopulation structure in the Barents Sea population was also indicated from the present analyses, which should be considered with regards to future management decisions. © 2012 The Authors.

  4. An initial perspective of S-asteroid subtypes within asteroid families

    NASA Technical Reports Server (NTRS)

    Kelley, M. S.; Gaffey, M. J.

    1993-01-01

    Many main belt asteroids cluster around certain values of semi-major axis (a), inclination (i), and eccentricity (e). Hirayama was the first to notice these concentrations which he interpreted as evidence of disruptions of larger parent bodies. He called these clusters 'asteroid families'. The term 'families' is increasingly reserved for genetic associations to distinguish them from clusters of unknown or purely dynamical origin (e.g. the Phocaea cluster). Members of a genetic asteroid family represent fragments derived from various depths within the original parent planetesimal. Thus, family members offer the potential for direct examination of the interiors of parent bodies which have undergone metamorphism and differentiation similar to that occurring in the inaccessible interiors of terrestrial planets. The differentiation similar to that occurring in the inaccessible interiors of terrestrial planets. The condition that genetic family members represent the fragments of a parent object provides a critical test of whether an association (cluster in proper element space) is a genetic family. Compositions (types and relative abundances of materials) of family members must permit the reconstruction of a compositionally plausible parent body. The compositions of proposed family members can be utilized to test the genetic reality of the family and to determine the type and degree of internal differentiation within the parent planetesimal. The interpretation of the S-class mineralogy provides a preliminary evaluation of family memberships. Detailed mineralogical and petrological analysis was done based on the reflectance spectra of 39 S-type asteroids. The result is a division of the S-asteroid class into seven subtypes based on compositional differences. These subtypes, designated S(I) to S(VII), correspond to surface silicate assemblages ranging from monomineralic olivine (dunites) through olivine-pyroxene mixtures to pure pyroxene or pyroxene-feldspar mixtures (basalts). The most general conclusion is that the S-asteroids cannot be treated as a single group of objects without greatly oversimplifying their properties. Each S-subtype needs to be treated as an independent group with a distinct evolutionary history.

  5. New PARSEC data base of α-enhanced stellar evolutionary tracks and isochrones - I. Calibration with 47 Tuc (NGC 104) and the improvement on RGB bump

    NASA Astrophysics Data System (ADS)

    Fu, Xiaoting; Bressan, Alessandro; Marigo, Paola; Girardi, Léo; Montalbán, Josefina; Chen, Yang; Nanni, Ambra

    2018-05-01

    Precise studies on the Galactic bulge, globular cluster, Galactic halo, and Galactic thick disc require stellar models with α enhancement and various values of helium content. These models are also important for extra-Galactic population synthesis studies. For this purpose, we complement the existing PARSEC models, which are based on the solar partition of heavy elements, with α-enhanced partitions. We collect detailed measurements on the metal mixture and helium abundance for the two populations of 47 Tuc (NGC 104) from the literature, and calculate stellar tracks and isochrones with these α-enhanced compositions. By fitting the precise colour-magnitude diagram with HST ACS/WFC data, from low main sequence till horizontal branch (HB), we calibrate some free parameters that are important for the evolution of low mass stars like the mixing at the bottom of the convective envelope. This new calibration significantly improves the prediction of the red giant branch bump (RGBB) brightness. Comparison with the observed RGB and HB luminosity functions also shows that the evolutionary lifetimes are correctly predicted. As a further result of this calibration process, we derive the age, distance modulus, reddening, and the RGB mass-loss for 47 Tuc. We apply the new calibration and α-enhanced mixtures of the two 47 Tuc populations ([α/Fe] ˜ 0.4 and 0.2) to other metallicities. The new models reproduce the RGB bump observations much better than previous models. This new PARSEC data base, with the newly updated α-enhanced stellar evolutionary tracks and isochrones, will also be a part of the new stellar products for Gaia.

  6. Segmentation of 3D microPET images of the rat brain via the hybrid gaussian mixture method with kernel density estimation.

    PubMed

    Chen, Tai-Been; Chen, Jyh-Cheng; Lu, Henry Horng-Shing

    2012-01-01

    Segmentation of positron emission tomography (PET) is typically achieved using the K-Means method or other approaches. In preclinical and clinical applications, the K-Means method needs a prior estimation of parameters such as the number of clusters and appropriate initialized values. This work segments microPET images using a hybrid method combining the Gaussian mixture model (GMM) with kernel density estimation. Segmentation is crucial to registration of disordered 2-deoxy-2-fluoro-D-glucose (FDG) accumulation locations with functional diagnosis and to estimate standardized uptake values (SUVs) of region of interests (ROIs) in PET images. Therefore, simulation studies are conducted to apply spherical targets to evaluate segmentation accuracy based on Tanimoto's definition of similarity. The proposed method generates a higher degree of similarity than the K-Means method. The PET images of a rat brain are used to compare the segmented shape and area of the cerebral cortex by the K-Means method and the proposed method by volume rendering. The proposed method provides clearer and more detailed activity structures of an FDG accumulation location in the cerebral cortex than those by the K-Means method.

  7. Inference of the phase-to-mechanical property link via coupled X-ray spectrometry and indentation analysis: Application to cement-based materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krakowiak, Konrad J.; Wilson, William; James, Simon

    2015-01-15

    A novel approach for the chemo-mechanical characterization of cement-based materials is presented, which combines the classical grid indentation technique with elemental mapping by scanning electron microscopy-energy dispersive X-ray spectrometry (SEM-EDS). It is illustrated through application to an oil-well cement system with siliceous filler. The characteristic X-rays of major elements (silicon, calcium and aluminum) are measured over the indentation region and mapped back on the indentation points. Measured intensities together with indentation hardness and modulus are considered in a clustering analysis within the framework of Finite Mixture Models with Gaussian component density function. The method is able to successfully isolate themore » calcium-silica-hydrate gel at the indentation scale from its mixtures with other products of cement hydration and anhydrous phases; thus providing a convenient means to link mechanical response to the calcium-to-silicon ratio quantified independently via X-ray wavelength dispersive spectroscopy. A discussion of uncertainty quantification of the estimated chemo-mechanical properties and phase volume fractions, as well as the effect of chemical observables on phase assessment is also included.« less

  8. Brimstone chemistry under laser light assists mass spectrometric detection and imaging the distribution of arsenic in minerals.

    PubMed

    Lal, Swapnil; Zheng, Zhaoyu; Pavlov, Julius; Attygalle, Athula B

    2018-05-23

    Singly charged As2n+1 ion clusters (n = 2-11) were generated from elemental arsenic by negative-ion laser-ablation mass spectrometry. The overall abundance of the gaseous As ions generated upon laser irradiation was enhanced nearly a hundred times when As-bearing samples were admixed with sulfur. However, sulfur does not act purely as an inert matrix: irradiating arsenic-sulfur mixtures revealed a novel pathway to generate and detect a series of [AsSn]- clusters (n = 2-6). Intriguingly, the spectra recorded from As2O3, NaAsO2, Na3AsO4, cacodylic acid and 3-amino-4-hydroxyphenylarsonic acid together with sulfur as the matrix were remarkably similar to that acquired from an elemental arsenic and sulfur mixture. This result indicated that arsenic sulfide cluster-ions are generated directly from arsenic compounds by a hitherto unknown pathway. The mechanism of elemental sulfur extracting chemically bound arsenic from compounds and forming [AsSn]- clusters is enigmatic; however, this discovery has a practical value as a general detection method for arsenic compounds. For example, the method was employed for the detection of As in its minerals, and for the imaging of arsenic distribution in minerals such as domeykite. LDI-MS data recorded from a latent image imprinted on a piece of paper from a flat mineral surface, and wetting the paper with a solution of sulfur, enabled the localization of arsenic in the mineral. The distribution of As was visualized as false-color images by extracting from acquired data the relative intensities of m/z 139 (AsS2-) and m/z 171 (AsS3-) ions.

  9. Estimation and Model Selection for Finite Mixtures of Latent Interaction Models

    ERIC Educational Resources Information Center

    Hsu, Jui-Chen

    2011-01-01

    Latent interaction models and mixture models have received considerable attention in social science research recently, but little is known about how to handle if unobserved population heterogeneity exists in the endogenous latent variables of the nonlinear structural equation models. The current study estimates a mixture of latent interaction…

  10. Scale Mixture Models with Applications to Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Qin, Zhaohui S.; Damien, Paul; Walker, Stephen

    2003-11-01

    Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.

  11. Automatic NMR-Based Identification of Chemical Reaction Types in Mixtures of Co-Occurring Reactions

    PubMed Central

    Latino, Diogo A. R. S.; Aires-de-Sousa, João

    2014-01-01

    The combination of chemoinformatics approaches with NMR techniques and the increasing availability of data allow the resolution of problems far beyond the original application of NMR in structure elucidation/verification. The diversity of applications can range from process monitoring, metabolic profiling, authentication of products, to quality control. An application related to the automatic analysis of complex mixtures concerns mixtures of chemical reactions. We encoded mixtures of chemical reactions with the difference between the 1H NMR spectra of the products and the reactants. All the signals arising from all the reactants of the co-occurring reactions were taken together (a simulated spectrum of the mixture of reactants) and the same was done for products. The difference spectrum is taken as the representation of the mixture of chemical reactions. A data set of 181 chemical reactions was used, each reaction manually assigned to one of 6 types. From this dataset, we simulated mixtures where two reactions of different types would occur simultaneously. Automatic learning methods were trained to classify the reactions occurring in a mixture from the 1H NMR-based descriptor of the mixture. Unsupervised learning methods (self-organizing maps) produced a reasonable clustering of the mixtures by reaction type, and allowed the correct classification of 80% and 63% of the mixtures in two independent test sets of different similarity to the training set. With random forests (RF), the percentage of correct classifications was increased to 99% and 80% for the same test sets. The RF probability associated to the predictions yielded a robust indication of their reliability. This study demonstrates the possibility of applying machine learning methods to automatically identify types of co-occurring chemical reactions from NMR data. Using no explicit structural information about the reactions participants, reaction elucidation is performed without structure elucidation of the molecules in the mixtures. PMID:24551112

  12. Automatic NMR-based identification of chemical reaction types in mixtures of co-occurring reactions.

    PubMed

    Latino, Diogo A R S; Aires-de-Sousa, João

    2014-01-01

    The combination of chemoinformatics approaches with NMR techniques and the increasing availability of data allow the resolution of problems far beyond the original application of NMR in structure elucidation/verification. The diversity of applications can range from process monitoring, metabolic profiling, authentication of products, to quality control. An application related to the automatic analysis of complex mixtures concerns mixtures of chemical reactions. We encoded mixtures of chemical reactions with the difference between the (1)H NMR spectra of the products and the reactants. All the signals arising from all the reactants of the co-occurring reactions were taken together (a simulated spectrum of the mixture of reactants) and the same was done for products. The difference spectrum is taken as the representation of the mixture of chemical reactions. A data set of 181 chemical reactions was used, each reaction manually assigned to one of 6 types. From this dataset, we simulated mixtures where two reactions of different types would occur simultaneously. Automatic learning methods were trained to classify the reactions occurring in a mixture from the (1)H NMR-based descriptor of the mixture. Unsupervised learning methods (self-organizing maps) produced a reasonable clustering of the mixtures by reaction type, and allowed the correct classification of 80% and 63% of the mixtures in two independent test sets of different similarity to the training set. With random forests (RF), the percentage of correct classifications was increased to 99% and 80% for the same test sets. The RF probability associated to the predictions yielded a robust indication of their reliability. This study demonstrates the possibility of applying machine learning methods to automatically identify types of co-occurring chemical reactions from NMR data. Using no explicit structural information about the reactions participants, reaction elucidation is performed without structure elucidation of the molecules in the mixtures.

  13. Accurate calibration of a molecular beam time-of-flight mass spectrometer for on-line analysis of high molecular weight species.

    PubMed

    Apicella, B; Wang, X; Passaro, M; Ciajolo, A; Russo, C

    2016-10-15

    Time-of-Flight (TOF) Mass Spectrometry is a powerful analytical technique, provided that an accurate calibration by standard molecules in the same m/z range of the analytes is performed. Calibration in a very large m/z range is a difficult task, particularly in studies focusing on the detection of high molecular weight clusters of different molecules or high molecular weight species. External calibration is the most common procedure used for TOF mass spectrometric analysis in the gas phase and, generally, the only available standards are made up of mixtures of noble gases, covering a small mass range for calibration, up to m/z 136 (higher mass isotope of xenon). In this work, an accurate calibration of a Molecular Beam Time-of Flight Mass Spectrometer (MB-TOFMS) is presented, based on the use of water clusters up to m/z 3000. The advantages of calibrating a MB-TOFMS with water clusters for the detection of analytes with masses above those of the traditional calibrants such as noble gases were quantitatively shown by statistical calculations. A comparison of the water cluster and noble gases calibration procedures in attributing the masses to a test mixture extending up to m/z 800 is also reported. In the case of the analysis of combustion products, another important feature of water cluster calibration was shown, that is the possibility of using them as "internal standard" directly formed from the combustion water, under suitable experimental conditions. The water clusters calibration of a MB-TOFMS gives rise to a ten-fold reduction in error compared to the traditional calibration with noble gases. The consequent improvement in mass accuracy in the calibration of a MB-TOFMS has important implications in various fields where detection of high molecular mass species is required. In combustion products analysis, it is also possible to obtain a new calibration spectrum before the acquisition of each spectrum, only modifying some operative conditions. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Characterization of Mixtures. Part 2: QSPR Models for Prediction of Excess Molar Volume and Liquid Density Using Neural Networks.

    PubMed

    Ajmani, Subhash; Rogers, Stephen C; Barley, Mark H; Burgess, Andrew N; Livingstone, David J

    2010-09-17

    In our earlier work, we have demonstrated that it is possible to characterize binary mixtures using single component descriptors by applying various mixing rules. We also showed that these methods were successful in building predictive QSPR models to study various mixture properties of interest. Here in, we developed a QSPR model of an excess thermodynamic property of binary mixtures i.e. excess molar volume (V(E) ). In the present study, we use a set of mixture descriptors which we earlier designed to specifically account for intermolecular interactions between the components of a mixture and applied successfully to the prediction of infinite-dilution activity coefficients using neural networks (part 1 of this series). We obtain a significant QSPR model for the prediction of excess molar volume (V(E) ) using consensus neural networks and five mixture descriptors. We find that hydrogen bond and thermodynamic descriptors are the most important in determining excess molar volume (V(E) ), which is in line with the theory of intermolecular forces governing excess mixture properties. The results also suggest that the mixture descriptors utilized herein may be sufficient to model a wide variety of properties of binary and possibly even more complex mixtures. Copyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Development of reversible jump Markov Chain Monte Carlo algorithm in the Bayesian mixture modeling for microarray data in Indonesia

    NASA Astrophysics Data System (ADS)

    Astuti, Ani Budi; Iriawan, Nur; Irhamah, Kuswanto, Heri

    2017-12-01

    In the Bayesian mixture modeling requires stages the identification number of the most appropriate mixture components thus obtained mixture models fit the data through data driven concept. Reversible Jump Markov Chain Monte Carlo (RJMCMC) is a combination of the reversible jump (RJ) concept and the Markov Chain Monte Carlo (MCMC) concept used by some researchers to solve the problem of identifying the number of mixture components which are not known with certainty number. In its application, RJMCMC using the concept of the birth/death and the split-merge with six types of movement, that are w updating, θ updating, z updating, hyperparameter β updating, split-merge for components and birth/death from blank components. The development of the RJMCMC algorithm needs to be done according to the observed case. The purpose of this study is to know the performance of RJMCMC algorithm development in identifying the number of mixture components which are not known with certainty number in the Bayesian mixture modeling for microarray data in Indonesia. The results of this study represent that the concept RJMCMC algorithm development able to properly identify the number of mixture components in the Bayesian normal mixture model wherein the component mixture in the case of microarray data in Indonesia is not known for certain number.

  16. Amplification of the entire kanamycin biosynthetic gene cluster during empirical strain improvement of Streptomyces kanamyceticus.

    PubMed

    Yanai, Koji; Murakami, Takeshi; Bibb, Mervyn

    2006-06-20

    Streptomyces kanamyceticus 12-6 is a derivative of the wild-type strain developed for industrial kanamycin (Km) production. Southern analysis and DNA sequencing revealed amplification of a large genomic segment including the entire Km biosynthetic gene cluster in the chromosome of strain 12-6. At 145 kb, the amplifiable unit of DNA (AUD) is the largest AUD reported in Streptomyces. Striking repetitive DNA sequences belonging to the clustered regularly interspaced short palindromic repeats family were found in the AUD and may play a role in its amplification. Strain 12-6 contains a mixture of different chromosomes with varying numbers of AUDs, sometimes exceeding 36 copies and producing an amplified region >5.7 Mb. The level of Km production depended on the copy number of the Km biosynthetic gene cluster, suggesting that DNA amplification occurred during strain improvement as a consequence of selection for increased Km resistance. Amplification of DNA segments including entire antibiotic biosynthetic gene clusters might be a common mechanism leading to increased antibiotic production in industrial strains.

  17. Non-linear clustering in the cold plus hot dark matter model

    NASA Astrophysics Data System (ADS)

    Bonometto, Silvio A.; Borgani, Stefano; Ghigna, Sebastiano; Klypin, Anatoly; Primack, Joel R.

    1995-03-01

    The main aim of this work is to find out if hierarchical scaling, observed in galaxy clustering, can be dynamically explained by studying N-body simulations. Previous analyses of dark matter (DM) particle distributions indicated heavy distortions with respect to the hierarchical pattern. Here, we shall describe how such distortions are to be interpreted and why they can be fully reconciled with the observed galaxy clustering. This aim is achieved by using high-resolution (512^3 grid-points) particle-mesh (PM) N-body simulations to follow the development of non-linear clustering in a Omega=1 universe, dominated either by cold dark matter (CDM) or by a mixture of cold+hot dark matter (CHDM) with Omega_cold=0.6, and Omega_hot=0.3 and Omega_baryon=0.1 a simulation box of side 100 Mpc (h=0.5) is used. We analyse two CHDM realizations with biasing factor b=1.5 (COBE normalization), starting from different initial random numbers, and compare them with CDM simulations with b=1 (COBE-compatible) and b=1.5. We evaluate high-order correlation functions and the void probability function (VPF). Correlation functions are obtained from both counts in cells and counts of neighbours. The analysis is carried out for DM particles and for galaxies identified as massive haloes of the evolved density field. We confirm that clustering of DM particles systematically exhibits deviations from hierarchical scaling, although the deviation increases somewhat in redshift space. Deviations from the hierarchical scaling of DM particles are found to be related to the spectrum shape, in a way that indicates that such distortions arise from finite sampling effects. We identify galaxy positions in the simulations and show that, quite differently from the DM particle background, galaxies follow hierarchical scaling (S_q=xi_q/& xgr^q-1_2=consta nt) far more closely, with reduced skewness and kurtosis coefficients S_3~2.5 and S_4~7.5, in general agreement with observational results. Unlike DM, the scaling of galaxy clustering is must marginally affected by redshift distortions and is obtained for both CDM and CHDM models. Hierarchical scaling in simulations is confirmed by VPF analysis. Also in this case, we find substantial agreement with observational findings.

  18. QSAR prediction of additive and non-additive mixture toxicities of antibiotics and pesticide.

    PubMed

    Qin, Li-Tang; Chen, Yu-Han; Zhang, Xin; Mo, Ling-Yun; Zeng, Hong-Hu; Liang, Yan-Peng

    2018-05-01

    Antibiotics and pesticides may exist as a mixture in real environment. The combined effect of mixture can either be additive or non-additive (synergism and antagonism). However, no effective predictive approach exists on predicting the synergistic and antagonistic toxicities of mixtures. In this study, we developed a quantitative structure-activity relationship (QSAR) model for the toxicities (half effect concentration, EC 50 ) of 45 binary and multi-component mixtures composed of two antibiotics and four pesticides. The acute toxicities of single compound and mixtures toward Aliivibrio fischeri were tested. A genetic algorithm was used to obtain the optimized model with three theoretical descriptors. Various internal and external validation techniques indicated that the coefficient of determination of 0.9366 and root mean square error of 0.1345 for the QSAR model predicted that 45 mixture toxicities presented additive, synergistic, and antagonistic effects. Compared with the traditional concentration additive and independent action models, the QSAR model exhibited an advantage in predicting mixture toxicity. Thus, the presented approach may be able to fill the gaps in predicting non-additive toxicities of binary and multi-component mixtures. Copyright © 2018 Elsevier Ltd. All rights reserved.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fu, H.B.; Hu, Y.J.; Bernstein, E.R.

    Small methanol clusters are formed by expanding a mixture of methanol vapor seeded in helium and are detected using vacuum UV (vuv) (118 nm) single-photon ionization/linear time-of-flight mass spectrometer (TOFMS). Protonated cluster ions, (CH{sub 3}OH){sub n-1}H{sup +} (n=2-8), formed through intracluster ion-molecule reactions following ionization, essentially correlate to the neutral clusters, (CH{sub 3}OH){sub n}, in the present study using 118 nm light as the ionization source. Both experimental and Born-Haber calculational results clarify that not enough excess energy is released into protonated cluster ions to initiate further fragmentation in the time scale appropriate for linear TOFMS. Size-specific spectra for (CH{submore » 3}OH){sub n} (n=4 to 8) clusters in the OH stretch fundamental region are recorded by IR+vuv (118 nm) nonresonant ion-dip spectroscopy through the detection chain of IR multiphoton predissociation and subsequent vuv single-photon ionization. The general structures and gross features of these cluster spectra are consistent with previous theoretical calculations. The lowest-energy peak contributed to each cluster spectrum is redshifted with increasing cluster size from n=4 to 8, and limits near {approx}3220 cm{sup -1} in the heptamer and octamer. Moreover, IR+vuv nonresonant ionization detected spectroscopy is employed to study the OH stretch first overtone of the methanol monomer. The rotational temperature of the clusters is estimated to be at least 50 K based on the simulation of the monomer rotational envelope under clustering conditions.« less

  20. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

    NASA Astrophysics Data System (ADS)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

    2017-12-01

    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  1. Structures and interactions in N-methylacetamide-water mixtures studied by IR spectra and density functional theory

    NASA Astrophysics Data System (ADS)

    Zhang, Rong; Li, Haoran; Lei, Yi; Han, Shijun

    2004-05-01

    IR spectra have been performed to study the structures and interactions in N-methylacetamide and water mixtures. Because of the competitions of acceptor and donor of the strong hydrogen bonds, some interesting phenomena of red shifts and blue shifts are observed in νCO and νN-H. It is due to the blue-shifting C-H⋯O hydrogen bond, the νC-H blue shifts more obviously. Then some representative cluster structures are suggested and further investigated by density functional theory method. The changes in bond length and frequency shift of the structures give good reasons for the red shift and blue shift, which represents excellent agreement with the IR experiment. The investigations of IR spectra and DFT calculations reveal that the weak C-H⋯O interactions play different roles compared with the classical strong hydrogen bonds in the NMA-water mixtures.

  2. Robust nonlinear system identification: Bayesian mixture of experts using the t-distribution

    NASA Astrophysics Data System (ADS)

    Baldacchino, Tara; Worden, Keith; Rowson, Jennifer

    2017-02-01

    A novel variational Bayesian mixture of experts model for robust regression of bifurcating and piece-wise continuous processes is introduced. The mixture of experts model is a powerful model which probabilistically splits the input space allowing different models to operate in the separate regions. However, current methods have no fail-safe against outliers. In this paper, a robust mixture of experts model is proposed which consists of Student-t mixture models at the gates and Student-t distributed experts, trained via Bayesian inference. The Student-t distribution has heavier tails than the Gaussian distribution, and so it is more robust to outliers, noise and non-normality in the data. Using both simulated data and real data obtained from the Z24 bridge this robust mixture of experts performs better than its Gaussian counterpart when outliers are present. In particular, it provides robustness to outliers in two forms: unbiased parameter regression models, and robustness to overfitting/complex models.

  3. Identification of crystalline structures in jet-cooled acetylene large clusters studied by two-dimensional correlation infrared spectroscopy

    NASA Astrophysics Data System (ADS)

    Matsumoto, Yoshiteru; Yoshiura, Ryuto; Honma, Kenji

    2017-07-01

    We investigated the crystalline structures of jet-cooled acetylene (C2H2) large clusters by laser spectroscopy and chemometrics. The CH stretching vibrations of the C2H2 large clusters were observed by infrared (IR) cavity ringdown spectroscopy. The IR spectra of C2H2 clusters were measured under the conditions of various concentrations of C2H2/He mixture gas for supersonic jets. Upon increasing the gas concentration from 1% to 10%, we observed a rapid intensity enhancement for a band in the IR spectra. The strong dependence of the intensity on the gas concentration indicates that the band was assigned to CH stretching vibrations of the large clusters. An analysis of the IR spectra by two-dimensional correlation spectroscopy revealed that the IR absorption due to the C2H2 large cluster is decomposed into two CH stretching vibrations. The vibrational frequencies of the two bands are almost equivalent to the IR absorption of the pure- and poly-crystalline orthorhombic structures in the aerosol particles. The characteristic temperature behavior of the IR spectra implies the existence of the other large cluster, which is discussed in terms of the phase transition of a bulk crystal.

  4. Development and validation of a metal mixture bioavailability model (MMBM) to predict chronic toxicity of Ni-Zn-Pb mixtures to Ceriodaphnia dubia.

    PubMed

    Nys, Charlotte; Janssen, Colin R; De Schamphelaere, Karel A C

    2017-01-01

    Recently, several bioavailability-based models have been shown to predict acute metal mixture toxicity with reasonable accuracy. However, the application of such models to chronic mixture toxicity is less well established. Therefore, we developed in the present study a chronic metal mixture bioavailability model (MMBM) by combining the existing chronic daphnid bioavailability models for Ni, Zn, and Pb with the independent action (IA) model, assuming strict non-interaction between the metals for binding at the metal-specific biotic ligand sites. To evaluate the predictive capacity of the MMBM, chronic (7d) reproductive toxicity of Ni-Zn-Pb mixtures to Ceriodaphnia dubia was investigated in four different natural waters (pH range: 7-8; Ca range: 1-2 mM; Dissolved Organic Carbon range: 5-12 mg/L). In each water, mixture toxicity was investigated at equitoxic metal concentration ratios as well as at environmental (i.e. realistic) metal concentration ratios. Statistical analysis of mixture effects revealed that observed interactive effects depended on the metal concentration ratio investigated when evaluated relative to the concentration addition (CA) model, but not when evaluated relative to the IA model. This indicates that interactive effects observed in an equitoxic experimental design cannot always be simply extrapolated to environmentally realistic exposure situations. Generally, the IA model predicted Ni-Zn-Pb mixture toxicity more accurately than the CA model. Overall, the MMBM predicted Ni-Zn-Pb mixture toxicity (expressed as % reproductive inhibition relative to a control) in 85% of the treatments with less than 20% error. Moreover, the MMBM predicted chronic toxicity of the ternary Ni-Zn-Pb mixture at least equally accurately as the toxicity of the individual metal treatments (RMSE Mix  = 16; RMSE Zn only  = 18; RMSE Ni only  = 17; RMSE Pb only  = 23). Based on the present study, we believe MMBMs can be a promising tool to account for the effects of water chemistry on metal mixture toxicity during chronic exposure and could be used in metal risk assessment frameworks. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Analysis of Personal and Home Characteristics Associated with the Elemental Composition of PM2.5 in Indoor, Outdoor, and Personal Air in the RIOPA Study.

    PubMed

    Ryan, Patrick H; Brokamp, Cole; Fan, Zhi-Hua; Rao, M B

    2015-12-01

    The complex mixture of chemicals and elements that constitute particulate matter (PM*) varies by season and geographic location because source contributors differ over time and place. The composition of PM having an aerodynamic diameter < 2.5 μm (PM2.5) is hypothesized to be responsible, in part, for its toxicity. Epidemiologic studies have identified specific components and sources of PM2.5 that are associated with adverse health outcomes. The majority of these studies use measures of outdoor concentrations obtained from one or a few central monitoring sites as a surrogate for measures of personal exposure. Personal PM2.5 (and its elemental composition), however, may be different from the PM2.5 measured at stationary outdoor sites. The objectives of this study were (1) to describe the relationships between the concentrations of various elements in indoor, outdoor, and personal PM2.5 samples, (2) to identify groups of individuals with similar exposures to mixtures of elements in personal PM2.5 and to examine personal and home characteristics of these groups, and (3) to evaluate whether concentrations of elements from outdoor PM2.5 samples are appropriate surrogates for personal exposure to PM2.5 and its elements and whether indoor PM2.5 concentrations and information about home characteristics improve the prediction of personal exposure. The objectives of the study were addressed using data collected as part of the Relationships of Indoor, Outdoor, and Personal Air (RIOPA) study. The RIOPA study has previously measured the mass concentrations of PM2.5 and its elemental constituents during 48-hour concurrent indoor, outdoor (directly outside the home), and personal samplings in three urban areas (Los Angeles, California; Houston, Texas; and Elizabeth, New Jersey). The resulting data and information about personal and home characteristics (including air-conditioning use, nearby emission sources, time spent indoors, census-tract geography, air-exchange rates, and other information) for each RIOPA participant were downloaded from the RIOPA study database. We performed three sets of analyses to address the study aims. First, we conducted descriptive analyses to describe the relationships between elemental concentrations in the concurrently gathered indoor, outdoor, and personal air samples. We assessed the correlation between personal exposure and indoor concentrations as well as personal exposure and outdoor concentrations of each element and calculated ratios between them. In addition, we performed principal component analysis (PCA) and calculated principal component scores (PCSs) to examine the heterogeneity of the elemental composition and then tested whether the mixture of elements in indoor, outdoor, and personal PM2.5 was significantly different within each study site and across study sites. Secondly, we performed model-based clustering analysis to group RIOPA participants with similar exposures to mixtures of elements in personal PM2.5. We examined the association between cluster membership and the concentrations of elements in indoor and outdoor PM2.5 samples and personal and home characteristics. Finally, we developed a series of linear regression models and random forest models to examine the association between personal exposure to elements in PM2.5 and (1) outdoor measurements, (2) outdoor and indoor measurements, and (3) outdoor and indoor measurements and home characteristics. As we developed each model, the improvement in prediction of personal exposure when including additional information was assessed. Personal exposures to PM2.5 and to most elements were significantly correlated with both indoor and outdoor concentrations, although concentrations in personal samples frequently exceeded those of indoor and outdoor samples. In general, for most PM2.5 elements indoor concentrations were more highly correlated with personal exposure than were outdoor concentrations. PCA showed that the mixture of elements in indoor, outdoor, and personal PM2.5 varied significantly across sample types within each study site and also across study sites within each sample type. Using model-based clustering, we identified seven clusters of RIOPA participants whose personal PM2.5 samples had similar patterns of elemental composition. Using this approach, subsets of RIOPA participants were identified whose personal exposures to PM2.5 (and its elements) were significantly higher than their indoor and outdoor concentrations (and vice versa). The results of linear and random forest regression models were consistent with our correlation analyses and demonstrated that (1) indoor concentrations were more significantly associated with personal exposure than were outdoor concentrations and (2) participant reports of time spent at their home significantly modified many of the associations between indoor and personal concentrations. In linear regression models, the inclusion of indoor concentrations significantly improved the prediction of personal exposures to Ba, Ca, Cl, Cu, K, Sn, Sr, V, and Zn compared with the use of outdoor elemental concentrations alone. Including additional information on personal and home characteristics improved the prediction for only one element, Pb. Our results support the use of outdoor monitoring sites as surrogates of personal exposure for a limited number of individual elements associated with long-range transport and with a few local or indoor sources. Based on our PCA and clustering analyses, we concluded that the overall elemental composition of PM2.5 obtained at outdoor monitoring sites may not accurately represent the elemental composition of personal PM2.5. Although the data used in these analyses compared outdoor PM2.5 composition collected at the home with indoor and personal samples, our results imply that studies examining the complete elemental composition of PM2.5 should be cautious about using data from central outdoor monitoring sites because of the potential for exposure misclassification. The inclusion of personal and home characteristics only marginally improved the prediction of personal exposure for a small number of elements in PM2.5. We concluded that the additional cost and burden of indoor and personal sampling may be justified for studies examining elements because neither outdoor monitoring nor questionnaire data on home and personal characteristics were able to represent adequately the overall elemental composition of personal PM2.5.

  6. Ion-water wires in imidazolium-based ionic liquid/water solutions induce unique trends in density.

    PubMed

    Ghoshdastidar, Debostuti; Senapati, Sanjib

    2016-03-28

    Ionic liquid/water binary mixtures are rapidly gaining popularity as solvents for dissolution of cellulose, nucleobases, and other poorly water-soluble biomolecules. Hence, several studies have focused on measuring the thermophysical properties of these versatile mixtures. Among these, 1-ethyl-3-methylimidazolium ([emim]) cation-based ILs containing different anions exhibit unique density behaviours upon addition of water. While [emim][acetate]/water binary mixtures display an unusual rise in density with the addition of low-to-moderate amounts of water, those containing the [trifluoroacetate] ([Tfa]) anion display a sluggish decrease in density. The density of [emim][tetrafluoroborate] ([emim][BF4])/water mixtures, on the other hand, declines rapidly in close accordance with the experimental reports. Here, we unravel the structural basis underlying this unique density behavior of [emim]-based IL/water mixtures using all-atom molecular dynamics (MD) simulations. The results revealed that the distinct nature of anion-water hydrogen bonded networks in the three systems was a key in modulating the observed unique density behaviour. Vast expanses of uninterrupted anion-water-anion H-bonded stretches, denoted here as anion-water wires, induced significant structuring in [emim][Ac]/water mixtures that resulted in the density rise. Conversely, the presence of intermittent large water clusters disintegrated the anion-water wires in [emim][Tfa]/water and [emim][BF4]/water mixtures to cause a monotonic density decrease. The differential nanostructuring affected the dynamics of the solutions proportionately, with the H-bond making and breaking dynamics found to be greatly retarded in [emim][Ac]/water mixtures, while it exhibited a faster relaxation in the other two binary solutions.

  7. Investigating Mixture Interactions of Astringent Stimuli Using the Isobole Approach

    PubMed Central

    Fleming, Erin E.; Ziegler, Gregory R.

    2016-01-01

    Abstract Astringents (alum, malic acid, tannic acid) representing 3 broad classes (multivalent salts, organic acids, and polyphenols) were characterized alone, and as 2- and 3-component mixtures using isoboles. In experiment 1, participants rated 7 attributes (“astringency,” the sub-qualities “drying,” “roughing,” and “puckering,” and the side tastes “bitterness,” “sourness,” and “sweetness”) using direct scaling. Quality specific power functions were calculated for each stimulus. In experiment 2, the same participants characterized 2- and 3-component mixtures. Multiple factor analysis (MFA) and hierarchical clustering on attribute ratings across stimuli indicate “astringency” is highly related to “bitterness” as well as “puckering,” and the subqualities “drying” and “roughing” are somewhat redundant. Moreover, power functions were used to calculate indices of interaction (I) for each attribute/mixture combination. For “astringency,” there was evidence of antagonism, regardless of the type of mixture. Conversely, for subqualities, the pattern of interaction depended on the mixture type. Alum/tannic acid and tannic acid/malic acid mixtures showed evidence of synergy for “drying” and “roughing”; alum/malic acid mixtures showed evidence of antagonism for “drying,” “roughing,” and “puckering.” Collectively, these data clarify some semantic ambiguity regarding astringency and its subqualities, as well as the nature of interactions of among different types of astringents. Present data are not inconsistent with the idea that astringency arises from multiple mechanisms, although it remains to be determined whether the synergy observed here might reflect simultaneous activation of these multiple mechanisms. PMID:27252355

  8. Rasch Mixture Models for DIF Detection

    PubMed Central

    Strobl, Carolin; Zeileis, Achim

    2014-01-01

    Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch mixture models is sensitive to the specification of the ability distribution even when the conditional maximum likelihood approach is used. It is demonstrated in a simulation study how differences in ability can influence the latent classes of a Rasch mixture model. If the aim is only DIF detection, it is not of interest to uncover such ability differences as one is only interested in a latent group structure regarding the item difficulties. To avoid any confounding effect of ability differences (or impact), a new score distribution for the Rasch mixture model is introduced here. It ensures the estimation of the Rasch mixture model to be independent of the ability distribution and thus restricts the mixture to be sensitive to latent structure in the item difficulties only. Its usefulness is demonstrated in a simulation study, and its application is illustrated in a study of verbal aggression. PMID:29795819

  9. Investigating Stage-Sequential Growth Mixture Models with Multiphase Longitudinal Data

    ERIC Educational Resources Information Center

    Kim, Su-Young; Kim, Jee-Seon

    2012-01-01

    This article investigates three types of stage-sequential growth mixture models in the structural equation modeling framework for the analysis of multiple-phase longitudinal data. These models can be important tools for situations in which a single-phase growth mixture model produces distorted results and can allow researchers to better understand…

  10. Experimental study of cluster formation in binary mixture of H2O and H2SO4 vapors in the presence of an ionizing radiation source

    NASA Technical Reports Server (NTRS)

    Singh, J. J.; Smith, A. C.; Yue, G. K.

    1980-01-01

    Molecular clusters formed in pure nitrogen containing H2O and H2SO4 vapors and exposed to a 3 mCi Ni63 beta source were studied in the mass range 50 to 780 amu using a quadrupole mass spectrometer. Measurements were made under several combinations of relative humidity and relative acidity ranging from 0.7 to 7.5 percent and 0.00047 to 0.06333 percent, respectively. The number of H2SO4 molecules in the clusters observed ranged from 1 to 7 whereas the number of H2O molecules ranged from 1 to 16. The experimental cluster spectra differ considerably from those calculated using the classical nucleation theory. First order calculations using modified surface tension values and including the effects of multipole moments of the nucleating molecules indicate that these effects may be enough to explain the difference between the measured and the calculated spectra.

  11. Latent trajectory studies: the basics, how to interpret the results, and what to report.

    PubMed

    van de Schoot, Rens

    2015-01-01

    In statistics, tools have been developed to estimate individual change over time. Also, the existence of latent trajectories, where individuals are captured by trajectories that are unobserved (latent), can be evaluated (Muthén & Muthén, 2000). The method used to evaluate such trajectories is called Latent Growth Mixture Modeling (LGMM) or Latent Class Growth Modeling (LCGA). The difference between the two models is whether variance within latent classes is allowed for (Jung & Wickrama, 2008). The default approach most often used when estimating such models begins with estimating a single cluster model, where only a single underlying group is presumed. Next, several additional models are estimated with an increasing number of clusters (latent groups or classes). For each of these models, the software is allowed to estimate all parameters without any restrictions. A final model is chosen based on model comparison tools, for example, using the BIC, the bootstrapped chi-square test, or the Lo-Mendell-Rubin test. To ease the use of LGMM/LCGA step by step in this symposium (Van de Schoot, 2015) guidelines are presented which can be used for researchers applying the methods to longitudinal data, for example, the development of posttraumatic stress disorder (PTSD) after trauma (Depaoli, van de Schoot, van Loey, & Sijbrandij, 2015; Galatzer-Levy, 2015). The guidelines include how to use the software Mplus (Muthén & Muthén, 1998-2012) to run the set of models needed to answer the research question: how many latent classes exist in the data? The next step described in the guidelines is how to add covariates/predictors to predict class membership using the three-step approach (Vermunt, 2010). Lastly, it described what essentials to report in the paper. When applying LGMM/LCGA models for the first time, the guidelines presented can be used to guide what models to run and what to report.

  12. Accuracy of latent-variable estimation in Bayesian semi-supervised learning.

    PubMed

    Yamazaki, Keisuke

    2015-09-01

    Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data, performs better when the model is well specified. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Local Solutions in the Estimation of Growth Mixture Models

    ERIC Educational Resources Information Center

    Hipp, John R.; Bauer, Daniel J.

    2006-01-01

    Finite mixture models are well known to have poorly behaved likelihood functions featuring singularities and multiple optima. Growth mixture models may suffer from fewer of these problems, potentially benefiting from the structure imposed on the estimated class means and covariances by the specified growth model. As demonstrated here, however,…

  14. Passive colloids work together to become Active

    NASA Astrophysics Data System (ADS)

    Kandula, Hima Nagamanasa; Wang, Wei; Zhang, Jie; Wu, Huanxin; Han, Ming; Luijten, Erik; Granick, Steve

    In recent years there is growing body of research to design self-propelled colloids to gain insights into non-equilibrium systems including living matter. While most active colloids developed hitherto entail prefabrication of Janus colloids and possess single fixed active site, we present one simple system where active colloids are formed in-situ naturally with multiple active sites and are reversible as well as reconfigurable. A binary mixture of Brownian colloids which have opposite polarizations when subjected to an AC electric field spontaneously assemble into clusters which are propelled by asymmetric induced charge electro osmosis. We find that tuning the relative sizes of the two species allows for the control over the number of active sites. More interestingly, the patches are dynamic enabling reconfiguration of the active cluster. Consequently, the clusters are active not only in motion but also in their structure.

  15. Presence of Li clusters in molten LiCl-Li

    DOE PAGES

    Merwin, Augustus; Phillips, William C.; Williamson, Mark A.; ...

    2016-05-05

    Molten mixtures of lithium chloride and metallic lithium are of significant interest in various metal oxide reduction processes. These solutions have been reported to exhibit seemingly anomalous physical characteristics that lack a comprehensive explanation. ln the current work, the physical chemistry of molten solutions of lithium chloride and metallic lithium, with and without lithium oxide, was investigated using in situ Raman spectroscopy. The Raman spectra obtained from these solutions were in agreement with the previously reported spectrum of the lithium cluster, Li 8. Furthermore, this observation is indicative of a nanofluid type colloidal suspension of Li 8, in a moltenmore » salt matrix. It is suggested that the formation and suspension of lithium clusters in lithium chloride is the cause of various phenomena exhibited by these solutions that were previously unexplainable.« less

  16. Delineating high-density areas in spatial Poisson fields from strip-transect sampling using indicator geostatistics: application to unexploded ordnance removal.

    PubMed

    Saito, Hirotaka; McKenna, Sean A

    2007-07-01

    An approach for delineating high anomaly density areas within a mixture of two or more spatial Poisson fields based on limited sample data collected along strip transects was developed. All sampled anomalies were transformed to anomaly count data and indicator kriging was used to estimate the probability of exceeding a threshold value derived from the cdf of the background homogeneous Poisson field. The threshold value was determined so that the delineation of high-density areas was optimized. Additionally, a low-pass filter was applied to the transect data to enhance such segmentation. Example calculations were completed using a controlled military model site, in which accurate delineation of clusters of unexploded ordnance (UXO) was required for site cleanup.

  17. Particle connectedness and cluster formation in sequential depositions of particles: integral-equation theory.

    PubMed

    Danwanichakul, Panu; Glandt, Eduardo D

    2004-11-15

    We applied the integral-equation theory to the connectedness problem. The method originally applied to the study of continuum percolation in various equilibrium systems was modified for our sequential quenching model, a particular limit of an irreversible adsorption. The development of the theory based on the (quenched-annealed) binary-mixture approximation includes the Ornstein-Zernike equation, the Percus-Yevick closure, and an additional term involving the three-body connectedness function. This function is simplified by introducing a Kirkwood-like superposition approximation. We studied the three-dimensional (3D) system of randomly placed spheres and 2D systems of square-well particles, both with a narrow and with a wide well. The results from our integral-equation theory are in good accordance with simulation results within a certain range of densities.

  18. Particle connectedness and cluster formation in sequential depositions of particles: Integral-equation theory

    NASA Astrophysics Data System (ADS)

    Danwanichakul, Panu; Glandt, Eduardo D.

    2004-11-01

    We applied the integral-equation theory to the connectedness problem. The method originally applied to the study of continuum percolation in various equilibrium systems was modified for our sequential quenching model, a particular limit of an irreversible adsorption. The development of the theory based on the (quenched-annealed) binary-mixture approximation includes the Ornstein-Zernike equation, the Percus-Yevick closure, and an additional term involving the three-body connectedness function. This function is simplified by introducing a Kirkwood-like superposition approximation. We studied the three-dimensional (3D) system of randomly placed spheres and 2D systems of square-well particles, both with a narrow and with a wide well. The results from our integral-equation theory are in good accordance with simulation results within a certain range of densities.

  19. Interpolation of orientation distribution functions in diffusion weighted imaging using multi-tensor model.

    PubMed

    Afzali, Maryam; Fatemizadeh, Emad; Soltanian-Zadeh, Hamid

    2015-09-30

    Diffusion weighted imaging (DWI) is a non-invasive method for investigating the brain white matter structure and can be used to evaluate fiber bundles. However, due to practical constraints, DWI data acquired in clinics are low resolution. This paper proposes a method for interpolation of orientation distribution functions (ODFs). To this end, fuzzy clustering is applied to segment ODFs based on the principal diffusion directions (PDDs). Next, a cluster is modeled by a tensor so that an ODF is represented by a mixture of tensors. For interpolation, each tensor is rotated separately. The method is applied on the synthetic and real DWI data of control and epileptic subjects. Both experiments illustrate capability of the method in increasing spatial resolution of the data in the ODF field properly. The real dataset show that the method is capable of reliable identification of differences between temporal lobe epilepsy (TLE) patients and normal subjects. The method is compared to existing methods. Comparison studies show that the proposed method generates smaller angular errors relative to the existing methods. Another advantage of the method is that it does not require an iterative algorithm to find the tensors. The proposed method is appropriate for increasing resolution in the ODF field and can be applied to clinical data to improve evaluation of white matter fibers in the brain. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Rutile-Deposited Pt–Pd clusters: A Hypothesis Regarding the Stability at 50/50 Ratio

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ha, Mai-Anh; Dadras, Mostafa J.; Alexandrova, Anastassia N.

    2014-10-03

    Mixed Pt–Pd clusters deposited on oxides have been of great interest to catalysis. Clusters containing Pt and Pd in roughly equal proportions were found to be unusually stable against sintering, one of the major mechanisms of catalyst deactivation. After aging of such catalysts, the 50/50 Pt–Pd and Pd–O clusters appeared to be the two most prevalent phases. The reason for the enhanced stability of these equally proportioned clusters has remained unclear. In the following, sintering of mixed Pt–Pd clusters on TiO2(110) for various initial atomic concentrations of Pt and Pd and at a range of catalytically relevant temperatures was simulated.more » It is confirmed that equally mixed clusters have the relatively highest survival rate. Surprisingly, subnanoclusters containing Pt and Pd in all proportions have very similar geometries and chemical bonding, revealing no apparent explanation for favoring the 1:1 Pt/Pd ratio. However, it was discovered that at high temperatures, the 50/50 clusters have considerably more thermally accessible isomers than clusters containing Pt and Pd in other proportions. Hence, one of the reasons for stability is entropic stabilization. Electrostatics also plays a key role as a subtle charge redistribution, and a shift of electron density to the slightly more electronegative Pt results in the partially charged atoms being further stabilized by intracluster Coulomb attraction; this effect is greatest for 1:1 mixtures.« less

  1. A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities

    NASA Astrophysics Data System (ADS)

    Muscoloni, Alessandro; Vittorio Cannistraci, Carlo

    2018-05-01

    The investigation of the hidden metric space behind complex network topologies is a fervid topic in current network science and the hyperbolic space is one of the most studied, because it seems associated to the structural organization of many real complex systems. The popularity-similarity-optimization (PSO) model simulates how random geometric graphs grow in the hyperbolic space, generating realistic networks with clustering, small-worldness, scale-freeness and rich-clubness. However, it misses to reproduce an important feature of real complex networks, which is the community organization. The geometrical-preferential-attachment (GPA) model was recently developed in order to confer to the PSO also a soft community structure, which is obtained by forcing different angular regions of the hyperbolic disk to have a variable level of attractiveness. However, the number and size of the communities cannot be explicitly controlled in the GPA, which is a clear limitation for real applications. Here, we introduce the nonuniform PSO (nPSO) model. Differently from GPA, the nPSO generates synthetic networks in the hyperbolic space where heterogeneous angular node attractiveness is forced by sampling the angular coordinates from a tailored nonuniform probability distribution (for instance a mixture of Gaussians). The nPSO differs from GPA in other three aspects: it allows one to explicitly fix the number and size of communities; it allows one to tune their mixing property by means of the network temperature; it is efficient to generate networks with high clustering. Several tests on the detectability of the community structure in nPSO synthetic networks and wide investigations on their structural properties confirm that the nPSO is a valid and efficient model to generate realistic complex networks with communities.

  2. Reducing the Matrix Effect in Organic Cluster SIMS Using Dynamic Reactive Ionization

    NASA Astrophysics Data System (ADS)

    Tian, Hua; Wucher, Andreas; Winograd, Nicholas

    2016-12-01

    Dynamic reactive ionization (DRI) utilizes a reactive molecule, HCl, which is doped into an Ar cluster projectile and activated to produce protons at the bombardment site on the cold sample surface with the presence of water. The methodology has been shown to enhance the ionization of protonated molecular ions and to reduce salt suppression in complex biomatrices. In this study, we further examine the possibility of obtaining improved quantitation with DRI during depth profiling of thin films. Using a trehalose film as a model system, we are able to define optimal DRI conditions for depth profiling. Next, the strategy is applied to a multilayer system consisting of the polymer antioxidants Irganox 1098 and 1010. These binary mixtures have demonstrated large matrix effects, making quantitative SIMS measurement not feasible. Systematic comparisons of depth profiling of this multilayer film between directly using GCIB, and under DRI conditions, show that the latter enhances protonated ions for both components by 4- to 15-fold, resulting in uniform depth profiling in positive ion mode and almost no matrix effect in negative ion mode. The methodology offers a new strategy to tackle the matrix effect and should lead to improved quantitative measurement using SIMS.

  3. Low-cost multispectral imaging for remote sensing of lettuce health

    NASA Astrophysics Data System (ADS)

    Ren, David D. W.; Tripathi, Siddhant; Li, Larry K. B.

    2017-01-01

    In agricultural remote sensing, unmanned aerial vehicle (UAV) platforms offer many advantages over conventional satellite and full-scale airborne platforms. One of the most important advantages is their ability to capture high spatial resolution images (1-10 cm) on-demand and at different viewing angles. However, UAV platforms typically rely on the use of multiple cameras, which can be costly and difficult to operate. We present the development of a simple low-cost imaging system for remote sensing of crop health and demonstrate it on lettuce (Lactuca sativa) grown in Hong Kong. To identify the optimal vegetation index, we recorded images of both healthy and unhealthy lettuce, and used them as input in an expectation maximization cluster analysis with a Gaussian mixture model. Results from unsupervised and supervised clustering show that, among four widely used vegetation indices, the blue wide-dynamic range vegetation index is the most accurate. This study shows that it is readily possible to design and build a remote sensing system capable of determining the health status of lettuce at a reasonably low cost (

  4. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  5. Nonlinear Structured Growth Mixture Models in M"plus" and OpenMx

    ERIC Educational Resources Information Center

    Grimm, Kevin J.; Ram, Nilam; Estabrook, Ryne

    2010-01-01

    Growth mixture models (GMMs; B. O. Muthen & Muthen, 2000; B. O. Muthen & Shedden, 1999) are a combination of latent curve models (LCMs) and finite mixture models to examine the existence of latent classes that follow distinct developmental patterns. GMMs are often fit with linear, latent basis, multiphase, or polynomial change models…

  6. The Potential of Growth Mixture Modelling

    ERIC Educational Resources Information Center

    Muthen, Bengt

    2006-01-01

    The authors of the paper on growth mixture modelling (GMM) give a description of GMM and related techniques as applied to antisocial behaviour. They bring up the important issue of choice of model within the general framework of mixture modelling, especially the choice between latent class growth analysis (LCGA) techniques developed by Nagin and…

  7. Detection of Single Standing Dead Trees from Aerial Color Infrared Imagery by Segmentation with Shape and Intensity Priors

    NASA Astrophysics Data System (ADS)

    Polewski, P.; Yao, W.; Heurich, M.; Krzystek, P.; Stilla, U.

    2015-03-01

    Standing dead trees, known as snags, are an essential factor in maintaining biodiversity in forest ecosystems. Combined with their role as carbon sinks, this makes for a compelling reason to study their spatial distribution. This paper presents an integrated method to detect and delineate individual dead tree crowns from color infrared aerial imagery. Our approach consists of two steps which incorporate statistical information about prior distributions of both the image intensities and the shapes of the target objects. In the first step, we perform a Gaussian Mixture Model clustering in the pixel color space with priors on the cluster means, obtaining up to 3 components corresponding to dead trees, living trees, and shadows. We then refine the dead tree regions using a level set segmentation method enriched with a generative model of the dead trees' shape distribution as well as a discriminative model of their pixel intensity distribution. The iterative application of the statistical shape template yields the set of delineated dead crowns. The prior information enforces the consistency of the template's shape variation with the shape manifold defined by manually labeled training examples, which makes it possible to separate crowns located in close proximity and prevents the formation of large crown clusters. Also, the statistical information built into the segmentation gives rise to an implicit detection scheme, because the shape template evolves towards an empty contour if not enough evidence for the object is present in the image. We test our method on 3 sample plots from the Bavarian Forest National Park with reference data obtained by manually marking individual dead tree polygons in the images. Our results are scenario-dependent and range from a correctness/completeness of 0.71/0.81 up to 0.77/1, with an average center-of-gravity displacement of 3-5 pixels between the detected and reference polygons.

  8. Equivalence of truncated count mixture distributions and mixtures of truncated count distributions.

    PubMed

    Böhning, Dankmar; Kuhnert, Ronny

    2006-12-01

    This article is about modeling count data with zero truncation. A parametric count density family is considered. The truncated mixture of densities from this family is different from the mixture of truncated densities from the same family. Whereas the former model is more natural to formulate and to interpret, the latter model is theoretically easier to treat. It is shown that for any mixing distribution leading to a truncated mixture, a (usually different) mixing distribution can be found so that the associated mixture of truncated densities equals the truncated mixture, and vice versa. This implies that the likelihood surfaces for both situations agree, and in this sense both models are equivalent. Zero-truncated count data models are used frequently in the capture-recapture setting to estimate population size, and it can be shown that the two Horvitz-Thompson estimators, associated with the two models, agree. In particular, it is possible to achieve strong results for mixtures of truncated Poisson densities, including reliable, global construction of the unique NPMLE (nonparametric maximum likelihood estimator) of the mixing distribution, implying a unique estimator for the population size. The benefit of these results lies in the fact that it is valid to work with the mixture of truncated count densities, which is less appealing for the practitioner but theoretically easier. Mixtures of truncated count densities form a convex linear model, for which a developed theory exists, including global maximum likelihood theory as well as algorithmic approaches. Once the problem has been solved in this class, it might readily be transformed back to the original problem by means of an explicitly given mapping. Applications of these ideas are given, particularly in the case of the truncated Poisson family.

  9. Development of PBPK Models for Gasoline in Adult and ...

    EPA Pesticide Factsheets

    Concern for potential developmental effects of exposure to gasoline-ethanol blends has grown along with their increased use in the US fuel supply. Physiologically-based pharmacokinetic (PBPK) models for these complex mixtures were developed to address dosimetric issues related to selection of exposure concentrations for in vivo toxicity studies. Sub-models for individual hydrocarbon (HC) constituents were first developed and calibrated with published literature or QSAR-derived data where available. Successfully calibrated sub-models for individual HCs were combined, assuming competitive metabolic inhibition in the liver, and a priori simulations of mixture interactions were performed. Blood HC concentration data were collected from exposed adult non-pregnant (NP) rats (9K ppm total HC vapor, 6h/day) to evaluate performance of the NP mixture model. This model was then converted to a pregnant (PG) rat mixture model using gestational growth equations that enabled a priori estimation of life-stage specific kinetic differences. To address the impact of changing relevant physiological parameters from NP to PG, the PG mixture model was first calibrated against the NP data. The PG mixture model was then evaluated against data from PG rats that were subsequently exposed (9K ppm/6.33h gestation days (GD) 9-20). Overall, the mixture models adequately simulated concentrations of HCs in blood from single (NP) or repeated (PG) exposures (within ~2-3 fold of measured values of

  10. Mixture-mixture design for the fingerprint optimization of chromatographic mobile phases and extraction solutions for Camellia sinensis.

    PubMed

    Borges, Cleber N; Bruns, Roy E; Almeida, Aline A; Scarminio, Ieda S

    2007-07-09

    A composite simplex centroid-simplex centroid mixture design is proposed for simultaneously optimizing two mixture systems. The complementary model is formed by multiplying special cubic models for the two systems. The design was applied to the simultaneous optimization of both mobile phase chromatographic mixtures and extraction mixtures for the Camellia sinensis Chinese tea plant. The extraction mixtures investigated contained varying proportions of ethyl acetate, ethanol and dichloromethane while the mobile phase was made up of varying proportions of methanol, acetonitrile and a methanol-acetonitrile-water (MAW) 15%:15%:70% mixture. The experiments were block randomized corresponding to a split-plot error structure to minimize laboratory work and reduce environmental impact. Coefficients of an initial saturated model were obtained using Scheffe-type equations. A cumulative probability graph was used to determine an approximate reduced model. The split-plot error structure was then introduced into the reduced model by applying generalized least square equations with variance components calculated using the restricted maximum likelihood approach. A model was developed to calculate the number of peaks observed with the chromatographic detector at 210 nm. A 20-term model contained essentially all the statistical information of the initial model and had a root mean square calibration error of 1.38. The model was used to predict the number of peaks eluted in chromatograms obtained from extraction solutions that correspond to axial points of the simplex centroid design. The significant model coefficients are interpreted in terms of interacting linear, quadratic and cubic effects of the mobile phase and extraction solution components.

  11. Reduced detonation kinetics and detonation structure in one- and multi-fuel gaseous mixtures

    NASA Astrophysics Data System (ADS)

    Fomin, P. A.; Trotsyuk, A. V.; Vasil'ev, A. A.

    2017-10-01

    Two-step approximate models of chemical kinetics of detonation combustion of (i) one-fuel (CH4/air) and (ii) multi-fuel gaseous mixtures (CH4/H2/air and CH4/CO/air) are developed for the first time. The models for multi-fuel mixtures are proposed for the first time. Owing to the simplicity and high accuracy, the models can be used in multi-dimensional numerical calculations of detonation waves in corresponding gaseous mixtures. The models are in consistent with the second law of thermodynamics and Le Chatelier’s principle. Constants of the models have a clear physical meaning. Advantages of the kinetic model for detonation combustion of methane has been demonstrated via numerical calculations of a two-dimensional structure of the detonation wave in a stoichiometric and fuel-rich methane-air mixtures and stoichiometric methane-oxygen mixture. The dominant size of the detonation cell, determines in calculations, is in good agreement with all known experimental data.

  12. Fitting a Mixture Item Response Theory Model to Personality Questionnaire Data: Characterizing Latent Classes and Investigating Possibilities for Improving Prediction

    ERIC Educational Resources Information Center

    Maij-de Meij, Annette M.; Kelderman, Henk; van der Flier, Henk

    2008-01-01

    Mixture item response theory (IRT) models aid the interpretation of response behavior on personality tests and may provide possibilities for improving prediction. Heterogeneity in the population is modeled by identifying homogeneous subgroups that conform to different measurement models. In this study, mixture IRT models were applied to the…

  13. Classifying GABAergic interneurons with semi-supervised projected model-based clustering.

    PubMed

    Mihaljević, Bojan; Benavides-Piccione, Ruth; Guerra, Luis; DeFelipe, Javier; Larrañaga, Pedro; Bielza, Concha

    2015-09-01

    A recently introduced pragmatic scheme promises to be a useful catalog of interneuron names. We sought to automatically classify digitally reconstructed interneuronal morphologies according to this scheme. Simultaneously, we sought to discover possible subtypes of these types that might emerge during automatic classification (clustering). We also investigated which morphometric properties were most relevant for this classification. A set of 118 digitally reconstructed interneuronal morphologies classified into the common basket (CB), horse-tail (HT), large basket (LB), and Martinotti (MA) interneuron types by 42 of the world's leading neuroscientists, quantified by five simple morphometric properties of the axon and four of the dendrites. We labeled each neuron with the type most commonly assigned to it by the experts. We then removed this class information for each type separately, and applied semi-supervised clustering to those cells (keeping the others' cluster membership fixed), to assess separation from other types and look for the formation of new groups (subtypes). We performed this same experiment unlabeling the cells of two types at a time, and of half the cells of a single type at a time. The clustering model is a finite mixture of Gaussians which we adapted for the estimation of local (per-cluster) feature relevance. We performed the described experiments on three different subsets of the data, formed according to how many experts agreed on type membership: at least 18 experts (the full data set), at least 21 (73 neurons), and at least 26 (47 neurons). Interneurons with more reliable type labels were classified more accurately. We classified HT cells with 100% accuracy, MA cells with 73% accuracy, and CB and LB cells with 56% and 58% accuracy, respectively. We identified three subtypes of the MA type, one subtype of CB and LB types each, and no subtypes of HT (it was a single, homogeneous type). We got maximum (adapted) Silhouette width and ARI values of 1, 0.83, 0.79, and 0.42, when unlabeling the HT, CB, LB, and MA types, respectively, confirming the quality of the formed cluster solutions. The subtypes identified when unlabeling a single type also emerged when unlabeling two types at a time, confirming their validity. Axonal morphometric properties were more relevant that dendritic ones, with the axonal polar histogram length in the [π, 2π) angle interval being particularly useful. The applied semi-supervised clustering method can accurately discriminate among CB, HT, LB, and MA interneuron types while discovering potential subtypes, and is therefore useful for neuronal classification. The discovery of potential subtypes suggests that some of these types are more heterogeneous that previously thought. Finally, axonal variables seem to be more relevant than dendritic ones for distinguishing among the CB, HT, LB, and MA interneuron types. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Investigation on Constrained Matrix Factorization for Hyperspectral Image Analysis

    DTIC Science & Technology

    2005-07-25

    analysis. Keywords: matrix factorization; nonnegative matrix factorization; linear mixture model ; unsupervised linear unmixing; hyperspectral imagery...spatial resolution permits different materials present in the area covered by a single pixel. The linear mixture model says that a pixel reflectance in...in r. In the linear mixture model , r is considered as the linear mixture of m1, m2, …, mP as nMαr += (1) where n is included to account for

  15. Microstructure and hydrogen bonding in water-acetonitrile mixtures.

    PubMed

    Mountain, Raymond D

    2010-12-16

    The connection of hydrogen bonding between water and acetonitrile in determining the microheterogeneity of the liquid mixture is examined using NPT molecular dynamics simulations. Mixtures for six, rigid, three-site models for acetonitrile and one water model (SPC/E) were simulated to determine the amount of water-acetonitrile hydrogen bonding. Only one of the six acetonitrile models (TraPPE-UA) was able to reproduce both the liquid density and the experimental estimates of hydrogen bonding derived from Raman scattering of the CN stretch band or from NMR quadrupole relaxation measurements. A simple modification of the acetonitrile model parameters for the models that provided poor estimates produced hydrogen-bonding results consistent with experiments for two of the models. Of these, only one of the modified models also accurately determined the density of the mixtures. The self-diffusion coefficient of liquid acetonitrile provided a final winnowing of the modified model and the successful, unmodified model. The unmodified model is provisionally recommended for simulations of water-acetonitrile mixtures.

  16. Hierarchical modeling of cluster size in wildlife surveys

    USGS Publications Warehouse

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  17. General mixture item response models with different item response structures: Exposition with an application to Likert scales.

    PubMed

    Tijmstra, Jesper; Bolsinova, Maria; Jeon, Minjeong

    2018-01-10

    This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample. If researchers are able to provide competing measurement models, this mixture IRT framework may help them deal with some violations of measurement invariance. To illustrate this approach, we consider a two-class mixture model, where a person's responses to Likert-scale items containing a neutral middle category are either modeled using a generalized partial credit model, or through an IRTree model. In the first model, the middle category ("neither agree nor disagree") is taken to be qualitatively similar to the other categories, and is taken to provide information about the person's endorsement. In the second model, the middle category is taken to be qualitatively different and to reflect a nonresponse choice, which is modeled using an additional latent variable that captures a person's willingness to respond. The mixture model is studied using simulation studies and is applied to an empirical example.

  18. Applications of the Simple Multi-Fluid Model to Correlations of the Vapor-Liquid Equilibrium of Refrigerant Mixtures Containing Carbon Dioxide

    NASA Astrophysics Data System (ADS)

    Akasaka, Ryo

    This study presents a simple multi-fluid model for Helmholtz energy equations of state. The model contains only three parameters, whereas rigorous multi-fluid models developed for several industrially important mixtures usually have more than 10 parameters and coefficients. Therefore, the model can be applied to mixtures where experimental data is limited. Vapor-liquid equilibrium (VLE) of the following seven mixtures have been successfully correlated with the model: CO2 + difluoromethane (R-32), CO2 + trifluoromethane (R-23), CO2 + fluoromethane (R-41), CO2 + 1,1,1,2- tetrafluoroethane (R-134a), CO2 + pentafluoroethane (R-125), CO2 + 1,1-difluoroethane (R-152a), and CO2 + dimethyl ether (DME). The best currently available equations of state for the pure refrigerants were used for the correlations. For all mixtures, average deviations in calculated bubble-point pressures from experimental values are within 2%. The simple multi-fluid model will be helpful for design and simulations of heat pumps and refrigeration systems using the mixtures as working fluid.

  19. Influence of cluster–support interactions on reactivity of size-selected Nb xO y clusters

    DOE PAGES

    Nakayama, Miki; Xue, Meng; An, Wei; ...

    2015-04-17

    Size-selected niobium oxide nanoclusters (Nb 3O 5, Nb 3O 7, Nb 4O 7, and Nb 4O 10) were deposited at room temperature onto a Cu(111) surface and a thin film of Cu 2O on Cu(111), and their interfacial electronic interactions and reactivity toward water dissociation were examined. These clusters were specifically chosen to elucidate the effects of the oxidation state of the metal centers; Nb 3O 5 and Nb 4O 7 are the reduced counterparts of Nb 3O 7 and Nb 4O 10, respectively. From two-photon photoemission spectroscopy (2PPE) measurements, we found that the work function increases upon cluster adsorptionmore » in all cases, indicating a negative interfacial dipole moment with the positive end pointing into the surface. The amount of increase was greater for the clusters with more metal centers and higher oxidation state. Additional analysis with DFT calculations of the clusters on Cu(111) indicated that the reduced clusters donate electrons to the substrate, indicating that the intrinsic cluster dipole moment makes a larger contribution to the overall interfacial dipole moment than charge transfer. X-ray photoelectron spectroscopy (XPS) measurements showed that the Nb atoms of Nb 3O 7 and Nb 4O 10 are primarily Nb 5+ on Cu(111), while for the reduced Nb 3O 5 and Nb 4O 7 clusters, a mixture of oxidation states was observed on Cu(111). Temperature-programmed desorption (TPD) experiments with D 2O showed that water dissociation occurred on all systems except for the oxidized Nb 3O 7 and Nb 4O 10 clusters on the Cu 2O film. A comparison of our XPS and TPD results suggests that Nb 5+ cations associated with Nb=O terminal groups act as Lewis acid sites which are key for water binding and subsequent dissociation. TPD measurements of 2-propanol dehydration also show that the clusters active toward water dissociation are indeed acidic. DFT calculations of water dissociation on Nb 3O 7 support our TPD results, but the use of bulk Cu 2O(111) as a model for the Cu 2O film merits future scrutiny in terms of interfacial charge transfer. The combination of our experimental and theoretical results suggests that both Lewis acidity and metal reducibility are important for water dissociation.« less

  20. Quantifying tree mortality in a mixed species woodland using multitemporal high spatial resolution satellite imagery

    USGS Publications Warehouse

    Garrity, Steven R.; Allen, Craig D.; Brumby, Steven P.; Gangodagamage, Chandana; McDowell, Nate G.; Cai, D. Michael

    2013-01-01

    Widespread tree mortality events have recently been observed in several biomes. To effectively quantify the severity and extent of these events, tools that allow for rapid assessment at the landscape scale are required. Past studies using high spatial resolution satellite imagery have primarily focused on detecting green, red, and gray tree canopies during and shortly after tree damage or mortality has occurred. However, detecting trees in various stages of death is not always possible due to limited availability of archived satellite imagery. Here we assess the capability of high spatial resolution satellite imagery for tree mortality detection in a southwestern U.S. mixed species woodland using archived satellite images acquired prior to mortality and well after dead trees had dropped their leaves. We developed a multistep classification approach that uses: supervised masking of non-tree image elements; bi-temporal (pre- and post-mortality) differencing of normalized difference vegetation index (NDVI) and red:green ratio (RGI); and unsupervised multivariate clustering of pixels into live and dead tree classes using a Gaussian mixture model. Classification accuracies were improved in a final step by tuning the rules of pixel classification using the posterior probabilities of class membership obtained from the Gaussian mixture model. Classifications were produced for two images acquired post-mortality with overall accuracies of 97.9% and 98.5%, respectively. Classified images were combined with land cover data to characterize the spatiotemporal characteristics of tree mortality across areas with differences in tree species composition. We found that 38% of tree crown area was lost during the drought period between 2002 and 2006. The majority of tree mortality during this period was concentrated in piñon-juniper (Pinus edulis-Juniperus monosperma) woodlands. An additional 20% of the tree canopy died or was removed between 2006 and 2011, primarily in areas experiencing wildfire and management activity. -Our results demonstrate that unsupervised clustering of bi-temporal NDVI and RGI differences can be used to detect tree mortality resulting from numerous causes and in several forest cover types.

  1. Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.

    PubMed

    Yang, Yang; Saleemi, Imran; Shah, Mubarak

    2013-07-01

    This paper proposes a novel representation of articulated human actions and gestures and facial expressions. The main goals of the proposed approach are: 1) to enable recognition using very few examples, i.e., one or k-shot learning, and 2) meaningful organization of unlabeled datasets by unsupervised clustering. Our proposed representation is obtained by automatically discovering high-level subactions or motion primitives, by hierarchical clustering of observed optical flow in four-dimensional, spatial, and motion flow space. The completely unsupervised proposed method, in contrast to state-of-the-art representations like bag of video words, provides a meaningful representation conducive to visual interpretation and textual labeling. Each primitive action depicts an atomic subaction, like directional motion of limb or torso, and is represented by a mixture of four-dimensional Gaussian distributions. For one--shot and k-shot learning, the sequence of primitive labels discovered in a test video are labeled using KL divergence, and can then be represented as a string and matched against similar strings of training videos. The same sequence can also be collapsed into a histogram of primitives or be used to learn a Hidden Markov model to represent classes. We have performed extensive experiments on recognition by one and k-shot learning as well as unsupervised action clustering on six human actions and gesture datasets, a composite dataset, and a database of facial expressions. These experiments confirm the validity and discriminative nature of the proposed representation.

  2. Data-driven mapping of hypoxia-related tumor heterogeneity using DCE-MRI and OE-MRI.

    PubMed

    Featherstone, Adam K; O'Connor, James P B; Little, Ross A; Watson, Yvonne; Cheung, Sue; Babur, Muhammad; Williams, Kaye J; Matthews, Julian C; Parker, Geoff J M

    2018-04-01

    Previous work has shown that combining dynamic contrast-enhanced (DCE)-MRI and oxygen-enhanced (OE)-MRI binary enhancement maps can identify tumor hypoxia. The current work proposes a novel, data-driven method for mapping tissue oxygenation and perfusion heterogeneity, based on clustering DCE/OE-MRI data. DCE-MRI and OE-MRI were performed on nine U87 (glioblastoma) and seven Calu6 (non-small cell lung cancer) murine xenograft tumors. Area under the curve and principal component analysis features were calculated and clustered separately using Gaussian mixture modelling. Evaluation metrics were calculated to determine the optimum feature set and cluster number. Outputs were quantitatively compared with a previous non data-driven approach. The optimum method located six robustly identifiable clusters in the data, yielding tumor region maps with spatially contiguous regions in a rim-core structure, suggesting a biological basis. Mean within-cluster enhancement curves showed physiologically distinct, intuitive kinetics of enhancement. Regions of DCE/OE-MRI enhancement mismatch were located, and voxel categorization agreed well with the previous non data-driven approach (Cohen's kappa = 0.61, proportional agreement = 0.75). The proposed method locates similar regions to the previous published method of binarization of DCE/OE-MRI enhancement, but renders a finer segmentation of intra-tumoral oxygenation and perfusion. This could aid in understanding the tumor microenvironment and its heterogeneity. Magn Reson Med 79:2236-2245, 2018. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine.

  3. Different Approaches to Covariate Inclusion in the Mixture Rasch Model

    ERIC Educational Resources Information Center

    Li, Tongyun; Jiao, Hong; Macready, George B.

    2016-01-01

    The present study investigates different approaches to adding covariates and the impact in fitting mixture item response theory models. Mixture item response theory models serve as an important methodology for tackling several psychometric issues in test development, including the detection of latent differential item functioning. A Monte Carlo…

  4. A compressibility based model for predicting the tensile strength of directly compressed pharmaceutical powder mixtures.

    PubMed

    Reynolds, Gavin K; Campbell, Jacqueline I; Roberts, Ron J

    2017-10-05

    A new model to predict the compressibility and compactability of mixtures of pharmaceutical powders has been developed. The key aspect of the model is consideration of the volumetric occupancy of each powder under an applied compaction pressure and the respective contribution it then makes to the mixture properties. The compressibility and compactability of three pharmaceutical powders: microcrystalline cellulose, mannitol and anhydrous dicalcium phosphate have been characterised. Binary and ternary mixtures of these excipients have been tested and used to demonstrate the predictive capability of the model. Furthermore, the model is shown to be uniquely able to capture a broad range of mixture behaviours, including neutral, negative and positive deviations, illustrating its utility for formulation design. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Kinetics of binary nucleation of vapors in size and composition space.

    PubMed

    Fisenko, Sergey P; Wilemski, Gerald

    2004-11-01

    We reformulate the kinetic description of binary nucleation in the gas phase using two natural independent variables: the total number of molecules g and the molar composition x of the cluster. The resulting kinetic equation can be viewed as a two-dimensional Fokker-Planck equation describing the simultaneous Brownian motion of the clusters in size and composition space. Explicit expressions for the Brownian diffusion coefficients in cluster size and composition space are obtained. For characterization of binary nucleation in gases three criteria are established. These criteria establish the relative importance of the rate processes in cluster size and composition space for different gas phase conditions and types of liquid mixtures. The equilibrium distribution function of the clusters is determined in terms of the variables g and x. We obtain an approximate analytical solution for the steady-state binary nucleation rate that has the correct limit in the transition to unary nucleation. To further illustrate our description, the nonequilibrium steady-state cluster concentrations are found by numerically solving the reformulated kinetic equation. For the reformulated transient problem, the relaxation or induction time for binary nucleation was calculated using Galerkin's method. This relaxation time is affected by processes in both size and composition space, but the contributions from each process can be separated only approximately.

  6. Thermal and log-normal distributions of plasma in laser driven Coulomb explosions of deuterium clusters

    NASA Astrophysics Data System (ADS)

    Barbarino, M.; Warrens, M.; Bonasera, A.; Lattuada, D.; Bang, W.; Quevedo, H. J.; Consoli, F.; de Angelis, R.; Andreoli, P.; Kimura, S.; Dyer, G.; Bernstein, A. C.; Hagel, K.; Barbui, M.; Schmidt, K.; Gaul, E.; Donovan, M. E.; Natowitz, J. B.; Ditmire, T.

    2016-08-01

    In this work, we explore the possibility that the motion of the deuterium ions emitted from Coulomb cluster explosions is highly disordered enough to resemble thermalization. We analyze the process of nuclear fusion reactions driven by laser-cluster interactions in experiments conducted at the Texas Petawatt laser facility using a mixture of D2+3He and CD4+3He cluster targets. When clusters explode by Coulomb repulsion, the emission of the energetic ions is “nearly” isotropic. In the framework of cluster Coulomb explosions, we analyze the energy distributions of the ions using a Maxwell-Boltzmann (MB) distribution, a shifted MB distribution (sMB), and the energy distribution derived from a log-normal (LN) size distribution of clusters. We show that the first two distributions reproduce well the experimentally measured ion energy distributions and the number of fusions from d-d and d-3He reactions. The LN distribution is a good representation of the ion kinetic energy distribution well up to high momenta where the noise becomes dominant, but overestimates both the neutron and the proton yields. If the parameters of the LN distributions are chosen to reproduce the fusion yields correctly, the experimentally measured high energy ion spectrum is not well represented. We conclude that the ion kinetic energy distribution is highly disordered and practically not distinguishable from a thermalized one.

  7. Classification-based quantitative analysis of stable isotope labeling by amino acids in cell culture (SILAC) data.

    PubMed

    Kim, Seongho; Carruthers, Nicholas; Lee, Joohyoung; Chinni, Sreenivasa; Stemmer, Paul

    2016-12-01

    Stable isotope labeling by amino acids in cell culture (SILAC) is a practical and powerful approach for quantitative proteomic analysis. A key advantage of SILAC is the ability to simultaneously detect the isotopically labeled peptides in a single instrument run and so guarantee relative quantitation for a large number of peptides without introducing any variation caused by separate experiment. However, there are a few approaches available to assessing protein ratios and none of the existing algorithms pays considerable attention to the proteins having only one peptide hit. We introduce new quantitative approaches to dealing with SILAC protein-level summary using classification-based methodologies, such as Gaussian mixture models with EM algorithms and its Bayesian approach as well as K-means clustering. In addition, a new approach is developed using Gaussian mixture model and a stochastic, metaheuristic global optimization algorithm, particle swarm optimization (PSO), to avoid either a premature convergence or being stuck in a local optimum. Our simulation studies show that the newly developed PSO-based method performs the best among others in terms of F1 score and the proposed methods further demonstrate the ability of detecting potential markers through real SILAC experimental data. No matter how many peptide hits the protein has, the developed approach can be applicable, rescuing many proteins doomed to removal. Furthermore, no additional correction for multiple comparisons is necessary for the developed methods, enabling direct interpretation of the analysis outcomes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  8. Extracting Spurious Latent Classes in Growth Mixture Modeling with Nonnormal Errors

    ERIC Educational Resources Information Center

    Guerra-Peña, Kiero; Steinley, Douglas

    2016-01-01

    Growth mixture modeling is generally used for two purposes: (1) to identify mixtures of normal subgroups and (2) to approximate oddly shaped distributions by a mixture of normal components. Often in applied research this methodology is applied to both of these situations indistinctly: using the same fit statistics and likelihood ratio tests. This…

  9. Mercury-induced fragmentation of n-decane and n-undecane in positive mode ion mobility spectrometry.

    PubMed

    Gunzer, F

    2015-09-21

    Ion mobility spectrometry is a well-known technique for trace gas analysis. Using soft ionization techniques, fragmentation of analytes is normally not observed, with the consequence that analyte spectra of single substances are quite simple, i.e. showing in general only one peak. If the concentration is high enough, an extra cluster peak involving two analyte molecules can often be observed. When investigating n-alkanes, different results regarding the number of peaks in the spectra have been obtained in the past using this spectrometric technique. Here we present results obtained when analyzing n-alkanes (n-hexane to n-undecane) with a pulsed electron source, which show no fragmentation or clustering at all. However, when investigating a mixture of mercury and an n-alkane, a situation quite typical in the oil and gas industry, a strong fragmentation and cluster formation involving these fragments has been observed exclusively for n-decane and n-undecane.

  10. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

    PubMed

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.

  11. Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

    PubMed Central

    Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

    2016-01-01

    This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability. PMID:26941699

  12. Single-Photon Ionization Soft-X-Ray Laser Mass Spectrometry of Potential Hydrogen Storage Materials

    NASA Astrophysics Data System (ADS)

    Dong, F.; Bernstein, E. R.; Rocca, J. J.

    A desk-top size capillary discharge 46.9 nm lasear is applied in the gas phase study of nanoclusters. The high photon energy allows for single-photon ionization mass spectrometry with reduced cluster fragmentation. In the present studies, neutral Al m C n and Al m C n H x cluster are investigation for the first time. Single photon ionization through 46.9 nm, 118 nm, 193 nm lasers is used to detect neutral cluster distributions through time of flight mass spectrometry. Al m C n clusters are generated through laser ablation of a mixture of Al and C powders pressed into a disk. An oscillation of the vertical ionization energies (VIEs) of Al m C n clusters is observed in the experiments. The VIEs of Al m C n clusters changes as a function of the numbers of Al and C atoms in the clusters. Al m C n H x clusters are generated through an Al ablation plasma-hydrocarbon reaction, an Al-C ablation plasma reacting with H2 gas, or through cold Al m C n clusters reacting with H2 gas in a fast flow reactor. DFT and ab inito calculations are carried out to explore the structures, IEs, and electronic structures of Al m C n H x clusters. C=C bonds are favored for the lowest energy structures for Al m C n clusters. Be m C n H x are generated through a beryllium ablation plasma-hydrocarbon reaction and detected by single photon ionization of 193 nm laser. Both Al m C n H x and Be m C n H x are considered as potential hydrogen storage materials.

  13. Templated Atom-Precise Galvanic Synthesis and Structure Elucidation of a [Ag24Au(SR)18](-) Nanocluster.

    PubMed

    Bootharaju, Megalamane S; Joshi, Chakra P; Parida, Manas R; Mohammed, Omar F; Bakr, Osman M

    2016-01-18

    Synthesis of atom-precise alloy nanoclusters with uniform composition is challenging when the alloying atoms are similar in size (for example, Ag and Au). A galvanic exchange strategy has been devised to produce a compositionally uniform [Ag24Au(SR)18](-) cluster (SR: thiolate) using a pure [Ag25(SR)18](-) cluster as a template. Conversely, the direct synthesis of Ag24Au cluster leads to a mixture of [Ag(25-x)Au(x)(SR)18](-), x=1-8. Mass spectrometry and crystallography of [Ag24Au(SR)18](-) reveal the presence of the Au heteroatom at the Ag25 center, forming Ag24Au. The successful exchange of the central Ag of Ag25 with Au causes perturbations in the Ag25 crystal structure, which are reflected in the absorption, luminescence, and ambient stability of the particle. These properties are compared with those of Ag25 and Ag24Pd clusters with same ligand and structural framework, providing new insights into the modulation of cluster properties with dopants at the single-atom level. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Solubility modeling of refrigerant/lubricant mixtures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Michels, H.H.; Sienel, T.H.

    1996-12-31

    A general model for predicting the solubility properties of refrigerant/lubricant mixtures has been developed based on applicable theory for the excess Gibbs energy of non-ideal solutions. In our approach, flexible thermodynamic forms are chosen to describe the properties of both the gas and liquid phases of refrigerant/lubricant mixtures. After an extensive study of models for describing non-ideal liquid effects, the Wohl-suffix equations, which have been extensively utilized in the analysis of hydrocarbon mixtures, have been developed into a general form applicable to mixtures where one component is a POE lubricant. In the present study we have analyzed several POEs wheremore » structural and thermophysical property data were available. Data were also collected from several sources on the solubility of refrigerant/lubricant binary pairs. We have developed a computer code (NISC), based on the Wohl model, that predicts dew point or bubble point conditions over a wide range of composition and temperature. Our present analysis covers mixtures containing up to three refrigerant molecules and one lubricant. The present code can be used to analyze the properties of R-410a and R-407c in mixtures with a POE lubricant. Comparisons with other models, such as the Wilson or modified Wilson equations, indicate that the Wohl-suffix equations yield more reliable predictions for HFC/POE mixtures.« less

  15. Adaptation of ammonia-oxidizing microorganisms to environment shift of paddy field soil.

    PubMed

    Ke, Xiubin; Lu, Yahai

    2012-04-01

    Adaptation of microorganisms to the environment is a central theme in microbial ecology. The objective of this study was to investigate the response of ammonia-oxidizing bacteria (AOB) and ammonia-oxidizing archaea (AOA) to a soil medium shift. We employed two rice field soils collected from Beijing and Hangzhou, China. These soils contained distinct AOB communities dominated by Nitrosomonas in Beijing rice soil and Nitrosospira in Hangzhou rice soil. Three mixtures were generated by mixing equal quantities of Beijing soil and Hangzhou soil (BH), Beijing soil with sterilized Hangzhou soil (BSH), and Hangzhou soil with sterilized Beijing soil (HSB). Pure and mixed soils were permanently flooded, and the surface-layer soil where ammonia oxidation occurred was collected to determine the response of AOB and AOA to the soil medium shift. AOB populations increased during the incubation, and the rates were initially faster in Beijing soil than in Hangzhou soil. Nitrosospira (cluster 3a) and Nitrosomonas (communis cluster) increased with time in correspondence with ammonia oxidation in the Hangzhou and Beijing soils, respectively. The 'BH' mixture exhibited a shift from Nitrosomonas at day 0 to Nitrosospira at days 21 and 60 when ammonia oxidation became most active. In 'HSB' and 'BSH' mixtures, Nitrosospira showed greater stimulation than Nitrosomonas, both with and without N amendment. These results suggest that Nitrosospira spp. were better adapted to soil environment shifts than Nitrosomonas. Analysis of the AOA community revealed that the composition of AOA community was not responsive to the soil environment shifts or to nitrogen amendment. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  16. An evaluation of the Bayesian approach to fitting the N-mixture model for use with pseudo-replicated count data

    USGS Publications Warehouse

    Toribo, S.G.; Gray, B.R.; Liang, S.

    2011-01-01

    The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.

  17. Percolation in education and application in the 21st century

    NASA Astrophysics Data System (ADS)

    Adler, Joan; Elfenbaum, Shaked; Sharir, Liran

    2017-03-01

    Percolation, "so simple you could teach it to your wife" (Chuck Newman, last century) is an ideal system to introduce young students to phase transitions. Two recent projects in the Computational Physics group at the Technion make this easy. One is a set of analog models to be mounted on our walls and enable visitors to switch between samples to see which mixtures of glass and metal objects have a percolating current. The second is a website enabling the creation of stereo samples of two and three dimensional clusters (suited for viewing with Oculus rift) on desktops, tablets and smartphones. Although there have been many physical applications for regular percolation in the past, for Bootstrap Percolation, where only sites with sufficient occupied neighbours remain active, there have not been a surfeit of condensed matter applications. We have found that the creation of diamond membranes for quantum computers can be modeled with a bootstrap process of graphitization in diamond, enabling prediction of optimal processing procedures.

  18. T-matrix modeling of linear depolarization by morphologically complex soot and soot-containing aerosols

    NASA Astrophysics Data System (ADS)

    Mishchenko, Michael I.; Liu, Li; Mackowski, Daniel W.

    2013-07-01

    We use state-of-the-art public-domain Fortran codes based on the T-matrix method to calculate orientation and ensemble averaged scattering matrix elements for a variety of morphologically complex black carbon (BC) and BC-containing aerosol particles, with a special emphasis on the linear depolarization ratio (LDR). We explain theoretically the quasi-Rayleigh LDR peak at side-scattering angles typical of low-density soot fractals and conclude that the measurement of this feature enables one to evaluate the compactness state of BC clusters and trace the evolution of low-density fluffy fractals into densely packed aggregates. We show that small backscattering LDRs measured with ground-based, airborne, and spaceborne lidars for fresh smoke generally agree with the values predicted theoretically for fluffy BC fractals and densely packed near-spheroidal BC aggregates. To reproduce higher lidar LDRs observed for aged smoke, one needs alternative particle models such as shape mixtures of BC spheroids or cylinders.

  19. Generalized species sampling priors with latent Beta reinforcements

    PubMed Central

    Airoldi, Edoardo M.; Costa, Thiago; Bassetti, Federico; Leisen, Fabrizio; Guindani, Michele

    2014-01-01

    Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of independent Beta random variables. We compare their theoretical clustering properties with those of the Dirichlet Process and the two parameters Poisson-Dirichlet process. The proposed construction provides a complete characterization of the joint process, differently from existing work. We then propose the use of such process as prior distribution in a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte Carlo sampler for posterior inference. We evaluate the performance of the prior and the robustness of the resulting inference in a simulation study, providing a comparison with popular Dirichlet Processes mixtures and Hidden Markov Models. Finally, we develop an application to the detection of chromosomal aberrations in breast cancer by leveraging array CGH data. PMID:25870462

  20. T-Matrix Modeling of Linear Depolarization by Morphologically Complex Soot and Soot-Containing Aerosols

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Liu, Li; Mackowski, Daniel W.

    2013-01-01

    We use state-of-the-art public-domain Fortran codes based on the T-matrix method to calculate orientation and ensemble averaged scattering matrix elements for a variety of morphologically complex black carbon (BC) and BC-containing aerosol particles, with a special emphasis on the linear depolarization ratio (LDR). We explain theoretically the quasi-Rayleigh LDR peak at side-scattering angles typical of low-density soot fractals and conclude that the measurement of this feature enables one to evaluate the compactness state of BC clusters and trace the evolution of low-density fluffy fractals into densely packed aggregates. We show that small backscattering LDRs measured with groundbased, airborne, and spaceborne lidars for fresh smoke generally agree with the values predicted theoretically for fluffy BC fractals and densely packed near-spheroidal BC aggregates. To reproduce higher lidar LDRs observed for aged smoke, one needs alternative particle models such as shape mixtures of BC spheroids or cylinders.

  1. Process Dissociation and Mixture Signal Detection Theory

    ERIC Educational Resources Information Center

    DeCarlo, Lawrence T.

    2008-01-01

    The process dissociation procedure was developed in an attempt to separate different processes involved in memory tasks. The procedure naturally lends itself to a formulation within a class of mixture signal detection models. The dual process model is shown to be a special case. The mixture signal detection model is applied to data from a widely…

  2. Investigating Approaches to Estimating Covariate Effects in Growth Mixture Modeling: A Simulation Study

    ERIC Educational Resources Information Center

    Li, Ming; Harring, Jeffrey R.

    2017-01-01

    Researchers continue to be interested in efficient, accurate methods of estimating coefficients of covariates in mixture modeling. Including covariates related to the latent class analysis not only may improve the ability of the mixture model to clearly differentiate between subjects but also makes interpretation of latent group membership more…

  3. Finite Mixture Multilevel Multidimensional Ordinal IRT Models for Large Scale Cross-Cultural Research

    ERIC Educational Resources Information Center

    de Jong, Martijn G.; Steenkamp, Jan-Benedict E. M.

    2010-01-01

    We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while…

  4. Approximation of the breast height diameter distribution of two-cohort stands by mixture models I Parameter estimation

    Treesearch

    Rafal Podlaski; Francis A. Roesch

    2013-01-01

    Study assessed the usefulness of various methods for choosing the initial values for the numerical procedures for estimating the parameters of mixture distributions and analysed variety of mixture models to approximate empirical diameter at breast height (dbh) distributions. Two-component mixtures of either the Weibull distribution or the gamma distribution were...

  5. Detection of mastitis in dairy cattle by use of mixture models for repeated somatic cell scores: a Bayesian approach via Gibbs sampling.

    PubMed

    Odegård, J; Jensen, J; Madsen, P; Gianola, D; Klemetsdal, G; Heringstad, B

    2003-11-01

    The distribution of somatic cell scores could be regarded as a mixture of at least two components depending on a cow's udder health status. A heteroscedastic two-component Bayesian normal mixture model with random effects was developed and implemented via Gibbs sampling. The model was evaluated using datasets consisting of simulated somatic cell score records. Somatic cell score was simulated as a mixture representing two alternative udder health statuses ("healthy" or "diseased"). Animals were assigned randomly to the two components according to the probability of group membership (Pm). Random effects (additive genetic and permanent environment), when included, had identical distributions across mixture components. Posterior probabilities of putative mastitis were estimated for all observations, and model adequacy was evaluated using measures of sensitivity, specificity, and posterior probability of misclassification. Fitting different residual variances in the two mixture components caused some bias in estimation of parameters. When the components were difficult to disentangle, so were their residual variances, causing bias in estimation of Pm and of location parameters of the two underlying distributions. When all variance components were identical across mixture components, the mixture model analyses returned parameter estimates essentially without bias and with a high degree of precision. Including random effects in the model increased the probability of correct classification substantially. No sizable differences in probability of correct classification were found between models in which a single cow effect (ignoring relationships) was fitted and models where this effect was split into genetic and permanent environmental components, utilizing relationship information. When genetic and permanent environmental effects were fitted, the between-replicate variance of estimates of posterior means was smaller because the model accounted for random genetic drift.

  6. Associations between dairy cow inter-service interval and probability of conception.

    PubMed

    Remnant, J G; Green, M J; Huxley, J N; Hudson, C D

    2018-07-01

    Recent research has indicated that the interval between inseminations in modern dairy cattle is often longer than the commonly accepted cycle length of 18-24 days. This study analysed 257,396 inseminations in 75,745 cows from 312 herds in England and Wales. The interval between subsequent inseminations in the same cow in the same lactation (inter-service interval, ISI) were calculated and inseminations categorised as successful or unsuccessful depending on whether there was a corresponding calving event. Conception risk was calculated for each individual ISI between 16 and 28 days. A random effects logistic regression model was fitted to the data with pregnancy as the outcome variable and ISI (in days) included in the model as a categorical variable. The modal ISI was 22 days and the peak conception risk was 44% for ISIs of 21 days rising from 27% at 16 days. The logistic regression model revealed significant associations of conception risk with ISI as well as 305 day milk yield, insemination number, parity and days in milk. Predicted conception risk was lower for ISIs of 16, 17 and 18 days and higher for ISIs of 20, 21 and 22 days compared to 25 day ISIs. A mixture model was specified to identify clusters in insemination frequency and conception risk for ISIs between 3 and 50 days. A "high conception risk, high insemination frequency" cluster was identified between 19 and 26 days which indicated that this time period was the true latent distribution for ISI with optimal reproductive outcome. These findings suggest that the period of increased numbers of inseminations around 22 days identified in existing work coincides with the period of increased probability of conception and therefore likely represents true return estrus events. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Identification of treatment responders based on multiple longitudinal outcomes with applications to multiple sclerosis patients.

    PubMed

    Kondo, Yumi; Zhao, Yinshan; Petkau, John

    2017-05-30

    Identification of treatment responders is a challenge in comparative studies where treatment efficacy is measured by multiple longitudinally collected continuous and count outcomes. Existing procedures often identify responders on the basis of only a single outcome. We propose a novel multiple longitudinal outcome mixture model that assumes that, conditionally on a cluster label, each longitudinal outcome is from a generalized linear mixed effect model. We utilize a Monte Carlo expectation-maximization algorithm to obtain the maximum likelihood estimates of our high-dimensional model and classify patients according to their estimated posterior probability of being a responder. We demonstrate the flexibility of our novel procedure on two multiple sclerosis clinical trial datasets with distinct data structures. Our simulation study shows that incorporating multiple outcomes improves the responder identification performance; this can occur even if some of the outcomes are ineffective. Our general procedure facilitates the identification of responders who are comprehensively defined by multiple outcomes from various distributions. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Toxicity of an α-Pore-forming Toxin Depends on the Assembly Mechanism on the Target Membrane as Revealed by Single Molecule Imaging*

    PubMed Central

    Subburaj, Yamunadevi; Ros, Uris; Hermann, Eduard; Tong, Rudi; García-Sáez, Ana J.

    2015-01-01

    α-Pore-forming toxins (α-PFTs) are ubiquitous defense tools that kill cells by opening pores in the target cell membrane. Despite their relevance in host/pathogen interactions, very little is known about the pore stoichiometry and assembly pathway leading to membrane permeabilization. Equinatoxin II (EqtII) is a model α-PFT from sea anemone that oligomerizes and forms pores in sphingomyelin-containing membranes. Here, we determined the spatiotemporal organization of EqtII in living cells by single molecule imaging. Surprisingly, we found that on the cell surface EqtII did not organize into a unique oligomeric form. Instead, it existed as a mixture of oligomeric species mostly including monomers, dimers, tetramers, and hexamers. Mathematical modeling based on our data supported a new model in which toxin clustering happened in seconds and proceeded via condensation of EqtII dimer units formed upon monomer association. Furthermore, altering the pathway of EqtII assembly strongly affected its toxic activity, which highlights the relevance of the assembly mechanism on toxicity. PMID:25525270

  9. Axelrod Model of Social Influence with Cultural Hybridization

    NASA Astrophysics Data System (ADS)

    Radillo-Díaz, Alejandro; Pérez, Luis A.; Del Castillo-Mussot, Marcelo

    2012-10-01

    Since cultural interactions between a pair of social agents involve changes in both individuals, we present simulations of a new model based on Axelrod's homogenization mechanism that includes hybridization or mixture of the agents' features. In this new hybridization model, once a cultural feature of a pair of agents has been chosen for the interaction, the average of the values for this feature is reassigned as the new value for both agents after interaction. Moreover, a parameter representing social tolerance is implemented in order to quantify whether agents are similar enough to engage in interaction, as well as to determine whether they belong to the same cluster of similar agents after the system has reached the frozen state. The transitions from a homogeneous state to a fragmented one decrease in abruptness as tolerance is increased. Additionally, the entropy associated to the system presents a maximum within the transition, the width of which increases as tolerance does. Moreover, a plateau was found inside the transition for a low-tolerance system of agents with only two cultural features.

  10. Discrimination of complex mixtures by a colorimetric sensor array: coffee aromas.

    PubMed

    Suslick, Benjamin A; Feng, Liang; Suslick, Kenneth S

    2010-03-01

    The analysis of complex mixtures presents a difficult challenge even for modern analytical techniques, and the ability to discriminate among closely similar such mixtures often remains problematic. Coffee provides a readily available archetype of such highly multicomponent systems. The use of a low-cost, sensitive colorimetric sensor array for the detection and identification of coffee aromas is reported. The color changes of the sensor array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis (PCA) and hierarchical clustering analysis (HCA). PCA revealed that the sensor array has exceptionally high dimensionality with 18 dimensions required to define 90% of the total variance. In quintuplicate runs of 10 commercial coffees and controls, no confusions or errors in classification by HCA were observed in 55 trials. In addition, the effects of temperature and time in the roasting of green coffee beans were readily observed and distinguishable with a resolution better than 10 degrees C and 5 min, respectively. Colorimetric sensor arrays demonstrate excellent potential for complex systems analysis in real-world applications and provide a novel method for discrimination among closely similar complex mixtures.

  11. Discrimination of Complex Mixtures by a Colorimetric Sensor Array: Coffee Aromas

    PubMed Central

    Suslick, Benjamin A.; Feng, Liang; Suslick, Kenneth S.

    2010-01-01

    The analysis of complex mixtures presents a difficult challenge even for modern analytical techniques, and the ability to discriminate among closely similar such mixtures often remains problematic. Coffee provides a readily available archetype of such highly multicomponent systems. The use of a low-cost, sensitive colorimetric sensor array for the detection and identification of coffee aromas is reported. The color changes of the sensor array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis (PCA) and hierarchical clustering analysis (HCA). PCA revealed that the sensor array has exceptionally high dimensionality with 18 dimensions required to define 90% of the total variance. In quintuplicate runs of 10 commercial coffees and controls, no confusions or errors in classification by HCA were observed in 55 trials. In addition, the effects of temperature and time in the roasting of green coffee beans were readily observed and distinguishable with a resolution better than 10 °C and 5 min, respectively. Colorimetric sensor arrays demonstrate excellent potential for complex systems analysis in real-world applications and provide a novel method for discrimination among closely similar complex mixtures. PMID:20143838

  12. Directional Dark Matter Detector Prototype (Time Projection Chamber)

    NASA Astrophysics Data System (ADS)

    Oliver-Mallory, Kelsey; Garcia-Sciveres, Maurice; Kadyk, John; Lopex-Thibodeaux, Mayra

    2013-04-01

    The time projection chamber is a mature technology that has emerged as a promising candidate for the directional detection of the WIMP particle. In order to utilize this technology in WIMP detection, the operational parameters must be chosen in the non-ideal regime. A prototype WIMP detector with a 10cm field cage, double GEM amplification, and ATLAS FEI3 pixel chip readout was constructed for the purpose of investigating effects of varying gas pressure in different gas mixtures. The rms radii of ionization clusters of photoelectrons caused by X-rays from a Fe-55 source were measured for several gas pressures between 760torr and 99torr in Ar(70)/ CO2(30), CF4, He(80)/Isobutane(20), and He(80)/CF4(20) mixtures. Average radii were determined from distributions of the data for each gas mixture and pressure, and revealed a negative correlation between pressure and radius in Ar(70)/CO2(30) and He(80)/Isobutane(20) mixtures. Investigation of the pressure-radius measurements are in progress using distributions of photoelectron and auger electron practical ranges (Univ. of Pisa) and diffusion, using the Garfield Monte Carlo program.

  13. Enhanced gel formation in binary mixtures of nanocolloids with short-range attraction

    NASA Astrophysics Data System (ADS)

    Harden, James L.; Guo, Hongyu; Bertrand, Martine; Shendruk, Tyler N.; Ramakrishnan, Subramanian; Leheny, Robert L.

    2018-01-01

    Colloidal suspensions transform between fluid and disordered solid states as parameters such as the colloid volume fraction and the strength and nature of the colloidal interactions are varied. Seemingly subtle changes in the characteristics of the colloids can markedly alter the mechanical rigidity and flow behavior of these soft composite materials. This sensitivity creates both a scientific challenge and an opportunity for designing suspensions for specific applications. In this paper, we report a novel mechanism of gel formation in mixtures of weakly attractive nanocolloids with modest size ratio. Employing a combination of x-ray photon correlation spectroscopy, rheometry, and molecular dynamics simulations, we find that gels are stable at remarkably weaker attraction in mixtures with size ratio near two than in the corresponding monodisperse suspensions. In contrast with depletion-driven gelation at larger size ratio, gel formation in the mixtures is triggered by microphase demixing of the species into dense regions of immobile smaller colloids surrounded by clusters of mobile larger colloids that is not predicted by mean-field thermodynamic considerations. These results point to a new route for tailoring nanostructured colloidal solids through judicious combination of interparticle interaction and size distribution.

  14. Modelling diameter distributions of two-cohort forest stands with various proportions of dominant species: a two-component mixture model approach.

    Treesearch

    Rafal Podlaski; Francis Roesch

    2014-01-01

    In recent years finite-mixture models have been employed to approximate and model empirical diameter at breast height (DBH) distributions. We used two-component mixtures of either the Weibull distribution or the gamma distribution for describing the DBH distributions of mixed-species, two-cohort forest stands, to analyse the relationships between the DBH components,...

  15. A general mixture model and its application to coastal sandbar migration simulation

    NASA Astrophysics Data System (ADS)

    Liang, Lixin; Yu, Xiping

    2017-04-01

    A mixture model for general description of sediment laden flows is developed and then applied to coastal sandbar migration simulation. Firstly the mixture model is derived based on the Eulerian-Eulerian approach of the complete two-phase flow theory. The basic equations of the model include the mass and momentum conservation equations for the water-sediment mixture and the continuity equation for sediment concentration. The turbulent motion of the mixture is formulated for the fluid and the particles respectively. A modified k-ɛ model is used to describe the fluid turbulence while an algebraic model is adopted for the particles. A general formulation for the relative velocity between the two phases in sediment laden flows, which is derived by manipulating the momentum equations of the enhanced two-phase flow model, is incorporated into the mixture model. A finite difference method based on SMAC scheme is utilized for numerical solutions. The model is validated by suspended sediment motion in steady open channel flows, both in equilibrium and non-equilibrium state, and in oscillatory flows as well. The computed sediment concentrations, horizontal velocity and turbulence kinetic energy of the mixture are all shown to be in good agreement with experimental data. The mixture model is then applied to the study of sediment suspension and sandbar migration in surf zones under a vertical 2D framework. The VOF method for the description of water-air free surface and topography reaction model is coupled. The bed load transport rate and suspended load entrainment rate are all decided by the sea bed shear stress, which is obtained from the boundary layer resolved mixture model. The simulation results indicated that, under small amplitude regular waves, erosion occurred on the sandbar slope against the wave propagation direction, while deposition dominated on the slope towards wave propagation, indicating an onshore migration tendency. The computation results also shows that the suspended load will also make great contributions to the topography change in the surf zone, which is usually neglected in some previous researches.

  16. Modeling mixtures of thyroid gland function disruptors in a vertebrate alternative model, the zebrafish eleutheroembryo

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thienpont, Benedicte; Barata, Carlos; Raldúa, Demetrio, E-mail: drpqam@cid.csic.es

    2013-06-01

    Maternal thyroxine (T4) plays an essential role in fetal brain development, and even mild and transitory deficits in free-T4 in pregnant women can produce irreversible neurological effects in their offspring. Women of childbearing age are daily exposed to mixtures of chemicals disrupting the thyroid gland function (TGFDs) through the diet, drinking water, air and pharmaceuticals, which has raised the highest concern for the potential additive or synergic effects on the development of mild hypothyroxinemia during early pregnancy. Recently we demonstrated that zebrafish eleutheroembryos provide a suitable alternative model for screening chemicals impairing the thyroid hormone synthesis. The present study usedmore » the intrafollicular T4-content (IT4C) of zebrafish eleutheroembryos as integrative endpoint for testing the hypotheses that the effect of mixtures of TGFDs with a similar mode of action [inhibition of thyroid peroxidase (TPO)] was well predicted by a concentration addition concept (CA) model, whereas the response addition concept (RA) model predicted better the effect of dissimilarly acting binary mixtures of TGFDs [TPO-inhibitors and sodium-iodide symporter (NIS)-inhibitors]. However, CA model provided better prediction of joint effects than RA in five out of the six tested mixtures. The exception being the mixture MMI (TPO-inhibitor)-KClO{sub 4} (NIS-inhibitor) dosed at a fixed ratio of EC{sub 10} that provided similar CA and RA predictions and hence it was difficult to get any conclusive result. There results support the phenomenological similarity criterion stating that the concept of concentration addition could be extended to mixture constituents having common apical endpoints or common adverse outcomes. - Highlights: • Potential synergic or additive effect of mixtures of chemicals on thyroid function. • Zebrafish as alternative model for testing the effect of mixtures of goitrogens. • Concentration addition seems to predict better the effect of mixtures of goitrogens.« less

  17. Dissecting psychiatric spectrum disorders by generative embedding☆☆☆

    PubMed Central

    Brodersen, Kay H.; Deserno, Lorenz; Schlagenhauf, Florian; Lin, Zhihao; Penny, Will D.; Buhmann, Joachim M.; Stephan, Klaas E.

    2013-01-01

    This proof-of-concept study examines the feasibility of defining subgroups in psychiatric spectrum disorders by generative embedding, using dynamical system models which infer neuronal circuit mechanisms from neuroimaging data. To this end, we re-analysed an fMRI dataset of 41 patients diagnosed with schizophrenia and 42 healthy controls performing a numerical n-back working-memory task. In our generative-embedding approach, we used parameter estimates from a dynamic causal model (DCM) of a visual–parietal–prefrontal network to define a model-based feature space for the subsequent application of supervised and unsupervised learning techniques. First, using a linear support vector machine for classification, we were able to predict individual diagnostic labels significantly more accurately (78%) from DCM-based effective connectivity estimates than from functional connectivity between (62%) or local activity within the same regions (55%). Second, an unsupervised approach based on variational Bayesian Gaussian mixture modelling provided evidence for two clusters which mapped onto patients and controls with nearly the same accuracy (71%) as the supervised approach. Finally, when restricting the analysis only to the patients, Gaussian mixture modelling suggested the existence of three patient subgroups, each of which was characterised by a different architecture of the visual–parietal–prefrontal working-memory network. Critically, even though this analysis did not have access to information about the patients' clinical symptoms, the three neurophysiologically defined subgroups mapped onto three clinically distinct subgroups, distinguished by significant differences in negative symptom severity, as assessed on the Positive and Negative Syndrome Scale (PANSS). In summary, this study provides a concrete example of how psychiatric spectrum diseases may be split into subgroups that are defined in terms of neurophysiological mechanisms specified by a generative model of network dynamics such as DCM. The results corroborate our previous findings in stroke patients that generative embedding, compared to analyses of more conventional measures such as functional connectivity or regional activity, can significantly enhance both the interpretability and performance of computational approaches to clinical classification. PMID:24363992

  18. Bayesian spatiotemporal crash frequency models with mixture components for space-time interactions.

    PubMed

    Cheng, Wen; Gill, Gurdiljot Singh; Zhang, Yongping; Cao, Zhong

    2018-03-01

    The traffic safety research has developed spatiotemporal models to explore the variations in the spatial pattern of crash risk over time. Many studies observed notable benefits associated with the inclusion of spatial and temporal correlation and their interactions. However, the safety literature lacks sufficient research for the comparison of different temporal treatments and their interaction with spatial component. This study developed four spatiotemporal models with varying complexity due to the different temporal treatments such as (I) linear time trend; (II) quadratic time trend; (III) Autoregressive-1 (AR-1); and (IV) time adjacency. Moreover, the study introduced a flexible two-component mixture for the space-time interaction which allows greater flexibility compared to the traditional linear space-time interaction. The mixture component allows the accommodation of global space-time interaction as well as the departures from the overall spatial and temporal risk patterns. This study performed a comprehensive assessment of mixture models based on the diverse criteria pertaining to goodness-of-fit, cross-validation and evaluation based on in-sample data for predictive accuracy of crash estimates. The assessment of model performance in terms of goodness-of-fit clearly established the superiority of the time-adjacency specification which was evidently more complex due to the addition of information borrowed from neighboring years, but this addition of parameters allowed significant advantage at posterior deviance which subsequently benefited overall fit to crash data. The Base models were also developed to study the comparison between the proposed mixture and traditional space-time components for each temporal model. The mixture models consistently outperformed the corresponding Base models due to the advantages of much lower deviance. For cross-validation comparison of predictive accuracy, linear time trend model was adjudged the best as it recorded the highest value of log pseudo marginal likelihood (LPML). Four other evaluation criteria were considered for typical validation using the same data for model development. Under each criterion, observed crash counts were compared with three types of data containing Bayesian estimated, normal predicted, and model replicated ones. The linear model again performed the best in most scenarios except one case of using model replicated data and two cases involving prediction without including random effects. These phenomena indicated the mediocre performance of linear trend when random effects were excluded for evaluation. This might be due to the flexible mixture space-time interaction which can efficiently absorb the residual variability escaping from the predictable part of the model. The comparison of Base and mixture models in terms of prediction accuracy further bolstered the superiority of the mixture models as the mixture ones generated more precise estimated crash counts across all four models, suggesting that the advantages associated with mixture component at model fit were transferable to prediction accuracy. Finally, the residual analysis demonstrated the consistently superior performance of random effect models which validates the importance of incorporating the correlation structures to account for unobserved heterogeneity. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Trapping of Li(+) Ions by [ThFn](4-n) Clusters Leading to Oscillating Maxwell-Stefan Diffusivity in the Molten Salt LiF-ThF4.

    PubMed

    Chakraborty, Brahmananda; Kidwai, Sharif; Ramaniah, Lavanya M

    2016-08-18

    A molten salt mixture of lithium fluoride and thorium fluoride (LiF-ThF4) serves as a fuel as well as a coolant in the most sophisticated molten salt reactor (MSR). Here, we report for the first time dynamic correlations, Onsager coefficients, Maxwell-Stefan (MS) diffusivities, and the concentration dependence of density and enthalpy of the molten salt mixture LiF-ThF4 at 1200 K in the composition range of 2-45% ThF4 and also at eutectic composition in the temperature range of 1123-1600 K using Green-Kubo formalism and equilibrium molecular dynamics simulations. We have observed an interesting oscillating pattern for the MS diffusivity for the cation-cation pair, in which ĐLi-Th oscillates between positive and negative values with the amplitude of the oscillation reducing as the system becomes rich in ThF4. Through the velocity autocorrelation function, vibrational density of states, radial distribution function analysis, and structural snapshots, we establish an interplay between the local structure and multicomponent dynamics and predict that formation of negatively charged [ThFn](4-n) clusters at a higher ThF4 mole % makes positively charged Li(+) ions oscillate between different clusters, with their range of motion reducing as the number of [ThFn](4-n) clusters increases, and finally Li(+) ions almost get trapped at a higher ThF4% when the electrostatic force on Li(+) exerted by various surrounding clusters gets balanced. Although reports on variations of density and enthalpy with temperature exist in the literature, for the first time we report variations of the density and enthalpy of LiF-ThF4 with the concentration of ThF4 (mole %) and fit them with the square root function of ThF4 concentration, which will be very useful for experimentalists to obtain data over a range of concentrations from fitting the formula for design purposes. The formation of [ThFn](4-n) clusters and the reduction in the diffusivity of the ions at a higher ThF4% may limit the percentage of ThF4 that can be used in the MSR to optimize the neutron economy.

  20. Classification Comparisons Between Compact Polarimetric and Quad-Pol SAR Imagery

    NASA Astrophysics Data System (ADS)

    Souissi, Boularbah; Doulgeris, Anthony P.; Eltoft, Torbjørn

    2015-04-01

    Recent interest in dual-pol SAR systems has lead to a novel approach, the so-called compact polarimetric imaging mode (CP) which attempts to reconstruct fully polarimetric information based on a few simple assumptions. In this work, the CP image is simulated from the full quad-pol (QP) image. We present here the initial comparison of polarimetric information content between QP and CP imaging modes. The analysis of multi-look polarimetric covariance matrix data uses an automated statistical clustering method based upon the expectation maximization (EM) algorithm for finite mixture modeling, using the complex Wishart probability density function. Our results showed that there are some different characteristics between the QP and CP modes. The classification is demonstrated using a E-SAR and Radarsat2 polarimetric SAR images acquired over DLR Oberpfaffenhofen in Germany and Algiers in Algeria respectively.

Top