Sample records for techniques including clustering

  1. GalWeight: A New and Effective Weighting Technique for Determining Galaxy Cluster and Group Membership

    NASA Astrophysics Data System (ADS)

    Abdullah, Mohamed H.; Wilson, Gillian; Klypin, Anatoly

    2018-07-01

    We introduce GalWeight, a new technique for assigning galaxy cluster membership. This technique is specifically designed to simultaneously maximize the number of bona fide cluster members while minimizing the number of contaminating interlopers. The GalWeight technique can be applied to both massive galaxy clusters and poor galaxy groups. Moreover, it is effective in identifying members in both the virial and infall regions with high efficiency. We apply the GalWeight technique to MDPL2 and Bolshoi N-body simulations, and find that it is >98% accurate in correctly assigning cluster membership. We show that GalWeight compares very favorably against four well-known existing cluster membership techniques (shifting gapper, den Hartog, caustic, SIM). We also apply the GalWeight technique to a sample of 12 Abell clusters (including the Coma cluster) using observations from the Sloan Digital Sky Survey. We conclude by discussing GalWeight’s potential for other astrophysical applications.

  2. Towards Effective Clustering Techniques for the Analysis of Electric Power Grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hogan, Emilie A.; Cotilla Sanchez, Jose E.; Halappanavar, Mahantesh

    2013-11-30

    Clustering is an important data analysis technique with numerous applications in the analysis of electric power grids. Standard clustering techniques are oblivious to the rich structural and dynamic information available for power grids. Therefore, by exploiting the inherent topological and electrical structure in the power grid data, we propose new methods for clustering with applications to model reduction, locational marginal pricing, phasor measurement unit (PMU or synchrophasor) placement, and power system protection. We focus our attention on model reduction for analysis based on time-series information from synchrophasor measurement devices, and spectral techniques for clustering. By comparing different clustering techniques onmore » two instances of realistic power grids we show that the solutions are related and therefore one could leverage that relationship for a computational advantage. Thus, by contrasting different clustering techniques we make a case for exploiting structure inherent in the data with implications for several domains including power systems.« less

  3. Clustering cancer gene expression data by projective clustering ensemble

    PubMed Central

    Yu, Xianxue; Yu, Guoxian

    2017-01-01

    Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data. PMID:28234920

  4. Assessment and application of clustering techniques to atmospheric particle number size distribution for the purpose of source apportionment

    NASA Astrophysics Data System (ADS)

    Salimi, F.; Ristovski, Z.; Mazaheri, M.; Laiman, R.; Crilley, L. R.; He, C.; Clifford, S.; Morawska, L.

    2014-06-01

    Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods which have been recently employed to analyse PNSD data, however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K-means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and silhouette width validation values and the K-means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K-means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectra to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.

  5. Assessment and application of clustering techniques to atmospheric particle number size distribution for the purpose of source apportionment

    NASA Astrophysics Data System (ADS)

    Salimi, F.; Ristovski, Z.; Mazaheri, M.; Laiman, R.; Crilley, L. R.; He, C.; Clifford, S.; Morawska, L.

    2014-11-01

    Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods that have been recently employed to analyse PNSD data; however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and Silhouette width validation values and the K means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectrum to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.

  6. Unsupervised color image segmentation using a lattice algebra clustering technique

    NASA Astrophysics Data System (ADS)

    Urcid, Gonzalo; Ritter, Gerhard X.

    2011-08-01

    In this paper we introduce a lattice algebra clustering technique for segmenting digital images in the Red-Green- Blue (RGB) color space. The proposed technique is a two step procedure. Given an input color image, the first step determines the finite set of its extreme pixel vectors within the color cube by means of the scaled min-W and max-M lattice auto-associative memory matrices, including the minimum and maximum vector bounds. In the second step, maximal rectangular boxes enclosing each extreme color pixel are found using the Chebychev distance between color pixels; afterwards, clustering is performed by assigning each image pixel to its corresponding maximal box. The two steps in our proposed method are completely unsupervised or autonomous. Illustrative examples are provided to demonstrate the color segmentation results including a brief numerical comparison with two other non-maximal variations of the same clustering technique.

  7. Determining the Optimal Number of Clusters with the Clustergram

    NASA Technical Reports Server (NTRS)

    Fluegemann, Joseph K.; Davies, Misty D.; Aguirre, Nathan D.

    2011-01-01

    Cluster analysis aids research in many different fields, from business to biology to aerospace. It consists of using statistical techniques to group objects in large sets of data into meaningful classes. However, this process of ordering data points presents much uncertainty because it involves several steps, many of which are subject to researcher judgment as well as inconsistencies depending on the specific data type and research goals. These steps include the method used to cluster the data, the variables on which the cluster analysis will be operating, the number of resulting clusters, and parts of the interpretation process. In most cases, the number of clusters must be guessed or estimated before employing the clustering method. Many remedies have been proposed, but none is unassailable and certainly not for all data types. Thus, the aim of current research for better techniques of determining the number of clusters is generally confined to demonstrating that the new technique excels other methods in performance for several disparate data types. Our research makes use of a new cluster-number-determination technique based on the clustergram: a graph that shows how the number of objects in the cluster and the cluster mean (the ordinate) change with the number of clusters (the abscissa). We use the features of the clustergram to make the best determination of the cluster-number.

  8. LENR BEC Clusters on and below Wires through Cavitation and Related Techniques

    NASA Astrophysics Data System (ADS)

    Stringham, Roger; Stringham, Julie

    2011-03-01

    During the last two years I have been working on BEC cluster densities deposited just under the surface of wires, using cavitation, and other techniques. If I get the concentration high enough before the clusters dissipate, in addition to cold fusion related excess heat (and other effects, including helium-4 formation) I anticipate that it may be possible to initiate transient forms of superconductivity at room temperature.

  9. Cluster analysis and subgrouping to investigate inter-individual variability to non-invasive brain stimulation: a systematic review.

    PubMed

    Pellegrini, Michael; Zoghi, Maryam; Jaberzadeh, Shapour

    2018-01-12

    Cluster analysis and other subgrouping techniques have risen in popularity in recent years in non-invasive brain stimulation research in the attempt to investigate the issue of inter-individual variability - the issue of why some individuals respond, as traditionally expected, to non-invasive brain stimulation protocols and others do not. Cluster analysis and subgrouping techniques have been used to categorise individuals, based on their response patterns, as responder or non-responders. There is, however, a lack of consensus and consistency on the most appropriate technique to use. This systematic review aimed to provide a systematic summary of the cluster analysis and subgrouping techniques used to date and suggest recommendations moving forward. Twenty studies were included that utilised subgrouping techniques, while seven of these additionally utilised cluster analysis techniques. The results of this systematic review appear to indicate that statistical cluster analysis techniques are effective in identifying subgroups of individuals based on response patterns to non-invasive brain stimulation. This systematic review also reports a lack of consensus amongst researchers on the most effective subgrouping technique and the criteria used to determine whether an individual is categorised as a responder or a non-responder. This systematic review provides a step-by-step guide to carrying out statistical cluster analyses and subgrouping techniques to provide a framework for analysis when developing further insights into the contributing factors of inter-individual variability in response to non-invasive brain stimulation.

  10. Key-Node-Separated Graph Clustering and Layouts for Human Relationship Graph Visualization.

    PubMed

    Itoh, Takayuki; Klein, Karsten

    2015-01-01

    Many graph-drawing methods apply node-clustering techniques based on the density of edges to find tightly connected subgraphs and then hierarchically visualize the clustered graphs. However, users may want to focus on important nodes and their connections to groups of other nodes for some applications. For this purpose, it is effective to separately visualize the key nodes detected based on adjacency and attributes of the nodes. This article presents a graph visualization technique for attribute-embedded graphs that applies a graph-clustering algorithm that accounts for the combination of connections and attributes. The graph clustering step divides the nodes according to the commonality of connected nodes and similarity of feature value vectors. It then calculates the distances between arbitrary pairs of clusters according to the number of connecting edges and the similarity of feature value vectors and finally places the clusters based on the distances. Consequently, the technique separates important nodes that have connections to multiple large clusters and improves the visibility of such nodes' connections. To test this technique, this article presents examples with human relationship graph datasets, including a coauthorship and Twitter communication network dataset.

  11. Chemodynamical Clustering Applied to APOGEE Data: Rediscovering Globular Clusters

    NASA Astrophysics Data System (ADS)

    Chen, Boquan; D’Onghia, Elena; Pardy, Stephen A.; Pasquali, Anna; Bertelli Motta, Clio; Hanlon, Bret; Grebel, Eva K.

    2018-06-01

    We have developed a novel technique based on a clustering algorithm that searches for kinematically and chemically clustered stars in the APOGEE DR12 Cannon data. As compared to classical chemical tagging, the kinematic information included in our methodology allows us to identify stars that are members of known globular clusters with greater confidence. We apply our algorithm to the entire APOGEE catalog of 150,615 stars whose chemical abundances are derived by the Cannon. Our methodology found anticorrelations between the elements Al and Mg, Na and O, and C and N previously identified in the optical spectra in globular clusters, even though we omit these elements in our algorithm. Our algorithm identifies globular clusters without a priori knowledge of their locations in the sky. Thus, not only does this technique promise to discover new globular clusters, but it also allows us to identify candidate streams of kinematically and chemically clustered stars in the Milky Way.

  12. Cluster Analysis in Sociometric Research: A Pattern-Oriented Approach to Identifying Temporally Stable Peer Status Groups of Girls

    ERIC Educational Resources Information Center

    Zettergren, Peter

    2007-01-01

    A modern clustering technique was applied to age-10 and age-13 sociometric data with the purpose of identifying longitudinally stable peer status clusters. The study included 445 girls from a Swedish longitudinal study. The identified temporally stable clusters of rejected, popular, and average girls were essentially larger than corresponding…

  13. Cluster analysis based on dimensional information with applications to feature selection and classification

    NASA Technical Reports Server (NTRS)

    Eigen, D. J.; Fromm, F. R.; Northouse, R. A.

    1974-01-01

    A new clustering algorithm is presented that is based on dimensional information. The algorithm includes an inherent feature selection criterion, which is discussed. Further, a heuristic method for choosing the proper number of intervals for a frequency distribution histogram, a feature necessary for the algorithm, is presented. The algorithm, although usable as a stand-alone clustering technique, is then utilized as a global approximator. Local clustering techniques and configuration of a global-local scheme are discussed, and finally the complete global-local and feature selector configuration is shown in application to a real-time adaptive classification scheme for the analysis of remote sensed multispectral scanner data.

  14. The Measurement of Sulfur Oxidation Products and Their Role in Homogeneous Nucleation

    NASA Technical Reports Server (NTRS)

    Eisele, F. L.

    1999-01-01

    An improved version of a transverse ion source was developed which uses selected ion chemical ionization mass spectrometry techniques inside of a particle nucleation flow tube. These new techniques are very unique, in that the chemical ionization is done inside of the flow tube rather than by having to remove the compounds and clusters of interest which are lost on first contact,with any surfaces. The transverse source is also unique because it allows the ion reaction time to be varied over more than an order of magnitude, which in turn makes possible the separation of ion induced cluster growth from the charging of preexisting molecular clusters. As a result of combining these unique capabilities, the first ever measurements of prenucleation molecular clusters were performed. These clusters are the intermediate stage of growth in the gas-to-particle conversion process. This new technique provides a means of observing clusters containing 2, 3, 4, ... and up to about 8 sulfuric acid molecules, where the critical cluster size under these measurement conditions was about 4 or 5. Thus, the nucleation process can now be directly observed and even growth beyond the critical cluster size can be investigated. The details of this investigation are discussed in a recently submitted paper, which is included as Appendix A. Measurements of the diffusion coefficient of sulfuric acid and sulfuric acid clustered with a water molecule have also been performed. The measurements are also discussed in more detail in another recently submitted paper which is included as Appendix B. The empirical results discussed in both of these papers provide a critical test of present nucleation theories. They also provide new hope for resolving many of the huge discrepancies between field observation and model prediction of particle nucleation. The second part of the research conducted under this project was directed towards the development of new chemical ionization techniques for measuring sulfur oxidation products.

  15. The JCMT Gould Belt Survey: Dense Core Clusters in Orion B

    NASA Astrophysics Data System (ADS)

    Kirk, H.; Johnstone, D.; Di Francesco, J.; Lane, J.; Buckle, J.; Berry, D. S.; Broekhoven-Fiene, H.; Currie, M. J.; Fich, M.; Hatchell, J.; Jenness, T.; Mottram, J. C.; Nutter, D.; Pattle, K.; Pineda, J. E.; Quinn, C.; Salji, C.; Tisi, S.; Hogerheijde, M. R.; Ward-Thompson, D.; The JCMT Gould Belt Survey Team

    2016-04-01

    The James Clerk Maxwell Telescope Gould Belt Legacy Survey obtained SCUBA-2 observations of dense cores within three sub-regions of Orion B: LDN 1622, NGC 2023/2024, and NGC 2068/2071, all of which contain clusters of cores. We present an analysis of the clustering properties of these cores, including the two-point correlation function and Cartwright’s Q parameter. We identify individual clusters of dense cores across all three regions using a minimal spanning tree technique, and find that in each cluster, the most massive cores tend to be centrally located. We also apply the independent M-Σ technique and find a strong correlation between core mass and the local surface density of cores. These two lines of evidence jointly suggest that some amount of mass segregation in clusters has happened already at the dense core stage.

  16. Clustering of financial time series with application to index and enhanced index tracking portfolio

    NASA Astrophysics Data System (ADS)

    Dose, Christian; Cincotti, Silvano

    2005-09-01

    A stochastic-optimization technique based on time series cluster analysis is described for index tracking and enhanced index tracking problems. Our methodology solves the problem in two steps, i.e., by first selecting a subset of stocks and then setting the weight of each stock as a result of an optimization process (asset allocation). Present formulation takes into account constraints on the number of stocks and on the fraction of capital invested in each of them, whilst not including transaction costs. Computational results based on clustering selection are compared to those of random techniques and show the importance of clustering in noise reduction and robust forecasting applications, in particular for enhanced index tracking.

  17. The Effect of Cluster-Based Instruction on Mathematic Achievement in Inclusive Schools

    ERIC Educational Resources Information Center

    Gunarhadi, Sunardi; Anwar, Mohammad; Andayani, Tri Rejeki; Shaari, Abdull Sukor

    2016-01-01

    The research aimed to investigate the effect of Cluster-Based Instruction (CBI) on the academic achievement of Mathematics in inclusive schools. The sample was 68 students in two intact classes, including those with learning disabilities, selected using a cluster random technique among 17 inclusive schools in the regency of Surakarta. The two…

  18. Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2008-01-01

    Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e., random noise) are included in the model. Furthermore, the distribution of the random noise greatly impacts the…

  19. Factor Analysis and Counseling Research

    ERIC Educational Resources Information Center

    Weiss, David J.

    1970-01-01

    Topics discussed include factor analysis versus cluster analysis, analysis of Q correlation matrices, ipsativity and factor analysis, and tests for the significance of a correlation matrix prior to application of factor analytic techniques. Techniques for factor extraction discussed include principal components, canonical factor analysis, alpha…

  20. Mathematical description and program documentation for CLASSY, an adaptive maximum likelihood clustering method

    NASA Technical Reports Server (NTRS)

    Lennington, R. K.; Rassbach, M. E.

    1979-01-01

    Discussed in this report is the clustering algorithm CLASSY, including detailed descriptions of its general structure and mathematical background and of the various major subroutines. The report provides a development of the logic and equations used with specific reference to program variables. Some comments on timing and proposed optimization techniques are included.

  1. A data mining approach to dinoflagellate clustering according to sterol composition: Correlations with evolutionary history.

    USDA-ARS?s Scientific Manuscript database

    This study examined the sterol compositions of 102 dinoflagellates (including several previously unexamined species) using clustering techniques as a means of determining the relatedness of the organisms. In addition, dinoflagellate sterol-based relationships were compared statistically to dinoflag...

  2. Application of Artificial Intelligence For Euler Solutions Clustering

    NASA Astrophysics Data System (ADS)

    Mikhailov, V.; Galdeano, A.; Diament, M.; Gvishiani, A.; Agayan, S.; Bogoutdinov, Sh.; Graeva, E.; Sailhac, P.

    Results of Euler deconvolution strongly depend on the selection of viable solutions. Synthetic calculations using multiple causative sources show that Euler solutions clus- ter in the vicinity of causative bodies even when they do not group densely about perimeter of the bodies. We have developed a clustering technique to serve as a tool for selecting appropriate solutions. The method RODIN, employed in this study, is based on artificial intelligence and was originally designed for problems of classification of large data sets. It is based on a geometrical approach to study object concentration in a finite metric space of any dimension. The method uses a formal definition of cluster and includes free parameters that facilitate the search for clusters of given proper- ties. Test on synthetic and real data showed that the clustering technique successfully outlines causative bodies more accurate than other methods of discriminating Euler solutions. In complicated field cases such as the magnetic field in the Gulf of Saint Malo region (Brittany, France), the method provides geologically insightful solutions. Other advantages of the clustering method application are: - Clusters provide solutions associated with particular bodies or parts of bodies permitting the analysis of different clusters of Euler solutions separately. This may allow computation of average param- eters for individual causative bodies. - Those measurements of the anomalous field that yield clusters also form dense clusters themselves. The application of cluster- ing technique thus outlines areas where the influence of different causative sources is more prominent. This allows one to focus on areas for reinterpretation, using different window sizes, structural indices and so on.

  3. Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles.

    PubMed

    Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F; Perez, Danny

    2017-10-21

    Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.

  4. Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles

    NASA Astrophysics Data System (ADS)

    Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F.; Perez, Danny

    2017-10-01

    Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.

  5. The composite sequential clustering technique for analysis of multispectral scanner data

    NASA Technical Reports Server (NTRS)

    Su, M. Y.

    1972-01-01

    The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.

  6. Technique for fast and efficient hierarchical clustering

    DOEpatents

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  7. Globular Cluster Abundances from High-resolution, Integrated-light Spectroscopy. II. Expanding the Metallicity Range for Old Clusters and Updated Analysis Techniques

    NASA Astrophysics Data System (ADS)

    Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew

    2017-01-01

    We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ˜0.1 dex for GCs with metallicities as high as [Fe/H] = -0.3, but the abundances measured for more metal-rich clusters may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na I, Mg I, Al I, Si I, Ca I, Ti I, Ti II, Sc II, V I, Cr I, Mn I, Co I, Ni I, Cu I, Y II, Zr I, Ba II, La II, Nd II, and Eu II. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe I, Ca I, Si I, Ni I, and Ba II. The elements that show the greatest differences include Mg I and Zr I. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements. This paper includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile.

  8. Observing Globular Cluster RR Lyrae Variables with the BYU West Mountain Observatory

    NASA Astrophysics Data System (ADS)

    Jeffery, E. J.; Joner, M. D.

    2016-06-01

    We have utilized the 0.9-meter telescope of the Brigham Young University West Mountain Observatory to secure data on six northern hemisphere globular clusters. Here we present representative observations of RR Lyrae stars located in these clusters, including light curves. We compare light curves produced using both DAOPHOT and ISIS software packages. Light curve fitting is done with FITLC. We find that for well-separated stars, DAOPHOT and ISIS provide comparable results. However, for stars within the cluster core, ISIS provides superior results. These improved techniques will allow us to better measure the properties of cluster variable stars.

  9. Bottom-up strategies for the assembling of magnetic systems using nanoclusters

    NASA Astrophysics Data System (ADS)

    Dupuis, V.; Hillion, A.; Robert, A.; Loiselet, O.; Khadra, G.; Capiod, P.; Albin, C.; Boisron, O.; Le Roy, D.; Bardotti, L.; Tournus, F.; Tamion, A.

    2018-05-01

    In the frame of the 20th Anniversary of the Journal of Nanoparticle Research (JNR), our aim is to start from the historical context 20 years ago and to give some recent results and perspectives concerning nanomagnets prepared from clusters preformed in the gas phase using the low-energy cluster beam deposition (LECBD) technique. In this paper, we focus our attention on the typical case of Co clusters embedded in various matrices to study interface magnetic anisotropy and magnetic interactions as a function of volume concentrations, and on still current and perspectives through two examples of binary metallic 3d-5d TM (namely CoPt and FeAu) cluster assemblies to illustrate size-related and nanoalloy phenomena on magnetic properties in well-defined mass-selected clusters. The structural and magnetic properties of these cluster assemblies were investigated using various experimental techniques that include high-resolution transmission electron microscopy (HRTEM), superconducting quantum interference device (SQUID) magnetometry, and synchrotron techniques such as extended X-ray absorption fine structure (EXAFS) and X-ray magnetic circular dichroism (XMCD). Depending on the chemical nature of both NPs and matrix, we observe different magnetic responses compared to their bulk counterparts. In particular, we show how finite size effects (size reduction) enhance their magnetic moment and how specific relaxation in nanoalloys can impact their magnetic anisotropy.

  10. An unsupervised classification technique for multispectral remote sensing data.

    NASA Technical Reports Server (NTRS)

    Su, M. Y.; Cummings, R. E.

    1973-01-01

    Description of a two-part clustering technique consisting of (a) a sequential statistical clustering, which is essentially a sequential variance analysis, and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum-likelihood classification techniques.

  11. Estimating the concrete compressive strength using hard clustering and fuzzy clustering based regression techniques.

    PubMed

    Nagwani, Naresh Kumar; Deo, Shirish V

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.

  12. Estimating the Concrete Compressive Strength Using Hard Clustering and Fuzzy Clustering Based Regression Techniques

    PubMed Central

    Nagwani, Naresh Kumar; Deo, Shirish V.

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939

  13. Unsupervised classification of earth resources data.

    NASA Technical Reports Server (NTRS)

    Su, M. Y.; Jayroe, R. R., Jr.; Cummings, R. E.

    1972-01-01

    A new clustering technique is presented. It consists of two parts: (a) a sequential statistical clustering which is essentially a sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by existing supervised maximum liklihood classification technique.

  14. A nonparametric clustering technique which estimates the number of clusters

    NASA Technical Reports Server (NTRS)

    Ramey, D. B.

    1983-01-01

    In applications of cluster analysis, one usually needs to determine the number of clusters, K, and the assignment of observations to each cluster. A clustering technique based on recursive application of a multivariate test of bimodality which automatically estimates both K and the cluster assignments is presented.

  15. Mapping of terrain by computer clustering techniques using multispectral scanner data and using color aerial film

    NASA Technical Reports Server (NTRS)

    Smedes, H. W.; Linnerud, H. J.; Woolaver, L. B.; Su, M. Y.; Jayroe, R. R.

    1972-01-01

    Two clustering techniques were used for terrain mapping by computer of test sites in Yellowstone National Park. One test was made with multispectral scanner data using a composite technique which consists of (1) a strictly sequential statistical clustering which is a sequential variance analysis, and (2) a generalized K-means clustering. In this composite technique, the output of (1) is a first approximation of the cluster centers. This is the input to (2) which consists of steps to improve the determination of cluster centers by iterative procedures. Another test was made using the three emulsion layers of color-infrared aerial film as a three-band spectrometer. Relative film densities were analyzed using a simple clustering technique in three-color space. Important advantages of the clustering technique over conventional supervised computer programs are (1) human intervention, preparation time, and manipulation of data are reduced, (2) the computer map, gives unbiased indication of where best to select the reference ground control data, (3) use of easy to obtain inexpensive film, and (4) the geometric distortions can be easily rectified by simple standard photogrammetric techniques.

  16. Performance analysis of clustering techniques over microarray data: A case study

    NASA Astrophysics Data System (ADS)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  17. Identification of piecewise affine systems based on fuzzy PCA-guided robust clustering technique

    NASA Astrophysics Data System (ADS)

    Khanmirza, Esmaeel; Nazarahari, Milad; Mousavi, Alireza

    2016-12-01

    Hybrid systems are a class of dynamical systems whose behaviors are based on the interaction between discrete and continuous dynamical behaviors. Since a general method for the analysis of hybrid systems is not available, some researchers have focused on specific types of hybrid systems. Piecewise affine (PWA) systems are one of the subsets of hybrid systems. The identification of PWA systems includes the estimation of the parameters of affine subsystems and the coefficients of the hyperplanes defining the partition of the state-input domain. In this paper, we have proposed a PWA identification approach based on a modified clustering technique. By using a fuzzy PCA-guided robust k-means clustering algorithm along with neighborhood outlier detection, the two main drawbacks of the well-known clustering algorithms, i.e., the poor initialization and the presence of outliers, are eliminated. Furthermore, this modified clustering technique enables us to determine the number of subsystems without any prior knowledge about system. In addition, applying the structure of the state-input domain, that is, considering the time sequence of input-output pairs, provides a more efficient clustering algorithm, which is the other novelty of this work. Finally, the proposed algorithm has been evaluated by parameter identification of an IGV servo actuator. Simulation together with experiment analysis has proved the effectiveness of the proposed method.

  18. The use of the temporal scan statistic to detect methicillin-resistant Staphylococcus aureus clusters in a community hospital.

    PubMed

    Faires, Meredith C; Pearl, David L; Ciccotelli, William A; Berke, Olaf; Reid-Smith, Richard J; Weese, J Scott

    2014-07-08

    In healthcare facilities, conventional surveillance techniques using rule-based guidelines may result in under- or over-reporting of methicillin-resistant Staphylococcus aureus (MRSA) outbreaks, as these guidelines are generally unvalidated. The objectives of this study were to investigate the utility of the temporal scan statistic for detecting MRSA clusters, validate clusters using molecular techniques and hospital records, and determine significant differences in the rate of MRSA cases using regression models. Patients admitted to a community hospital between August 2006 and February 2011, and identified with MRSA>48 hours following hospital admission, were included in this study. Between March 2010 and February 2011, MRSA specimens were obtained for spa typing. MRSA clusters were investigated using a retrospective temporal scan statistic. Tests were conducted on a monthly scale and significant clusters were compared to MRSA outbreaks identified by hospital personnel. Associations between the rate of MRSA cases and the variables year, month, and season were investigated using a negative binomial regression model. During the study period, 735 MRSA cases were identified and 167 MRSA isolates were spa typed. Nine different spa types were identified with spa type 2/t002 (88.6%) the most prevalent. The temporal scan statistic identified significant MRSA clusters at the hospital (n=2), service (n=16), and ward (n=10) levels (P ≤ 0.05). Seven clusters were concordant with nine MRSA outbreaks identified by hospital staff. For the remaining clusters, seven events may have been equivalent to true outbreaks and six clusters demonstrated possible transmission events. The regression analysis indicated years 2009-2011, compared to 2006, and months March and April, compared to January, were associated with an increase in the rate of MRSA cases (P ≤ 0.05). The application of the temporal scan statistic identified several MRSA clusters that were not detected by hospital personnel. The identification of specific years and months with increased MRSA rates may be attributable to several hospital level factors including the presence of other pathogens. Within hospitals, the incorporation of the temporal scan statistic to standard surveillance techniques is a valuable tool for healthcare workers to evaluate surveillance strategies and aid in the identification of MRSA clusters.

  19. Clustering P-Wave Receiver Functions To Constrain Subsurface Seismic Structure

    NASA Astrophysics Data System (ADS)

    Chai, C.; Larmat, C. S.; Maceira, M.; Ammon, C. J.; He, R.; Zhang, H.

    2017-12-01

    The acquisition of high-quality data from permanent and temporary dense seismic networks provides the opportunity to apply statistical and machine learning techniques to a broad range of geophysical observations. Lekic and Romanowicz (2011) used clustering analysis on tomographic velocity models of the western United States to perform tectonic regionalization and the velocity-profile clusters agree well with known geomorphic provinces. A complementary and somewhat less restrictive approach is to apply cluster analysis directly to geophysical observations. In this presentation, we apply clustering analysis to teleseismic P-wave receiver functions (RFs) continuing efforts of Larmat et al. (2015) and Maceira et al. (2015). These earlier studies validated the approach with surface waves and stacked EARS RFs from the USArray stations. In this study, we experiment with both the K-means and hierarchical clustering algorithms. We also test different distance metrics defined in the vector space of RFs following Lekic and Romanowicz (2011). We cluster data from two distinct data sets. The first, corresponding to the western US, was by smoothing/interpolation of receiver-function wavefield (Chai et al. 2015). Spatial coherence and agreement with geologic region increase with this simpler, spatially smoothed set of observations. The second data set is composed of RFs for more than 800 stations of the China Digital Seismic Network (CSN). Preliminary results show a first order agreement between clusters and tectonic region and each region cluster includes a distinct Ps arrival, which probably reflects differences in crustal thickness. Regionalization remains an important step to characterize a model prior to application of full waveform and/or stochastic imaging techniques because of the computational expense of these types of studies. Machine learning techniques can provide valuable information that can be used to design and characterize formal geophysical inversion, providing information on spatial variability in the subsurface geology.

  20. Lithium cluster anions: photoelectron spectroscopy and ab initio calculations.

    PubMed

    Alexandrova, Anastassia N; Boldyrev, Alexander I; Li, Xiang; Sarkas, Harry W; Hendricks, Jay H; Arnold, Susan T; Bowen, Kit H

    2011-01-28

    Structural and energetic properties of small, deceptively simple anionic clusters of lithium, Li(n)(-), n = 3-7, were determined using a combination of anion photoelectron spectroscopy and ab initio calculations. The most stable isomers of each of these anions, the ones most likely to contribute to the photoelectron spectra, were found using the gradient embedded genetic algorithm program. Subsequently, state-of-the-art ab initio techniques, including time-dependent density functional theory, coupled cluster, and multireference configurational interactions methods, were employed to interpret the experimental spectra.

  1. On the clustering of multidimensional pictorial data

    NASA Technical Reports Server (NTRS)

    Bryant, J. D. (Principal Investigator)

    1979-01-01

    Obvious approaches to reducing the cost (in computer resources) of applying current clustering techniques to the problem of remote sensing are discussed. The use of spatial information in finding fields and in classifying mixture pixels is examined, and the AMOEBA clustering program is described. Internally, a pattern recognition program, from without, AMOEBA appears to be an unsupervised clustering program. It is fast and automatic. No choices (such as arbitrary thresholds to set split/combine sequences) need be made. The problem of finding the number of clusters is solved automatically. At the conclusion of the program, all points in the scene are classified; however, a provision is included for a reject classification of some points which, within the theoretical framework, cannot rationally be assigned to any cluster.

  2. GLOBULAR CLUSTER ABUNDANCES FROM HIGH-RESOLUTION, INTEGRATED-LIGHT SPECTROSCOPY. II. EXPANDING THE METALLICITY RANGE FOR OLD CLUSTERS AND UPDATED ANALYSIS TECHNIQUES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew

    2017-01-10

    We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ∼0.1 dex for GCs with metallicities as high as [Fe/H] = −0.3, but the abundances measured for more metal-rich clustersmore » may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na i, Mg i, Al i, Si i, Ca i, Ti i, Ti ii, Sc ii, V i, Cr i, Mn i, Co i, Ni i, Cu i, Y ii, Zr i, Ba ii, La ii, Nd ii, and Eu ii. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe i, Ca i, Si i, Ni i, and Ba ii. The elements that show the greatest differences include Mg i and Zr i. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements.« less

  3. Introduction to the JASIST Special Topic Issue on Web Retrieval and Mining: A Machine Learning Perspective.

    ERIC Educational Resources Information Center

    Chen, Hsinchun

    2003-01-01

    Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)

  4. Which modifiable health risk behaviours are related? A systematic review of the clustering of Smoking, Nutrition, Alcohol and Physical activity ('SNAP') health risk factors.

    PubMed

    Noble, Natasha; Paul, Christine; Turon, Heidi; Oldmeadow, Christopher

    2015-12-01

    There is a growing body of literature examining the clustering of health risk behaviours, but little consensus about which risk factors can be expected to cluster for which sub groups of people. This systematic review aimed to examine the international literature on the clustering of smoking, poor nutrition, excess alcohol and physical inactivity (SNAP) health behaviours among adults, including associated socio-demographic variables. A literature search was conducted in May 2014. Studies examining at least two SNAP risk factors, and using a cluster or factor analysis technique, or comparing observed to expected prevalence of risk factor combinations, were included. Fifty-six relevant studies were identified. A majority of studies (81%) reported a 'healthy' cluster characterised by the absence of any SNAP risk factors. More than half of the studies reported a clustering of alcohol with smoking, and half reported clustering of all four SNAP risk factors. The methodological quality of included studies was generally weak to moderate. Males and those with greater social disadvantage showed riskier patterns of behaviours; younger age was less clearly associated with riskier behaviours. Clustering patterns reported here reinforce the need for health promotion interventions to target multiple behaviours, and for such efforts to be specifically designed and accessible for males and those who are socially disadvantaged. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Discrete Wavelet Transform-Based Whole-Spectral and Subspectral Analysis for Improved Brain Tumor Clustering Using Single Voxel MR Spectroscopy.

    PubMed

    Yang, Guang; Nawaz, Tahir; Barrick, Thomas R; Howe, Franklyn A; Slabaugh, Greg

    2015-12-01

    Many approaches have been considered for automatic grading of brain tumors by means of pattern recognition with magnetic resonance spectroscopy (MRS). Providing an improved technique which can assist clinicians in accurately identifying brain tumor grades is our main objective. The proposed technique, which is based on the discrete wavelet transform (DWT) of whole-spectral or subspectral information of key metabolites, combined with unsupervised learning, inspects the separability of the extracted wavelet features from the MRS signal to aid the clustering. In total, we included 134 short echo time single voxel MRS spectra (SV MRS) in our study that cover normal controls, low grade and high grade tumors. The combination of DWT-based whole-spectral or subspectral analysis and unsupervised clustering achieved an overall clustering accuracy of 94.8% and a balanced error rate of 7.8%. To the best of our knowledge, it is the first study using DWT combined with unsupervised learning to cluster brain SV MRS. Instead of dimensionality reduction on SV MRS or feature selection using model fitting, our study provides an alternative method of extracting features to obtain promising clustering results.

  6. Study of atmospheric dynamics and pollution in the coastal area of English Channel using clustering technique

    NASA Astrophysics Data System (ADS)

    Sokolov, Anton; Dmitriev, Egor; Delbarre, Hervé; Augustin, Patrick; Gengembre, Cyril; Fourmenten, Marc

    2016-04-01

    The problem of atmospheric contamination by principal air pollutants was considered in the industrialized coastal region of English Channel in Dunkirk influenced by north European metropolitan areas. MESO-NH nested models were used for the simulation of the local atmospheric dynamics and the online calculation of Lagrangian backward trajectories with 15-minute temporal resolution and the horizontal resolution down to 500 m. The one-month mesoscale numerical simulation was coupled with local pollution measurements of volatile organic components, particulate matter, ozone, sulphur dioxide and nitrogen oxides. Principal atmospheric pathways were determined by clustering technique applied to backward trajectories simulated. Six clusters were obtained which describe local atmospheric dynamics, four winds blowing through the English Channel, one coming from the south, and the biggest cluster with small wind speeds. This last cluster includes mostly sea breeze events. The analysis of meteorological data and pollution measurements allows relating the principal atmospheric pathways with local air contamination events. It was shown that contamination events are mostly connected with a channelling of pollution from local sources and low-turbulent states of the local atmosphere.

  7. Rotation Period of Blanco 1 Members from KELT Light Curves: Comparing Rotation-Ages to Various Stellar Chronometers at 100 Myr

    NASA Astrophysics Data System (ADS)

    Cargile, Phillip; James, D. J.; Pepper, J.; Kuhn, R.; Siverd, R. J.; Stassun, K. G.

    2012-01-01

    The age of a star is one of its most fundamental properties, and yet tragically it is also the one property that is not directly measurable in observations. We must therefore rely on age estimates based on mostly model-dependent or empirical methods. Moreover, there remains a critical need for direct comparison of different age-dating techniques using the same stars analyzed in a consistent fashion. One chronometer commonly being employed is using stellar rotation rates to measure stellar ages, i.e., gyrochronology. Although this technique is one of the better-understood chronometers, its calibration relies heavily on the solar datum, as well as benchmark open clusters with reliable ages, and also lacks a comprehensive comparative analysis to other stellar chronometers. The age of the nearby (? pc) open cluster Blanco 1 has been estimated using various techniques, including being one of only 7 clusters with an LDB age measurement, making it a unique and powerful comparative laboratory for stellar chronometry, including gyrochronology. Here, we present preliminary results from our light-curve analysis of solar-type stars in Blanco 1 in order to identify and measure rotation periods of cluster members. The light-curve data were obtained during the engineering and calibration phase of the KELT-South survey. The large area on the sky and low number of contaminating field stars makes Blanco 1 an ideal target for the extremely wide field and large pixel scale of the KELT telescope. We apply a period-finding technique using the Lomb-Scargle periodogram and FAP statistics to measure significant rotation periods in the KELT-South light curves for confirmed Blanco 1 members. These new rotation periods allow us to test and inform rotation evolution models for stellar ages at ? Myr, determining a rotation-age for Blanco 1 using gyrochronology, and compare this rotation-age to other age measurements for this cluster.

  8. Clustering techniques: measuring the performance of contract service providers.

    PubMed

    Cruz, Antonio Miguel; Perilla, Sandra Patricia Usaquén; Pabón, Nidia Nelly Vanegas

    2010-01-01

    This paper investigates the use of clustering technique to characterize the providers of maintenance services in a health-care institution according to their performance. A characterization of the inventory of equipment from seven pilot areas was carried out first (including 264 medical devices). The characterization study concluded that the inventory on a whole is old [exploitation time (ET)/useful life (UL) average is 0.78] and has high maintenance service costs relative to the original cost of acquisition (service cost /acquisition cost average 8.61%). A monitoring of the performance of maintenance service providers was then conducted. The variables monitored were response time (RT), service time (ST), availability, and turnaround time (TAT). Finally, the study grouped maintenance service providers into clusters according to performance. The study grouped maintenance service providers into the following clusters. Cluster 0: Identified with the best performance, the lowest values of TAT, RT, and ST, with an average TAT value of 1.46 days; Clusters 1 and 2: Identified with the poorest performance, highest values of TAT, RT, and ST, and an average TAT value of 9.79 days; and Cluster 3: Identified by medium-quality performance, intermediate values of TAT, RT, and ST, and an average TAT value of 2.56 days.

  9. Internal Cluster Validation on Earthquake Data in the Province of Bengkulu

    NASA Astrophysics Data System (ADS)

    Rini, D. S.; Novianti, P.; Fransiska, H.

    2018-04-01

    K-means method is an algorithm for cluster n object based on attribute to k partition, where k < n. There is a deficiency of algorithms that is before the algorithm is executed, k points are initialized randomly so that the resulting data clustering can be different. If the random value for initialization is not good, the clustering becomes less optimum. Cluster validation is a technique to determine the optimum cluster without knowing prior information from data. There are two types of cluster validation, which are internal cluster validation and external cluster validation. This study aims to examine and apply some internal cluster validation, including the Calinski-Harabasz (CH) Index, Sillhouette (S) Index, Davies-Bouldin (DB) Index, Dunn Index (D), and S-Dbw Index on earthquake data in the Bengkulu Province. The calculation result of optimum cluster based on internal cluster validation is CH index, S index, and S-Dbw index yield k = 2, DB Index with k = 6 and Index D with k = 15. Optimum cluster (k = 6) based on DB Index gives good results for clustering earthquake in the Bengkulu Province.

  10. IDENTIFICATION OF MEMBERS IN THE CENTRAL AND OUTER REGIONS OF GALAXY CLUSTERS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Serra, Ana Laura; Diaferio, Antonaldo, E-mail: serra@ph.unito.it

    2013-05-10

    The caustic technique measures the mass of galaxy clusters in both their virial and infall regions and, as a byproduct, yields the list of cluster galaxy members. Here we use 100 galaxy clusters with mass M{sub 200} {>=} 10{sup 14} h {sup -1} M{sub Sun} extracted from a cosmological N-body simulation of a {Lambda}CDM universe to test the ability of the caustic technique to identify the cluster galaxy members. We identify the true three-dimensional members as the gravitationally bound galaxies. The caustic technique uses the caustic location in the redshift diagram to separate the cluster members from the interlopers. Wemore » apply the technique to mock catalogs containing 1000 galaxies in the field of view of 12 h {sup -1} Mpc on a side at the cluster location. On average, this sample size roughly corresponds to 180 real galaxy members within 3r{sub 200}, similar to recent redshift surveys of cluster regions. The caustic technique yields a completeness, the fraction of identified true members, f{sub c} = 0.95 {+-} 0.03, within 3r{sub 200}. The contamination, the fraction of interlopers in the observed catalog of members, increases from f{sub i}=0.020{sup +0.046}{sub -0.015} at r{sub 200} to f{sub i}=0.08{sup +0.11}{sub -0.05} at 3r{sub 200}. No other technique for the identification of the members of a galaxy cluster provides such large completeness and small contamination at these large radii. The caustic technique assumes spherical symmetry and the asphericity of the cluster is responsible for most of the spread of the completeness and the contamination. By applying the technique to an approximately spherical system obtained by stacking the individual clusters, the spreads decrease by at least a factor of two. We finally estimate the cluster mass within 3r{sub 200} after removing the interlopers: for individual clusters, the mass estimated with the virial theorem is unbiased and within 30% of the actual mass; this spread decreases to less than 10% for the spherically symmetric stacked cluster.« less

  11. A density-based clustering model for community detection in complex networks

    NASA Astrophysics Data System (ADS)

    Zhao, Xiang; Li, Yantao; Qu, Zehui

    2018-04-01

    Network clustering (or graph partitioning) is an important technique for uncovering the underlying community structures in complex networks, which has been widely applied in various fields including astronomy, bioinformatics, sociology, and bibliometric. In this paper, we propose a density-based clustering model for community detection in complex networks (DCCN). The key idea is to find group centers with a higher density than their neighbors and a relatively large integrated-distance from nodes with higher density. The experimental results indicate that our approach is efficient and effective for community detection of complex networks.

  12. A redshift survey of the strong-lensing cluster ABELL 383

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Geller, Margaret J.; Hwang, Ho Seong; Kurtz, Michael J.

    2014-03-01

    Abell 383 is a famous rich cluster (z = 0.1887) imaged extensively as a basis for intensive strong- and weak-lensing studies. Nonetheless, there are few spectroscopic observations. We enable dynamical analyses by measuring 2360 new redshifts for galaxies with r {sub Petro} ≤ 20.5 and within 50' of the Brightest Cluster Galaxy (BCG; R.A.{sub 2000} = 42.°014125, decl.{sub 2000} = –03.°529228). We apply the caustic technique to identify 275 cluster members within 7 h {sup –1} Mpc of the hierarchical cluster center. The BCG lies within –11 ± 110 km s{sup –1} and 21 ± 56 h {sup –1} kpcmore » of the hierarchical cluster center; the velocity dispersion profile of the BCG appears to be an extension of the velocity dispersion profile based on cluster members. The distribution of cluster members on the sky corresponds impressively with the weak-lensing contours of Okabe et al. especially when the impact of foreground and background structure is included. The values of R {sub 200} = 1.22 ± 0.01 h {sup –1} Mpc and M {sub 200} = (5.07 ± 0.09) × 10{sup 14} h {sup –1} M {sub ☉} obtained by application of the caustic technique agree well with recent completely independent lensing measures. The caustic estimate extends direct measurement of the cluster mass profile to a radius of ∼5 h {sup –1} Mpc.« less

  13. On the Performance of an Algebraic MultigridSolver on Multicore Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baker, A H; Schulz, M; Yang, U M

    2010-04-29

    Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.

  14. Uncertainties in the cluster-cluster correlation function

    NASA Astrophysics Data System (ADS)

    Ling, E. N.; Frenk, C. S.; Barrow, J. D.

    1986-12-01

    The bootstrap resampling technique is applied to estimate sampling errors and significance levels of the two-point correlation functions determined for a subset of the CfA redshift survey of galaxies and a redshift sample of 104 Abell clusters. The angular correlation function for a sample of 1664 Abell clusters is also calculated. The standard errors in xi(r) for the Abell data are found to be considerably larger than quoted 'Poisson errors'. The best estimate for the ratio of the correlation length of Abell clusters (richness class R greater than or equal to 1, distance class D less than or equal to 4) to that of CfA galaxies is 4.2 + 1.4 or - 1.0 (68 percentile error). The enhancement of cluster clustering over galaxy clustering is statistically significant in the presence of resampling errors. The uncertainties found do not include the effects of possible systematic biases in the galaxy and cluster catalogs and could be regarded as lower bounds on the true uncertainty range.

  15. The search for structure - Object classification in large data sets. [for astronomers

    NASA Technical Reports Server (NTRS)

    Kurtz, Michael J.

    1988-01-01

    Research concerning object classifications schemes are reviewed, focusing on large data sets. Classification techniques are discussed, including syntactic, decision theoretic methods, fuzzy techniques, and stochastic and fuzzy grammars. Consideration is given to the automation of MK classification (Morgan and Keenan, 1973) and other problems associated with the classification of spectra. In addition, the classification of galaxies is examined, including the problems of systematic errors, blended objects, galaxy types, and galaxy clusters.

  16. Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis.

    PubMed

    Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T

    2010-01-01

    Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.

  17. Validating clustering of molecular dynamics simulations using polymer models.

    PubMed

    Phillips, Joshua L; Colvin, Michael E; Newsam, Shawn

    2011-11-14

    Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers.

  18. Validating clustering of molecular dynamics simulations using polymer models

    PubMed Central

    2011-01-01

    Background Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers. PMID:22082218

  19. Predicting the points of interaction of small molecules in the NF-κB pathway

    PubMed Central

    2011-01-01

    Background The similarity property principle has been used extensively in drug discovery to identify small compounds that interact with specific drug targets. Here we show it can be applied to identify the interactions of small molecules within the NF-κB signalling pathway. Results Clusters that contain compounds with a predominant interaction within the pathway were created, which were then used to predict the interaction of compounds not included in the clustering analysis. Conclusions The technique successfully predicted the points of interactions of compounds that are known to interact with the NF-κB pathway. The method was also shown to be successful when compounds for which the interaction points were unknown were included in the clustering analysis. PMID:21342508

  20. [Applying the clustering technique for characterising maintenance outsourcing].

    PubMed

    Cruz, Antonio M; Usaquén-Perilla, Sandra P; Vanegas-Pabón, Nidia N; Lopera, Carolina

    2010-06-01

    Using clustering techniques for characterising companies providing health institutions with maintenance services. The study analysed seven pilot areas' equipment inventory (264 medical devices). Clustering techniques were applied using 26 variables. Response time (RT), operation duration (OD), availability and turnaround time (TAT) were amongst the most significant ones. Average biomedical equipment obsolescence value was 0.78. Four service provider clusters were identified: clusters 1 and 3 had better performance, lower TAT, RT and DR values (56 % of the providers coded O, L, C, B, I, S, H, F and G, had 1 to 4 day TAT values:

  1. [Analysis of syndrome discipline of generalized anxiety disorder using data mining techniques].

    PubMed

    Tang, Qi-sheng; Sun, Wen-jun; Qu, Miao; Guo, Dong-fang

    2012-09-01

    To study the use of data mining techniques in analyzing the syndrome discipline of generalized anxiety disorder (GAD). From August 1, 2009 to July 31, 2010, 705 patients with GAD in 10 hospitals of Beijing were investigated over one year. Data mining techniques, such as Bayes net and cluster analysis, were used to analyze the syndrome discipline of GAD. A total of 61 symptoms of GAD were screened out. By using Bayes net, nine syndromes of GAD were abstracted based on the symptoms. Eight syndromes were abstracted by cluster analysis. After screening for duplicate syndromes and combining the experts' experience and traditional Chinese medicine theory, six syndromes of GAD were defined. These included depressed liver qi transforming into fire, phlegm-heat harassing the heart, liver depression and spleen deficiency, heart-kidney non-interaction, dual deficiency of the heart and spleen, and kidney deficiency and liver yang hyperactivity. Based on the results, the draft of Syndrome Diagnostic Criteria for Generalized Anxiety Disorder was developed. Data mining techniques such as Bayes net and cluster analysis have certain future potential for establishing syndrome models and analyzing syndrome discipline, thus they are suitable for the research of syndrome differentiation.

  2. Using Machine Learning Techniques in the Analysis of Oceanographic Data

    NASA Astrophysics Data System (ADS)

    Falcinelli, K. E.; Abuomar, S.

    2017-12-01

    Acoustic Doppler Current Profilers (ADCPs) are oceanographic tools capable of collecting large amounts of current profile data. Using unsupervised machine learning techniques such as principal component analysis, fuzzy c-means clustering, and self-organizing maps, patterns and trends in an ADCP dataset are found. Cluster validity algorithms such as visual assessment of cluster tendency and clustering index are used to determine the optimal number of clusters in the ADCP dataset. These techniques prove to be useful in analysis of ADCP data and demonstrate potential for future use in other oceanographic applications.

  3. Visual cues for data mining

    NASA Astrophysics Data System (ADS)

    Rogowitz, Bernice E.; Rabenhorst, David A.; Gerth, John A.; Kalin, Edward B.

    1996-04-01

    This paper describes a set of visual techniques, based on principles of human perception and cognition, which can help users analyze and develop intuitions about tabular data. Collections of tabular data are widely available, including, for example, multivariate time series data, customer satisfaction data, stock market performance data, multivariate profiles of companies and individuals, and scientific measurements. In our approach, we show how visual cues can help users perform a number of data mining tasks, including identifying correlations and interaction effects, finding clusters and understanding the semantics of cluster membership, identifying anomalies and outliers, and discovering multivariate relationships among variables. These cues are derived from psychological studies on perceptual organization, visual search, perceptual scaling, and color perception. These visual techniques are presented as a complement to the statistical and algorithmic methods more commonly associated with these tasks, and provide an interactive interface for the human analyst.

  4. Cluster analysis and quality assessment of logged water at an irrigation project, eastern Saudi Arabia.

    PubMed

    Hussain, Mahbub; Ahmed, Syed Munaf; Abderrahman, Walid

    2008-01-01

    A multivariate statistical technique, cluster analysis, was used to assess the logged surface water quality at an irrigation project at Al-Fadhley, Eastern Province, Saudi Arabia. The principal idea behind using the technique was to utilize all available hydrochemical variables in the quality assessment including trace elements and other ions which are not considered in conventional techniques for water quality assessments like Stiff and Piper diagrams. Furthermore, the area belongs to an irrigation project where water contamination associated with the use of fertilizers, insecticides and pesticides is expected. This quality assessment study was carried out on a total of 34 surface/logged water samples. To gain a greater insight in terms of the seasonal variation of water quality, 17 samples were collected from both summer and winter seasons. The collected samples were analyzed for a total of 23 water quality parameters including pH, TDS, conductivity, alkalinity, sulfate, chloride, bicarbonate, nitrate, phosphate, bromide, fluoride, calcium, magnesium, sodium, potassium, arsenic, boron, copper, cobalt, iron, lithium, manganese, molybdenum, nickel, selenium, mercury and zinc. Cluster analysis in both Q and R modes was used. Q-mode analysis resulted in three distinct water types for both the summer and winter seasons. Q-mode analysis also showed the spatial as well as temporal variation in water quality. R-mode cluster analysis led to the conclusion that there are two major sources of contamination for the surface/shallow groundwater in the area: fertilizers, micronutrients, pesticides, and insecticides used in agricultural activities, and non-point natural sources.

  5. Evaluation of primary immunization coverage of infants under universal immunization programme in an urban area of bangalore city using cluster sampling and lot quality assurance sampling techniques.

    PubMed

    K, Punith; K, Lalitha; G, Suman; Bs, Pradeep; Kumar K, Jayanth

    2008-07-01

    Is LQAS technique better than cluster sampling technique in terms of resources to evaluate the immunization coverage in an urban area? To assess and compare the lot quality assurance sampling against cluster sampling in the evaluation of primary immunization coverage. Population-based cross-sectional study. Areas under Mathikere Urban Health Center. Children aged 12 months to 23 months. 220 in cluster sampling, 76 in lot quality assurance sampling. Percentages and Proportions, Chi square Test. (1) Using cluster sampling, the percentage of completely immunized, partially immunized and unimmunized children were 84.09%, 14.09% and 1.82%, respectively. With lot quality assurance sampling, it was 92.11%, 6.58% and 1.31%, respectively. (2) Immunization coverage levels as evaluated by cluster sampling technique were not statistically different from the coverage value as obtained by lot quality assurance sampling techniques. Considering the time and resources required, it was found that lot quality assurance sampling is a better technique in evaluating the primary immunization coverage in urban area.

  6. Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications

    PubMed Central

    Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric

    2016-01-01

    Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method. PMID:27212939

  7. The WAGGS project - I. The WiFeS Atlas of Galactic Globular cluster Spectra

    NASA Astrophysics Data System (ADS)

    Usher, Christopher; Pastorello, Nicola; Bellstedt, Sabine; Alabi, Adebusola; Cerulo, Pierluigi; Chevalier, Leonie; Fraser-McKelvie, Amelia; Penny, Samantha; Foster, Caroline; McDermid, Richard M.; Schiavon, Ricardo P.; Villaume, Alexa

    2017-07-01

    We present the WiFeS Atlas of Galactic Globular cluster Spectra, a library of integrated spectra of Milky Way and Local Group globular clusters. We used the WiFeS integral field spectrograph on the Australian National University 2.3 m telescope to observe the central regions of 64 Milky Way globular clusters and 22 globular clusters hosted by the Milky Way's low-mass satellite galaxies. The spectra have wider wavelength coverage (3300-9050 Å) and higher spectral resolution (R = 6800) than existing spectral libraries of Milky Way globular clusters. By including Large and Small Magellanic Cloud star clusters, we extend the coverage of parameter space of existing libraries towards young and intermediate ages. While testing stellar population synthesis models and analysis techniques is the main aim of this library, the observations may also further our understanding of the stellar populations of Local Group globular clusters and make possible the direct comparison of extragalactic globular cluster integrated light observations with well-understood globular clusters in the Milky Way. The integrated spectra are publicly available via the project website.

  8. Cluster-cluster clustering

    NASA Technical Reports Server (NTRS)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C. S.

    1985-01-01

    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales.

  9. The relative impact of baryons and cluster shape on weak lensing mass estimates of galaxy clusters

    NASA Astrophysics Data System (ADS)

    Lee, B. E.; Le Brun, A. M. C.; Haq, M. E.; Deering, N. J.; King, L. J.; Applegate, D.; McCarthy, I. G.

    2018-05-01

    Weak gravitational lensing depends on the integrated mass along the line of sight. Baryons contribute to the mass distribution of galaxy clusters and the resulting mass estimates from lensing analysis. We use the cosmo-OWLS suite of hydrodynamic simulations to investigate the impact of baryonic processes on the bias and scatter of weak lensing mass estimates of clusters. These estimates are obtained by fitting NFW profiles to mock data using MCMC techniques. In particular, we examine the difference in estimates between dark matter-only runs and those including various prescriptions for baryonic physics. We find no significant difference in the mass bias when baryonic physics is included, though the overall mass estimates are suppressed when feedback from AGN is included. For lowest-mass systems for which a reliable mass can be obtained (M200 ≈ 2 × 1014M⊙), we find a bias of ≈-10 per cent. The magnitude of the bias tends to decrease for higher mass clusters, consistent with no bias for the most massive clusters which have masses comparable to those found in the CLASH and HFF samples. For the lowest mass clusters, the mass bias is particularly sensitive to the fit radii and the limits placed on the concentration prior, rendering reliable mass estimates difficult. The scatter in mass estimates between the dark matter-only and the various baryonic runs is less than between different projections of individual clusters, highlighting the importance of triaxiality.

  10. RRW: repeated random walks on genome-scale protein networks for local cluster discovery

    PubMed Central

    Macropol, Kathy; Can, Tolga; Singh, Ambuj K

    2009-01-01

    Background We propose an efficient and biologically sensitive algorithm based on repeated random walks (RRW) for discovering functional modules, e.g., complexes and pathways, within large-scale protein networks. Compared to existing cluster identification techniques, RRW implicitly makes use of network topology, edge weights, and long range interactions between proteins. Results We apply the proposed technique on a functional network of yeast genes and accurately identify statistically significant clusters of proteins. We validate the biological significance of the results using known complexes in the MIPS complex catalogue database and well-characterized biological processes. We find that 90% of the created clusters have the majority of their catalogued proteins belonging to the same MIPS complex, and about 80% have the majority of their proteins involved in the same biological process. We compare our method to various other clustering techniques, such as the Markov Clustering Algorithm (MCL), and find a significant improvement in the RRW clusters' precision and accuracy values. Conclusion RRW, which is a technique that exploits the topology of the network, is more precise and robust in finding local clusters. In addition, it has the added flexibility of being able to find multi-functional proteins by allowing overlapping clusters. PMID:19740439

  11. Adaptive coding of MSS imagery. [Multi Spectral band Scanners

    NASA Technical Reports Server (NTRS)

    Habibi, A.; Samulon, A. S.; Fultz, G. L.; Lumb, D.

    1977-01-01

    A number of adaptive data compression techniques are considered for reducing the bandwidth of multispectral data. They include adaptive transform coding, adaptive DPCM, adaptive cluster coding, and a hybrid method. The techniques are simulated and their performance in compressing the bandwidth of Landsat multispectral images is evaluated and compared using signal-to-noise ratio and classification consistency as fidelity criteria.

  12. Multiscale visual quality assessment for cluster analysis with self-organizing maps

    NASA Astrophysics Data System (ADS)

    Bernard, Jürgen; von Landesberger, Tatiana; Bremm, Sebastian; Schreck, Tobias

    2011-01-01

    Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better understanding the characteristics and relationships among the found clusters. While promising approaches to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with refined parameters, among others. In this work, we present an encompassing suite of visual tools for quality assessment of an important visual cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated system, apply it on experimental data sets, and show its applicability.

  13. Evaluation of Primary Immunization Coverage of Infants Under Universal Immunization Programme in an Urban Area of Bangalore City Using Cluster Sampling and Lot Quality Assurance Sampling Techniques

    PubMed Central

    K, Punith; K, Lalitha; G, Suman; BS, Pradeep; Kumar K, Jayanth

    2008-01-01

    Research Question: Is LQAS technique better than cluster sampling technique in terms of resources to evaluate the immunization coverage in an urban area? Objective: To assess and compare the lot quality assurance sampling against cluster sampling in the evaluation of primary immunization coverage. Study Design: Population-based cross-sectional study. Study Setting: Areas under Mathikere Urban Health Center. Study Subjects: Children aged 12 months to 23 months. Sample Size: 220 in cluster sampling, 76 in lot quality assurance sampling. Statistical Analysis: Percentages and Proportions, Chi square Test. Results: (1) Using cluster sampling, the percentage of completely immunized, partially immunized and unimmunized children were 84.09%, 14.09% and 1.82%, respectively. With lot quality assurance sampling, it was 92.11%, 6.58% and 1.31%, respectively. (2) Immunization coverage levels as evaluated by cluster sampling technique were not statistically different from the coverage value as obtained by lot quality assurance sampling techniques. Considering the time and resources required, it was found that lot quality assurance sampling is a better technique in evaluating the primary immunization coverage in urban area. PMID:19876474

  14. Unsupervised classification of remote multispectral sensing data

    NASA Technical Reports Server (NTRS)

    Su, M. Y.

    1972-01-01

    The new unsupervised classification technique for classifying multispectral remote sensing data which can be either from the multispectral scanner or digitized color-separation aerial photographs consists of two parts: (a) a sequential statistical clustering which is a one-pass sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. Applications of the technique using an IBM-7094 computer on multispectral data sets over Purdue's Flight Line C-1 and the Yellowstone National Park test site have been accomplished. Comparisons between the classification maps by the unsupervised technique and the supervised maximum liklihood technique indicate that the classification accuracies are in agreement.

  15. Characterization of Glutaredoxin Fe-S Cluster-Binding Interactions Using Circular Dichroism Spectroscopy.

    PubMed

    Albetel, Angela-Nadia; Outten, Caryn E

    2018-01-01

    Monothiol glutaredoxins (Grxs) with a conserved Cys-Gly-Phe-Ser (CGFS) active site are iron-sulfur (Fe-S) cluster-binding proteins that interact with a variety of partner proteins and perform crucial roles in iron metabolism including Fe-S cluster transfer, Fe-S cluster repair, and iron signaling. Various analytical and spectroscopic methods are currently being used to monitor and characterize glutaredoxin Fe-S cluster-dependent interactions at the molecular level. The electronic, magnetic, and vibrational properties of the protein-bound Fe-S cluster provide a convenient handle to probe the structure, function, and coordination chemistry of Grx complexes. However, some limitations arise from sample preparation requirements, complexity of individual techniques, or the necessity for combining multiple methods in order to achieve a complete investigation. In this chapter, we focus on the use of UV-visible circular dichroism spectroscopy as a fast and simple initial approach for investigating glutaredoxin Fe-S cluster-dependent interactions. © 2018 Elsevier Inc. All rights reserved.

  16. Structures and stabilities of Al(n) (+), Al(n), and Al(n) (-) (n=13-34) clusters.

    PubMed

    Aguado, Andrés; López, José M

    2009-02-14

    Putative global minima of neutral (Al(n)) and singly charged (Al(n) (+) and Al(n) (-)) aluminum clusters with n=13-34 have been located from first-principles density functional theory structural optimizations. The calculations include spin polarization and employ the generalized gradient approximation of Perdew, Burke, and Ernzerhof to describe exchange-correlation electronic effects. Our results show that icosahedral growth dominates the structures of aluminum clusters for n=13-22. For n=23-34, there is a strong competition between decahedral structures, relaxed fragments of a fcc crystalline lattice (some of them including stacking faults), and hexagonal prismatic structures. For such small cluster sizes, there is no evidence yet for a clear establishment of the fcc atomic packing prevalent in bulk aluminum. The global minimum structure for a given number of atoms depends significantly on the cluster charge for most cluster sizes. An explicit comparison is made with previous theoretical results in the range n=13-30: for n=19, 22, 24, 25, 26, 29, 30 we locate a lower energy structure than previously reported. Sizes n=32, 33 are studied here for the first time by an ab initio technique.

  17. A Multivariate Analysis of Galaxy Cluster Properties

    NASA Astrophysics Data System (ADS)

    Ogle, P. M.; Djorgovski, S.

    1993-05-01

    We have assembled from the literature a data base on on 394 clusters of galaxies, with up to 16 parameters per cluster. They include optical and x-ray luminosities, x-ray temperatures, galaxy velocity dispersions, central galaxy and particle densities, optical and x-ray core radii and ellipticities, etc. In addition, derived quantities, such as the mass-to-light ratios and x-ray gas masses are included. Doubtful measurements have been identified, and deleted from the data base. Our goal is to explore the correlations between these parameters, and interpret them in the framework of our understanding of evolution of clusters and large-scale structure, such as the Gott-Rees scaling hierarchy. Among the simple, monovariate correlations we found, the most significant include those between the optical and x-ray luminosities, x-ray temperatures, cluster velocity dispersions, and central galaxy densities, in various mutual combinations. While some of these correlations have been discussed previously in the literature, generally smaller samples of objects have been used. We will also present the results of a multivariate statistical analysis of the data, including a principal component analysis (PCA). Such an approach has not been used previously for studies of cluster properties, even though it is much more powerful and complete than the simple monovariate techniques which are commonly employed. The observed correlations may lead to powerful constraints for theoretical models of formation and evolution of galaxy clusters. P.M.O. was supported by a Caltech graduate fellowship. S.D. acknowledges a partial support from the NASA contract NAS5-31348 and the NSF PYI award AST-9157412.

  18. Network-based spatial clustering technique for exploring features in regional industry

    NASA Astrophysics Data System (ADS)

    Chou, Tien-Yin; Huang, Pi-Hui; Yang, Lung-Shih; Lin, Wen-Tzu

    2008-10-01

    In the past researches, industrial cluster mainly focused on single or particular industry and less on spatial industrial structure and mutual relations. Industrial cluster could generate three kinds of spillover effects, including knowledge, labor market pooling, and input sharing. In addition, industrial cluster indeed benefits industry development. To fully control the status and characteristics of district industrial cluster can facilitate to improve the competitive ascendancy of district industry. The related researches on industrial spatial cluster were of great significance for setting up industrial policies and promoting district economic development. In this study, an improved model, GeoSOM, that combines DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and SOM (Self-Organizing Map) was developed for analyzing industrial cluster. Different from former distance-based algorithm for industrial cluster, the proposed GeoSOM model can calculate spatial characteristics between firms based on DBSCAN algorithm and evaluate the similarity between firms based on SOM clustering analysis. The demonstrative data sets, the manufacturers around Taichung County in Taiwan, were analyzed for verifying the practicability of the proposed model. The analyzed results indicate that GeoSOM is suitable for evaluating spatial industrial cluster.

  19. Electrical Load Profile Analysis Using Clustering Techniques

    NASA Astrophysics Data System (ADS)

    Damayanti, R.; Abdullah, A. G.; Purnama, W.; Nandiyanto, A. B. D.

    2017-03-01

    Data mining is one of the data processing techniques to collect information from a set of stored data. Every day the consumption of electricity load is recorded by Electrical Company, usually at intervals of 15 or 30 minutes. This paper uses a clustering technique, which is one of data mining techniques to analyse the electrical load profiles during 2014. The three methods of clustering techniques were compared, namely K-Means (KM), Fuzzy C-Means (FCM), and K-Means Harmonics (KHM). The result shows that KHM is the most appropriate method to classify the electrical load profile. The optimum number of clusters is determined using the Davies-Bouldin Index. By grouping the load profile, the demand of variation analysis and estimation of energy loss from the group of load profile with similar pattern can be done. From the group of electric load profile, it can be known cluster load factor and a range of cluster loss factor that can help to find the range of values of coefficients for the estimated loss of energy without performing load flow studies.

  20. Collision-induced dissociation of protonated water clusters

    NASA Astrophysics Data System (ADS)

    Berthias, F.; Buridon, V.; Abdoul-Carime, H.; Farizon, B.; Farizon, M.; Dinh, P. M.; Reinhard, P.-G.; Suraud, E.; Märk, T. D.

    2014-06-01

    Collision-induced dissociation (CID) has been studied for protonated water clusters H+(H2O)n, with n = 2-8, colliding with argon atoms at a laboratory energy of 8 keV. The experimental data have been taken with an apparatus (Device for Irradiation of Molecular Clusters, `Dispositif d'Irradiation d'Agrégats Moléculaire,' DIAM) that has been recently constructed at the Institut de Physique Nucléaire de Lyon. It includes an event-by-event mass spectrometry detection technique, COINTOF (correlated ion and neutral fragment time of flight). The latter device allows, for each collision event, to detect and identify in a correlated manner all produced neutral and charged fragments. For all the studied cluster ions, it has allowed us to identify branching ratios for the loss of i = 1 to i = n water molecules, leading to fragment ions ranging from H+(H2O)i=n-1 all the way down to the production of protons. Using a corresponding calibration technique we determine total charged fragment production cross sections for incident protonated water clusters H+(H2O)n, with n = 2-7. Observed trends for branching ratios and cross sections, and a comparison with earlier data on measured attenuation cross sections for water clusters colliding with other noble gases (He and Xe), give insight into the underlying dissociation mechanisms.

  1. Approaches to answering critical CER questions.

    PubMed

    Kinnier, Christine V; Chung, Jeanette W; Bilimoria, Karl Y

    2015-01-01

    While randomized controlled trials (RCTs) are the gold standard for research, many research questions cannot be ethically and practically answered using an RCT. Comparative effectiveness research (CER) techniques are often better suited than RCTs to address the effects of an intervention under routine care conditions, an outcome otherwise known as effectiveness. CER research techniques covered in this section include: effectiveness-oriented experimental studies such as pragmatic trials and cluster randomized trials, treatment response heterogeneity, observational and database studies including adjustment techniques such as sensitivity analysis and propensity score analysis, systematic reviews and meta-analysis, decision analysis, and cost effectiveness analysis. Each section describes the technique and covers the strengths and weaknesses of the approach.

  2. A robust multilevel simultaneous eigenvalue solver

    NASA Technical Reports Server (NTRS)

    Costiner, Sorin; Taasan, Shlomo

    1993-01-01

    Multilevel (ML) algorithms for eigenvalue problems are often faced with several types of difficulties such as: the mixing of approximated eigenvectors by the solution process, the approximation of incomplete clusters of eigenvectors, the poor representation of solution on coarse levels, and the existence of close or equal eigenvalues. Algorithms that do not treat appropriately these difficulties usually fail, or their performance degrades when facing them. These issues motivated the development of a robust adaptive ML algorithm which treats these difficulties, for the calculation of a few eigenvectors and their corresponding eigenvalues. The main techniques used in the new algorithm include: the adaptive completion and separation of the relevant clusters on different levels, the simultaneous treatment of solutions within each cluster, and the robustness tests which monitor the algorithm's efficiency and convergence. The eigenvectors' separation efficiency is based on a new ML projection technique generalizing the Rayleigh Ritz projection, combined with a technique, the backrotations. These separation techniques, when combined with an FMG formulation, in many cases lead to algorithms of O(qN) complexity, for q eigenvectors of size N on the finest level. Previously developed ML algorithms are less focused on the mentioned difficulties. Moreover, algorithms which employ fine level separation techniques are of O(q(sub 2)N) complexity and usually do not overcome all these difficulties. Computational examples are presented where Schrodinger type eigenvalue problems in 2-D and 3-D, having equal and closely clustered eigenvalues, are solved with the efficiency of the Poisson multigrid solver. A second order approximation is obtained in O(qN) work, where the total computational work is equivalent to only a few fine level relaxations per eigenvector.

  3. Mapping patient safety: a large-scale literature review using bibliometric visualisation techniques.

    PubMed

    Rodrigues, S P; van Eck, N J; Waltman, L; Jansen, F W

    2014-03-13

    The amount of scientific literature available is often overwhelming, making it difficult for researchers to have a good overview of the literature and to see relations between different developments. Visualisation techniques based on bibliometric data are helpful in obtaining an overview of the literature on complex research topics, and have been applied here to the topic of patient safety (PS). On the basis of title words and citation relations, publications in the period 2000-2010 related to PS were identified in the Scopus bibliographic database. A visualisation of the most frequently cited PS publications was produced based on direct and indirect citation relations between publications. Terms were extracted from titles and abstracts of the publications, and a visualisation of the most important terms was created. The main PS-related topics studied in the literature were identified using a technique for clustering publications and terms. A total of 8480 publications were identified, of which the 1462 most frequently cited ones were included in the visualisation. The publications were clustered into 19 clusters, which were grouped into three categories: (1) magnitude of PS problems (42% of all included publications); (2) PS risk factors (31%) and (3) implementation of solutions (19%). In the visualisation of PS-related terms, five clusters were identified: (1) medication; (2) measuring harm; (3) PS culture; (4) physician; (5) training, education and communication. Both analysis at publication and term level indicate an increasing focus on risk factors. A bibliometric visualisation approach makes it possible to analyse large amounts of literature. This approach is very useful for improving one's understanding of a complex research topic such as PS and for suggesting new research directions or alternative research priorities. For PS research, the approach suggests that more research on implementing PS improvement initiatives might be needed.

  4. Thermal wake/vessel detection technique

    DOEpatents

    Roskovensky, John K [Albuquerque, NM; Nandy, Prabal [Albuquerque, NM; Post, Brian N [Albuquerque, NM

    2012-01-10

    A computer-automated method for detecting a vessel in water based on an image of a portion of Earth includes generating a thermal anomaly mask. The thermal anomaly mask flags each pixel of the image initially deemed to be a wake pixel based on a comparison of a thermal value of each pixel against other thermal values of other pixels localized about each pixel. Contiguous pixels flagged by the thermal anomaly mask are grouped into pixel clusters. A shape of each of the pixel clusters is analyzed to determine whether each of the pixel clusters represents a possible vessel detection event. The possible vessel detection events are represented visually within the image.

  5. Quantum annealing for combinatorial clustering

    NASA Astrophysics Data System (ADS)

    Kumar, Vaibhaw; Bass, Gideon; Tomlin, Casey; Dulny, Joseph

    2018-02-01

    Clustering is a powerful machine learning technique that groups "similar" data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver "qbsolv." The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.

  6. The relative vertex clustering value - a new criterion for the fast discovery of functional modules in protein interaction networks

    PubMed Central

    2015-01-01

    Background Cellular processes are known to be modular and are realized by groups of proteins implicated in common biological functions. Such groups of proteins are called functional modules, and many community detection methods have been devised for their discovery from protein interaction networks (PINs) data. In current agglomerative clustering approaches, vertices with just a very few neighbors are often classified as separate clusters, which does not make sense biologically. Also, a major limitation of agglomerative techniques is that their computational efficiency do not scale well to large PINs. Finally, PIN data obtained from large scale experiments generally contain many false positives, and this makes it hard for agglomerative clustering methods to find the correct clusters, since they are known to be sensitive to noisy data. Results We propose a local similarity premetric, the relative vertex clustering value, as a new criterion allowing to decide when a node can be added to a given node's cluster and which addresses the above three issues. Based on this criterion, we introduce a novel and very fast agglomerative clustering technique, FAC-PIN, for discovering functional modules and protein complexes from a PIN data. Conclusions Our proposed FAC-PIN algorithm is applied to nine PIN data from eight different species including the yeast PIN, and the identified functional modules are validated using Gene Ontology (GO) annotations from DAVID Bioinformatics Resources. Identified protein complexes are also validated using experimentally verified complexes. Computational results show that FAC-PIN can discover functional modules or protein complexes from PINs more accurately and more efficiently than HC-PIN and CNM, the current state-of-the-art approaches for clustering PINs in an agglomerative manner. PMID:25734691

  7. Interpretative Communities in Conflict: A Master Syllabus for Political Communication.

    ERIC Educational Resources Information Center

    Smith, Craig Allen

    1992-01-01

    Advocates the interpretive communities approach to teaching political communication. Discusses philosophical issues in the teaching of political communication courses, and pedagogical techniques (including concepts versus cases, clustering examples, C-SPAN video examples, and simulations and games). (SR)

  8. Towards the use of computationally inserted lesions for mammographic CAD assessment

    NASA Astrophysics Data System (ADS)

    Ghanian, Zahra; Pezeshk, Aria; Petrick, Nicholas; Sahiner, Berkman

    2018-03-01

    Computer-aided detection (CADe) devices used for breast cancer detection on mammograms are typically first developed and assessed for a specific "original" acquisition system, e.g., a specific image detector. When CADe developers are ready to apply their CADe device to a new mammographic acquisition system, they typically assess the CADe device with images acquired using the new system. Collecting large repositories of clinical images containing verified cancer locations and acquired by the new image acquisition system is costly and time consuming. Our goal is to develop a methodology to reduce the clinical data burden in the assessment of a CADe device for use with a different image acquisition system. We are developing an image blending technique that allows users to seamlessly insert lesions imaged using an original acquisition system into normal images or regions acquired with a new system. In this study, we investigated the insertion of microcalcification clusters imaged using an original acquisition system into normal images acquired with that same system utilizing our previously-developed image blending technique. We first performed a reader study to assess whether experienced observers could distinguish between computationally inserted and native clusters. For this purpose, we applied our insertion technique to clinical cases taken from the University of South Florida Digital Database for Screening Mammography (DDSM) and the Breast Cancer Digital Repository (BCDR). Regions of interest containing microcalcification clusters from one breast of a patient were inserted into the contralateral breast of the same patient. The reader study included 55 native clusters and their 55 inserted counterparts. Analysis of the reader ratings using receiver operating characteristic (ROC) methodology indicated that inserted clusters cannot be reliably distinguished from native clusters (area under the ROC curve, AUC=0.58±0.04). Furthermore, CADe sensitivity was evaluated on mammograms with native and inserted microcalcification clusters using a commercial CADe system. For this purpose, we used full field digital mammograms (FFDMs) from 68 clinical cases, acquired at the University of Michigan Health System. The average sensitivities for native and inserted clusters were equal, 85.3% (58/68). These results demonstrate the feasibility of using the inserted microcalcification clusters for assessing mammographic CAD devices.

  9. Model-based Clustering of High-Dimensional Data in Astrophysics

    NASA Astrophysics Data System (ADS)

    Bouveyron, C.

    2016-05-01

    The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream. Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. The recent developments in model-based classification overcome these drawbacks and allow to efficiently classify high-dimensional data, even in the "small n / large p" situation. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection. The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.

  10. Image processing for x-ray inspection of pistachio nuts

    NASA Astrophysics Data System (ADS)

    Casasent, David P.

    2001-03-01

    A review is provided of image processing techniques that have been applied to the inspection of pistachio nuts using X-ray images. X-ray sensors provide non-destructive internal product detail not available from other sensors. The primary concern in this data is detecting the presence of worm infestations in nuts, since they have been linked to the presence of aflatoxin. We describe new techniques for segmentation, feature selection, selection of product categories (clusters), classifier design, etc. Specific novel results include: a new segmentation algorithm to produce images of isolated product items; preferable classifier operation (the classifier with the best probability of correct recognition Pc is not best); higher-order discrimination information is present in standard features (thus, high-order features appear useful); classifiers that use new cluster categories of samples achieve improved performance. Results are presented for X-ray images of pistachio nuts; however, all techniques have use in other product inspection applications.

  11. Understanding carbohydrate-carbohydrate interactions by means of glyconanotechnology.

    PubMed

    de la Fuente, Jesus M; Penadés, Soledad

    2004-01-01

    Carbohydrate-carbohydrate interaction is a reliable and versatile mechanism for cell adhesion and recognition. Glycosphingolipid (GSL) clusters at the cell membrane are mainly involved in this interaction. To investigate carbohydrate-carbohydrate interaction an integrated strategy (Glyconanotechnology) was developed. This strategy includes polyvalent tools (gold glyconanoparticles) mimicking GSL clustering at the cell membrane as well as analytical techniques such as AFM, TEM, and SPR to evaluate the interactions. The results obtained by means of this strategy and current status are presented.

  12. A hybrid algorithm for clustering of time series data based on affinity search technique.

    PubMed

    Aghabozorgi, Saeed; Ying Wah, Teh; Herawan, Tutut; Jalab, Hamid A; Shaygan, Mohammad Amin; Jalali, Alireza

    2014-01-01

    Time series clustering is an important solution to various problems in numerous fields of research, including business, medical science, and finance. However, conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several systems. In this paper, a new hybrid clustering algorithm is proposed based on the similarity in shape of time series data. Time series data are first grouped as subclusters based on similarity in time. The subclusters are then merged using the k-Medoids algorithm based on similarity in shape. This model has two contributions: (1) it is more accurate than other conventional and hybrid approaches and (2) it determines the similarity in shape among time series data with a low complexity. To evaluate the accuracy of the proposed model, the model is tested extensively using syntactic and real-world time series datasets.

  13. A Hybrid Algorithm for Clustering of Time Series Data Based on Affinity Search Technique

    PubMed Central

    Aghabozorgi, Saeed; Ying Wah, Teh; Herawan, Tutut; Jalab, Hamid A.; Shaygan, Mohammad Amin; Jalali, Alireza

    2014-01-01

    Time series clustering is an important solution to various problems in numerous fields of research, including business, medical science, and finance. However, conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several systems. In this paper, a new hybrid clustering algorithm is proposed based on the similarity in shape of time series data. Time series data are first grouped as subclusters based on similarity in time. The subclusters are then merged using the k-Medoids algorithm based on similarity in shape. This model has two contributions: (1) it is more accurate than other conventional and hybrid approaches and (2) it determines the similarity in shape among time series data with a low complexity. To evaluate the accuracy of the proposed model, the model is tested extensively using syntactic and real-world time series datasets. PMID:24982966

  14. Users matter : multi-agent systems model of high performance computing cluster users.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    North, M. J.; Hood, C. S.; Decision and Information Sciences

    2005-01-01

    High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the aggregate behavior of clusters is poorly understood. The difficulties arise from complicated interactions between cluster components during operation. These interactions have been studied by many researchers, some of whom have identified the need for holistic multi-scale modeling that simultaneously includes network level, operating system level, process level, and user level behaviors. Each of these levels presents its own modeling challenges, but the user level is the most complex duemore » to the adaptability of human beings. In this vein, there are several major user modeling goals, namely descriptive modeling, predictive modeling and automated weakness discovery. This study shows how multi-agent techniques were used to simulate a large-scale computing cluster at each of these levels.« less

  15. Finding clusters of similar events within clinical incident reports: a novel methodology combining case based reasoning and information retrieval

    PubMed Central

    Tsatsoulis, C; Amthauer, H

    2003-01-01

    A novel methodological approach for identifying clusters of similar medical incidents by analyzing large databases of incident reports is described. The discovery of similar events allows the identification of patterns and trends, and makes possible the prediction of future events and the establishment of barriers and best practices. Two techniques from the fields of information science and artificial intelligence have been integrated—namely, case based reasoning and information retrieval—and very good clustering accuracies have been achieved on a test data set of incident reports from transfusion medicine. This work suggests that clustering should integrate the features of an incident captured in traditional form based records together with the detailed information found in the narrative included in event reports. PMID:14645892

  16. XCluSim: a visual analytics tool for interactively comparing multiple clustering results of bioinformatics data

    PubMed Central

    2015-01-01

    Background Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious. Results In this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician. Conclusions Compared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results. PMID:26328893

  17. Technical support for creating an artificial intelligence system for feature extraction and experimental design

    NASA Technical Reports Server (NTRS)

    Glick, B. J.

    1985-01-01

    Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.

  18. Spot detection and image segmentation in DNA microarray data.

    PubMed

    Qin, Li; Rueda, Luis; Ali, Adnan; Ngom, Alioune

    2005-01-01

    Following the invention of microarrays in 1994, the development and applications of this technology have grown exponentially. The numerous applications of microarray technology include clinical diagnosis and treatment, drug design and discovery, tumour detection, and environmental health research. One of the key issues in the experimental approaches utilising microarrays is to extract quantitative information from the spots, which represent genes in a given experiment. For this process, the initial stages are important and they influence future steps in the analysis. Identifying the spots and separating the background from the foreground is a fundamental problem in DNA microarray data analysis. In this review, we present an overview of state-of-the-art methods for microarray image segmentation. We discuss the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. We analytically show that clustering-based techniques are equivalent to the one-dimensional, standard k-means clustering algorithm that utilises the Euclidean distance.

  19. Molecular Clusters: Nanoscale Building Blocks for Solid-State Materials.

    PubMed

    Pinkard, Andrew; Champsaur, Anouck M; Roy, Xavier

    2018-04-17

    The programmed assembly of nanoscale building blocks into multicomponent hierarchical structures is a powerful strategy for the bottom-up construction of functional materials. To develop this concept, our team has explored the use of molecular clusters as superatomic building blocks to fabricate new classes of materials. The library of molecular clusters is rich with exciting properties, including diverse functionalization, redox activity, and magnetic ordering, so the resulting cluster-assembled solids, which we term superatomic crystals (SACs), hold the promise of high tunability, atomic precision, and robust architectures among a diverse range of other material properties. Molecular clusters have only seldom been used as precursors for functional materials. Our team has been at the forefront of new developments in this exciting research area, and this Account focuses on our progress toward designing materials from cluster-based precursors. In particular, this Account discusses (1) the design and synthesis of molecular cluster superatomic building blocks, (2) their self-assembly into SACs, and (3) their resulting collective properties. The set of molecular clusters discussed herein is diverse, with different cluster cores and ligand arrangements to create an impressive array of solids. The cluster cores include octahedral M 6 E 8 and cubane M 4 E 4 (M = metal; E = chalcogen), which are typically passivated by a shell of supporting ligands, a feature upon which we have expanded upon by designing and synthesizing more exotic ligands that can be used to direct solid-state assembly. Building from this library, we have designed whole families of binary SACs where the building blocks are held together through electrostatic, covalent, or van der Waals interactions. Using single-crystal X-ray diffraction (SCXRD) to determine the atomic structure, a remarkable range of compositional variability is accessible. We can also use this technique, in tandem with vibrational spectroscopy, to ascertain features about the constituent superatomic building blocks, such as the charge of the cluster cores, by analysis of bond distances from the SCXRD data. The combination of atomic precision and intercluster interactions in these SACs produces novel collective properties, including tunable electrical transport, crystalline thermal conductivity, and ferromagnetism. In addition, we have developed a synthetic strategy to insert redox-active guests into the superstructure of SACs via single-crystal-to-single-crystal intercalation. This intercalation process allows us to tune the optical and electrical transport properties of the superatomic crystal host. These properties are explored using a host of techniques, including Raman spectroscopy, SQUID magnetometry, electrical transport measurements, electronic absorption spectroscopy, differential scanning calorimetry, and frequency-domain thermoreflectance. Superatomic crystals have proven to be both robust and tunable, representing a new method of materials design and architecture. This Account demonstrates how precisely controlling the structure and properties of nanoscale building blocks is key in developing the next generation of functional materials; several examples are discussed and detailed herein.

  20. The Unexamined Student Is Not Worth Teaching: Preparation, the Zone of Proximal Development, and the Socratic Model of Scaffolded Learning

    ERIC Educational Resources Information Center

    Colter, Robert; Ulatowski, Joseph

    2017-01-01

    "Scaffolded learning" describes a cluster of instructional techniques designed to move students from a novice position toward greater understanding, such that they become independent learners. Our Socratic Model of Scaffolded Learning ("SMSL") includes two phases not normally included in discussions of scaffolded learning, the…

  1. Locality-Aware CTA Clustering For Modern GPUs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Ang; Song, Shuaiwen; Liu, Weifeng

    2017-04-08

    In this paper, we proposed a novel clustering technique for tapping into the performance potential of a largely ignored type of locality: inter-CTA locality. We first demonstrated the capability of the existing GPU hardware to exploit such locality, both spatially and temporally, on L1 or L1/Tex unified cache. To verify the potential of this locality, we quantified its existence in a broad spectrum of applications and discussed its sources of origin. Based on these insights, we proposed the concept of CTA-Clustering and its associated software techniques. Finally, We evaluated these techniques on all modern generations of NVIDIA GPU architectures. Themore » experimental results showed that our proposed clustering techniques could significantly improve on-chip cache performance.« less

  2. Multivariate time series clustering on geophysical data recorded at Mt. Etna from 1996 to 2003

    NASA Astrophysics Data System (ADS)

    Di Salvo, Roberto; Montalto, Placido; Nunnari, Giuseppe; Neri, Marco; Puglisi, Giuseppe

    2013-02-01

    Time series clustering is an important task in data analysis issues in order to extract implicit, previously unknown, and potentially useful information from a large collection of data. Finding useful similar trends in multivariate time series represents a challenge in several areas including geophysics environment research. While traditional time series analysis methods deal only with univariate time series, multivariate time series analysis is a more suitable approach in the field of research where different kinds of data are available. Moreover, the conventional time series clustering techniques do not provide desired results for geophysical datasets due to the huge amount of data whose sampling rate is different according to the nature of signal. In this paper, a novel approach concerning geophysical multivariate time series clustering is proposed using dynamic time series segmentation and Self Organizing Maps techniques. This method allows finding coupling among trends of different geophysical data recorded from monitoring networks at Mt. Etna spanning from 1996 to 2003, when the transition from summit eruptions to flank eruptions occurred. This information can be used to carry out a more careful evaluation of the state of volcano and to define potential hazard assessment at Mt. Etna.

  3. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    PubMed

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  4. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches

    PubMed Central

    Bolin, Jocelyn H.; Edwards, Julianne M.; Finch, W. Holmes; Cassady, Jerrell C.

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering. PMID:24795683

  5. Cluster-based analysis improves predictive validity of spike-triggered receptive field estimates

    PubMed Central

    Malone, Brian J.

    2017-01-01

    Spectrotemporal receptive field (STRF) characterization is a central goal of auditory physiology. STRFs are often approximated by the spike-triggered average (STA), which reflects the average stimulus preceding a spike. In many cases, the raw STA is subjected to a threshold defined by gain values expected by chance. However, such correction methods have not been universally adopted, and the consequences of specific gain-thresholding approaches have not been investigated systematically. Here, we evaluate two classes of statistical correction techniques, using the resulting STRF estimates to predict responses to a novel validation stimulus. The first, more traditional technique eliminated STRF pixels (time-frequency bins) with gain values expected by chance. This correction method yielded significant increases in prediction accuracy, including when the threshold setting was optimized for each unit. The second technique was a two-step thresholding procedure wherein clusters of contiguous pixels surviving an initial gain threshold were then subjected to a cluster mass threshold based on summed pixel values. This approach significantly improved upon even the best gain-thresholding techniques. Additional analyses suggested that allowing threshold settings to vary independently for excitatory and inhibitory subfields of the STRF resulted in only marginal additional gains, at best. In summary, augmenting reverse correlation techniques with principled statistical correction choices increased prediction accuracy by over 80% for multi-unit STRFs and by over 40% for single-unit STRFs, furthering the interpretational relevance of the recovered spectrotemporal filters for auditory systems analysis. PMID:28877194

  6. New Statistical Methodology for Determining Cancer Clusters

    Cancer.gov

    The development of an innovative statistical technique that shows that women living in a broad stretch of the metropolitan northeastern United States, which includes Long Island, are slightly more likely to die from breast cancer than women in other parts of the Northeast.

  7. Clustervision: Visual Supervision of Unsupervised Clustering.

    PubMed

    Kwon, Bum Chul; Eysenbach, Ben; Verma, Janu; Ng, Kenney; De Filippi, Christopher; Stewart, Walter F; Perer, Adam

    2018-01-01

    Clustering, the process of grouping together similar items into distinct partitions, is a common type of unsupervised machine learning that can be useful for summarizing and aggregating complex multi-dimensional data. However, data can be clustered in many ways, and there exist a large body of algorithms designed to reveal different patterns. While having access to a wide variety of algorithms is helpful, in practice, it is quite difficult for data scientists to choose and parameterize algorithms to get the clustering results relevant for their dataset and analytical tasks. To alleviate this problem, we built Clustervision, a visual analytics tool that helps ensure data scientists find the right clustering among the large amount of techniques and parameters available. Our system clusters data using a variety of clustering techniques and parameters and then ranks clustering results utilizing five quality metrics. In addition, users can guide the system to produce more relevant results by providing task-relevant constraints on the data. Our visual user interface allows users to find high quality clustering results, explore the clusters using several coordinated visualization techniques, and select the cluster result that best suits their task. We demonstrate this novel approach using a case study with a team of researchers in the medical domain and showcase that our system empowers users to choose an effective representation of their complex data.

  8. Identification and characterization of earthquake clusters: a comparative analysis for selected sequences in Italy

    NASA Astrophysics Data System (ADS)

    Peresan, Antonella; Gentili, Stefania

    2017-04-01

    Identification and statistical characterization of seismic clusters may provide useful insights about the features of seismic energy release and their relation to physical properties of the crust within a given region. Moreover, a number of studies based on spatio-temporal analysis of main-shocks occurrence require preliminary declustering of the earthquake catalogs. Since various methods, relying on different physical/statistical assumptions, may lead to diverse classifications of earthquakes into main events and related events, we aim to investigate the classification differences among different declustering techniques. Accordingly, a formal selection and comparative analysis of earthquake clusters is carried out for the most relevant earthquakes in North-Eastern Italy, as reported in the local OGS-CRS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. The comparison is then extended to selected earthquake sequences associated with a different seismotectonic setting, namely to events that occurred in the region struck by the recent Central Italy destructive earthquakes, making use of INGV data. Various techniques, ranging from classical space-time windows methods to ad hoc manual identification of aftershocks, are applied for detection of earthquake clusters. In particular, a statistical method based on nearest-neighbor distances of events in space-time-energy domain, is considered. Results from clusters identification by the nearest-neighbor method turn out quite robust with respect to the time span of the input catalogue, as well as to minimum magnitude cutoff. The identified clusters for the largest events reported in North-Eastern Italy since 1977 are well consistent with those reported in earlier studies, which were aimed at detailed manual aftershocks identification. The study shows that the data-driven approach, based on the nearest-neighbor distances, can be satisfactorily applied to decompose the seismic catalog into background seismicity and individual sequences of earthquake clusters, also in areas characterized by moderate seismic activity, where the standard declustering techniques may turn out rather gross approximations. With these results acquired, the main statistical features of seismic clusters are explored, including complex interdependence of related events, with the aim to characterize the space-time patterns of earthquakes occurrence in North-Eastern Italy and capture their basic differences with Central Italy sequences.

  9. Clustering analysis of moving target signatures

    NASA Astrophysics Data System (ADS)

    Martone, Anthony; Ranney, Kenneth; Innocenti, Roberto

    2010-04-01

    Previously, we developed a moving target indication (MTI) processing approach to detect and track slow-moving targets inside buildings, which successfully detected moving targets (MTs) from data collected by a low-frequency, ultra-wideband radar. Our MTI algorithms include change detection, automatic target detection (ATD), clustering, and tracking. The MTI algorithms can be implemented in a real-time or near-real-time system; however, a person-in-the-loop is needed to select input parameters for the clustering algorithm. Specifically, the number of clusters to input into the cluster algorithm is unknown and requires manual selection. A critical need exists to automate all aspects of the MTI processing formulation. In this paper, we investigate two techniques that automatically determine the number of clusters: the adaptive knee-point (KP) algorithm and the recursive pixel finding (RPF) algorithm. The KP algorithm is based on a well-known heuristic approach for determining the number of clusters. The RPF algorithm is analogous to the image processing, pixel labeling procedure. Both algorithms are used to analyze the false alarm and detection rates of three operational scenarios of personnel walking inside wood and cinderblock buildings.

  10. Analysis of Spectral-type A/B Stars in Five Open Clusters

    NASA Astrophysics Data System (ADS)

    Wilhelm, Ronald J.; Rafuil Islam, M.

    2014-01-01

    We have obtained low resolution (R = 1000) spectroscopy of N=68, spectral-type A/B stars in five nearby open star clusters using the McDonald Observatory, 2.1m telescope. The sample of blue stars in various clusters were selected to test our new technique for determining interstellar reddening and distances in areas where interstellar reddening is high. We use a Bayesian approach to find the posterior distribution for Teff, Logg and [Fe/H] from a combination of reddened, photometric colors and spectroscopic line strengths. We will present calibration results for this technique using open cluster star data with known reddening and distances. Preliminary results suggest our technique can produce both reddening and distance determinations to within 10% of cluster values. Our technique opens the possibility of determining distances for blue stars at low Galactic latitudes where extinction can be large and differential. We will also compare our stellar parameter determinations to previously reported MK spectral classifications and discuss the probability that some of our stars are not members of their reported clusters.

  11. Optical Materials with a Genome: Nanophotonics with DNA-Stabilized Silver Clusters

    NASA Astrophysics Data System (ADS)

    Copp, Stacy M.

    Fluorescent silver clusters with unique rod-like geometries are stabilized by DNA. The sizes and colors of these clusters, or AgN-DNA, are selected by DNA base sequence, which can tune peak emission from blue-green into the near-infrared. Combined with DNA nanostructures, AgN-DNA promise exciting applications in nanophotonics and sensing. Until recently, however, a lack of understanding of the mechanisms controlling AgN-DNA fluorescence has challenged such applications. This dissertation discusses progress toward understanding the role of DNA as a "genome" for silver clusters and toward using DNA to achieve atomic-scale precision of silver cluster size and nanometer-scale precision of silver cluster position on a DNA breadboard. We also investigate sensitivity of AgN-DNA to local solvent environment, with an eye toward applications in chemical and biochemical sensing. Using robotic techniques to generate large data sets, we show that fluorescent silver clusters are templated by certain DNA base motifs that select "magic-sized" cluster cores of enhanced stabilities. The linear arrangement of bases on the phosphate backbone imposes a unique rod-like geometry on the clusters. Harnessing machine learning and bioinformatics techniques, we also demonstrate that sequences of DNA templates can be selected to stabilize silver clusters with desired optical properties, including high fluorescence intensity and specific fluorescence wavelengths, with much higher rates of success as compared to current strategies. The discovered base motifs can be also used to design modular DNA host strands that enable individual silver clusters with atomically precise sizes to bind at specific programmed locations on a DNA nanostructure. We show that DNA-mediated nanoscale arrangement enables near-field coupling of distinct clusters, demonstrated by dual-color cluster assemblies exhibiting resonant energy transfer. These results demonstrate a new degree of control over the optical properties and relative positions of nanoparticles, selected almost solely by the sequence of DNA. AgN-DNA are promising chemical and biochemical sensors due to the sensitivity of their fluorescence to local environment. However, the mechanisms behind many sensing schemes are not understood, and the nature of the excited state of the silver cluster itself remains unknown. To probe the fluorescence mechanisms of AgN-DNA, we investigate the behavior of purified solutions of these clusters in various solvents. We find that standard models for fluorophore solvatochromism, including the Lippert-Mataga model, do not describe AgN-DNA fluorescence because such models neglect specific interactions between the cluster and surrounding solvent molecules. Fluorescence colors are well-modeled by Mie-Gans theory, suggesting that the local dielectric environment of the cluster does play a role in fluorescence, although additional specific solvent interactions and cluster shape changes may also determine fluorescence color and intensity. These results suggest that AgN-DNA may be sensitive to changes in local dielectric environment on nanometer length scales and may also act as sensors for small molecules with affinity for DNA.

  12. Distributed cluster management techniques for unattended ground sensor networks

    NASA Astrophysics Data System (ADS)

    Essawy, Magdi A.; Stelzig, Chad A.; Bevington, James E.; Minor, Sharon

    2005-05-01

    Smart Sensor Networks are becoming important target detection and tracking tools. The challenging problems in such networks include the sensor fusion, data management and communication schemes. This work discusses techniques used to distribute sensor management and multi-target tracking responsibilities across an ad hoc, self-healing cluster of sensor nodes. Although miniaturized computing resources possess the ability to host complex tracking and data fusion algorithms, there still exist inherent bandwidth constraints on the RF channel. Therefore, special attention is placed on the reduction of node-to-node communications within the cluster by minimizing unsolicited messaging, and distributing the sensor fusion and tracking tasks onto local portions of the network. Several challenging problems are addressed in this work including track initialization and conflict resolution, track ownership handling, and communication control optimization. Emphasis is also placed on increasing the overall robustness of the sensor cluster through independent decision capabilities on all sensor nodes. Track initiation is performed using collaborative sensing within a neighborhood of sensor nodes, allowing each node to independently determine if initial track ownership should be assumed. This autonomous track initiation prevents the formation of duplicate tracks while eliminating the need for a central "management" node to assign tracking responsibilities. Track update is performed as an ownership node requests sensor reports from neighboring nodes based on track error covariance and the neighboring nodes geo-positional location. Track ownership is periodically recomputed using propagated track states to determine which sensing node provides the desired coverage characteristics. High fidelity multi-target simulation results are presented, indicating the distribution of sensor management and tracking capabilities to not only reduce communication bandwidth consumption, but to also simplify multi-target tracking within the cluster.

  13. Prediction model for peninsular Indian summer monsoon rainfall using data mining and statistical approaches

    NASA Astrophysics Data System (ADS)

    Vathsala, H.; Koolagudi, Shashidhar G.

    2017-01-01

    In this paper we discuss a data mining application for predicting peninsular Indian summer monsoon rainfall, and propose an algorithm that combine data mining and statistical techniques. We select likely predictors based on association rules that have the highest confidence levels. We then cluster the selected predictors to reduce their dimensions and use cluster membership values for classification. We derive the predictors from local conditions in southern India, including mean sea level pressure, wind speed, and maximum and minimum temperatures. The global condition variables include southern oscillation and Indian Ocean dipole conditions. The algorithm predicts rainfall in five categories: Flood, Excess, Normal, Deficit and Drought. We use closed itemset mining, cluster membership calculations and a multilayer perceptron function in the algorithm to predict monsoon rainfall in peninsular India. Using Indian Institute of Tropical Meteorology data, we found the prediction accuracy of our proposed approach to be exceptionally good.

  14. Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches

    PubMed Central

    Boyack, Kevin W.; Newman, David; Duhon, Russell J.; Klavans, Richard; Patek, Michael; Biberstine, Joseph R.; Schijvenaars, Bob; Skupin, André; Ma, Nianli; Börner, Katy

    2011-01-01

    Background We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE. Conclusions PubMed's own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts. PMID:21437291

  15. Focus-based filtering + clustering technique for power-law networks with small world phenomenon

    NASA Astrophysics Data System (ADS)

    Boutin, François; Thièvre, Jérôme; Hascoët, Mountaz

    2006-01-01

    Realistic interaction networks usually present two main properties: a power-law degree distribution and a small world behavior. Few nodes are linked to many nodes and adjacent nodes are likely to share common neighbors. Moreover, graph structure usually presents a dense core that is difficult to explore with classical filtering and clustering techniques. In this paper, we propose a new filtering technique accounting for a user-focus. This technique extracts a tree-like graph with also power-law degree distribution and small world behavior. Resulting structure is easily drawn with classical force-directed drawing algorithms. It is also quickly clustered and displayed into a multi-level silhouette tree (MuSi-Tree) from any user-focus. We built a new graph filtering + clustering + drawing API and report a case study.

  16. Scalable Prediction of Energy Consumption using Incremental Time Series Clustering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Simmhan, Yogesh; Noor, Muhammad Usman

    2013-10-09

    Time series datasets are a canonical form of high velocity Big Data, and often generated by pervasive sensors, such as found in smart infrastructure. Performing predictive analytics on time series data can be computationally complex, and requires approximation techniques. In this paper, we motivate this problem using a real application from the smart grid domain. We propose an incremental clustering technique, along with a novel affinity score for determining cluster similarity, which help reduce the prediction error for cumulative time series within a cluster. We evaluate this technique, along with optimizations, using real datasets from smart meters, totaling ~700,000 datamore » points, and show the efficacy of our techniques in improving the prediction error of time series data within polynomial time.« less

  17. Finite temperature properties of clusters by replica exchange metadynamics: the water nonamer.

    PubMed

    Zhai, Yingteng; Laio, Alessandro; Tosatti, Erio; Gong, Xin-Gao

    2011-03-02

    We introduce an approach for the accurate calculation of thermal properties of classical nanoclusters. On the basis of a recently developed enhanced sampling technique, replica exchange metadynamics, the method yields the true free energy of each relevant cluster structure, directly sampling its basin and measuring its occupancy in full equilibrium. All entropy sources, whether vibrational, rotational anharmonic, or especially configurational, the latter often forgotten in many cluster studies, are automatically included. For the present demonstration, we choose the water nonamer (H(2)O)(9), an extremely simple cluster, which nonetheless displays a sufficient complexity and interesting physics in its relevant structure spectrum. Within a standard TIP4P potential description of water, we find that the nonamer second relevant structure possesses a higher configurational entropy than the first, so that the two free energies surprisingly cross for increasing temperature.

  18. Finite Temperature Properties of Clusters by Replica Exchange Metadynamics: The Water Nonamer

    NASA Astrophysics Data System (ADS)

    Zhai, Yingteng; Laio, Alessandro; Tosatti, Erio; Gong, Xingao

    2012-02-01

    We introduce an approach for the accurate calculation of thermal properties of classical nanoclusters. Based on a recently developed enhanced sampling technique, replica exchange metadynamics, the method yields the true free energy of each relevant cluster structure, directly sampling its basin and measuring its occupancy in full equilibrium. All entropy sources, whether vibrational, rotational anharmonic and especially configurational -- the latter often forgotten in many cluster studies -- are automatically included. For the present demonstration we choose the water nonamer (H2O)9, an extremely simple cluster which nonetheless displays a sufficient complexity and interesting physics in its relevant structure spectrum. Within a standard TIP4P potential description of water, we find that the nonamer second relevant structure possesses a higher configurational entropy than the first, so that the two free energies surprisingly cross for increasing temperature.

  19. OGLE II Eclipsing Binaries In The LMC: Analysis With Class

    NASA Astrophysics Data System (ADS)

    Devinney, Edward J.; Prsa, A.; Guinan, E. F.; DeGeorge, M.

    2011-01-01

    The Eclipsing Binaries (EBs) via Artificial Intelligence (EBAI) Project is applying machine learning techniques to elucidate the nature of EBs. Previously, Prsa, et al. applied artificial neural networks (ANNs) trained on physically-realistic Wilson-Devinney models to solve the light curves of the 1882 detached EBs in the LMC discovered by the OGLE II Project (Wyrzykowski, et al.) fully automatically, bypassing the need for manually-derived starting solutions. A curious result is the non-monotonic distribution of the temperature ratio parameter T2/T1, featuring a subsidiary peak noted previously by Mazeh, et al. in an independent analysis using the EBOP EB solution code (Tamuz, et al.). To explore this and to gain a fuller understanding of the multivariate EBAI LMC observational plus solutions data, we have employed automatic clustering and advanced visualization (CAV) techniques. Clustering the OGLE II data aggregates objects that are similar with respect to many parameter dimensions. Measures of similarity for example, could include the multidimensional Euclidean Distance between data objects, although other measures may be appropriate. Applying clustering, we find good evidence that the T2/T1 subsidiary peak is due to evolved binaries, in support of Mazeh et al.'s speculation. Further, clustering suggests that the LMC detached EBs occupying the main sequence region belong to two distinct classes. Also identified as a separate cluster in the multivariate data are stars having a Period-I band relation. Derekas et al. had previously found a Period-K band relation for LMC EBs discovered by the MACHO Project (Alcock, et al.). We suggest such CAV techniques will prove increasingly useful for understanding the large, multivariate datasets increasingly being produced in astronomy. We are grateful for the support of this research from NSF/RUI Grant AST-05-75042 f.

  20. Characterization of electrically-active defects in ultraviolet light-emitting diodes with laser-based failure analysis techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, Mary A.; Tangyunyong, Paiboon; Cole, Edward I.

    2016-01-14

    Laser-based failure analysis techniques demonstrate the ability to quickly and non-intrusively screen deep ultraviolet light-emitting diodes (LEDs) for electrically-active defects. In particular, two laser-based techniques, light-induced voltage alteration and thermally-induced voltage alteration, generate applied voltage maps (AVMs) that provide information on electrically-active defect behavior including turn-on bias, density, and spatial location. Here, multiple commercial LEDs were examined and found to have dark defect signals in the AVM indicating a site of reduced resistance or leakage through the diode. The existence of the dark defect signals in the AVM correlates strongly with an increased forward-bias leakage current. This increased leakage ismore » not present in devices without AVM signals. Transmission electron microscopy analysis of a dark defect signal site revealed a dislocation cluster through the pn junction. The cluster included an open core dislocation. Even though LEDs with few dark AVM defect signals did not correlate strongly with power loss, direct association between increased open core dislocation densities and reduced LED device performance has been presented elsewhere [M. W. Moseley et al., J. Appl. Phys. 117, 095301 (2015)].« less

  1. Characterization of electrically-active defects in ultraviolet light-emitting diodes with laser-based failure analysis techniques

    DOE PAGES

    Miller, Mary A.; Tangyunyong, Paiboon; Edward I. Cole, Jr.

    2016-01-12

    In this study, laser-based failure analysis techniques demonstrate the ability to quickly and non-intrusively screen deep ultraviolet light-emitting diodes(LEDs) for electrically-active defects. In particular, two laser-based techniques, light-induced voltage alteration and thermally-induced voltage alteration, generate applied voltage maps (AVMs) that provide information on electrically-active defect behavior including turn-on bias, density, and spatial location. Here, multiple commercial LEDs were examined and found to have dark defect signals in the AVM indicating a site of reduced resistance or leakage through the diode. The existence of the dark defect signals in the AVM correlates strongly with an increased forward-bias leakage current. This increasedmore » leakage is not present in devices without AVM signals. Transmission electron microscopyanalysis of a dark defect signal site revealed a dislocation cluster through the pn junction. The cluster included an open core dislocation. Even though LEDs with few dark AVM defect signals did not correlate strongly with power loss, direct association between increased open core dislocation densities and reduced LED device performance has been presented elsewhere [M. W. Moseley et al., J. Appl. Phys. 117, 095301 (2015)].« less

  2. A framework for graph-based synthesis, analysis, and visualization of HPC cluster job data.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mayo, Jackson R.; Kegelmeyer, W. Philip, Jr.; Wong, Matthew H.

    The monitoring and system analysis of high performance computing (HPC) clusters is of increasing importance to the HPC community. Analysis of HPC job data can be used to characterize system usage and diagnose and examine failure modes and their effects. This analysis is not straightforward, however, due to the complex relationships that exist between jobs. These relationships are based on a number of factors, including shared compute nodes between jobs, proximity of jobs in time, etc. Graph-based techniques represent an approach that is particularly well suited to this problem, and provide an effective technique for discovering important relationships in jobmore » queuing and execution data. The efficacy of these techniques is rooted in the use of a semantic graph as a knowledge representation tool. In a semantic graph job data, represented in a combination of numerical and textual forms, can be flexibly processed into edges, with corresponding weights, expressing relationships between jobs, nodes, users, and other relevant entities. This graph-based representation permits formal manipulation by a number of analysis algorithms. This report presents a methodology and software implementation that leverages semantic graph-based techniques for the system-level monitoring and analysis of HPC clusters based on job queuing and execution data. Ontology development and graph synthesis is discussed with respect to the domain of HPC job data. The framework developed automates the synthesis of graphs from a database of job information. It also provides a front end, enabling visualization of the synthesized graphs. Additionally, an analysis engine is incorporated that provides performance analysis, graph-based clustering, and failure prediction capabilities for HPC systems.« less

  3. Data Mining Methods for Recommender Systems

    NASA Astrophysics Data System (ADS)

    Amatriain, Xavier; Jaimes*, Alejandro; Oliver, Nuria; Pujol, Josep M.

    In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.

  4. OPEN CLUSTERS AS PROBES OF THE GALACTIC MAGNETIC FIELD. I. CLUSTER PROPERTIES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hoq, Sadia; Clemens, D. P., E-mail: shoq@bu.edu, E-mail: clemens@bu.edu

    2015-10-15

    Stars in open clusters are powerful probes of the intervening Galactic magnetic field via background starlight polarimetry because they provide constraints on the magnetic field distances. We use 2MASS photometric data for a sample of 31 clusters in the outer Galaxy for which near-IR polarimetric data were obtained to determine the cluster distances, ages, and reddenings via fitting theoretical isochrones to cluster color–magnitude diagrams. The fitting approach uses an objective χ{sup 2} minimization technique to derive the cluster properties and their uncertainties. We found the ages, distances, and reddenings for 24 of the clusters, and the distances and reddenings formore » 6 additional clusters that were either sparse or faint in the near-IR. The derived ranges of log(age), distance, and E(B−V) were 7.25–9.63, ∼670–6160 pc, and 0.02–1.46 mag, respectively. The distance uncertainties ranged from ∼8% to 20%. The derived parameters were compared to previous studies, and most cluster parameters agree within our uncertainties. To test the accuracy of the fitting technique, synthetic clusters with 50, 100, or 200 cluster members and a wide range of ages were fit. These tests recovered the input parameters within their uncertainties for more than 90% of the individual synthetic cluster parameters. These results indicate that the fitting technique likely provides reliable estimates of cluster properties. The distances derived will be used in an upcoming study of the Galactic magnetic field in the outer Galaxy.« less

  5. The applicability and effectiveness of cluster analysis

    NASA Technical Reports Server (NTRS)

    Ingram, D. S.; Actkinson, A. L.

    1973-01-01

    An insight into the characteristics which determine the performance of a clustering algorithm is presented. In order for the techniques which are examined to accurately cluster data, two conditions must be simultaneously satisfied. First the data must have a particular structure, and second the parameters chosen for the clustering algorithm must be correct. By examining the structure of the data from the Cl flight line, it is clear that no single set of parameters can be used to accurately cluster all the different crops. The effectiveness of either a noniterative or iterative clustering algorithm to accurately cluster data representative of the Cl flight line is questionable. Thus extensive a prior knowledge is required in order to use cluster analysis in its present form for applications like assisting in the definition of field boundaries and evaluating the homogeneity of a field. New or modified techniques are necessary for clustering to be a reliable tool.

  6. Profiling Local Optima in K-Means Clustering: Developing a Diagnostic Technique

    ERIC Educational Resources Information Center

    Steinley, Douglas

    2006-01-01

    Using the cluster generation procedure proposed by D. Steinley and R. Henson (2005), the author investigated the performance of K-means clustering under the following scenarios: (a) different probabilities of cluster overlap; (b) different types of cluster overlap; (c) varying samples sizes, clusters, and dimensions; (d) different multivariate…

  7. Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra

    NASA Astrophysics Data System (ADS)

    Unglert, K.; Radić, V.; Jellinek, A. M.

    2016-06-01

    Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.

  8. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning

    ERIC Educational Resources Information Center

    Firdausiah Mansur, Andi Besse; Yusof, Norazah

    2013-01-01

    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  9. Optimizing Instruction Scheduling and Register Allocation for Register-File-Connected Clustered VLIW Architectures

    PubMed Central

    Tang, Haijing; Wang, Siye; Zhang, Yanjun

    2013-01-01

    Clustering has become a common trend in very long instruction words (VLIW) architecture to solve the problem of area, energy consumption, and design complexity. Register-file-connected clustered (RFCC) VLIW architecture uses the mechanism of global register file to accomplish the inter-cluster data communications, thus eliminating the performance and energy consumption penalty caused by explicit inter-cluster data move operations in traditional bus-connected clustered (BCC) VLIW architecture. However, the limit number of access ports to the global register file has become an issue which must be well addressed; otherwise the performance and energy consumption would be harmed. In this paper, we presented compiler optimization techniques for an RFCC VLIW architecture called Lily, which is designed for encryption systems. These techniques aim at optimizing performance and energy consumption for Lily architecture, through appropriate manipulation of the code generation process to maintain a better management of the accesses to the global register file. All the techniques have been implemented and evaluated. The result shows that our techniques can significantly reduce the penalty of performance and energy consumption due to access port limitation of global register file. PMID:23970841

  10. Active learning for semi-supervised clustering based on locally linear propagation reconstruction.

    PubMed

    Chang, Chin-Chun; Lin, Po-Yi

    2015-03-01

    The success of semi-supervised clustering relies on the effectiveness of side information. To get effective side information, a new active learner learning pairwise constraints known as must-link and cannot-link constraints is proposed in this paper. Three novel techniques are developed for learning effective pairwise constraints. The first technique is used to identify samples less important to cluster structures. This technique makes use of a kernel version of locally linear embedding for manifold learning. Samples neither important to locally linear propagation reconstructions of other samples nor on flat patches in the learned manifold are regarded as unimportant samples. The second is a novel criterion for query selection. This criterion considers not only the importance of a sample to expanding the space coverage of the learned samples but also the expected number of queries needed to learn the sample. To facilitate semi-supervised clustering, the third technique yields inferred must-links for passing information about flat patches in the learned manifold to semi-supervised clustering algorithms. Experimental results have shown that the learned pairwise constraints can capture the underlying cluster structures and proven the feasibility of the proposed approach. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Data mining to support simulation modeling of patient flow in hospitals.

    PubMed

    Isken, Mark W; Rajagopalan, Balaji

    2002-04-01

    Spiraling health care costs in the United States are driving institutions to continually address the challenge of optimizing the use of scarce resources. One of the first steps towards optimizing resources is to utilize capacity effectively. For hospital capacity planning problems such as allocation of inpatient beds, computer simulation is often the method of choice. One of the more difficult aspects of using simulation models for such studies is the creation of a manageable set of patient types to include in the model. The objective of this paper is to demonstrate the potential of using data mining techniques, specifically clustering techniques such as K-means, to help guide the development of patient type definitions for purposes of building computer simulation or analytical models of patient flow in hospitals. Using data from a hospital in the Midwest this study brings forth several important issues that researchers need to address when applying clustering techniques in general and specifically to hospital data.

  12. Computational Studies on the Anharmonic Dynamics of Molecular Clusters

    NASA Astrophysics Data System (ADS)

    Mancini, John S.

    Molecular nanoclusters present ideal systems to probe the physical forces and dynamics that drive the behavior of larger bulk systems. At the nanocluster limit the first instances of several phenomena can be observed including the breaking of hydrogen and molecular bonds. Advancements in experimental and theoretical techniques have made it possible to explore these phenomena in great detail. The most fruitful of these studies have involved the use of both experimental and theoretical techniques to leverage to strengths of the two approaches. This dissertation seeks to explore several important phenomena of molecular clusters using new and existing theoretical methodologies. Three specific systems are considered, hydrogen chloride clusters, mixed water and hydrogen chloride clusters and the first cluster where hydrogen chloride autoionization occurs. The focus of these studies remain as close as possible to experimentally observable phenomena with the intention of validating, simulating and expanding on experimental work. Specifically, the properties of interested are those related to the vibrational ground and excited state dynamics of these systems. Studies are performed using full and reduced dimensional potential energy surface alongside advanced quantum mechanical methods including diffusion Monte Carlo, vibrational configuration interaction theory and quasi-classical molecular dynamics. The insight gained from these studies are great and varied. A new on-they-fly ab initio method for studying molecular clusters is validated for (HCl)1--6. A landmark study of the dissociation energy and predissociation mechanism of (HCl)3 is reported. The ground states of mixed (HCl)n(H2O)m are found to be highly delocalized across multiple stationary point configurations. Furthermore, it is identified that the consideration of this delocalization is required in vibrational excited state calculations to achieve agreement with experimental measurements. Finally, the theoretical infrared spectra for the first case of HCl ionization in (H 2O)m is reported, H+(H2O) 3Cl--. The calculation indicates that the ionized cluster's spectra is much more complex than any pervious harmonic predictions, with a large number of the system's infrared active peaks resulting from overtones of lower frequency molecular motions.

  13. Classification of high-resolution multi-swath hyperspectral data using Landsat 8 surface reflectance data as a calibration target and a novel histogram based unsupervised classification technique to determine natural classes from biophysically relevant fit parameters

    NASA Astrophysics Data System (ADS)

    McCann, C.; Repasky, K. S.; Morin, M.; Lawrence, R. L.; Powell, S. L.

    2016-12-01

    Compact, cost-effective, flight-based hyperspectral imaging systems can provide scientifically relevant data over large areas for a variety of applications such as ecosystem studies, precision agriculture, and land management. To fully realize this capability, unsupervised classification techniques based on radiometrically-calibrated data that cluster based on biophysical similarity rather than simply spectral similarity are needed. An automated technique to produce high-resolution, large-area, radiometrically-calibrated hyperspectral data sets based on the Landsat surface reflectance data product as a calibration target was developed and applied to three subsequent years of data covering approximately 1850 hectares. The radiometrically-calibrated data allows inter-comparison of the temporal series. Advantages of the radiometric calibration technique include the need for minimal site access, no ancillary instrumentation, and automated processing. Fitting the reflectance spectra of each pixel using a set of biophysically relevant basis functions reduces the data from 80 spectral bands to 9 parameters providing noise reduction and data compression. Examination of histograms of these parameters allows for determination of natural splitting into biophysical similar clusters. This method creates clusters that are similar in terms of biophysical parameters, not simply spectral proximity. Furthermore, this method can be applied to other data sets, such as urban scenes, by developing other physically meaningful basis functions. The ability to use hyperspectral imaging for a variety of important applications requires the development of data processing techniques that can be automated. The radiometric-calibration combined with the histogram based unsupervised classification technique presented here provide one potential avenue for managing big-data associated with hyperspectral imaging.

  14. Focusing cosmic telescopes: systematics of strong lens modeling

    NASA Astrophysics Data System (ADS)

    Johnson, Traci Lin; Sharon, Keren q.

    2018-01-01

    The use of strong gravitational lensing by galaxy clusters has become a popular method for studying the high redshift universe. While diverse in computational methods, lens modeling techniques have grasped the means for determining statistical errors on cluster masses and magnifications. However, the systematic errors have yet to be quantified, arising from the number of constraints, availablity of spectroscopic redshifts, and various types of image configurations. I will be presenting my dissertation work on quantifying systematic errors in parametric strong lensing techniques. I have participated in the Hubble Frontier Fields lens model comparison project, using simulated clusters to compare the accuracy of various modeling techniques. I have extended this project to understanding how changing the quantity of constraints affects the mass and magnification. I will also present my recent work extending these studies to clusters in the Outer Rim Simulation. These clusters are typical of the clusters found in wide-field surveys, in mass and lensing cross-section. These clusters have fewer constraints than the HFF clusters and thus, are more susceptible to systematic errors. With the wealth of strong lensing clusters discovered in surveys such as SDSS, SPT, DES, and in the future, LSST, this work will be influential in guiding the lens modeling efforts and follow-up spectroscopic campaigns.

  15. Comparison of Clustering Techniques for Residential Energy Behavior using Smart Meter Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, Ling; Lee, Doris; Sim, Alex

    Current practice in whole time series clustering of residential meter data focuses on aggregated or subsampled load data at the customer level, which ignores day-to-day differences within customers. This information is critical to determine each customer’s suitability to various demand side management strategies that support intelligent power grids and smart energy management. Clustering daily load shapes provides fine-grained information on customer attributes and sources of variation for subsequent models and customer segmentation. In this paper, we apply 11 clustering methods to daily residential meter data. We evaluate their parameter settings and suitability based on 6 generic performance metrics and post-checkingmore » of resulting clusters. Finally, we recommend suitable techniques and parameters based on the goal of discovering diverse daily load patterns among residential customers. To the authors’ knowledge, this paper is the first robust comparative review of clustering techniques applied to daily residential load shape time series in the power systems’ literature.« less

  16. Grain Cluster Microstructure and Grain Boundary Character Distribution in Alloy 690

    NASA Astrophysics Data System (ADS)

    Xia, Shuang; Zhou, Bangxin; Chen, Wenjue

    2009-12-01

    The effects of thermal-mechanical processing (TMP) on microstructure evolution during recrystallization and grain boundary character distribution (GBCD) in aged Alloy 690 were investigated by the electron backscatter diffraction (EBSD) technique and optical microscopy. The original grain boundaries of the deformed microstructure did not play an important role in the manipulation of the proportion of the Σ3 n ( n = 1, 2, 3…) type boundaries. Instead, the grain cluster formed by multiple twinning starting from a single nucleus during recrystallization was the key microstructural feature affecting the GBCD. All of the grains in this kind of cluster had Σ3 n mutual misorientations regardless of whether they were adjacent. A large grain cluster containing 91 grains was found in the sample after a small-strain (5 pct) and a high-temperature (1100 °C) recrystallization anneal, and twin relationships up to the ninth generation (Σ39) were found in this cluster. The ratio of cluster size over grain size (including all types of boundaries as defining individual grains) dictated the proportion of Σ3 n boundaries.

  17. Condensed Matter Cluster Reactions in LENR Power Cells for a Radical New Type of Space Power Source

    NASA Astrophysics Data System (ADS)

    Yang, Xiaoling; Miley, George H.; Hora, Heinz

    2009-03-01

    This paper reviews previous theoretical and experimental study on the possibility of nuclear events in multilayer thin film electrodes (Lipson et al., 2004 and 2005; Miley et al., 2007), including the correlation between excess heat and transmutations (Miley and Shrestha, 2003) and the cluster theory that predicts it. As a result of this added understanding of cluster reactions, a new class of electrodes is under development at the University of Illinois. These electrodes are designed to enhance cluster formation and subsequent reactions. Two approaches are under development. The first employs improved loading-unloading techniques, intending to obtain a higher volumetric density of sites favoring cluster formation. The second is designed to create nanostructures on the electrode where the cluster state is formed by electroless deposition of palladium on nickel micro structures. Power units employing these electrodes should offer unique advantages for space applications. This is a fundamental new nuclear energy source that is environmentally compatible with a minimum of radiation involvement, high specific power, very long lifetime, and scalable from micro power to kilowatts.

  18. An algol program for dissimilarity analysis: a divisive-omnithetic clustering technique

    USGS Publications Warehouse

    Tipper, J.C.

    1979-01-01

    Clustering techniques are used properly to generate hypotheses about patterns in data. Of the hierarchical techniques, those which are divisive and omnithetic possess many theoretically optimal properties. One such method, dissimilarity analysis, is implemented here in ALGOL 60, and determined to be competitive computationally with most other methods. ?? 1979.

  19. Mapping patient safety: a large-scale literature review using bibliometric visualisation techniques

    PubMed Central

    Rodrigues, S P; van Eck, N J; Waltman, L; Jansen, F W

    2014-01-01

    Background The amount of scientific literature available is often overwhelming, making it difficult for researchers to have a good overview of the literature and to see relations between different developments. Visualisation techniques based on bibliometric data are helpful in obtaining an overview of the literature on complex research topics, and have been applied here to the topic of patient safety (PS). Methods On the basis of title words and citation relations, publications in the period 2000–2010 related to PS were identified in the Scopus bibliographic database. A visualisation of the most frequently cited PS publications was produced based on direct and indirect citation relations between publications. Terms were extracted from titles and abstracts of the publications, and a visualisation of the most important terms was created. The main PS-related topics studied in the literature were identified using a technique for clustering publications and terms. Results A total of 8480 publications were identified, of which the 1462 most frequently cited ones were included in the visualisation. The publications were clustered into 19 clusters, which were grouped into three categories: (1) magnitude of PS problems (42% of all included publications); (2) PS risk factors (31%) and (3) implementation of solutions (19%). In the visualisation of PS-related terms, five clusters were identified: (1) medication; (2) measuring harm; (3) PS culture; (4) physician; (5) training, education and communication. Both analysis at publication and term level indicate an increasing focus on risk factors. Conclusions A bibliometric visualisation approach makes it possible to analyse large amounts of literature. This approach is very useful for improving one's understanding of a complex research topic such as PS and for suggesting new research directions or alternative research priorities. For PS research, the approach suggests that more research on implementing PS improvement initiatives might be needed. PMID:24625640

  20. Identifying and reducing error in cluster-expansion approximations of protein energies.

    PubMed

    Hahn, Seungsoo; Ashenberg, Orr; Grigoryan, Gevorg; Keating, Amy E

    2010-12-01

    Protein design involves searching a vast space for sequences that are compatible with a defined structure. This can pose significant computational challenges. Cluster expansion is a technique that can accelerate the evaluation of protein energies by generating a simple functional relationship between sequence and energy. The method consists of several steps. First, for a given protein structure, a training set of sequences with known energies is generated. Next, this training set is used to expand energy as a function of clusters consisting of single residues, residue pairs, and higher order terms, if required. The accuracy of the sequence-based expansion is monitored and improved using cross-validation testing and iterative inclusion of additional clusters. As a trade-off for evaluation speed, the cluster-expansion approximation causes prediction errors, which can be reduced by including more training sequences, including higher order terms in the expansion, and/or reducing the sequence space described by the cluster expansion. This article analyzes the sources of error and introduces a method whereby accuracy can be improved by judiciously reducing the described sequence space. The method is applied to describe the sequence-stability relationship for several protein structures: coiled-coil dimers and trimers, a PDZ domain, and T4 lysozyme as examples with computationally derived energies, and SH3 domains in amphiphysin-1 and endophilin-1 as examples where the expanded pseudo-energies are obtained from experiments. Our open-source software package Cluster Expansion Version 1.0 allows users to expand their own energy function of interest and thereby apply cluster expansion to custom problems in protein design. © 2010 Wiley Periodicals, Inc.

  1. Untangling Magmatic Processes and Hydrothermal Alteration of in situ Superfast Spreading Ocean Crust at ODP/IODP Site 1256 with Fuzzy c-means Cluster Analysis of Rock Magnetic Properties

    NASA Astrophysics Data System (ADS)

    Dekkers, M. J.; Heslop, D.; Herrero-Bervera, E.; Acton, G.; Krasa, D.

    2014-12-01

    Ocean Drilling Program (ODP)/Integrated ODP (IODP) Hole 1256D (6.44.1' N, 91.56.1' W) on the Cocos Plate occurs in 15.2 Ma oceanic crust generated by superfast seafloor spreading. Presently, it is the only drill hole that has sampled all three oceanic crust layers in a tectonically undisturbed setting. Here we interpret down-hole trends in several rock-magnetic parameters with fuzzy c-means cluster analysis, a multivariate statistical technique. The parameters include the magnetization ratio, the coercivity ratio, the coercive force, the low-field susceptibility, and the Curie temperature. By their combined, multivariate, analysis the effects of magmatic and hydrothermal processes can be evaluated. The optimal number of clusters - a key point in the analysis because there is no a priori information on this - was determined through a combination of approaches: by calculation of several cluster validity indices, by testing for coherent cluster distributions on non-linear-map plots, and importantly by testing for stability of the cluster solution from all possible starting points. Here, we consider a solution robust if the cluster allocation is independent of the starting configuration. The five-cluster solution appeared to be robust. Three clusters are distinguished in the extrusive segment of the Hole that express increasing hydrothermal alteration of the lavas. The sheeted dike and gabbro portions are characterized by two clusters, both with higher coercivities than in lava samples. Extensive alteration, however, can obliterate magnetic property differences between lavas, dikes, and gabbros. The imprint of thermochemical alteration on the iron-titanium oxides is only partially related to the porosity of the rocks. All clusters display rock magnetic characteristics in line with a stable NRM. This implies that the entire sampled sequence of ocean crust can contribute to marine magnetic anomalies. Determination of the absolute paleointensity with thermal techniques is not straightforward because of the propensity of oxyexsolution during laboratory heating and/or the presence of intergrowths. The upper part of the extrusive sequence, the granoblastic portion of the dikes, and moderately altered gabbros may contain a comparatively uncontaminated thermoremanent magnetization.

  2. Mathematical Intelligence and Mathematical Creativity: A Causal Relationship

    ERIC Educational Resources Information Center

    Tyagi, Tarun Kumar

    2017-01-01

    This study investigated the causal relationship between mathematical creativity and mathematical intelligence. Four hundred thirty-nine 8th-grade students, age ranged from 11 to 14 years, were included in the sample of this study by random cluster technique on which mathematical creativity and Hindi adaptation of mathematical intelligence test…

  3. Automatic Thesaurus Generation for an Electronic Community System.

    ERIC Educational Resources Information Center

    Chen, Hsinchun; And Others

    1995-01-01

    This research reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used include term filtering, automatic indexing, and cluster analysis. The Worm Community System, used by molecular biologists studying the nematode worm C. elegans, was used as the testbed for this research.…

  4. Collected Notes on the Workshop for Pattern Discovery in Large Databases

    NASA Technical Reports Server (NTRS)

    Buntine, Wray (Editor); Delalto, Martha (Editor)

    1991-01-01

    These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.

  5. Evaluating Mixture Modeling for Clustering: Recommendations and Cautions

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2011-01-01

    This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…

  6. Discovery, biosynthesis, and rational engineering of novel enterocin and wailupemycin polyketide analogues.

    PubMed

    Kalaitzis, John A

    2013-01-01

    The marine actinomycete Streptomyces maritimus produces a structurally diverse set of unusual polyketide natural products including the major metabolite enterocin. Investigations of enterocin biosynthesis revealed that the unique carbon skeleton is derived from an aromatic polyketide pathway which is genetically coded by the 21.3 kb enc gene cluster in S. maritimus. Characterization of the enc biosynthesis gene cluster and subsequent manipulation of it via heterologous expression and/or mutagenesis enabled the discovery of other enc-based metabolites that were produced in only very minor amounts in the wild type. Also described are techniques used to harness the enterocin biosynthetic machinery in order to generate unnatural enc-derived polyketide analogues. This review focuses upon the molecular methods used in combination with classical natural products detection and isolation techniques to access minor metabolites of the S. maritimus secondary metabolome.

  7. Gold nanoparticles for cancer detection and treatment: The role of adhesion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oni, Y.; Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey 08544; Hao, K.

    2014-02-28

    This paper presents the results of an experimental study of the effects of adhesion between gold nanoparticles and surfaces that are relevant to the potential applications in cancer detection and treatment. Adhesion is measured using a dip coating/atomic force microscopy (DC/AFM) technique. The adhesion forces are obtained for dip-coated gold nanoparticles that interact with peptide or antibody-based molecular recognition units (MRUs) that attach specifically to breast cancer cells. They include MRUs that attach specifically to receptors on breast cancer cells. Adhesion forces between anti-cancer drugs such as paclitaxel, and the constituents of MRU-conjugated Au nanoparticle clusters, are measured using forcemore » microscopy techniques. The implications of the results are then discussed for the design of robust gold nanoparticle clusters and for potential applications in localized drug delivery and hyperthermia.« less

  8. Clustering: An Interactive Technique to Enhance Learning in Biology.

    ERIC Educational Resources Information Center

    Ambron, Joanna

    1988-01-01

    Explains an interdisciplinary approach to biology and writing which increases students' mastery of vocabulary, scientific concepts, creativity, and expression. Describes modifications of the clustering technique used to summarize lectures, integrate reading and understand textbook material. (RT)

  9. Low-energy collisions of helium clusters with size-selected cobalt cluster ions

    NASA Astrophysics Data System (ADS)

    Odaka, Hideho; Ichihashi, Masahiko

    2017-04-01

    Collisions of helium clusters with size-selected cobalt cluster ions, Com+ (m ≤ 5), were studied experimentally by using a merging beam technique. The product ions, Com+Hen (cluster complexes), were mass-analyzed, and this result indicates that more than 20 helium atoms can be attached onto Com+ at the relative velocities of 103 m/s. The measured size distributions of the cluster complexes indicate that there are relatively stable complexes: Co2+Hen (n = 2, 4, 6, and 12), Co3+Hen (n = 3, 6), Co4+He4, and Co5+Hen (n = 3, 6, 8, and 10). These stabilities are explained in terms of their geometric structures. The yields of the cluster complexes were also measured as a function of the relative velocity (1 × 102-4 × 103 m/s), and this result demonstrates that the main interaction in the collision process changes with the increase of the collision energy from the electrostatic interaction, which includes the induced deformation of HeN, to the hard-sphere interaction. Supplementary material in the form of one pdf file available from the Journal web page at http://https://doi.org/10.1140/epjd/e2017-80015-0

  10. I. Excluded volume effects in Ising cluster distributions and nuclear multifragmentation. II. Multiple-chance effects in alpha-particle evaporation

    NASA Astrophysics Data System (ADS)

    Breus, Dimitry Eugene

    In Part I, geometric clusters of the Ising model are studied as possible model clusters for nuclear multifragmentation. These clusters may not be considered as non-interacting (ideal gas) due to excluded volume effect which predominantly is the artifact of the cluster's finite size. Interaction significantly complicates the use of clusters in the analysis of thermodynamic systems. Stillinger's theory is used as a basis for the analysis, which within the RFL (Reiss, Frisch, Lebowitz) fluid-of-spheres approximation produces a prediction for cluster concentrations well obeyed by geometric clusters of the Ising model. If thermodynamic condition of phase coexistence is met, these concentrations can be incorporated into a differential equation procedure of moderate complexity to elucidate the liquid-vapor phase diagram of the system with cluster interaction included. The drawback of increased complexity is outweighted by the reward of greater accuracy of the phase diagram, as it is demonstrated by the Ising model. A novel nuclear-cluster analysis procedure is developed by modifying Fisher's model to contain cluster interaction and employing the differential equation procedure to obtain thermodynamic variables. With this procedure applied to geometric clusters, the guidelines are developed to look for excluded volume effect in nuclear multifragmentation. In Part II, an explanation is offered for the recently observed oscillations in the energy spectra of alpha-particles emitted from hot compound nuclei. Contrary to what was previously expected, the oscillations are assumed to be caused by the multiple-chance nature of alpha-evaporation. In a semi-empirical fashion this assumption is successfully confirmed by a technique of two-spectra decomposition which treats experimental alpha-spectra as having contributions from at least two independent emitters. Building upon the success of the multiple-chance explanation of the oscillations, Moretto's single-chance evaporation theory is augmented to include multiple-chance emission and tested on experimental data to yield positive results.

  11. K-means-clustering-based fiber nonlinearity equalization techniques for 64-QAM coherent optical communication system.

    PubMed

    Zhang, Junfeng; Chen, Wei; Gao, Mingyi; Shen, Gangxiang

    2017-10-30

    In this work, we proposed two k-means-clustering-based algorithms to mitigate the fiber nonlinearity for 64-quadrature amplitude modulation (64-QAM) signal, the training-sequence assisted k-means algorithm and the blind k-means algorithm. We experimentally demonstrated the proposed k-means-clustering-based fiber nonlinearity mitigation techniques in 75-Gb/s 64-QAM coherent optical communication system. The proposed algorithms have reduced clustering complexity and low data redundancy and they are able to quickly find appropriate initial centroids and select correctly the centroids of the clusters to obtain the global optimal solutions for large k value. We measured the bit-error-ratio (BER) performance of 64-QAM signal with different launched powers into the 50-km single mode fiber and the proposed techniques can greatly mitigate the signal impairments caused by the amplified spontaneous emission noise and the fiber Kerr nonlinearity and improve the BER performance.

  12. Ages of Extragalactic Intermediate-Age Star Clusters

    NASA Technical Reports Server (NTRS)

    Flower, P. J.

    1983-01-01

    A dating technique for faint, distant star clusters observable in the local group of galaxies with the space telescope is discussed. Color-magnitude diagrams of Magellanic Cloud clusters are mentioned along with the metallicity of star clusters.

  13. Genetic variability of citrinin-producing Penicillium citrinum strains as occupational health hazards in northern Iran.

    PubMed

    Khosravi, Ali Reza; Sheikhkarami, Mojgan; Shokri, Hojjatollah; Sabokbar, Azar

    2012-12-01

    We evaluated the ability of randomly amplified polymorphic DNA (RAPD) to type citrinin-producing Penicillium citrinum (P. citrinum) strains recovered from the forest's air in northern Iran. A total of 12 P. citrinum strains (P1-P12) were characterised by citrinin production and random amplification of polymorphic DNA (RAPD) technique. All the strains produced citrinin with levels ranging from 1.5 μg mL(-1) to 39.6 μg mL(-1) (average value: 12.68 μg mL(-1)). Of 11 primers tested, eight primers produced polymorphic amplification patterns. These primers generated a total of 105 reproducible RAPD bands, averaging to 13.1 bands per primer. Dendrogram for each primer indicating the distance of the strains to each other was constructed. RAPD results showed that the collected strains constituted four different clusters. The first cluster included two isolates (P1 and P3). The second cluster included seven isolates (P2, P4, P5, P6, P7, P8, and P10). The third and fourth clusters included one isolate (P9) and two isolates (P11 and P12), respectively. We concluded that RAPD analysis might be used in providing genotypic characters for toxigenic P. citrinum strains typing in epidemiological investigations and public health related risk assessment.

  14. Constraining the mass–richness relationship of redMaPPer clusters with angular clustering

    DOE PAGES

    Baxter, Eric J.; Rozo, Eduardo; Jain, Bhuvnesh; ...

    2016-08-04

    The potential of using cluster clustering for calibrating the mass–richness relation of galaxy clusters has been recognized theoretically for over a decade. In this paper, we demonstrate the feasibility of this technique to achieve high-precision mass calibration using redMaPPer clusters in the Sloan Digital Sky Survey North Galactic Cap. By including cross-correlations between several richness bins in our analysis, we significantly improve the statistical precision of our mass constraints. The amplitude of the mass–richness relation is constrained to 7 per cent statistical precision by our analysis. However, the error budget is systematics dominated, reaching a 19 per cent total errormore » that is dominated by theoretical uncertainty in the bias–mass relation for dark matter haloes. We confirm the result from Miyatake et al. that the clustering amplitude of redMaPPer clusters depends on galaxy concentration as defined therein, and we provide additional evidence that this dependence cannot be sourced by mass dependences: some other effect must account for the observed variation in clustering amplitude with galaxy concentration. Assuming that the observed dependence of redMaPPer clustering on galaxy concentration is a form of assembly bias, we find that such effects introduce a systematic error on the amplitude of the mass–richness relation that is comparable to the error bar from statistical noise. Finally, the results presented here demonstrate the power of cluster clustering for mass calibration and cosmology provided the current theoretical systematics can be ameliorated.« less

  15. Validation of the (GTG)(5)-rep-PCR fingerprinting technique for rapid classification and identification of acetic acid bacteria, with a focus on isolates from Ghanaian fermented cocoa beans.

    PubMed

    De Vuyst, Luc; Camu, Nicholas; De Winter, Tom; Vandemeulebroecke, Katrien; Van de Perre, Vincent; Vancanneyt, Marc; De Vos, Paul; Cleenwerck, Ilse

    2008-06-30

    Amplification of repetitive bacterial DNA elements through the polymerase chain reaction (rep-PCR fingerprinting) using the (GTG)(5) primer, referred to as (GTG)(5)-PCR fingerprinting, was found a promising genotypic tool for rapid and reliable speciation of acetic acid bacteria (AAB). The method was evaluated with 64 AAB reference strains, including 31 type strains, and 132 isolates from Ghanaian, fermented cocoa beans, and was validated with DNA:DNA hybridization data. Most reference strains, except for example all Acetobacter indonesiensis strains and Gluconacetobacter liquefaciens LMG 1509, grouped according to their species designation, indicating the usefulness of this technique for identification to the species level. Moreover, exclusive patterns were obtained for most strains, suggesting that the technique can also be used for characterization below species level or typing of AAB strains. The (GTG)(5)-PCR fingerprinting allowed us to differentiate four major clusters among the fermented cocoa bean isolates, namely A. pasteurianus (cluster I, 100 isolates), A. syzygii- or A. lovaniensis-like (cluster II, 23 isolates), and A. tropicalis-like (clusters III and IV containing 4 and 5 isolates, respectively). A. syzygii-like and A. tropicalis-like strains from cocoa bean fermentations were reported for the first time. Validation of the method and indications for reclassifications of AAB species and existence of new Acetobacter species were obtained through 16S rRNA sequencing analyses and DNA:DNA hybridizations. Reclassifications refer to A. aceti LMG 1531, Ga. xylinus LMG 1518, and Ga. xylinus subsp. sucrofermentans LMG 18788(T).

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bonamigo, M.; Grillo, C.; Ettori, S.

    We present a novel approach for a combined analysis of X-ray and gravitational lensing data and apply this technique to the merging galaxy cluster MACS J0416.1–2403. The method exploits the information on the intracluster gas distribution that comes from a fit of the X-ray surface brightness and then includes the hot gas as a fixed mass component in the strong-lensing analysis. With our new technique, we can separate the collisional from the collision-less diffuse mass components, thus obtaining a more accurate reconstruction of the dark matter distribution in the core of a cluster. We introduce an analytical description of themore » X-ray emission coming from a set of dual pseudo-isothermal elliptical mass distributions, which can be directly used in most lensing softwares. By combining Chandra observations with Hubble Frontier Fields imaging and Multi Unit Spectroscopic Explorer spectroscopy in MACS J0416.1–2403, we measure a projected gas-to-total mass fraction of approximately 10% at 350 kpc from the cluster center. Compared to the results of a more traditional cluster mass model (diffuse halos plus member galaxies), we find a significant difference in the cumulative projected mass profile of the dark matter component and that the dark matter over total mass fraction is almost constant, out to more than 350 kpc. In the coming era of large surveys, these results show the need of multiprobe analyses for detailed dark matter studies in galaxy clusters.« less

  17. Joining X-Ray to Lensing: An Accurate Combined Analysis of MACS J0416.1-2403

    NASA Astrophysics Data System (ADS)

    Bonamigo, M.; Grillo, C.; Ettori, S.; Caminha, G. B.; Rosati, P.; Mercurio, A.; Annunziatella, M.; Balestra, I.; Lombardi, M.

    2017-06-01

    We present a novel approach for a combined analysis of X-ray and gravitational lensing data and apply this technique to the merging galaxy cluster MACS J0416.1-2403. The method exploits the information on the intracluster gas distribution that comes from a fit of the X-ray surface brightness and then includes the hot gas as a fixed mass component in the strong-lensing analysis. With our new technique, we can separate the collisional from the collision-less diffuse mass components, thus obtaining a more accurate reconstruction of the dark matter distribution in the core of a cluster. We introduce an analytical description of the X-ray emission coming from a set of dual pseudo-isothermal elliptical mass distributions, which can be directly used in most lensing softwares. By combining Chandra observations with Hubble Frontier Fields imaging and Multi Unit Spectroscopic Explorer spectroscopy in MACS J0416.1-2403, we measure a projected gas-to-total mass fraction of approximately 10% at 350 kpc from the cluster center. Compared to the results of a more traditional cluster mass model (diffuse halos plus member galaxies), we find a significant difference in the cumulative projected mass profile of the dark matter component and that the dark matter over total mass fraction is almost constant, out to more than 350 kpc. In the coming era of large surveys, these results show the need of multiprobe analyses for detailed dark matter studies in galaxy clusters.

  18. Solid state and aqueous behavior of uranyl peroxide cage clusters

    NASA Astrophysics Data System (ADS)

    Pellegrini, Kristi Lynn

    Uranyl peroxide cage clusters include a large family of more than 50 published clusters of a variety of sizes, which can incorporate various ligands including pyrophosphate and oxalate. Previous studies have reported that uranyl clusters can be used as a method to separate uranium from a solid matrix, with potential applications in reprocessing of irradiated nuclear fuel. Because of the potential applications of these novel structures in an advanced nuclear fuel cycle and their likely presence in areas of contamination, it is important to understand their behavior in both solid state and aqueous systems, including complex environments where other ions are present. In this thesis, I examine the aqueous behavior of U24Pp 12, as well as aqueous cluster systems with added mono-, di-, and trivalent cations. The resulting solutions were analyzed using dynamic light scattering and ultra-small angle X-ray scattering to evaluate the species in solution. Precipitates of these systems were analyzed using powder X-ray diffraction, X-ray fluorescence spectrometry, and Raman spectroscopy. The results of these analyses demonstrate the importance of cation size, charge, and concentration of added cations on the aqueous behavior of uranium macroions. Specifically, aggregates of various sizes and shapes form rapidly upon addition of cations, and in some cases these aggregates appear to precipitate into an X-ray amorphous material that still contains U24Pp12 clusters. In addition, I probe aggregation of U24Pp12 and U60, another uranyl peroxide cage cluster, in mixed solvent water-alcohol systems. The aggregation of uranyl clusters in water-alcohol systems is a result of hydrogen bonding with polar organic molecules and the reduction of the dielectric constant of the system. Studies of aggregation of uranyl clusters also allow for comparison between the newer uranyl polyoxometalate family and century-old transition metal polyoxometalates. To complement the solution studies of uranyl cage clusters, solid state analyses of U24Pp12 are presented, including single crystal X-ray diffraction and preliminary single crystal neutron diffraction. Solid state analyses are used to probe the complicated bonding environments between U24Pp12 and crystallized counterions, giving further insight into the importance of cluster protonation and counterions in uranyl cluster systems. The combination of solid state and solution techniques provides information about the complicated nature of uranyl peroxide nanoclusters, and insight towards future applications of clusters in the advanced nuclear fuel cycle and the environment.

  19. The role of nickel in radiation damage of ferritic alloys

    DOE PAGES

    Osetsky, Y.; Anento, Napoleon; Serra, Anna; ...

    2014-11-26

    According to modern theory, damage evolution under neutron irradiation depends on the fraction of self-interstitial atoms (SIAs) produced in the form of one-dimensional glissile clusters. These clusters, having a low interaction cross-section with other defects, are absorbed mainly by grain boundaries and dislocations, creating the so-called production bias. It is known empirically that the addition of certain alloying elements influences many radiation effects, including swelling; however, the mechanisms are unknown in many cases. In this study, we report the results of an extensive multi-technique atomistic level modeling study of SIA clusters mobility in body-centered cubic Fe–Ni alloys. We have foundmore » that Ni interacts strongly with the periphery of clusters, affecting their mobility. The total effect is defined by the number of Ni atoms interacting with the cluster at the same time and can be significant, even in low-Ni alloys. Thus a 1 nm (37SIAs) cluster is practically immobile at T < 500 K in the Fe–0.8 at.% Ni alloy. Increasing cluster size and Ni content enhances cluster immobilization. Finally, this effect should have quite broad consequences in void swelling, matrix damage accumulation and radiation induced hardening and the results obtained help to better understand and predict the effects of radiation in Fe–Ni ferritic alloys.« less

  20. Is It Feasible to Identify Natural Clusters of TSC-Associated Neuropsychiatric Disorders (TAND)?

    PubMed

    Leclezio, Loren; Gardner-Lubbe, Sugnet; de Vries, Petrus J

    2018-04-01

    Tuberous sclerosis complex (TSC) is a genetic disorder with multisystem involvement. The lifetime prevalence of TSC-Associated Neuropsychiatric Disorders (TAND) is in the region of 90% in an apparently unique, individual pattern. This "uniqueness" poses significant challenges for diagnosis, psycho-education, and intervention planning. To date, no studies have explored whether there may be natural clusters of TAND. The purpose of this feasibility study was (1) to investigate the practicability of identifying natural TAND clusters, and (2) to identify appropriate multivariate data analysis techniques for larger-scale studies. TAND Checklist data were collected from 56 individuals with a clinical diagnosis of TSC (n = 20 from South Africa; n = 36 from Australia). Using R, the open-source statistical platform, mean squared contingency coefficients were calculated to produce a correlation matrix, and various cluster analyses and exploratory factor analysis were examined. Ward's method rendered six TAND clusters with good face validity and significant convergence with a six-factor exploratory factor analysis solution. The "bottom-up" data-driven strategies identified a "scholastic" cluster of TAND manifestations, an "autism spectrum disorder-like" cluster, a "dysregulated behavior" cluster, a "neuropsychological" cluster, a "hyperactive/impulsive" cluster, and a "mixed/mood" cluster. These feasibility results suggest that a combination of cluster analysis and exploratory factor analysis methods may be able to identify clinically meaningful natural TAND clusters. Findings require replication and expansion in larger dataset, and could include quantification of cluster or factor scores at an individual level. Copyright © 2018 Elsevier Inc. All rights reserved.

  1. Spatial characterization of dissolved trace elements and heavy metals in the upper Han River (China) using multivariate statistical techniques.

    PubMed

    Li, Siyue; Zhang, Quanfa

    2010-04-15

    A data matrix (4032 observations), obtained during a 2-year monitoring period (2005-2006) from 42 sites in the upper Han River is subjected to various multivariate statistical techniques including cluster analysis, principal component analysis (PCA), factor analysis (FA), correlation analysis and analysis of variance to determine the spatial characterization of dissolved trace elements and heavy metals. Our results indicate that waters in the upper Han River are primarily polluted by Al, As, Cd, Pb, Sb and Se, and the potential pollutants include Ba, Cr, Hg, Mn and Ni. Spatial distribution of trace metals indicates the polluted sections mainly concentrate in the Danjiang, Danjiangkou Reservoir catchment and Hanzhong Plain, and the most contaminated river is in the Hanzhong Plain. Q-model clustering depends on geographical location of sampling sites and groups the 42 sampling sites into four clusters, i.e., Danjiang, Danjiangkou Reservoir region (lower catchment), upper catchment and one river in headwaters pertaining to water quality. The headwaters, Danjiang and lower catchment, and upper catchment correspond to very high polluted, moderate polluted and relatively low polluted regions, respectively. Additionally, PCA/FA and correlation analysis demonstrates that Al, Cd, Mn, Ni, Fe, Si and Sr are controlled by natural sources, whereas the other metals appear to be primarily controlled by anthropogenic origins though geogenic source contributing to them. 2009 Elsevier B.V. All rights reserved.

  2. DICON: interactive visual analysis of multidimensional clusters.

    PubMed

    Cao, Nan; Gotz, David; Sun, Jimeng; Qu, Huamin

    2011-12-01

    Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a treemap-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis. © 2011 IEEE

  3. The Search for Bright Variable Stars in Open Cluster NGC 6819.

    NASA Astrophysics Data System (ADS)

    Talamantes, Antonio; Sandquist, E. L.

    2009-01-01

    During this research period data was taken for seven nights at the 1m telescope at Mt. Laguna Observatory for the open cluster NGC 6819. For four of the nights data was taken using a V-band filter. For the three nights remaining nights the data was taken using an R-band filter. Photometry was done using the ISIS image subtraction package. Six new variable stars were located using these techniques. These variable types include a pulsating variable, five detached eclipsing binaries. Of the detached eclipsing binaries, three are near the cluster turnoff and two in the blue straggler region(and one of these has total eclipses). Nine previously known variables(six contact binaries, two detached eclipsing binaries and one near-contact binary) were also studied.

  4. An unsupervised classification approach for analysis of Landsat data to monitor land reclamation in Belmont county, Ohio

    NASA Technical Reports Server (NTRS)

    Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.

    1981-01-01

    Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.

  5. "I Keep That Hush-Hush": Male Survivors of Sexual Abuse and the Challenges of Disclosure

    ERIC Educational Resources Information Center

    Sorsoli, Lynn; Kia-Keating, Maryam; Grossman, Frances K.

    2008-01-01

    Disclosure is a prominent variable in child sexual abuse research, but little research has examined male disclosure experiences. Sixteen male survivors of childhood sexual abuse were interviewed regarding experiences of disclosure. Analytic techniques included a grounded theory approach to coding and the use of conceptually clustered matrices.…

  6. Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm

    NASA Astrophysics Data System (ADS)

    Frisca, Bustamam, Alhadi; Siswantining, Titin

    2017-03-01

    Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.

  7. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    PubMed

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  8. The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions

    NASA Astrophysics Data System (ADS)

    Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.

    2018-04-01

    The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.

  9. Comprehensive Genomic Analyses of the OM43 Clade, Including a Novel Species from the Red Sea, Indicate Ecotype Differentiation among Marine Methylotrophs

    PubMed Central

    Jimenez-Infante, Francy; Ngugi, David Kamanda; Vinu, Manikandan; Alam, Intikhab; Kamau, Allan Anthony; Blom, Jochen; Bajic, Vladimir B.

    2015-01-01

    The OM43 clade within the family Methylophilaceae of Betaproteobacteria represents a group of methylotrophs that play important roles in the metabolism of C1 compounds in marine environments and other aquatic environments around the globe. Using dilution-to-extinction cultivation techniques, we successfully isolated a novel species of this clade (here designated MBRS-H7) from the ultraoligotrophic open ocean waters of the central Red Sea. Phylogenomic analyses indicate that MBRS-H7 is a novel species that forms a distinct cluster together with isolate KB13 from Hawaii (Hawaii-Red Sea [H-RS] cluster) that is separate from the cluster represented by strain HTCC2181 (from the Oregon coast). Phylogenetic analyses using the robust 16S-23S internal transcribed spacer revealed a potential ecotype separation of the marine OM43 clade members, which was further confirmed by metagenomic fragment recruitment analyses that showed trends of higher abundance in low-chlorophyll and/or high-temperature provinces for the H-RS cluster but a preference for colder, highly productive waters for the HTCC2181 cluster. This potential environmentally driven niche differentiation is also reflected in the metabolic gene inventories, which in the case of the H-RS cluster include those conferring resistance to high levels of UV irradiation, temperature, and salinity. Interestingly, we also found different energy conservation modules between these OM43 subclades, namely, the existence of the NADH:quinone oxidoreductase complex I (NUO) system in the H-RS cluster and the nonhomologous NADH:quinone oxidoreductase (NQR) system in the HTCC2181 cluster, which might have implications for their overall energetic yields. PMID:26655752

  10. Chirped Pulse Rotational Spectroscopy of a Single THUJONE+WATER Sample

    NASA Astrophysics Data System (ADS)

    Kisiel, Zbigniew; Perez, Cristobal; Schnell, Melanie

    2016-06-01

    Rotational spectroscopy of natural products dates over 35 years when six different species including thujone were investigated. Nevertheless, the technique of low-resolution microwave spectroscopy employed therein allowed determination of only a single conformational parameter. Advances in sensitivity and resolution possible with supersonic expansion techniques of rotational spectroscopy made possible much more detailed studies such that, for example, the structures of first camphor, and then of multiple clusters of camphor with water were determined. We revisited the rotational spectrum of the well known thujone molecule by using the chirped pulse spectrometer in Hamburg. The spectrum of a single thujone sample was recorded with an admixture of 18O enriched water and was successively analysed using an array of techniques, including the AUTOFIT program, the AABS package and the STRFIT program. We have, so far, been able to assign rotational transitions of α-thujone, β-thujone, another thujone isomer, fenchone, and several thujone-water clusters in the spectrum of this single sample. Natural abundance molecular populations were sufficient to determine precise heavy atom backbones of thujone and fenchone, and H_218O enrichment delivered water molecule orientations in the hydrated clusters. An overview of these results will be presented. Z.Kisiel, A.C.Legon, JACS 100, 8166 (1978) Z.Kisiel, O.Desyatnyk, E.Białkowska-Jaworska, L.Pszczółkowski, PCCP 5 820 (2003) C.Pérez, A.Krin, A.L.Steber, J.C.López, Z.Kisiel, M.Schnell, J.Phys.Chem.Lett. 7 154 (2016) N.A.Seifert, I.A.Finneran, C.Perez, et al. J.Mol.Spectrosc. 312, 12 (2015) Z.Kisiel, L.Pszczółkowski, B.J.Drouin, et al. J.Mol.Spectrosc. 280, 134 (2012). Z.Kisiel, J.Mol.Spectrosc. 218, 58 (2003)

  11. A rapid ATR-FTIR spectroscopic method for detection of sibutramine adulteration in tea and coffee based on hierarchical cluster and principal component analyses.

    PubMed

    Cebi, Nur; Yilmaz, Mustafa Tahsin; Sagdic, Osman

    2017-08-15

    Sibutramine may be illicitly included in herbal slimming foods and supplements marketed as "100% natural" to enhance weight loss. Considering public health and legal regulations, there is an urgent need for effective, rapid and reliable techniques to detect sibutramine in dietetic herbal foods, teas and dietary supplements. This research comprehensively explored, for the first time, detection of sibutramine in green tea, green coffee and mixed herbal tea using ATR-FTIR spectroscopic technique combined with chemometrics. Hierarchical cluster analysis and PCA principle component analysis techniques were employed in spectral range (2746-2656cm -1 ) for classification and discrimination through Euclidian distance and Ward's algorithm. Unadulterated and adulterated samples were classified and discriminated with respect to their sibutramine contents with perfect accuracy without any false prediction. The results suggest that existence of the active substance could be successfully determined at the levels in the range of 0.375-12mg in totally 1.75g of green tea, green coffee and mixed herbal tea by using FTIR-ATR technique combined with chemometrics. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Intrinsic magnetic properties of bimetallic nanoparticles elaborated by cluster beam deposition.

    PubMed

    Dupuis, V; Khadra, G; Hillion, A; Tamion, A; Tuaillon-Combes, J; Bardotti, L; Tournus, F

    2015-11-14

    In this paper, we present some specific chemical and magnetic order obtained very recently on characteristic bimetallic nanoalloys prepared by mass-selected Low Energy Cluster Beam Deposition (LECBD). We study how the competition between d-atom hybridization, complex structure, morphology and chemical affinity affects their intrinsic magnetic properties at the nanoscale. The structural and magnetic properties of these nanoalloys were investigated using various experimental techniques that include High Resolution Transmission Electron Microscopy (HRTEM), Superconducting Quantum Interference Device (SQUID) magnetometry, as well as synchrotron techniques such as Extended X-ray Absorption Fine Structure (EXAFS) and X-ray Magnetic Circular Dichroism (XMCD). Depending on the chemical nature of the nanoalloys we observe different magnetic responses compared to their bulk counterparts. In particular, we show how specific relaxation in nanoalloys impacts their magnetic anisotropy; and how finite size effects (size reduction) inversely enhance their magnetic moment.

  13. Slow Photoelectron Velocity-Map Imaging of Cryogenically Cooled Anions

    NASA Astrophysics Data System (ADS)

    Weichman, Marissa L.; Neumark, Daniel M.

    2018-04-01

    Slow photoelectron velocity-map imaging spectroscopy of cryogenically cooled anions (cryo-SEVI) is a powerful technique for elucidating the vibrational and electronic structure of neutral radicals, clusters, and reaction transition states. SEVI is a high-resolution variant of anion photoelectron spectroscopy based on photoelectron imaging that yields spectra with energy resolution as high as 1-2 cm‑1. The preparation of cryogenically cold anions largely eliminates hot bands and dramatically narrows the rotational envelopes of spectral features, enabling the acquisition of well-resolved photoelectron spectra for complex and spectroscopically challenging species. We review the basis and history of the SEVI method, including recent experimental developments that have improved its resolution and versatility. We then survey recent SEVI studies to demonstrate the utility of this technique in the spectroscopy of aromatic radicals, metal and metal oxide clusters, nonadiabatic interactions between excited states of small molecules, and transition states of benchmark bimolecular reactions.

  14. Glioma grading using cell nuclei morphologic features in digital pathology images

    NASA Astrophysics Data System (ADS)

    Reza, Syed M. S.; Iftekharuddin, Khan M.

    2016-03-01

    This work proposes a computationally efficient cell nuclei morphologic feature analysis technique to characterize the brain gliomas in tissue slide images. In this work, our contributions are two-fold: 1) obtain an optimized cell nuclei segmentation method based on the pros and cons of the existing techniques in literature, 2) extract representative features by k-mean clustering of nuclei morphologic features to include area, perimeter, eccentricity, and major axis length. This clustering based representative feature extraction avoids shortcomings of extensive tile [1] [2] and nuclear score [3] based methods for brain glioma grading in pathology images. Multilayer perceptron (MLP) is used to classify extracted features into two tumor types: glioblastoma multiforme (GBM) and low grade glioma (LGG). Quantitative scores such as precision, recall, and accuracy are obtained using 66 clinical patients' images from The Cancer Genome Atlas (TCGA) [4] dataset. On an average ~94% accuracy from 10 fold crossvalidation confirms the efficacy of the proposed method.

  15. Formation and Assembly of Massive Star Clusters

    NASA Astrophysics Data System (ADS)

    McMillan, Stephen

    The formation of stars and star clusters is a major unresolved problem in astrophysics. It is central to modeling stellar populations and understanding galaxy luminosity distributions in cosmological models. Young massive clusters are major components of starburst galaxies, while globular clusters are cornerstones of the cosmic distance scale and represent vital laboratories for studies of stellar dynamics and stellar evolution. Yet how these clusters form and how rapidly and efficiently they expel their natal gas remain unclear, as do the consequences of this gas expulsion for cluster structure and survival. Also unclear is how the properties of low-mass clusters, which form from small-scale instabilities in galactic disks and inform much of our understanding of cluster formation and star-formation efficiency, differ from those of more massive clusters, which probably formed in starburst events driven by fast accretion at high redshift, or colliding gas flows in merging galaxies. Modeling cluster formation requires simulating many simultaneous physical processes, placing stringent demands on both software and hardware. Simulations of galaxies evolving in cosmological contexts usually lack the numerical resolution to simulate star formation in detail. They do not include detailed treatments of important physical effects such as magnetic fields, radiation pressure, ionization, and supernova feedback. Simulations of smaller clusters include these effects, but fall far short of the mass of even single young globular clusters. With major advances in computing power and software, we can now directly address this problem. We propose to model the formation of massive star clusters by integrating the FLASH adaptive mesh refinement magnetohydrodynamics (MHD) code into the Astrophysical Multi-purpose Software Environment (AMUSE) framework, to work with existing stellar-dynamical and stellar evolution modules in AMUSE. All software will be freely distributed on-line, allowing open access to state-of- the-art simulation techniques within a modern, modular software environment. We will follow the gravitational collapse of 0.1-10 million-solar mass gas clouds through star formation and coalescence into a star cluster, modeling in detail the coupling of the gas and the newborn stars. We will study the effects of star formation by detecting accreting regions of gas in self-gravitating, turbulent, MHD, FLASH models that we will translate into collisional dynamical systems of stars modeled with an N-body code, coupled together in the AMUSE framework. Our FLASH models will include treatments of radiative transfer from the newly formed stars, including heating and radiative acceleration of the surrounding gas. Specific questions to be addressed are: (1) How efficiently does the gas in a star forming region form stars, how does this depend on mass, metallicity, and other parameters, and what terminates star formation? What observational predictions can be made to constrain our models? (2) How important are different mechanisms for driving turbulence and removing gas from a cluster: accretion, radiative feedback, and mechanical feedback? (3) How does the infant mortality rate of young clusters depend on the initial properties of the parent cloud? (4) What are the characteristic formation timescales of massive star clusters, and what observable imprints does the assembly process leave on their structure at an age of 10-20 Myr, when formation is essentially complete and many clusters can be observed? These studies are directly relevant to NASA missions at many electromagnetic wavelengths, including Chandra, GALEX, Hubble, and Spitzer. Each traces different aspects of cluster formation and evolution: X-rays trace supernovae, ultraviolet traces young stars, visible colors can distinguish between young blue stars and older red stars, and the infrared directly shows young embedded star clusters.

  16. Min-max hyperellipsoidal clustering for anomaly detection in network security.

    PubMed

    Sarasamma, Suseela T; Zhu, Qiuming A

    2006-08-01

    A novel hyperellipsoidal clustering technique is presented for an intrusion-detection system in network security. Hyperellipsoidal clusters toward maximum intracluster similarity and minimum intercluster similarity are generated from training data sets. The novelty of the technique lies in the fact that the parameters needed to construct higher order data models in general multivariate Gaussian functions are incrementally derived from the data sets using accretive processes. The technique is implemented in a feedforward neural network that uses a Gaussian radial basis function as the model generator. An evaluation based on the inclusiveness and exclusiveness of samples with respect to specific criteria is applied to accretively learn the output clusters of the neural network. One significant advantage of this is its ability to detect individual anomaly types that are hard to detect with other anomaly-detection schemes. Applying this technique, several feature subsets of the tcptrace network-connection records that give above 95% detection at false-positive rates below 5% were identified.

  17. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering.

    PubMed

    Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar

    2015-01-01

    Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed.

  18. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering

    PubMed Central

    Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar

    2015-01-01

    Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed. PMID:25972896

  19. Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.

    PubMed

    Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi

    2018-01-01

    Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.

  20. Scalable Parallel Density-based Clustering and Applications

    NASA Astrophysics Data System (ADS)

    Patwary, Mostofa Ali

    2014-04-01

    Recently, density-based clustering algorithms (DBSCAN and OPTICS) have gotten significant attention of the scientific community due to their unique capability of discovering arbitrary shaped clusters and eliminating noise data. These algorithms have several applications, which require high performance computing, including finding halos and subhalos (clusters) from massive cosmology data in astrophysics, analyzing satellite images, X-ray crystallography, and anomaly detection. However, parallelization of these algorithms are extremely challenging as they exhibit inherent sequential data access order, unbalanced workload resulting in low parallel efficiency. To break the data access sequentiality and to achieve high parallelism, we develop new parallel algorithms, both for DBSCAN and OPTICS, designed using graph algorithmic techniques. For example, our parallel DBSCAN algorithm exploits the similarities between DBSCAN and computing connected components. Using datasets containing up to a billion floating point numbers, we show that our parallel density-based clustering algorithms significantly outperform the existing algorithms, achieving speedups up to 27.5 on 40 cores on shared memory architecture and speedups up to 5,765 using 8,192 cores on distributed memory architecture. In our experiments, we found that while achieving the scalability, our algorithms produce clustering results with comparable quality to the classical algorithms.

  1. Combining self-organizing mapping and supervised affinity propagation clustering approach to investigate functional brain networks involved in motor imagery and execution with fMRI measurements.

    PubMed

    Zhang, Jiang; Liu, Qi; Chen, Huafu; Yuan, Zhen; Huang, Jin; Deng, Lihua; Lu, Fengmei; Zhang, Junpeng; Wang, Yuqing; Wang, Mingwen; Chen, Liangyin

    2015-01-01

    Clustering analysis methods have been widely applied to identifying the functional brain networks of a multitask paradigm. However, the previously used clustering analysis techniques are computationally expensive and thus impractical for clinical applications. In this study a novel method, called SOM-SAPC that combines self-organizing mapping (SOM) and supervised affinity propagation clustering (SAPC), is proposed and implemented to identify the motor execution (ME) and motor imagery (MI) networks. In SOM-SAPC, SOM was first performed to process fMRI data and SAPC is further utilized for clustering the patterns of functional networks. As a result, SOM-SAPC is able to significantly reduce the computational cost for brain network analysis. Simulation and clinical tests involving ME and MI were conducted based on SOM-SAPC, and the analysis results indicated that functional brain networks were clearly identified with different response patterns and reduced computational cost. In particular, three activation clusters were clearly revealed, which include parts of the visual, ME and MI functional networks. These findings validated that SOM-SAPC is an effective and robust method to analyze the fMRI data with multitasks.

  2. CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms.

    PubMed

    Kohlhoff, Kai J; Sosnick, Marc H; Hsu, William T; Pande, Vijay S; Altman, Russ B

    2011-08-15

    Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures. CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL. Releases of the CAMPAIGN library are freely available for download under the LGPL from https://simtk.org/home/campaign. Source code can also be obtained through anonymous subversion access as described on https://simtk.org/scm/?group_id=453. kjk33@cantab.net.

  3. Improvements in Ionized Cluster-Beam Deposition

    NASA Technical Reports Server (NTRS)

    Fitzgerald, D. J.; Compton, L. E.; Pawlik, E. V.

    1986-01-01

    Lower temperatures result in higher purity and fewer equipment problems. In cluster-beam deposition, clusters of atoms formed by adiabatic expansion nozzle and with proper nozzle design, expanding vapor cools sufficiently to become supersaturated and form clusters of material deposited. Clusters are ionized and accelerated in electric field and then impacted on substrate where films form. Improved cluster-beam technique useful for deposition of refractory metals.

  4. Applications of modern statistical methods to analysis of data in physical science

    NASA Astrophysics Data System (ADS)

    Wicker, James Eric

    Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.

  5. Using Geographic Information Systems and Spatial Analysis Methods to Assess Household Water Access and Sanitation Coverage in the SHINE Trial.

    PubMed

    Ntozini, Robert; Marks, Sara J; Mangwadu, Goldberg; Mbuya, Mduduzi N N; Gerema, Grace; Mutasa, Batsirai; Julian, Timothy R; Schwab, Kellogg J; Humphrey, Jean H; Zungu, Lindiwe I

    2015-12-15

    Access to water and sanitation are important determinants of behavioral responses to hygiene and sanitation interventions. We estimated cluster-specific water access and sanitation coverage to inform a constrained randomization technique in the SHINE trial. Technicians and engineers inspected all public access water sources to ascertain seasonality, function, and geospatial coordinates. Households and water sources were mapped using open-source geospatial software. The distance from each household to the nearest perennial, functional, protected water source was calculated, and for each cluster, the median distance and the proportion of households within <500 m and >1500 m of such a water source. Cluster-specific sanitation coverage was ascertained using a random sample of 13 households per cluster. These parameters were included as covariates in randomization to optimize balance in water and sanitation access across treatment arms at the start of the trial. The observed high variability between clusters in both parameters suggests that constraining on these factors was needed to reduce risk of bias. © The Author 2015. Published by Oxford University Press for the Infectious Diseases Society of America.

  6. Traveling-cluster approximation for uncorrelated amorphous systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sen, A.K.; Mills, R.; Kaplan, T.

    1984-11-15

    We have developed a formalism for including cluster effects in the one-electron Green's function for a positionally disordered (liquid or amorphous) system without any correlation among the scattering sites. This method is an extension of the technique known as the traveling-cluster approximation (TCA) originally obtained and applied to a substitutional alloy by Mills and Ratanavararaksa. We have also proved the appropriate fixed-point theorem, which guarantees, for a bounded local potential, that the self-consistent equations always converge upon iteration to a unique, Herglotz solution. To our knowledge, this is the only analytic theory for considering cluster effects. Furthermore, we have performedmore » some computer calculations in the pair TCA, for the model case of delta-function potentials on a one-dimensional random chain. These results have been compared with ''exact calculations'' (which, in principle, take into account all cluster effects) and with the coherent-potential approximation (CPA), which is the single-site TCA. The density of states for the pair TCA clearly shows some improvement over the CPA and yet, apparently, the pair approximation distorts some of the features of the exact results.« less

  7. Active constrained clustering by examining spectral Eigenvectors

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri L.; desJardins, Marie; Xu, Qianjun

    2005-01-01

    This work focuses on the active selection of pairwise constraints for spectral clustering. We develop and analyze a technique for Active Constrained Clustering by Examining Spectral eigenvectorS (ACCESS) derived from a similarity matrix.

  8. Distant Cluster Hunting. II; A Comparison of X-Ray and Optical Cluster Detection Techniques and Catalogs from the ROSAT Optical X-Ray Survey

    NASA Technical Reports Server (NTRS)

    Donahue, Megan; Scharf, Caleb A.; Mack, Jennifer; Lee, Y. Paul; Postman, Marc; Rosait, Piero; Dickinson, Mark; Voit, G. Mark; Stocke, John T.

    2002-01-01

    We present and analyze the optical and X-ray catalogs of moderate-redshift cluster candidates from the ROSA TOptical X-Ray Survey, or ROXS. The survey covers the sky area contained in the fields of view of 23 deep archival ROSA T PSPC pointings, 4.8 square degrees. The cross-correlated cluster catalogs were con- structed by comparing two independent catalogs extracted from the optical and X-ray bandpasses, using a matched-filter technique for the optical data and a wavelet technique for the X-ray data. We cross-identified cluster candidates in each catalog. As reported in Paper 1, the matched-filter technique found optical counter- parts for at least 60% (26 out of 43) of the X-ray cluster candidates; the estimated redshifts from the matched filter algorithm agree with at least 7 of 1 1 spectroscopic confirmations (Az 5 0.10). The matched filter technique. with an imaging sensitivity of ml N 23, identified approximately 3 times the number of candidates (155 candidates, 142 with a detection confidence >3 u) found in the X-ray survey of nearly the same area. There are 57 X-ray candidates, 43 of which are unobscured by scattered light or bright stars in the optical images. Twenty-six of these have fairly secure optical counterparts. We find that the matched filter algorithm, when applied to images with galaxy flux sensitivities of mI N 23, is fairly well-matched to discovering z 5 1 clusters detected by wavelets in ROSAT PSPC exposures of 8000-60,000 s. The difference in the spurious fractions between the optical and X-ray (30%) and IO%, respectively) cannot account for the difference in source number. In Paper I, we compared the optical and X-ray cluster luminosity functions and we found that the luminosity functions are consistent if the relationship between X-ray and optical luminosities is steep (Lx o( L&f). Here, in Paper 11, we present the cluster catalogs and a numerical simulation of the ROXS. We also present color-magnitude plots for several of the cluster candidates, and examine the prominence of the red sequence in each. We find that the X-ray clusters in our survey do not all have a prominent red sequence. We conclude that while the red sequence may be a distinct feature in the color-magnitude plots for virialized massive clusters, it may be less distinct in lower mass clusters of galaxies at even moderate redshifts. Multiple, complementary methods of selecting and defining clusters may be essential, particularly at high redshift where all methods start to run into completeness limits, incomplete understanding of physical evolution, and projection effects.

  9. Network module detection: Affinity search technique with the multi-node topological overlap measure

    PubMed Central

    Li, Ai; Horvath, Steve

    2009-01-01

    Background Many clustering procedures only allow the user to input a pairwise dissimilarity or distance measure between objects. We propose a clustering method that can input a multi-point dissimilarity measure d(i1, i2, ..., iP) where the number of points P can be larger than 2. The work is motivated by gene network analysis where clusters correspond to modules of highly interconnected nodes. Here, we define modules as clusters of network nodes with high multi-node topological overlap. The topological overlap measure is a robust measure of interconnectedness which is based on shared network neighbors. In previous work, we have shown that the multi-node topological overlap measure yields biologically meaningful results when used as input of network neighborhood analysis. Findings We adapt network neighborhood analysis for the use of module detection. We propose the Module Affinity Search Technique (MAST), which is a generalized version of the Cluster Affinity Search Technique (CAST). MAST can accommodate a multi-node dissimilarity measure. Clusters grow around user-defined or automatically chosen seeds (e.g. hub nodes). We propose both local and global cluster growth stopping rules. We use several simulations and a gene co-expression network application to argue that the MAST approach leads to biologically meaningful results. We compare MAST with hierarchical clustering and partitioning around medoid clustering. Conclusion Our flexible module detection method is implemented in the MTOM software which can be downloaded from the following webpage: PMID:19619323

  10. Network module detection: Affinity search technique with the multi-node topological overlap measure.

    PubMed

    Li, Ai; Horvath, Steve

    2009-07-20

    Many clustering procedures only allow the user to input a pairwise dissimilarity or distance measure between objects. We propose a clustering method that can input a multi-point dissimilarity measure d(i1, i2, ..., iP) where the number of points P can be larger than 2. The work is motivated by gene network analysis where clusters correspond to modules of highly interconnected nodes. Here, we define modules as clusters of network nodes with high multi-node topological overlap. The topological overlap measure is a robust measure of interconnectedness which is based on shared network neighbors. In previous work, we have shown that the multi-node topological overlap measure yields biologically meaningful results when used as input of network neighborhood analysis. We adapt network neighborhood analysis for the use of module detection. We propose the Module Affinity Search Technique (MAST), which is a generalized version of the Cluster Affinity Search Technique (CAST). MAST can accommodate a multi-node dissimilarity measure. Clusters grow around user-defined or automatically chosen seeds (e.g. hub nodes). We propose both local and global cluster growth stopping rules. We use several simulations and a gene co-expression network application to argue that the MAST approach leads to biologically meaningful results. We compare MAST with hierarchical clustering and partitioning around medoid clustering. Our flexible module detection method is implemented in the MTOM software which can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/MTOM/

  11. Mississippi State University Center for Air Sea Technology. FY93 and FY 94 Research Program in Navy Ocean Modeling and Prediction

    DTIC Science & Technology

    1994-09-30

    relational versus object oriented DBMS, knowledge discovery, data models, rnetadata, data filtering, clustering techniques, and synthetic data. A secondary...The first was the investigation of Al/ES Lapplications (knowledge discovery, data mining, and clustering ). Here CAST collabo.rated with Dr. Fred Petry...knowledge discovery system based on clustering techniques; implemented an on-line data browser to the DBMS; completed preliminary efforts to apply object

  12. Spitzer Imaging of Planck-Herschel Dusty Proto-Clusters at z=2-3

    NASA Astrophysics Data System (ADS)

    Cooray, Asantha; Ma, Jingzhe; Greenslade, Joshua; Kubo, Mariko; Nayyeri, Hooshang; Clements, David; Cheng, Tai-An

    2018-05-01

    We have recently introduced a new proto-cluster selection technique by combing Herschel/SPIRE imaging data and Planck/HFIk all-sky survey point source catalog. These sources are identified as Planck point sources with clumps of Herschel source over-densities with far-IR colors comparable to z=0 ULIRGS redshifted to z=2 to 3. The selection is sensitive to dusty starbursts and obscured QSOs and we have recovered couple of the known proto-clusters and close to 30 new proto-clusters. The candidate proto-clusters selected from this technique have far-IR flux densities several times higher than those that are optically selected, such as using LBG selection, implying that the member galaxies are in a special phase of heightened dusty starburst and dusty QSO activity. This far-IR luminous phase may be short but likely to be necessary piece to understand the whole stellar mass assembly history of clusters. Moreover, our photo-clusters are missed in optical selections, suggesting that optically selected proto-clusters alone do not provide adequate statistics and a comparison of the far-IR and optical selected clusters may reveal the importance of the dusty stellar mass assembly. Here, we propose IRAC observations of six of the highest priority new proto-clusters, to establish the validity of the technique and to determine the total stellar mass through SED models. For a modest observing time the science program will have a substantial impact on an upcoming science topic in cosmology with implications for observations with JWST and WFIRST to understand the mass assembly in the universe.

  13. Preparation of gold nanocluster bioconjugates for electron microscopy.

    PubMed

    Heinecke, Christine L; Ackerson, Christopher J

    2013-01-01

    In this chapter, we describe types of gold nanoparticle-biomolecule conjugates and their use in electron microscopy. Included are two detailed protocols for labeling an IgG antibody with gold monolayer protected clusters. The first approach is a direct bonding approach that utilizes the ligand place exchange reaction. The second approach describes NHS-EDC coupling of Au(144)(pMBA)(60) with IgG. Also included are various characterization techniques for determining labeling efficiency.

  14. Hierarchical clustering of EMD based interest points for road sign detection

    NASA Astrophysics Data System (ADS)

    Khan, Jesmin; Bhuiyan, Sharif; Adhami, Reza

    2014-04-01

    This paper presents an automatic road traffic signs detection and recognition system based on hierarchical clustering of interest points and joint transform correlation. The proposed algorithm consists of the three following stages: interest points detection, clustering of those points and similarity search. At the first stage, good discriminative, rotation and scale invariant interest points are selected from the image edges based on the 1-D empirical mode decomposition (EMD). We propose a two-step unsupervised clustering technique, which is adaptive and based on two criterion. In this context, the detected points are initially clustered based on the stable local features related to the brightness and color, which are extracted using Gabor filter. Then points belonging to each partition are reclustered depending on the dispersion of the points in the initial cluster using position feature. This two-step hierarchical clustering yields the possible candidate road signs or the region of interests (ROIs). Finally, a fringe-adjusted joint transform correlation (JTC) technique is used for matching the unknown signs with the existing known reference road signs stored in the database. The presented framework provides a novel way to detect a road sign from the natural scenes and the results demonstrate the efficacy of the proposed technique, which yields a very low false hit rate.

  15. Progress toward Synthesis and Characterization of Rare-Earth Nanoparticles

    NASA Astrophysics Data System (ADS)

    Romero, Dulce G.; Ho, Pei-Chun; Attar, Saeed; Margosan, Dennis

    2010-03-01

    Magnetic nanoparticles exhibit interesting phenomena, such as enhanced magnetization and reduced magnetic ordering temperature (i.e. superparamagnetism), which has technical applications in industry, including magnetic storage, magnetic imaging, and magnetic refrigeration. We used the inverse micelle technique to synthesize Gd and Nd nanoparticles given its potential to control the cluster size, amount of aggregation, and prevent oxidation of the rare-earth elements. Gd and Nd were reduced by NaBH4 from the chloride salt. The produced clusters were characterized by X-ray diffraction (XRD), scanning electron microscopy (SEM), and energy dispersive X-ray spectroscopy (EDX). The results from the XRD show that the majority of the peaks match those of the surfactant, DDAB. No peaks of Gd were observed due to excess surfactant or amorphous clusters. However, the results from the SEM and EDX indicate the presence of Gd and Nd in our clusters microscopically, and current synthesized samples contain impurities. We are using liquid-liquid extraction method to purify the sample, and the results will be discussed.

  16. The electronic structure of Au25 clusters: between discrete and continuous

    NASA Astrophysics Data System (ADS)

    Katsiev, Khabiboulakh; Lozova, Nataliya; Wang, Lu; Sai Krishna, Katla; Li, Ruipeng; Mei, Wai-Ning; Skrabalak, Sara E.; Kumar, Challa S. S. R.; Losovyj, Yaroslav

    2016-08-01

    Here, an approach based on synchrotron resonant photoemission is employed to explore the transition between quantization and hybridization of the electronic structure in atomically precise ligand-stabilized nanoparticles. While the presence of ligands maintains quantization in Au25 clusters, their removal renders increased hybridization of the electronic states in the vicinity of the Fermi level. These observations are supported by DFT studies.Here, an approach based on synchrotron resonant photoemission is employed to explore the transition between quantization and hybridization of the electronic structure in atomically precise ligand-stabilized nanoparticles. While the presence of ligands maintains quantization in Au25 clusters, their removal renders increased hybridization of the electronic states in the vicinity of the Fermi level. These observations are supported by DFT studies. Electronic supplementary information (ESI) available: Experimental details including chemicals, sample preparation, and characterization methods. Computation techniques, SV-AUC, GIWAXS, XPS, UPS, MALDI-TOF, ESI data of Au25 clusters. See DOI: 10.1039/c6nr02374f

  17. Mining the National Career Assessment Examination Result Using Clustering Algorithm

    NASA Astrophysics Data System (ADS)

    Pagudpud, M. V.; Palaoag, T. T.; Padirayon, L. M.

    2018-03-01

    Education is an essential process today which elicits authorities to discover and establish innovative strategies for educational improvement. This study applied data mining using clustering technique for knowledge extraction from the National Career Assessment Examination (NCAE) result in the Division of Quirino. The NCAE is an examination given to all grade 9 students in the Philippines to assess their aptitudes in the different domains. Clustering the students is helpful in identifying students’ learning considerations. With the use of the RapidMiner tool, clustering algorithms such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN), k-means, k-medoid, expectation maximization clustering, and support vector clustering algorithms were analyzed. The silhouette indexes of the said clustering algorithms were compared, and the result showed that the k-means algorithm with k = 3 and silhouette index equal to 0.196 is the most appropriate clustering algorithm to group the students. Three groups were formed having 477 students in the determined group (cluster 0), 310 proficient students (cluster 1) and 396 developing students (cluster 2). The data mining technique used in this study is essential in extracting useful information from the NCAE result to better understand the abilities of students which in turn is a good basis for adopting teaching strategies.

  18. Use of density functional theory method to calculate structures of neutral carbon clusters C{sub n} (3 ≤ n ≤ 24) and study their variability of structural forms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yen, T. W.; Lai, S. K., E-mail: sklai@coll.phy.ncu.edu.tw

    2015-02-28

    In this work, we present modifications to the well-known basin hopping (BH) optimization algorithm [D. J. Wales and J. P. Doye, J. Phys. Chem. A 101, 5111 (1997)] by incorporating in it the unique and specific nature of interactions among valence electrons and ions in carbon atoms through calculating the cluster’s total energy by the density functional tight-binding (DFTB) theory, using it to find the lowest energy structures of carbon clusters and, from these optimized atomic and electronic structures, studying their varied forms of topological transitions, which include a linear chain, a monocyclic to a polycyclic ring, and a fullerene/cage-likemore » geometry. In this modified BH (MBH) algorithm, we define a spatial volume within which the cluster’s lowest energy structure is to be searched, and introduce in addition a cut-and-splice genetic operator to increase the searching performance of the energy minimum than the original BH technique. The present MBH/DFTB algorithm is, therefore, characteristically distinguishable from the original BH technique commonly applied to nonmetallic and metallic clusters, technically more thorough and natural in describing the intricate couplings between valence electrons and ions in a carbon cluster, and thus theoretically sound in putting these two charged components on an equal footing. The proposed modified minimization algorithm should be more appropriate, accurate, and precise in the description of a carbon cluster. We evaluate the present algorithm, its energy-minimum searching in particular, by its optimization robustness. Specifically, we first check the MBH/DFTB technique for two representative carbon clusters of larger size, i.e., C{sub 60} and C{sub 72} against the popular cut-and-splice approach [D. M. Deaven and K. M. Ho, Phys. Rev. Lett. 75, 288 (1995)] that normally is combined with the genetic algorithm method for finding the cluster’s energy minimum, before employing it to investigate carbon clusters in the size range C{sub 3}-C{sub 24} studying their topological transitions. An effort was also made to compare our MBH/DFTB and its re-optimized results carried out by full density functional theory (DFT) calculations with some early DFT-based studies.« less

  19. Evaluating gyrochronology on the zero-age-main-sequence: rotation periods in the southern open cluster Blanco 1 from the Kelt-South survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cargile, P. A.; Pepper, J.; Siverd, R.

    2014-02-10

    We report periods for 33 members of Blanco 1 as measured from Kilodegree Extremely Little Telescope-South light curves, the first reported rotation periods for this benchmark zero-age-main-sequence open cluster. The distribution of these stars spans from late-A or early-F dwarfs to mid-K with periods ranging from less than a day to ∼8 days. The rotation period distribution has a morphology similar to the coeval Pleiades cluster, suggesting the universal nature of stellar rotation distributions. Employing two different gyrochronology methods, we find an age of 146{sub −14}{sup +13} Myr for the cluster. Using the same techniques, we infer an age ofmore » 134{sub −10}{sup +9} Myr for the Pleiades measured from existing literature rotation periods. These rotation-derived ages agree with independently determined cluster ages based on the lithium depletion boundary technique. Additionally, we evaluate different gyrochronology models and quantify levels of agreement between the models and the Blanco 1/Pleiades rotation period distributions, including incorporating the rotation distributions of clusters at ages up to 1.1 Gyr. We find the Skumanich-like spin-down rate sufficiently describes the rotation evolution of stars hotter than the Sun; however, we find cooler stars rotating faster than predicted by a Skumanich law, suggesting a mass dependence in the efficiency of stellar angular momentum loss rate. Finally, we compare the Blanco 1 and Pleiades rotation period distributions to available nonlinear angular momentum evolution models. We find they require a significant mass dependence on the initial rotation rate of solar-type stars to reproduce the observed range of rotation periods at a given stellar mass and are furthermore unable to predict the observed over-density of stars along the upper envelope of the clusters' rotation distributions.« less

  20. Within-Cluster and Across-Cluster Matching with Observational Multilevel Data

    ERIC Educational Resources Information Center

    Kim, Jee-Seon; Steiner, Peter M.; Hall, Courtney; Thoemmes, Felix

    2013-01-01

    When randomized experiments cannot be conducted in practice, propensity score (PS) techniques for matching treated and control units are frequently used for estimating causal treatment effects from observational data. Despite the popularity of PS techniques, they are not yet well studied for matching multilevel data where selection into treatment…

  1. An Intercomparison Between Radar Reflectivity and the IR Cloud Classification Technique for the TOGA-COARE Area

    NASA Technical Reports Server (NTRS)

    Carvalho, L. M. V.; Rickenbach, T.

    1999-01-01

    Satellite infrared (IR) and visible (VIS) images from the Tropical Ocean Global Atmosphere - Coupled Ocean Atmosphere Response Experiment (TOGA-COARE) experiment are investigated through the use of Clustering Analysis. The clusters are obtained from the values of IR and VIS counts and the local variance for both channels. The clustering procedure is based on the standardized histogram of each variable obtained from 179 pairs of images. A new approach to classify high clouds using only IR and the clustering technique is proposed. This method allows the separation of the enhanced convection in two main classes: convective tops, more closely related to the most active core of the storm, and convective systems, which produce regions of merged, thick anvil clouds. The resulting classification of different portions of cloudiness is compared to the radar reflectivity field for intensive events. Convective Systems and Convective Tops are followed during their life cycle using the IR clustering method. The areal coverage of precipitation and features related to convective and stratiform rain is obtained from the radar for each stage of the evolving Mesoscale Convective Systems (MCS). In order to compare the IR clustering method with a simple threshold technique, two IR thresholds (Tir) were used to identify different portions of cloudiness, Tir=240K which roughly defines the extent of all cloudiness associated with the MCS, and Tir=220K which indicates the presence of deep convection. It is shown that the IR clustering technique can be used as a simple alternative to identify the actual portion of convective and stratiform rainfall.

  2. REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Glascoe, L G; Glaser, R E; Chin, H S

    2004-06-17

    The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goalmore » of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.« less

  3. Infrared Multiple Photon Dissociation Spectroscopy Of Metal Cluster-Adducts

    NASA Astrophysics Data System (ADS)

    Cox, D. M.; Kaldor, A.; Zakin, M. R.

    1987-01-01

    Recent development of the laser vaporization technique combined with mass-selective detection has made possible new studies of the fundamental chemical and physical properties of unsupported transition metal clusters as a function of the number of constituent atoms. A variety of experimental techniques have been developed in our laboratory to measure ionization threshold energies, magnetic moments, and gas phase reactivity of clusters. However, studies have so far been unable to determine the cluster structure or the chemical state of chemisorbed species on gas phase clusters. The application of infrared multiple photon dissociation IRMPD to obtain the IR absorption properties of metal cluster-adsorbate species in a molecular beam is described here. Specifically using a high power, pulsed CO2 laser as the infrared source, the IRMPD spectrum for methanol chemisorbed on small iron clusters is measured as a function of the number of both iron atoms and methanols in the complex for different methanol isotopes. Both the feasibility and potential utility of IRMPD for characterizing metal cluster-adsorbate interactions are demonstrated. The method is generally applicable to any cluster or cluster-adsorbate system dependent only upon the availability of appropriate high power infrared sources.

  4. Evaluation of the procedure 1A component of the 1980 US/Canada wheat and barley exploratory experiment

    NASA Technical Reports Server (NTRS)

    Chapman, G. M. (Principal Investigator); Carnes, J. G.

    1981-01-01

    Several techniques which use clusters generated by a new clustering algorithm, CLASSY, are proposed as alternatives to random sampling to obtain greater precision in crop proportion estimation: (1) Proportional Allocation/relative count estimator (PA/RCE) uses proportional allocation of dots to clusters on the basis of cluster size and a relative count cluster level estimate; (2) Proportional Allocation/Bayes Estimator (PA/BE) uses proportional allocation of dots to clusters and a Bayesian cluster-level estimate; and (3) Bayes Sequential Allocation/Bayesian Estimator (BSA/BE) uses sequential allocation of dots to clusters and a Bayesian cluster level estimate. Clustering in an effective method in making proportion estimates. It is estimated that, to obtain the same precision with random sampling as obtained by the proportional sampling of 50 dots with an unbiased estimator, samples of 85 or 166 would need to be taken if dot sets with AI labels (integrated procedure) or ground truth labels, respectively were input. Dot reallocation provides dot sets that are unbiased. It is recommended that these proportion estimation techniques are maintained, particularly the PA/BE because it provides the greatest precision.

  5. The WIYN Open Cluster Study: A 15-Year Report

    NASA Astrophysics Data System (ADS)

    Mathieu, Robert D.; WOCS Collaboration

    2013-06-01

    The WIYN 3.5m telescope combines large aperture, wide field of view and superb image quality. The WIYN consortium includes investigators in numerous areas of open cluster research. The combination spawned the WIYN Open Cluster Study (WOCS) over a decade ago, with the goals of producing 1) comprehensive photometric, astrometric and spectroscopic data for new fundamental open clusters and 2) addressing key astrophysical problems with these data. The set of core WOCS open clusters spans age and metallicity. Low reddening, solar proximity and richness were also desirable features in selecting core open clusters. More than 50 WIYN Open Cluster Study papers have been published in refereed journals. Highlights include: deep and wide-field photometry of NGC 188, NGC 2168 (M35), and NGC 6819 (WOCS I, II, XI and LII); deep and wide-field proper-motion studies of the old open clusters NGC 188, NGC 2682 (M67) and NGC 6791 (WOCS XVII, XXXIII and XLVI); comprehensive radial-velocity surveys of NGC 188, NGC 2168 and NGC 6819 (WOCS XXXII, XXIV, and XXXVIII); metallicity and lithium abundances in NGC 2168 (WOCS V); comprehensive definition of the hard-binary populations of NGC 188 and NGC 2168 (WOCS XXII and XLVIII); rotation period distributions in NGC 1039 (M34) and NGC 2168 (WOCS XXXV, XLIII, and XLV); study of chromospheric activity in NGC 2682 (WOCS XVIII); photometric variability surveys in NGC 188 and NGC 2682 (IX and XV); new Bayesian techniques for determination of cluster parameters (WOCS XXIII); a new infrared age-diagnostic for open clusters (WOCS XL); theoretical studies of stellar rotation (WOCS XIII and XIV); sophisticated N-body simulations of NGC 188 (WOCS LI); and the discovery of a high binary frequency and white dwarf companions among NGC 188 blue stragglers. While the WIYN 3.5m telescope remains at its heart, today the WIYN Open Cluster Study collaboration extends beyond both the WIYN observatory and consortium, and continues as a vital and productive exploration into these fundamental stellar systems. Publication list can be found at http://www.astro.ufl.edu ata/wocs/pubs.html. The WIYN Open Cluster Study has been continuously supported by grants from the National Science Foundation.

  6. Testing prediction methods: Earthquake clustering versus the Poisson model

    USGS Publications Warehouse

    Michael, A.J.

    1997-01-01

    Testing earthquake prediction methods requires statistical techniques that compare observed success to random chance. One technique is to produce simulated earthquake catalogs and measure the relative success of predicting real and simulated earthquakes. The accuracy of these tests depends on the validity of the statistical model used to simulate the earthquakes. This study tests the effect of clustering in the statistical earthquake model on the results. Three simulation models were used to produce significance levels for a VLF earthquake prediction method. As the degree of simulated clustering increases, the statistical significance drops. Hence, the use of a seismicity model with insufficient clustering can lead to overly optimistic results. A successful method must pass the statistical tests with a model that fully replicates the observed clustering. However, a method can be rejected based on tests with a model that contains insufficient clustering. U.S. copyright. Published in 1997 by the American Geophysical Union.

  7. Dynamic multifactor clustering of financial networks

    NASA Astrophysics Data System (ADS)

    Ross, Gordon J.

    2014-02-01

    We investigate the tendency for financial instruments to form clusters when there are multiple factors influencing the correlation structure. Specifically, we consider a stock portfolio which contains companies from different industrial sectors, located in several different countries. Both sector membership and geography combine to create a complex clustering structure where companies seem to first be divided based on sector, with geographical subclusters emerging within each industrial sector. We argue that standard techniques for detecting overlapping clusters and communities are not able to capture this type of structure and show how robust regression techniques can instead be used to remove the influence of both sector and geography from the correlation matrix separately. Our analysis reveals that prior to the 2008 financial crisis, companies did not tend to form clusters based on geography. This changed immediately following the crisis, with geography becoming a more important determinant of clustering structure.

  8. Damage detection methodology under variable load conditions based on strain field pattern recognition using FBGs, nonlinear principal component analysis, and clustering techniques

    NASA Astrophysics Data System (ADS)

    Sierra-Pérez, Julián; Torres-Arredondo, M.-A.; Alvarez-Montoya, Joham

    2018-01-01

    Structural health monitoring consists of using sensors integrated within structures together with algorithms to perform load monitoring, damage detection, damage location, damage size and severity, and prognosis. One possibility is to use strain sensors to infer structural integrity by comparing patterns in the strain field between the pristine and damaged conditions. In previous works, the authors have demonstrated that it is possible to detect small defects based on strain field pattern recognition by using robust machine learning techniques. They have focused on methodologies based on principal component analysis (PCA) and on the development of several unfolding and standardization techniques, which allow dealing with multiple load conditions. However, before a real implementation of this approach in engineering structures, changes in the strain field due to conditions different from damage occurrence need to be isolated. Since load conditions may vary in most engineering structures and promote significant changes in the strain field, it is necessary to implement novel techniques for uncoupling such changes from those produced by damage occurrence. A damage detection methodology based on optimal baseline selection (OBS) by means of clustering techniques is presented. The methodology includes the use of hierarchical nonlinear PCA as a nonlinear modeling technique in conjunction with Q and nonlinear-T 2 damage indices. The methodology is experimentally validated using strain measurements obtained by 32 fiber Bragg grating sensors bonded to an aluminum beam under dynamic bending loads and simultaneously submitted to variations in its pitch angle. The results demonstrated the capability of the methodology for clustering data according to 13 different load conditions (pitch angles), performing the OBS and detecting six different damages induced in a cumulative way. The proposed methodology showed a true positive rate of 100% and a false positive rate of 1.28% for a 99% of confidence.

  9. Computational intelligence techniques for biological data mining: An overview

    NASA Astrophysics Data System (ADS)

    Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari

    2014-10-01

    Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.

  10. Clustering Categorical Data Using Community Detection Techniques

    PubMed Central

    2017-01-01

    With the advent of the k-modes algorithm, the toolbox for clustering categorical data has an efficient tool that scales linearly in the number of data items. However, random initialization of cluster centers in k-modes makes it hard to reach a good clustering without resorting to many trials. Recently proposed methods for better initialization are deterministic and reduce the clustering cost considerably. A variety of initialization methods differ in how the heuristics chooses the set of initial centers. In this paper, we address the clustering problem for categorical data from the perspective of community detection. Instead of initializing k modes and running several iterations, our scheme, CD-Clustering, builds an unweighted graph and detects highly cohesive groups of nodes using a fast community detection technique. The top-k detected communities by size will define the k modes. Evaluation on ten real categorical datasets shows that our method outperforms the existing initialization methods for k-modes in terms of accuracy, precision, and recall in most of the cases. PMID:29430249

  11. Clustering methods for the optimization of atomic cluster structure

    NASA Astrophysics Data System (ADS)

    Bagattini, Francesco; Schoen, Fabio; Tigli, Luca

    2018-04-01

    In this paper, we propose a revised global optimization method and apply it to large scale cluster conformation problems. In the 1990s, the so-called clustering methods were considered among the most efficient general purpose global optimization techniques; however, their usage has quickly declined in recent years, mainly due to the inherent difficulties of clustering approaches in large dimensional spaces. Inspired from the machine learning literature, we redesigned clustering methods in order to deal with molecular structures in a reduced feature space. Our aim is to show that by suitably choosing a good set of geometrical features coupled with a very efficient descent method, an effective optimization tool is obtained which is capable of finding, with a very high success rate, all known putative optima for medium size clusters without any prior information, both for Lennard-Jones and Morse potentials. The main result is that, beyond being a reliable approach, the proposed method, based on the idea of starting a computationally expensive deep local search only when it seems worth doing so, is capable of saving a huge amount of searches with respect to an analogous algorithm which does not employ a clustering phase. In this paper, we are not claiming the superiority of the proposed method compared to specific, refined, state-of-the-art procedures, but rather indicating a quite straightforward way to save local searches by means of a clustering scheme working in a reduced variable space, which might prove useful when included in many modern methods.

  12. Sensitivity evaluation of dynamic speckle activity measurements using clustering methods.

    PubMed

    Etchepareborda, Pablo; Federico, Alejandro; Kaufmann, Guillermo H

    2010-07-01

    We evaluate and compare the use of competitive neural networks, self-organizing maps, the expectation-maximization algorithm, K-means, and fuzzy C-means techniques as partitional clustering methods, when the sensitivity of the activity measurement of dynamic speckle images needs to be improved. The temporal history of the acquired intensity generated by each pixel is analyzed in a wavelet decomposition framework, and it is shown that the mean energy of its corresponding wavelet coefficients provides a suited feature space for clustering purposes. The sensitivity obtained by using the evaluated clustering techniques is also compared with the well-known methods of Konishi-Fujii, weighted generalized differences, and wavelet entropy. The performance of the partitional clustering approach is evaluated using simulated dynamic speckle patterns and also experimental data.

  13. Photometry Using Kepler "Superstamps" of Open Clusters NGC 6791 & NGC 6819

    NASA Astrophysics Data System (ADS)

    Kuehn, Charles A.; Drury, Jason A.; Bellamy, Beau R.; Stello, Dennis; Bedding, Timothy R.; Reed, Mike; Quick, Breanna

    2015-09-01

    The Kepler space telescope has proven to be a gold mine for the study of variable stars. Usually, Kepler only reads out a handful of pixels around each pre-selected target star, omitting a large number of stars in the Kepler field. Fortunately, for the open clusters NGC 6791 and NGC 6819, Kepler also read out larger "superstamps" which contained complete images of the central region of each cluster. These cluster images can be used to study additional stars in the open clusters that were not originally on Kepler's target list. We discuss our work on using two photometric techniques to analyze these superstamps and present sample results from this project to demonstrate the value of this technique for a wide variety of variable stars.

  14. Low Altitude AVIRIS Data for Mapping Land Cover in Yellowstone National Park: Use of Isodata Clustering Techniques

    NASA Technical Reports Server (NTRS)

    Spruce, Joe

    2001-01-01

    Yellowstone National Park (YNP) contains a diversity of land cover. YNP managers need site-specific land cover maps, which may be produced more effectively using high-resolution hyperspectral imagery. ISODATA clustering techniques have aided operational multispectral image classification and may benefit certain hyperspectral data applications if optimally applied. In response, a study was performed for an area in northeast YNP using 11 select bands of low-altitude AVIRIS data calibrated to ground reflectance. These data were subjected to ISODATA clustering and Maximum Likelihood Classification techniques to produce a moderately detailed land cover map. The latter has good apparent overall agreement with field surveys and aerial photo interpretation.

  15. Chapter 7. Cloning and analysis of natural product pathways.

    PubMed

    Gust, Bertolt

    2009-01-01

    The identification of gene clusters of natural products has lead to an enormous wealth of information about their biosynthesis and its regulation, and about self-resistance mechanisms. Well-established routine techniques are now available for the cloning and sequencing of gene clusters. The subsequent functional analysis of the complex biosynthetic machinery requires efficient genetic tools for manipulation. Until recently, techniques for the introduction of defined changes into Streptomyces chromosomes were very time-consuming. In particular, manipulation of large DNA fragments has been challenging due to the absence of suitable restriction sites for restriction- and ligation-based techniques. The homologous recombination approach called recombineering (referred to as Red/ET-mediated recombination in this chapter) has greatly facilitated targeted genetic modifications of complex biosynthetic pathways from actinomycetes by eliminating many of the time-consuming and labor-intensive steps. This chapter describes techniques for the cloning and identification of biosynthetic gene clusters, for the generation of gene replacements within such clusters, for the construction of integrative library clones and their expression in heterologous hosts, and for the assembly of entire biosynthetic gene clusters from the inserts of individual library clones. A systematic approach toward insertional mutation of a complete Streptomyces genome is shown by the use of an in vitro transposon mutagenesis procedure.

  16. ClusterTAD: an unsupervised machine learning approach to detecting topologically associated domains of chromosomes from Hi-C data.

    PubMed

    Oluwadare, Oluwatosin; Cheng, Jianlin

    2017-11-14

    With the development of chromosomal conformation capturing techniques, particularly, the Hi-C technique, the study of the spatial conformation of a genome is becoming an important topic in bioinformatics and computational biology. The Hi-C technique can generate genome-wide chromosomal interaction (contact) data, which can be used to investigate the higher-level organization of chromosomes, such as Topologically Associated Domains (TAD), i.e., locally packed chromosome regions bounded together by intra chromosomal contacts. The identification of the TADs for a genome is useful for studying gene regulation, genomic interaction, and genome function. Here, we formulate the TAD identification problem as an unsupervised machine learning (clustering) problem, and develop a new TAD identification method called ClusterTAD. We introduce a novel method to represent chromosomal contacts as features to be used by the clustering algorithm. Our results show that ClusterTAD can accurately predict the TADs on a simulated Hi-C data. Our method is also largely complementary and consistent with existing methods on the real Hi-C datasets of two mouse cells. The validation with the chromatin immunoprecipitation (ChIP) sequencing (ChIP-Seq) data shows that the domain boundaries identified by ClusterTAD have a high enrichment of CTCF binding sites, promoter-related marks, and enhancer-related histone modifications. As ClusterTAD is based on a proven clustering approach, it opens a new avenue to apply a large array of clustering methods developed in the machine learning field to the TAD identification problem. The source code, the results, and the TADs generated for the simulated and real Hi-C datasets are available here: https://github.com/BDM-Lab/ClusterTAD .

  17. The Use of Cluster Analysis in Typological Research on Community College Students

    ERIC Educational Resources Information Center

    Bahr, Peter Riley; Bielby, Rob; House, Emily

    2011-01-01

    One useful and increasingly popular method of classifying students is known commonly as cluster analysis. The variety of techniques that comprise the cluster analytic family are intended to sort observations (for example, students) within a data set into subsets (clusters) that share similar characteristics and differ in meaningful ways from other…

  18. A Comparison of Alternative Distributed Dynamic Cluster Formation Techniques for Industrial Wireless Sensor Networks.

    PubMed

    Gholami, Mohammad; Brennan, Robert W

    2016-01-06

    In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency.

  19. A Comparison of Alternative Distributed Dynamic Cluster Formation Techniques for Industrial Wireless Sensor Networks

    PubMed Central

    Gholami, Mohammad; Brennan, Robert W.

    2016-01-01

    In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency. PMID:26751447

  20. Electron molecular ion recombination: product excitation and fragmentation.

    PubMed

    Adams, Nigel G; Poterya, Viktoriya; Babcock, Lucia M

    2006-01-01

    Electron-ion dissociative recombination is an important ionization loss process in any ionized gas containing molecular ions. This includes the interstellar medium, circumstellar shells, cometary comae, planetary ionospheres, fusion plasma boundaries, combustion flames, laser plasmas and chemical deposition and etching plasmas. In addition to controlling the ionization density, the process generates many radical species, which can contribute to a parallel neutral chemistry. Techniques used to obtain rate data and product information (flowing afterglows and storage rings) are discussed and recent data are reviewed including diatomic to polyatomic ions and cluster ions. The data are divided into rate coefficients and cross sections, including their temperature/energy dependencies, and quantitative identification of neutral reaction products. The latter involve both ground and electronically excited states and including vibrational excitation. The data from the different techniques are compared and trends in the data are examined. The reactions are considered in terms of the basic mechanisms (direct and indirect processes including tunneling) and recent theoretical developments are discussed. Finally, new techniques are mentioned (for product identification; electrostatic storage rings, including single and double rings; Coulomb explosion) and new ways forward are suggested.

  1. Commodity cluster and hardware-based massively parallel implementations of hyperspectral imaging algorithms

    NASA Astrophysics Data System (ADS)

    Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David

    2006-05-01

    The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.

  2. Analysis of Tropical Cyclone Tracks in the North Indian Ocean

    NASA Astrophysics Data System (ADS)

    Patwardhan, A.; Paliwal, M.; Mohapatra, M.

    2011-12-01

    Cyclones are regarded as one of the most dangerous meteorological phenomena of the tropical region. The probability of landfall of a tropical cyclone depends on its movement (trajectory). Analysis of trajectories of tropical cyclones could be useful for identifying potentially predictable characteristics. There is long history of analysis of tropical cyclones tracks. A common approach is using different clustering techniques to group the cyclone tracks on the basis of certain characteristics. Various clustering method have been used to study the tropical cyclones in different ocean basins like western North Pacific ocean (Elsner and Liu, 2003; Camargo et al., 2007), North Atlantic Ocean (Elsner, 2003; Gaffney et al. 2007; Nakamura et al., 2009). In this study, tropical cyclone tracks in the North Indian Ocean basin, for the period 1961-2010 have been analyzed and grouped into clusters based on their spatial characteristics. A tropical cyclone trajectory is approximated as an open curve and described by its first two moments. The resulting clusters have different centroid locations and also differently shaped variance ellipses. These track characteristics are then used in the standard clustering algorithms which allow the whole track shape, length, and location to be incorporated into the clustering methodology. The resulting clusters have different genesis locations and trajectory shapes. We have also examined characteristics such as life span, maximum sustained wind speed, landfall, seasonality, many of which are significantly different across the identified clusters. The clustering approach groups cyclones with higher maximum wind speed and longest life span in to one cluster. Another cluster includes short duration cyclonic events that are mostly deep depressions and significant for rainfall over Eastern and Central India. The clustering approach is likely to prove useful for analysis of events of significance with regard to impacts.

  3. Big Bangs in Galaxy Clusters: Using X-ray Temperature Maps to Trace Merger Histories in Clusters with Radio Halos/Relics

    NASA Astrophysics Data System (ADS)

    Burns, Jack O.; Datta, Abhirup; Hallman, Eric J.

    2016-06-01

    Galaxy clusters are assembled through large and small mergers which are the most energetic events ("bangs") since the Big Bang. Cluster mergers "stir" the intracluster medium (ICM) creating shocks and turbulence which are illuminated by ~Mpc-sized radio features called relics and halos. These shocks heat the ICM and are detected in x-rays via thermal emission. Disturbed morphologies in x-ray surface brightness and temperatures are direct evidence for cluster mergers. In the radio, relics (in the outskirts of the clusters) and halos (located near the cluster core) are also clear signposts of recent mergers. Our recent ENZO cosmological simulations suggest that around a merger event, radio emission peaks very sharply (and briefly) while the x-ray emission rises and decays slowly. Hence, a sample of galaxy clusters that shows both luminous x-ray emission and radio relics/halos are good candidates for very recent mergers. We are in the early stages of analyzing a unique sample of 48 galaxy clusters with (i) known radio relics and/or halos and (ii) significant archival x-ray observations (>50 ksec) from Chandra and/or XMM. We have developed a new x-ray data analysis pipeline, implemented on parallel processor supercomputers, to create x-ray surface brightness, high fidelity temperature, and pressure maps of these clusters in order to study merging activity. The temperature maps are made using three different map-making techniques: Weighted Voronoi Tessellation, Adaptive Circular Binning, and Contour Binning. In this talk, we will show preliminary results for several clusters, including Abell 2744 and the Bullet cluster. This work is supported by NASA ADAP grant NNX15AE17G.

  4. Cosmological Constraints from Galaxy Clustering and the Mass-to-number Ratio of Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Tinker, Jeremy L.; Sheldon, Erin S.; Wechsler, Risa H.; Becker, Matthew R.; Rozo, Eduardo; Zu, Ying; Weinberg, David H.; Zehavi, Idit; Blanton, Michael R.; Busha, Michael T.; Koester, Benjamin P.

    2012-01-01

    We place constraints on the average density (Ω m ) and clustering amplitude (σ8) of matter using a combination of two measurements from the Sloan Digital Sky Survey: the galaxy two-point correlation function, wp (rp ), and the mass-to-galaxy-number ratio within galaxy clusters, M/N, analogous to cluster M/L ratios. Our wp (rp ) measurements are obtained from DR7 while the sample of clusters is the maxBCG sample, with cluster masses derived from weak gravitational lensing. We construct nonlinear galaxy bias models using the Halo Occupation Distribution (HOD) to fit both wp (rp ) and M/N for different cosmological parameters. HOD models that match the same two-point clustering predict different numbers of galaxies in massive halos when Ω m or σ8 is varied, thereby breaking the degeneracy between cosmology and bias. We demonstrate that this technique yields constraints that are consistent and competitive with current results from cluster abundance studies, without the use of abundance information. Using wp (rp ) and M/N alone, we find Ω0.5 m σ8 = 0.465 ± 0.026, with individual constraints of Ω m = 0.29 ± 0.03 and σ8 = 0.85 ± 0.06. Combined with current cosmic microwave background data, these constraints are Ω m = 0.290 ± 0.016 and σ8 = 0.826 ± 0.020. All errors are 1σ. The systematic uncertainties that the M/N technique are most sensitive to are the amplitude of the bias function of dark matter halos and the possibility of redshift evolution between the SDSS Main sample and the maxBCG cluster sample. Our derived constraints are insensitive to the current level of uncertainties in the halo mass function and in the mass-richness relation of clusters and its scatter, making the M/N technique complementary to cluster abundances as a method for constraining cosmology with future galaxy surveys.

  5. Intelligent Traffic Quantification System

    NASA Astrophysics Data System (ADS)

    Mohanty, Anita; Bhanja, Urmila; Mahapatra, Sudipta

    2017-08-01

    Currently, city traffic monitoring and controlling is a big issue in almost all cities worldwide. Vehicular ad-hoc Network (VANET) technique is an efficient tool to minimize this problem. Usually, different types of on board sensors are installed in vehicles to generate messages characterized by different vehicle parameters. In this work, an intelligent system based on fuzzy clustering technique is developed to reduce the number of individual messages by extracting important features from the messages of a vehicle. Therefore, the proposed fuzzy clustering technique reduces the traffic load of the network. The technique also reduces congestion and quantifies congestion.

  6. Hybrid Clustering-GWO-NARX neural network technique in predicting stock price

    NASA Astrophysics Data System (ADS)

    Das, Debashish; Safa Sadiq, Ali; Mirjalili, Seyedali; Noraziah, A.

    2017-09-01

    Prediction of stock price is one of the most challenging tasks due to nonlinear nature of the stock data. Though numerous attempts have been made to predict the stock price by applying various techniques, yet the predicted price is not always accurate and even the error rate is high to some extent. Consequently, this paper endeavours to determine an efficient stock prediction strategy by implementing a combinatorial method of Grey Wolf Optimizer (GWO), Clustering and Non Linear Autoregressive Exogenous (NARX) Technique. The study uses stock data from prominent stock market i.e. New York Stock Exchange (NYSE), NASDAQ and emerging stock market i.e. Malaysian Stock Market (Bursa Malaysia), Dhaka Stock Exchange (DSE). It applies K-means clustering algorithm to determine the most promising cluster, then MGWO is used to determine the classification rate and finally the stock price is predicted by applying NARX neural network algorithm. The prediction performance gained through experimentation is compared and assessed to guide the investors in making investment decision. The result through this technique is indeed promising as it has shown almost precise prediction and improved error rate. We have applied the hybrid Clustering-GWO-NARX neural network technique in predicting stock price. We intend to work with the effect of various factors in stock price movement and selection of parameters. We will further investigate the influence of company news either positive or negative in stock price movement. We would be also interested to predict the Stock indices.

  7. Goal Profiles, Mental Toughness and its Influence on Performance Outcomes among Wushu Athletes

    PubMed Central

    Roy, Jolly

    2007-01-01

    This study examined the association between goal orientations and mental toughness and its influence on performance outcomes in competition. Wushu athletes (n = 40) competing in Intervarsity championships in Malaysia completed Task and Ego Orientations in Sport Questionnaire (TEOSQ) and Psychological Performance Inventory (PPI). Using cluster analysis techniques including hierarchical methods and the non-hierarchical method (k-means cluster) to examine goal profiles, a three cluster solution emerged viz. cluster 1 - high task and moderate ego (HT/ME), cluster 2 - moderate task and low ego (MT/LE) and, cluster 3 - moderate task and moderate ego (MT/ME). Analysis of the fundamental areas of mental toughness based on goal profiles revealed that athletes in cluster 1 scored significantly higher on negative energy control than athletes in cluster 2. Further, athletes in cluster 1 also scored significantly higher on positive energy control than athletes in cluster 3. Chi-square (χ2) test revealed no significant differences among athletes with different goal profiles on performance outcomes in the competition. However, significant differences were observed between athletes (medallist and non medallist) in self- confidence (p = 0.001) and negative energy control (p = 0.042). Medallist’s scored significantly higher on self-confidence (mean = 21.82 ± 2.72) and negative energy control (mean = 19.59 ± 2.32) than the non-medallists (self confidence-mean = 18.76 ± 2.49; negative energy control mean = 18.14 ± 1.91). Key points Mental toughness can be influenced by certain goal profile combination. Athletes with successful outcomes in performance (medallist) displayed greater mental toughness. PMID:24198700

  8. Suicide in the oldest old: an observational study and cluster analysis.

    PubMed

    Sinyor, Mark; Tan, Lynnette Pei Lin; Schaffer, Ayal; Gallagher, Damien; Shulman, Kenneth

    2016-01-01

    The older population are at a high risk for suicide. This study sought to learn more about the characteristics of suicide in the oldest-old and to use a cluster analysis to determine if oldest-old suicide victims assort into clinically meaningful subgroups. Data were collected from a coroner's chart review of suicide victims in Toronto from 1998 to 2011. We compared two age groups (65-79 year olds, n = 335, and 80+ year olds, n = 191) and then conducted a hierarchical agglomerative cluster analysis using Ward's method to identify distinct clusters in the 80+ group. The younger and older age groups differed according to marital status, living circumstances and pattern of stressors. The cluster analysis identified three distinct clusters in the 80+ group. Cluster 1 was the largest (n = 124) and included people who were either married or widowed who had significantly more depression and somewhat more medical health stressors. In contrast, cluster 2 (n = 50) comprised people who were almost all single and living alone with significantly less identified depression and slightly fewer medical health stressors. All members of cluster 3 (n = 17) lived in a retirement residence or nursing home, and this group had the highest rates of depression, dementia, other mental illness and past suicide attempts. This is the first study to use the cluster analysis technique to identify meaningful subgroups among suicide victims in the oldest-old. The results reveal different patterns of suicide in the older population that may be relevant for clinical care. Copyright © 2015 John Wiley & Sons, Ltd.

  9. Integrin Clustering Matters: A Review of Biomaterials Functionalized with Multivalent Integrin-Binding Ligands to Improve Cell Adhesion, Migration, Differentiation, Angiogenesis, and Biomedical Device Integration.

    PubMed

    Karimi, Fatemeh; O'Connor, Andrea J; Qiao, Greg G; Heath, Daniel E

    2018-03-25

    Material systems that exhibit tailored interactions with cells are a cornerstone of biomaterial and tissue engineering technologies. One method of achieving these tailored interactions is to biofunctionalize materials with peptide ligands that bind integrin receptors present on the cell surface. However, cell biology research has illustrated that both integrin binding and integrin clustering are required to achieve a full adhesion response. This biophysical knowledge has motivated researchers to develop material systems biofunctionalized with nanoscale clusters of ligands that promote both integrin occupancy and clustering of the receptors. These materials have improved a wide variety of biological interactions in vitro including cell adhesion, proliferation, migration speed, gene expression, and stem cell differentiation; and improved in vivo outcomes including increased angiogenesis, tissue healing, and biomedical device integration. This review first introduces the techniques that enable the fabrication of these nanopatterned materials, describes the improved biological effects that have been achieved, and lastly discusses the current limitations of the technology and where future advances may occur. Although this technology is still in its nascency, it will undoubtedly play an important role in the future development of biomaterials and tissue engineering scaffolds for both in vitro and in vivo applications. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification.

    PubMed

    Li, Jinyan; Fong, Simon; Sung, Yunsick; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2016-01-01

    An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class. In this paper, we use a novel class-balancing method named adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique (ASCB_DmSMOTE) to solve this imbalanced dataset problem, which is common in biomedical applications. The proposed method combines under-sampling and over-sampling into a swarm optimisation algorithm. It adaptively selects suitable parameters for the rebalancing algorithm to find the best solution. Compared with the other versions of the SMOTE algorithm, significant improvements, which include higher accuracy and credibility, are observed with ASCB_DmSMOTE. Our proposed method tactfully combines two rebalancing techniques together. It reasonably re-allocates the majority class in the details and dynamically optimises the two parameters of SMOTE to synthesise a reasonable scale of minority class for each clustered sub-imbalanced dataset. The proposed methods ultimately overcome other conventional methods and attains higher credibility with even greater accuracy of the classification model.

  11. Security and Correctness Analysis on Privacy-Preserving k-Means Clustering Schemes

    NASA Astrophysics Data System (ADS)

    Su, Chunhua; Bao, Feng; Zhou, Jianying; Takagi, Tsuyoshi; Sakurai, Kouichi

    Due to the fast development of Internet and the related IT technologies, it becomes more and more easier to access a large amount of data. k-means clustering is a powerful and frequently used technique in data mining. Many research papers about privacy-preserving k-means clustering were published. In this paper, we analyze the existing privacy-preserving k-means clustering schemes based on the cryptographic techniques. We show those schemes will cause the privacy breach and cannot output the correct results due to the faults in the protocol construction. Furthermore, we analyze our proposal as an option to improve such problems but with intermediate information breach during the computation.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shin, Jaejin; Woo, Jong-Hak; Mulchaey, John S.

    We perform a comprehensive study of X-ray cavities using a large sample of X-ray targets selected from the Chandra archive. The sample is selected to cover a large dynamic range including galaxy clusters, groups, and individual galaxies. Using β -modeling and unsharp masking techniques, we investigate the presence of X-ray cavities for 133 targets that have sufficient X-ray photons for analysis. We detect 148 X-ray cavities from 69 targets and measure their properties, including cavity size, angle, and distance from the center of the diffuse X-ray gas. We confirm the strong correlation between cavity size and distance from the X-raymore » center similar to previous studies. We find that the detection rates of X-ray cavities are similar among galaxy clusters, groups and individual galaxies, suggesting that the formation mechanism of X-ray cavities is independent of environment.« less

  13. The Effect of Buzz Group Technique and Clustering Technique in Teaching Writing at the First Class of SMA HKBP I Tarutung

    ERIC Educational Resources Information Center

    Pangaribuan, Tagor; Manik, Sondang

    2018-01-01

    This research held at SMA HKBP 1 Tarutung North Sumatra on the research result of test XI[superscript 2] and XI[superscript 2] students, after they got treatment in teaching writing in recount text by using buzz group and clustering technique. The average score (X) was 67.7 and the total score buzz group the average score (X) was 77.2 and in…

  14. Monitoring of dispersed smoke-plume layers by determining locations of the data-point clusters

    NASA Astrophysics Data System (ADS)

    Kovalev, Vladimir; Wold, Cyle; Petkov, Alexander; Min Hao, Wei

    2018-04-01

    A modified data-processing technique of the signals recorded by zenith-directed lidar, which operates in smoke-polluted atmosphere, is discussed. The technique is based on simple transformations of the lidar backscatter signal and the determination of the spatial location of the data point clusters. The technique allows more reliable detection of the location of dispersed smoke layering. Examples of typical results obtained with lidar in a smokepolluted atmosphere are presented.

  15. Constrained spectral clustering under a local proximity structure assumption

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri; Xu, Qianjun; des Jardins, Marie

    2005-01-01

    This work focuses on incorporating pairwise constraints into a spectral clustering algorithm. A new constrained spectral clustering method is proposed, as well as an active constraint acquisition technique and a heuristic for parameter selection. We demonstrate that our constrained spectral clustering method, CSC, works well when the data exhibits what we term local proximity structure.

  16. Learner Typologies Development Using OIndex and Data Mining Based Clustering Techniques

    ERIC Educational Resources Information Center

    Luan, Jing

    2004-01-01

    This explorative data mining project used distance based clustering algorithm to study 3 indicators, called OIndex, of student behavioral data and stabilized at a 6-cluster scenario following an exhaustive explorative study of 4, 5, and 6 cluster scenarios produced by K-Means and TwoStep algorithms. Using principles in data mining, the study…

  17. A Comprehensive Careers Cluster Curriculum Model. Health Occupations Cluster Curriculum Project and Health-Care Aide Curriculum Project.

    ERIC Educational Resources Information Center

    Bortz, Richard F.

    To prepare learning materials for health careers programs at the secondary level, the developmental phase of two curriculum projects--the Health Occupations Cluster Curriculum Project and Health-Care Aide Curriculum Project--utilized a model which incorporated a key factor analysis technique. Entitled "A Comprehensive Careers Cluster Curriculum…

  18. Evidence that Clouds of keV Hydrogen Ion Clusters Bounce Elastically from a Solid Surface

    NASA Technical Reports Server (NTRS)

    Lewis, R. A.; Martin, James J.; Chakrabarti, Suman; Rodgers, Stephen L. (Technical Monitor)

    2002-01-01

    The behavior of hydrogen ion clusters is tested by an inject/hold/extract technique in a Penning-Malmberg trap. The timing pattern of the extraction signals is consistent with the clusters bouncing elastically from a detector several times. The ion clusters behave more like an elastic fluid than a beam of ions.

  19. An ensemble framework for clustering protein-protein interaction networks.

    PubMed

    Asur, Sitaram; Ucar, Duygu; Parthasarathy, Srinivasan

    2007-07-01

    Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. In this article, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topology-based distance metrics to counteract the effects of noise. We develop a PCA-based consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topology-based, information theoretic and domain-specific validation metrics and show that our approaches can provide significant benefits over other state-of-the-art approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can (a) produce improved biologically significant functional groupings; and (b) facilitate soft clustering by discovering multiple functional associations for proteins. Supplementary data are available at Bioinformatics online.

  20. Million-body star cluster simulations: comparisons between Monte Carlo and direct N-body

    NASA Astrophysics Data System (ADS)

    Rodriguez, Carl L.; Morscher, Meagan; Wang, Long; Chatterjee, Sourav; Rasio, Frederic A.; Spurzem, Rainer

    2016-12-01

    We present the first detailed comparison between million-body globular cluster simulations computed with a Hénon-type Monte Carlo code, CMC, and a direct N-body code, NBODY6++GPU. Both simulations start from an identical cluster model with 106 particles, and include all of the relevant physics needed to treat the system in a highly realistic way. With the two codes `frozen' (no fine-tuning of any free parameters or internal algorithms of the codes) we find good agreement in the overall evolution of the two models. Furthermore, we find that in both models, large numbers of stellar-mass black holes (>1000) are retained for 12 Gyr. Thus, the very accurate direct N-body approach confirms recent predictions that black holes can be retained in present-day, old globular clusters. We find only minor disagreements between the two models and attribute these to the small-N dynamics driving the evolution of the cluster core for which the Monte Carlo assumptions are less ideal. Based on the overwhelming general agreement between the two models computed using these vastly different techniques, we conclude that our Monte Carlo approach, which is more approximate, but dramatically faster compared to the direct N-body, is capable of producing an accurate description of the long-term evolution of massive globular clusters even when the clusters contain large populations of stellar-mass black holes.

  1. Rain volume estimation over areas using satellite and radar data

    NASA Technical Reports Server (NTRS)

    Doneaud, A. A.; Vonderhaar, T. H.

    1985-01-01

    The feasibility of rain volume estimation over fixed and floating areas was investigated using rapid scan satellite data following a technique recently developed with radar data, called the Area Time Integral (ATI) technique. The radar and rapid scan GOES satellite data were collected during the Cooperative Convective Precipitation Experiment (CCOPE) and North Dakota Cloud Modification Project (NDCMP). Six multicell clusters and cells were analyzed to the present time. A two-cycle oscillation emphasizing the multicell character of the clusters is demonstrated. Three clusters were selected on each day, 12 June and 2 July. The 12 June clusters occurred during the daytime, while the 2 July clusters during the nighttime. A total of 86 time steps of radar and 79 time steps of satellite images were analyzed. There were approximately 12-min time intervals between radar scans on the average.

  2. Spectroscopic characterization of galaxy clusters in RCS-1: spectroscopic confirmation, redshift accuracy, and dynamical mass-richness relation

    NASA Astrophysics Data System (ADS)

    Gilbank, David G.; Barrientos, L. Felipe; Ellingson, Erica; Blindert, Kris; Yee, H. K. C.; Anguita, T.; Gladders, M. D.; Hall, P. B.; Hertling, G.; Infante, L.; Yan, R.; Carrasco, M.; Garcia-Vergara, Cristina; Dawson, K. S.; Lidman, C.; Morokuma, T.

    2018-05-01

    We present follow-up spectroscopic observations of galaxy clusters from the first Red-sequence Cluster Survey (RCS-1). This work focuses on two samples, a lower redshift sample of ˜30 clusters ranging in redshift from z ˜ 0.2-0.6 observed with multiobject spectroscopy (MOS) on 4-6.5-m class telescopes and a z ˜ 1 sample of ˜10 clusters 8-m class telescope observations. We examine the detection efficiency and redshift accuracy of the now widely used red-sequence technique for selecting clusters via overdensities of red-sequence galaxies. Using both these data and extended samples including previously published RCS-1 spectroscopy and spectroscopic redshifts from SDSS, we find that the red-sequence redshift using simple two-filter cluster photometric redshifts is accurate to σz ≈ 0.035(1 + z) in RCS-1. This accuracy can potentially be improved with better survey photometric calibration. For the lower redshift sample, ˜5 per cent of clusters show some (minor) contamination from secondary systems with the same red-sequence intruding into the measurement aperture of the original cluster. At z ˜ 1, the rate rises to ˜20 per cent. Approximately ten per cent of projections are expected to be serious, where the two components contribute significant numbers of their red-sequence galaxies to another cluster. Finally, we present a preliminary study of the mass-richness calibration using velocity dispersions to probe the dynamical masses of the clusters. We find a relation broadly consistent with that seen in the local universe from the WINGS sample at z ˜ 0.05.

  3. Improvements on GPS Location Cluster Analysis for the Prediction of Large Carnivore Feeding Activities: Ground-Truth Detection Probability and Inclusion of Activity Sensor Measures

    PubMed Central

    Blecha, Kevin A.; Alldredge, Mat W.

    2015-01-01

    Animal space use studies using GPS collar technology are increasingly incorporating behavior based analysis of spatio-temporal data in order to expand inferences of resource use. GPS location cluster analysis is one such technique applied to large carnivores to identify the timing and location of feeding events. For logistical and financial reasons, researchers often implement predictive models for identifying these events. We present two separate improvements for predictive models that future practitioners can implement. Thus far, feeding prediction models have incorporated a small range of covariates, usually limited to spatio-temporal characteristics of the GPS data. Using GPS collared cougar (Puma concolor) we include activity sensor data as an additional covariate to increase prediction performance of feeding presence/absence. Integral to the predictive modeling of feeding events is a ground-truthing component, in which GPS location clusters are visited by human observers to confirm the presence or absence of feeding remains. Failing to account for sources of ground-truthing false-absences can bias the number of predicted feeding events to be low. Thus we account for some ground-truthing error sources directly in the model with covariates and when applying model predictions. Accounting for these errors resulted in a 10% increase in the number of clusters predicted to be feeding events. Using a double-observer design, we show that the ground-truthing false-absence rate is relatively low (4%) using a search delay of 2–60 days. Overall, we provide two separate improvements to the GPS cluster analysis techniques that can be expanded upon and implemented in future studies interested in identifying feeding behaviors of large carnivores. PMID:26398546

  4. Multivariate statistical analysis: Principles and applications to coorbital streams of meteorite falls

    NASA Technical Reports Server (NTRS)

    Wolf, S. F.; Lipschutz, M. E.

    1993-01-01

    Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.

  5. Secure and Fair Cluster Head Selection Protocol for Enhancing Security in Mobile Ad Hoc Networks

    PubMed Central

    Paramasivan, B.; Kaliappan, M.

    2014-01-01

    Mobile ad hoc networks (MANETs) are wireless networks consisting of number of autonomous mobile devices temporarily interconnected into a network by wireless media. MANETs become one of the most prevalent areas of research in the recent years. Resource limitations, energy efficiency, scalability, and security are the great challenging issues in MANETs. Due to its deployment nature, MANETs are more vulnerable to malicious attack. The secure routing protocols perform very basic security related functions which are not sufficient to protect the network. In this paper, a secure and fair cluster head selection protocol (SFCP) is proposed which integrates security factors into the clustering approach for achieving attacker identification and classification. Byzantine agreement based cooperative technique is used for attacker identification and classification to make the network more attack resistant. SFCP used to solve this issue by making the nodes that are totally surrounded by malicious neighbors adjust dynamically their belief and disbelief thresholds. The proposed protocol selects the secure and energy efficient cluster head which acts as a local detector without imposing overhead to the clustering performance. SFCP is simulated in network simulator 2 and compared with two protocols including AODV and CBRP. PMID:25143986

  6. Secure and fair cluster head selection protocol for enhancing security in mobile ad hoc networks.

    PubMed

    Paramasivan, B; Kaliappan, M

    2014-01-01

    Mobile ad hoc networks (MANETs) are wireless networks consisting of number of autonomous mobile devices temporarily interconnected into a network by wireless media. MANETs become one of the most prevalent areas of research in the recent years. Resource limitations, energy efficiency, scalability, and security are the great challenging issues in MANETs. Due to its deployment nature, MANETs are more vulnerable to malicious attack. The secure routing protocols perform very basic security related functions which are not sufficient to protect the network. In this paper, a secure and fair cluster head selection protocol (SFCP) is proposed which integrates security factors into the clustering approach for achieving attacker identification and classification. Byzantine agreement based cooperative technique is used for attacker identification and classification to make the network more attack resistant. SFCP used to solve this issue by making the nodes that are totally surrounded by malicious neighbors adjust dynamically their belief and disbelief thresholds. The proposed protocol selects the secure and energy efficient cluster head which acts as a local detector without imposing overhead to the clustering performance. SFCP is simulated in network simulator 2 and compared with two protocols including AODV and CBRP.

  7. Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of LANDSAT digital data

    NASA Technical Reports Server (NTRS)

    Ballew, G.

    1977-01-01

    The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.

  8. Galaxy Cluster Mass Reconstruction Project – III. The impact of dynamical substructure on cluster mass estimates

    DOE PAGES

    Old, L.; Wojtak, R.; Pearce, F. R.; ...

    2017-12-20

    With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously withmore » commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ~ 10 percent at 10 14 and ≳ 20 percent for ≲ 10 13.5. Finally, the use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.« less

  9. Galaxy Cluster Mass Reconstruction Project – III. The impact of dynamical substructure on cluster mass estimates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Old, L.; Wojtak, R.; Pearce, F. R.

    With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously withmore » commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ~ 10 percent at 10 14 and ≳ 20 percent for ≲ 10 13.5. Finally, the use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.« less

  10. Visualizing statistical significance of disease clusters using cartograms.

    PubMed

    Kronenfeld, Barry J; Wong, David W S

    2017-05-15

    Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss. On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions. The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.

  11. Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.

    PubMed

    Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V

    2007-01-01

    The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.

  12. Phenotyping asthma, rhinitis and eczema in MeDALL population-based birth cohorts: an allergic comorbidity cluster.

    PubMed

    Garcia-Aymerich, J; Benet, M; Saeys, Y; Pinart, M; Basagaña, X; Smit, H A; Siroux, V; Just, J; Momas, I; Rancière, F; Keil, T; Hohmann, C; Lau, S; Wahn, U; Heinrich, J; Tischer, C G; Fantini, M P; Lenzi, J; Porta, D; Koppelman, G H; Postma, D S; Berdel, D; Koletzko, S; Kerkhof, M; Gehring, U; Wickman, M; Melén, E; Hallberg, J; Bindslev-Jensen, C; Eller, E; Kull, I; Lødrup Carlsen, K C; Carlsen, K-H; Lambrecht, B N; Kogevinas, M; Sunyer, J; Kauffmann, F; Bousquet, J; Antó, J M

    2015-08-01

    Asthma, rhinitis and eczema often co-occur in children, but their interrelationships at the population level have been poorly addressed. We assessed co-occurrence of childhood asthma, rhinitis and eczema using unsupervised statistical techniques. We included 17 209 children at 4 years and 14 585 at 8 years from seven European population-based birth cohorts (MeDALL project). At each age period, children were grouped, using partitioning cluster analysis, according to the distribution of 23 variables covering symptoms 'ever' and 'in the last 12 months', doctor diagnosis, age of onset and treatments of asthma, rhinitis and eczema; immunoglobulin E sensitization; weight; and height. We tested the sensitivity of our estimates to subject and variable selections, and to different statistical approaches, including latent class analysis and self-organizing maps. Two groups were identified as the optimal way to cluster the data at both age periods and in all sensitivity analyses. The first (reference) group at 4 and 8 years (including 70% and 79% of children, respectively) was characterized by a low prevalence of symptoms and sensitization, whereas the second (symptomatic) group exhibited more frequent symptoms and sensitization. Ninety-nine percentage of children with comorbidities (co-occurrence of asthma, rhinitis and/or eczema) were included in the symptomatic group at both ages. The children's characteristics in both groups were consistent in all sensitivity analyses. At 4 and 8 years, at the population level, asthma, rhinitis and eczema can be classified together as an allergic comorbidity cluster. Future research including time-repeated assessments and biological data will help understanding the interrelationships between these diseases. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Statistical uncertainty of extreme wind storms over Europe derived from a probabilistic clustering technique

    NASA Astrophysics Data System (ADS)

    Walz, Michael; Leckebusch, Gregor C.

    2016-04-01

    Extratropical wind storms pose one of the most dangerous and loss intensive natural hazards for Europe. However, due to only 50 years of high quality observational data, it is difficult to assess the statistical uncertainty of these sparse events just based on observations. Over the last decade seasonal ensemble forecasts have become indispensable in quantifying the uncertainty of weather prediction on seasonal timescales. In this study seasonal forecasts are used in a climatological context: By making use of the up to 51 ensemble members, a broad and physically consistent statistical base can be created. This base can then be used to assess the statistical uncertainty of extreme wind storm occurrence more accurately. In order to determine the statistical uncertainty of storms with different paths of progression, a probabilistic clustering approach using regression mixture models is used to objectively assign storm tracks (either based on core pressure or on extreme wind speeds) to different clusters. The advantage of this technique is that the entire lifetime of a storm is considered for the clustering algorithm. Quadratic curves are found to describe the storm tracks most accurately. Three main clusters (diagonal, horizontal or vertical progression of the storm track) can be identified, each of which have their own particulate features. Basic storm features like average velocity and duration are calculated and compared for each cluster. The main benefit of this clustering technique, however, is to evaluate if the clusters show different degrees of uncertainty, e.g. more (less) spread for tracks approaching Europe horizontally (diagonally). This statistical uncertainty is compared for different seasonal forecast products.

  14. Quality Evaluation of Agricultural Distillates Using an Electronic Nose

    PubMed Central

    Dymerski, Tomasz; Gębicki, Jacek; Wardencki, Waldemar; Namieśnik, Jacek

    2013-01-01

    The paper presents the application of an electronic nose instrument to fast evaluation of agricultural distillates differing in quality. The investigations were carried out using a prototype of electronic nose equipped with a set of six semiconductor sensors by FIGARO Co., an electronic circuit converting signal into digital form and a set of thermostats able to provide gradient temperature characteristics to a gas mixture. A volatile fraction of the agricultural distillate samples differing in quality was obtained by barbotage. Interpretation of the results involved three data analysis techniques: principal component analysis, single-linkage cluster analysis and cluster analysis with spheres method. The investigations prove the usefulness of the presented technique in the quality control of agricultural distillates. Optimum measurements conditions were also defined, including volumetric flow rate of carrier gas (15 L/h), thermostat temperature during the barbotage process (15 °C) and time of sensor signal acquisition from the onset of the barbotage process (60 s). PMID:24287525

  15. Moisture structure of tropical cloud systems as inferred from SSM/I

    NASA Technical Reports Server (NTRS)

    Robertson, Franklin R.

    1989-01-01

    The structure of tropical cloud systems was examined using data obtained by the Special Sensor Microwave/Imager on vertically-integrated vapor, ice, and liquid water (including precipitable water) in a cloud cluster associated with a Pacific easterly wave. The cloud cluster provided a sample of the varying signatures of bulk microphysical processes in organized tropical convection. Composition techniques were used to interpret this variability and its significance in terms of the response of convection to its thermodynamic environment. The relative intensities of the ice and liquid-water signatures should provide insight on the relative contribution of stratiform vs convective rain and the characteristics of the water budgets of mesoscale convective systems.

  16. Spectral-element simulation of two-dimensional elastic wave propagation in fully heterogeneous media on a GPU cluster

    NASA Astrophysics Data System (ADS)

    Rudianto, Indra; Sudarmaji

    2018-04-01

    We present an implementation of the spectral-element method for simulation of two-dimensional elastic wave propagation in fully heterogeneous media. We have incorporated most of realistic geological features in the model, including surface topography, curved layer interfaces, and 2-D wave-speed heterogeneity. To accommodate such complexity, we use an unstructured quadrilateral meshing technique. Simulation was performed on a GPU cluster, which consists of 24 core processors Intel Xeon CPU and 4 NVIDIA Quadro graphics cards using CUDA and MPI implementation. We speed up the computation by a factor of about 5 compared to MPI only, and by a factor of about 40 compared to Serial implementation.

  17. Standard Giant Branches in the Washington Photometric System

    NASA Technical Reports Server (NTRS)

    Geisler, Doug; Sarajedini, Ata

    1998-01-01

    We have obtained CCD photometry in the Washington system C, T(sub 1) filters for some 850,000 objects associated with 10 Galactic globular clusters and 2 old open clusters. These clusters have well-known metal abundances, spanning a metallicity range of 2.5 dex from [Fe/H] approx -2.25 to +0.25 at a spacing of approx. 0.2 dex. Two independent observations were obtained for each cluster and internal checks, as well as external comparisons with existing photoelectric photometry, indicate that the final colors and magnitudes have overall uncertainties of 0.03 mag. Analogous to the method employed by Da Costa and Armandroff for V, I photometry , we then proceed to construct standard ((M(sub T),(C - T(sub 1))(sub 0)) giant branches for these clusters adopting the Lee et distance scale, using some 350 stars per globular cluster to define the giant branch. We then determine the metallicity sensitivity of the ((C - T(sub 1))(sub 0) color at a given M((sub T)(sub 1)) value. The Washington system technique is found to have three times the metallicity sensitivity of the V, I technique. At M((sub T)(sub 1)) = -2 (about a magnitude below the tip of the giant branch, roughly equivalent to M(sub I) = -3), the giant branches of 47 Tuc and M15 are separated by 1.16 magnitudes in (V - l)(sub 0) and only 0.38 magnitudes in (V - I)(sub 0). Thus, for a given photometric accuracy, metallicities can be determined three times more precisely with the Washington technique. We find a linear relationship between (C - T(sub l)(sub 0) (at M(sub T)(sub 1) = -2) and metallicity exists over the full metallicity range, with an rms of only 0.04 dex. We also derive metallicity calibrations for M(sub T)(sub 1) = -2.5 and -1.5, as well as for two other metallicity scales. The Washington technique retains almost the same metallicity sensitivity at faint magnitudes , and indeed the standard giant branches are still well separated even below the horizontal branch. The photometry is used to set upper limits in the range 0.03 - 0.09 dex for any intrinsic metallicity dispersion in the calibrating clusters. The calibrations are applicable to objects with ages approx. greater than 5 Gyr - any age effects are small or negligible for such objects. This new technique is found to have many advantages over the old two-color diagram technique for deriving metallicities from Washington photometry. In addition to only requiring 2 filters instead of 3 or 4, the new technique is generally much less sensitive to reddening and photometric errors, and the metallicity sensitivity is many times higher. The new technique is especially advantageous for metal-poor objects. The five metal-poor clusters determined by Geisler et al., using the old technique, to be much more metal-poor than previous indications, yield metallicities using the new technique which are in excellent agreement with the Zinn scale.

  18. CMOS: Efficient Clustered Data Monitoring in Sensor Networks

    PubMed Central

    2013-01-01

    Tiny and smart sensors enable applications that access a network of hundreds or thousands of sensors. Thus, recently, many researchers have paid attention to wireless sensor networks (WSNs). The limitation of energy is critical since most sensors are battery-powered and it is very difficult to replace batteries in cases that sensor networks are utilized outdoors. Data transmission between sensor nodes needs more energy than computation in a sensor node. In order to reduce the energy consumption of sensors, we present an approximate data gathering technique, called CMOS, based on the Kalman filter. The goal of CMOS is to efficiently obtain the sensor readings within a certain error bound. In our approach, spatially close sensors are grouped as a cluster. Since a cluster header generates approximate readings of member nodes, a user query can be answered efficiently using the cluster headers. In addition, we suggest an energy efficient clustering method to distribute the energy consumption of cluster headers. Our simulation results with synthetic data demonstrate the efficiency and accuracy of our proposed technique. PMID:24459444

  19. CMOS: efficient clustered data monitoring in sensor networks.

    PubMed

    Min, Jun-Ki

    2013-01-01

    Tiny and smart sensors enable applications that access a network of hundreds or thousands of sensors. Thus, recently, many researchers have paid attention to wireless sensor networks (WSNs). The limitation of energy is critical since most sensors are battery-powered and it is very difficult to replace batteries in cases that sensor networks are utilized outdoors. Data transmission between sensor nodes needs more energy than computation in a sensor node. In order to reduce the energy consumption of sensors, we present an approximate data gathering technique, called CMOS, based on the Kalman filter. The goal of CMOS is to efficiently obtain the sensor readings within a certain error bound. In our approach, spatially close sensors are grouped as a cluster. Since a cluster header generates approximate readings of member nodes, a user query can be answered efficiently using the cluster headers. In addition, we suggest an energy efficient clustering method to distribute the energy consumption of cluster headers. Our simulation results with synthetic data demonstrate the efficiency and accuracy of our proposed technique.

  20. Parallel k-means++

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique.more » We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.« less

  1. Using cluster analysis for medical resource decision making.

    PubMed

    Dilts, D; Khamalah, J; Plotkin, A

    1995-01-01

    Escalating costs of health care delivery have in the recent past often made the health care industry investigate, adapt, and apply those management techniques relating to budgeting, resource control, and forecasting that have long been used in the manufacturing sector. A strategy that has contributed much in this direction is the definition and classification of a hospital's output into "products" or groups of patients that impose similar resource or cost demands on the hospital. Existing classification schemes have frequently employed cluster analysis in generating these groupings. Unfortunately, the myriad articles and books on clustering and classification contain few formalized selection methodologies for choosing a technique for solving a particular problem, hence they often leave the novice investigator at a loss. This paper reviews the literature on clustering, particularly as it has been applied in the medical resource-utilization domain, addresses the critical choices facing an investigator in the medical field using cluster analysis, and offers suggestions (using the example of clustering low-vision patients) for how such choices can be made.

  2. Identifying irregularly shaped crime hot-spots using a multiobjective evolutionary algorithm

    NASA Astrophysics Data System (ADS)

    Wu, Xiaolan; Grubesic, Tony H.

    2010-12-01

    Spatial cluster detection techniques are widely used in criminology, geography, epidemiology, and other fields. In particular, spatial scan statistics are popular and efficient techniques for detecting areas of elevated crime or disease events. The majority of spatial scan approaches attempt to delineate geographic zones by evaluating the significance of clusters using likelihood ratio statistics tested with the Poisson distribution. While this can be effective, many scan statistics give preference to circular clusters, diminishing their ability to identify elongated and/or irregular shaped clusters. Although adjusting the shape of the scan window can mitigate some of these problems, both the significance of irregular clusters and their spatial structure must be accounted for in a meaningful way. This paper utilizes a multiobjective evolutionary algorithm to find clusters with maximum significance while quantitatively tracking their geographic structure. Crime data for the city of Cincinnati are utilized to demonstrate the advantages of the new approach and highlight its benefits versus more traditional scan statistics.

  3. A formal concept analysis approach to consensus clustering of multi-experiment expression data

    PubMed Central

    2014-01-01

    Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. Conclusions The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices. PMID:24885407

  4. Clustering by soft-constraint affinity propagation: applications to gene-expression data.

    PubMed

    Leone, Michele; Sumedha; Weigt, Martin

    2007-10-15

    Similarity-measure-based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck (2007a). In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, e.g. in analyzing gene expression data. This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new a priori free parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.

  5. Task shifting of frontline community health workers for cardiovascular risk reduction: design and rationale of a cluster randomised controlled trial (DISHA study) in India.

    PubMed

    Jeemon, Panniyammakal; Narayanan, Gitanjali; Kondal, Dimple; Kahol, Kashvi; Bharadwaj, Ashok; Purty, Anil; Negi, Prakash; Ladhani, Sulaiman; Sanghvi, Jyoti; Singh, Kuldeep; Kapoor, Deksha; Sobti, Nidhi; Lall, Dorothy; Manimunda, Sathyaprakash; Dwivedi, Supriya; Toteja, Gurudyal; Prabhakaran, Dorairaj

    2016-03-15

    Effective task-shifting interventions targeted at reducing the global cardiovascular disease (CVD) epidemic in low and middle-income countries (LMICs) are urgently needed. DISHA is a cluster randomised controlled trial conducted across 10 sites (5 in phase 1 and 5 in phase 2) in India in 120 clusters. At each site, 12 clusters were randomly selected from a district. A cluster is defined as a small village with 250-300 households and well defined geographical boundaries. They were then randomly allocated to intervention and control clusters in a 1:1 allocation sequence. If any of the intervention and control clusters were <10 km apart, one was dropped and replaced with another randomly selected cluster from the same district. The study included a representative baseline cross-sectional survey, development of a structured intervention model, delivery of intervention for a minimum period of 18 months by trained frontline health workers (mainly Anganwadi workers and ASHA workers) and a post intervention survey in a representative sample. The study staff had no information on intervention allocation until the completion of the baseline survey. In order to ensure comparability of data across sites, the DISHA study follows a common protocol and manual of operation with standardized measurement techniques. Our study is the largest community based cluster randomised trial in low and middle-income country settings designed to test the effectiveness of 'task shifting' interventions involving frontline health workers for cardiovascular risk reduction. CTRI/2013/10/004049 . Registered 7 October 2013.

  6. Detection and tracking of gas plumes in LWIR hyperspectral video sequence data

    NASA Astrophysics Data System (ADS)

    Gerhart, Torin; Sunu, Justin; Lieu, Lauren; Merkurjev, Ekaterina; Chang, Jen-Mei; Gilles, Jérôme; Bertozzi, Andrea L.

    2013-05-01

    Automated detection of chemical plumes presents a segmentation challenge. The segmentation problem for gas plumes is difficult due to the diffusive nature of the cloud. The advantage of considering hyperspectral images in the gas plume detection problem over the conventional RGB imagery is the presence of non-visual data, allowing for a richer representation of information. In this paper we present an effective method of visualizing hyperspectral video sequences containing chemical plumes and investigate the effectiveness of segmentation techniques on these post-processed videos. Our approach uses a combination of dimension reduction and histogram equalization to prepare the hyperspectral videos for segmentation. First, Principal Components Analysis (PCA) is used to reduce the dimension of the entire video sequence. This is done by projecting each pixel onto the first few Principal Components resulting in a type of spectral filter. Next, a Midway method for histogram equalization is used. These methods redistribute the intensity values in order to reduce icker between frames. This properly prepares these high-dimensional video sequences for more traditional segmentation techniques. We compare the ability of various clustering techniques to properly segment the chemical plume. These include K-means, spectral clustering, and the Ginzburg-Landau functional.

  7. Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering.

    PubMed

    Boari de Lima, Elisa; Meira, Wagner; Melo-Minardi, Raquel Cardoso de

    2016-06-01

    As increasingly more genomes are sequenced, the vast majority of proteins may only be annotated computationally, given experimental investigation is extremely costly. This highlights the need for computational methods to determine protein functions quickly and reliably. We believe dividing a protein family into subtypes which share specific functions uncommon to the whole family reduces the function annotation problem's complexity. Hence, this work's purpose is to detect isofunctional subfamilies inside a family of unknown function, while identifying differentiating residues. Similarity between protein pairs according to various properties is interpreted as functional similarity evidence. Data are integrated using genetic programming and provided to a spectral clustering algorithm, which creates clusters of similar proteins. The proposed framework was applied to well-known protein families and to a family of unknown function, then compared to ASMC. Results showed our fully automated technique obtained better clusters than ASMC for two families, besides equivalent results for other two, including one whose clusters were manually defined. Clusters produced by our framework showed great correspondence with the known subfamilies, besides being more contrasting than those produced by ASMC. Additionally, for the families whose specificity determining positions are known, such residues were among those our technique considered most important to differentiate a given group. When run with the crotonase and enolase SFLD superfamilies, the results showed great agreement with this gold-standard. Best results consistently involved multiple data types, thus confirming our hypothesis that similarities according to different knowledge domains may be used as functional similarity evidence. Our main contributions are the proposed strategy for selecting and integrating data types, along with the ability to work with noisy and incomplete data; domain knowledge usage for detecting subfamilies in a family with different specificities, thus reducing the complexity of the experimental function characterization problem; and the identification of residues responsible for specificity.

  8. Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering

    PubMed Central

    Boari de Lima, Elisa; Meira, Wagner; de Melo-Minardi, Raquel Cardoso

    2016-01-01

    As increasingly more genomes are sequenced, the vast majority of proteins may only be annotated computationally, given experimental investigation is extremely costly. This highlights the need for computational methods to determine protein functions quickly and reliably. We believe dividing a protein family into subtypes which share specific functions uncommon to the whole family reduces the function annotation problem’s complexity. Hence, this work’s purpose is to detect isofunctional subfamilies inside a family of unknown function, while identifying differentiating residues. Similarity between protein pairs according to various properties is interpreted as functional similarity evidence. Data are integrated using genetic programming and provided to a spectral clustering algorithm, which creates clusters of similar proteins. The proposed framework was applied to well-known protein families and to a family of unknown function, then compared to ASMC. Results showed our fully automated technique obtained better clusters than ASMC for two families, besides equivalent results for other two, including one whose clusters were manually defined. Clusters produced by our framework showed great correspondence with the known subfamilies, besides being more contrasting than those produced by ASMC. Additionally, for the families whose specificity determining positions are known, such residues were among those our technique considered most important to differentiate a given group. When run with the crotonase and enolase SFLD superfamilies, the results showed great agreement with this gold-standard. Best results consistently involved multiple data types, thus confirming our hypothesis that similarities according to different knowledge domains may be used as functional similarity evidence. Our main contributions are the proposed strategy for selecting and integrating data types, along with the ability to work with noisy and incomplete data; domain knowledge usage for detecting subfamilies in a family with different specificities, thus reducing the complexity of the experimental function characterization problem; and the identification of residues responsible for specificity. PMID:27348631

  9. Clustering analysis for muon tomography data elaboration in the Muon Portal project

    NASA Astrophysics Data System (ADS)

    Bandieramonte, M.; Antonuccio-Delogu, V.; Becciani, U.; Costa, A.; La Rocca, P.; Massimino, P.; Petta, C.; Pistagna, C.; Riggi, F.; Riggi, S.; Sciacca, E.; Vitello, F.

    2015-05-01

    Clustering analysis is one of multivariate data analysis techniques which allows to gather statistical data units into groups, in order to minimize the logical distance within each group and to maximize the one between different groups. In these proceedings, the authors present a novel approach to the muontomography data analysis based on clustering algorithms. As a case study we present the Muon Portal project that aims to build and operate a dedicated particle detector for the inspection of harbor containers to hinder the smuggling of nuclear materials. Clustering techniques, working directly on scattering points, help to detect the presence of suspicious items inside the container, acting, as it will be shown, as a filter for a preliminary analysis of the data.

  10. The Global Optimization of Pt13 Cluster Using the First-Principle Molecular Dynamics with the Quenching Technique

    NASA Astrophysics Data System (ADS)

    Chen, Xiangping; Duan, Haiming; Cao, Biaobing; Long, Mengqiu

    2018-03-01

    The high-temperature first-principle molecular dynamics method used to obtain the low energy configurations of clusters [L. L. Wang and D. D. Johnson, PRB 75, 235405 (2007)] is extended to a considerably large temperature range by combination with the quenching technique. Our results show that there are strong correlations between the possibilities for obtaining the ground-state structure and the temperatures. Larger possibilities can be obtained at relatively low temperatures (as corresponds to the pre-melting temperature range). Details of the structural correlation with the temperature are investigated by taking the Pt13 cluster as an example, which suggests a quite efficient method to obtain the lowest-energy geometries of metal clusters.

  11. Using reflection time-of-flight mass spectrometer techniques to investigate cluster dynamics and bonding

    NASA Astrophysics Data System (ADS)

    Wei, Shiqing; Castleman, A. W., Jr.

    1994-02-01

    Lase based time-of-flight mass spectrometer systems affixed with reflectrons are valuable tools for investigating cluster dynamics and reactions, spectroscopy and structures. Utilizing the reflectron time-of-flight mass spectrometer techniques, both decay fractions and kinetic energy releases of metastable cluster ions can be measured with high precision. By applying related theoretical models, the desired thermochemical values of metastable species can be deduced, which are otherwise very difficult to obtain. Several examples are discussed with attention focused on ammonia as a test case for hydrogen bond systems, and xenon for weaker van der Waals clusters. A brief overview of applications to investigating solvation effects on reactions and structures, delayed electron transfer and ionization through intracluster Penning ionization is also given.

  12. Automatic identification of the number of food items in a meal using clustering techniques based on the monitoring of swallowing and chewing.

    PubMed

    Lopez-Meyer, Paulo; Schuckers, Stephanie; Makeyev, Oleksandr; Fontana, Juan M; Sazonov, Edward

    2012-09-01

    The number of distinct foods consumed in a meal is of significant clinical concern in the study of obesity and other eating disorders. This paper proposes the use of information contained in chewing and swallowing sequences for meal segmentation by food types. Data collected from experiments of 17 volunteers were analyzed using two different clustering techniques. First, an unsupervised clustering technique, Affinity Propagation (AP), was used to automatically identify the number of segments within a meal. Second, performance of the unsupervised AP method was compared to a supervised learning approach based on Agglomerative Hierarchical Clustering (AHC). While the AP method was able to obtain 90% accuracy in predicting the number of food items, the AHC achieved an accuracy >95%. Experimental results suggest that the proposed models of automatic meal segmentation may be utilized as part of an integral application for objective Monitoring of Ingestive Behavior in free living conditions.

  13. Optimal Cluster Mill Pass Scheduling With an Accurate and Rapid New Strip Crown Model

    NASA Astrophysics Data System (ADS)

    Malik, Arif S.; Grandhi, Ramana V.; Zipf, Mark E.

    2007-05-01

    Besides the requirement to roll coiled sheet at high levels of productivity, the optimal pass scheduling of cluster-type reversing cold mills presents the added challenge of assigning mill parameters that facilitate the best possible strip flatness. The pressures of intense global competition, and the requirements for increasingly thinner, higher quality specialty sheet products that are more difficult to roll, continue to force metal producers to commission innovative flatness-control technologies. This means that during the on-line computerized set-up of rolling mills, the mathematical model should not only determine the minimum total number of passes and maximum rolling speed, it should simultaneously optimize the pass-schedule so that desired flatness is assured, either by manual or automated means. In many cases today, however, on-line prediction of strip crown and corresponding flatness for the complex cluster-type rolling mills is typically addressed either by trial and error, by approximate deflection models for equivalent vertical roll-stacks, or by non-physical pattern recognition style models. The abundance of the aforementioned methods is largely due to the complexity of cluster-type mill configurations and the lack of deflection models with sufficient accuracy and speed for on-line use. Without adequate assignment of the pass-schedule set-up parameters, it may be difficult or impossible to achieve the required strip flatness. In this paper, we demonstrate optimization of cluster mill pass-schedules using a new accurate and rapid strip crown model. This pass-schedule optimization includes computations of the predicted strip thickness profile to validate mathematical constraints. In contrast to many of the existing methods for on-line prediction of strip crown and flatness on cluster mills, the demonstrated method requires minimal prior tuning and no extensive training with collected mill data. To rapidly and accurately solve the multi-contact problem and predict the strip crown, a new customized semi-analytical modeling technique that couples the Finite Element Method (FEM) with classical solid mechanics was developed to model the deflection of the rolls and strip while under load. The technique employed offers several important advantages over traditional methods to calculate strip crown, including continuity of elastic foundations, non-iterative solution when using predetermined foundation moduli, continuous third-order displacement fields, simple stress-field determination, and a comparatively faster solution time.

  14. Skilled delivery service utilization and its association with the establishment of Women's Health Development Army in Yeky district, South West Ethiopia: a multilevel analysis.

    PubMed

    Negero, Melese Girmaye; Mitike, Yifru Berhan; Worku, Abebaw Gebeyehu; Abota, Tafesse Lamaro

    2018-01-30

    Because of the unacceptably high maternal and perinatal morbidity and mortality, the government of Ethiopia has established health extension program with a community-based network involving health extension workers (HEWs) and a community level women organization which is known as "Women's Health Development Army" (WHDA). Currently, the HEWs and WHDA network is the approach preferred by the government to register pregnant women and encourage them to link in the healthcare system. However, its association with skilled delivery service utilization is not well known. A community-based cross-sectional study was conducted from January to February 2015. Within 380 clusters of WHDA, a total of 748 reproductive-age women who gave birth in 1 year preceding the study, were included using multistage sampling technique. The data were entered into EPI info version 7 statistical software and exported to STATA version 11 for analysis. Multilevel analysis technique was applied to check for an association of selected variables with a utilization of skilled delivery service. About 45% of women have received skilled delivery care. A significant heterogeneity was observed between "Women's Health Development Teams (clusters)" for skilled delivery care service utilization which explains about 62% of the total variation. Individual-level predictors including urban residence [AOR (95% CI) 35.10 (4.62, 266.52)], previous exposure of complications [AOR (95% CI) 3.81 (1.60, 9.08)], at least four ANC visits [AOR (95% CI) 7.44 (1.48, 37.42)] and preference of skilled personnel [AOR (95% CI) 8.11 (2.61, 25.15)] were significantly associated with skilled delivery service use. Among cluster level variables, the distance of clusters within 2 km radius from the nearest health facility was significantly associated [AOR (95% CI) 6.03 (1.92, 18.93)] with skilled delivery service utilization. In this study, significant variation among clusters of WHDA was observed. Both individual and cluster level variables were identified to predict skilled delivery service utilization. Encouraging women to have frequent ANC visits (- 4 and above), enhancing awareness creation towards the delivery care attendance, constructing more health facilities and roads in hard to reach areas and establishing telemedicine services are recommended.

  15. Effect of denoising on supervised lung parenchymal clusters

    NASA Astrophysics Data System (ADS)

    Jayamani, Padmapriya; Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.

    2012-03-01

    Denoising is a critical preconditioning step for quantitative analysis of medical images. Despite promises for more consistent diagnosis, denoising techniques are seldom explored in clinical settings. While this may be attributed to the esoteric nature of the parameter sensitve algorithms, lack of quantitative measures on their ecacy to enhance the clinical decision making is a primary cause of physician apathy. This paper addresses this issue by exploring the eect of denoising on the integrity of supervised lung parenchymal clusters. Multiple Volumes of Interests (VOIs) were selected across multiple high resolution CT scans to represent samples of dierent patterns (normal, emphysema, ground glass, honey combing and reticular). The VOIs were labeled through consensus of four radiologists. The original datasets were ltered by multiple denoising techniques (median ltering, anisotropic diusion, bilateral ltering and non-local means) and the corresponding ltered VOIs were extracted. Plurality of cluster indices based on multiple histogram-based pair-wise similarity measures were used to assess the quality of supervised clusters in the original and ltered space. The resultant rank orders were analyzed using the Borda criteria to nd the denoising-similarity measure combination that has the best cluster quality. Our exhaustive analyis reveals (a) for a number of similarity measures, the cluster quality is inferior in the ltered space; and (b) for measures that benet from denoising, a simple median ltering outperforms non-local means and bilateral ltering. Our study suggests the need to judiciously choose, if required, a denoising technique that does not deteriorate the integrity of supervised clusters.

  16. The Application of Clustering Techniques to Citation Data. Research Reports Series B No. 6.

    ERIC Educational Resources Information Center

    Arms, William Y.; Arms, Caroline

    This report describes research carried out as part of the Design of Information Systems in the Social Sciences (DISISS) project. Cluster analysis techniques were applied to a machine readable file of bibliographic data in the form of cited journal titles in order to identify groupings which could be used to structure bibliographic files. Practical…

  17. Use of joint two-view information for computerized lesion detection on mammograms: improvement of microcalcification detection accuracy

    NASA Astrophysics Data System (ADS)

    Sahiner, Berkman; Gurcan, Metin N.; Chan, Heang-Ping; Hadjiiski, Lubomir M.; Petrick, Nicholas; Helvie, Mark A.

    2002-05-01

    We are developing new techniques to improve the accuracy of computerized microcalcification detection by using the joint two-view information on craniocaudal (CC) and mediolateral-oblique (MLO) views. After cluster candidates were detected using a single-view detection technique, candidates on CC and MLO views were paired using their radial distances from the nipple. Object pairs were classified with a joint two-view classifier that used the similarity of objects in a pair. Each cluster candidate was also classified as a true microcalcification cluster or a false-positive (FP) using its single-view features. The outputs of these two classifiers were fused. A data set of 38 pairs of mammograms from our database was used to train the new detection technique. The independent test set consisted of 77 pairs of mammograms from the University of South Florida public database. At a per-film sensitivity of 70%, the FP rates were 0.17 and 0.27 with the fusion and single-view detection methods, respectively. Our results indicate that correspondence of cluster candidates on two different views provides valuable additional information for distinguishing false from true microcalcification clusters.

  18. Modulation aware cluster size optimisation in wireless sensor networks

    NASA Astrophysics Data System (ADS)

    Sriram Naik, M.; Kumar, Vinay

    2017-07-01

    Wireless sensor networks (WSNs) play a great role because of their numerous advantages to the mankind. The main challenge with WSNs is the energy efficiency. In this paper, we have focused on the energy minimisation with the help of cluster size optimisation along with consideration of modulation effect when the nodes are not able to communicate using baseband communication technique. Cluster size optimisations is important technique to improve the performance of WSNs. It provides improvement in energy efficiency, network scalability, network lifetime and latency. We have proposed analytical expression for cluster size optimisation using traditional sensing model of nodes for square sensing field with consideration of modulation effects. Energy minimisation can be achieved by changing the modulation schemes such as BPSK, 16-QAM, QPSK, 64-QAM, etc., so we are considering the effect of different modulation techniques in the cluster formation. The nodes in the sensing fields are random and uniformly deployed. It is also observed that placement of base station at centre of scenario enables very less number of modulation schemes to work in energy efficient manner but when base station placed at the corner of the sensing field, it enable large number of modulation schemes to work in energy efficient manner.

  19. Density-cluster NMA: A new protein decomposition technique for coarse-grained normal mode analysis.

    PubMed

    Demerdash, Omar N A; Mitchell, Julie C

    2012-07-01

    Normal mode analysis has emerged as a useful technique for investigating protein motions on long time scales. This is largely due to the advent of coarse-graining techniques, particularly Hooke's Law-based potentials and the rotational-translational blocking (RTB) method for reducing the size of the force-constant matrix, the Hessian. Here we present a new method for domain decomposition for use in RTB that is based on hierarchical clustering of atomic density gradients, which we call Density-Cluster RTB (DCRTB). The method reduces the number of degrees of freedom by 85-90% compared with the standard blocking approaches. We compared the normal modes from DCRTB against standard RTB using 1-4 residues in sequence in a single block, with good agreement between the two methods. We also show that Density-Cluster RTB and standard RTB perform well in capturing the experimentally determined direction of conformational change. Significantly, we report superior correlation of DCRTB with B-factors compared with 1-4 residue per block RTB. Finally, we show significant reduction in computational cost for Density-Cluster RTB that is nearly 100-fold for many examples. Copyright © 2012 Wiley Periodicals, Inc.

  20. Collaboration patterns in the German political science co-authorship network.

    PubMed

    Leifeld, Philip; Wankmüller, Sandra; Berger, Valentin T Z; Ingold, Karin; Steiner, Christiane

    2017-01-01

    Research on social processes in the production of scientific output suggests that the collective research agenda of a discipline is influenced by its structural features, such as "invisible colleges" or "groups of collaborators" as well as academic "stars" that are embedded in, or connect, these research groups. Based on an encompassing dataset that takes into account multiple publication types including journals and chapters in edited volumes, we analyze the complete co-authorship network of all 1,339 researchers in German political science. Through the use of consensus graph clustering techniques and descriptive centrality measures, we identify the ten largest research clusters, their research topics, and the most central researchers who act as bridges and connect these clusters. We also aggregate the findings at the level of research organizations and consider the inter-university co-authorship network. The findings indicate that German political science is structured by multiple overlapping research clusters with a dominance of the subfields of international relations, comparative politics and political sociology. A small set of well-connected universities takes leading roles in these informal research groups.

  1. Collaboration patterns in the German political science co-authorship network

    PubMed Central

    Wankmüller, Sandra; Berger, Valentin T. Z.; Ingold, Karin; Steiner, Christiane

    2017-01-01

    Research on social processes in the production of scientific output suggests that the collective research agenda of a discipline is influenced by its structural features, such as “invisible colleges” or “groups of collaborators” as well as academic “stars” that are embedded in, or connect, these research groups. Based on an encompassing dataset that takes into account multiple publication types including journals and chapters in edited volumes, we analyze the complete co-authorship network of all 1,339 researchers in German political science. Through the use of consensus graph clustering techniques and descriptive centrality measures, we identify the ten largest research clusters, their research topics, and the most central researchers who act as bridges and connect these clusters. We also aggregate the findings at the level of research organizations and consider the inter-university co-authorship network. The findings indicate that German political science is structured by multiple overlapping research clusters with a dominance of the subfields of international relations, comparative politics and political sociology. A small set of well-connected universities takes leading roles in these informal research groups. PMID:28388621

  2. Close proximity electrostatic effect from small clusters of emitters

    NASA Astrophysics Data System (ADS)

    Dall'Agnol, Fernando F.; de Assis, Thiago A.

    2017-10-01

    Using a numerical simulation based on the finite-element technique, this work investigates the field emission properties from clusters of a few emitters at close proximity, by analyzing the properties of the maximum local field enhancement factor (γm ) and the corresponding emission current. At short distances between the emitters, we show the existence of a nonintuitive behavior, which consists of the increasing of γm as the distance c between the emitters decreases. Here we investigate this phenomenon for clusters with 2, 3, 4 and 7 identical emitters and study the influence of the proximity effect in the emission current, considering the role of the aspect ratio of the individual emitters. Importantly, our results show that peripheral emitters with high aspect-ratios in large clusters can, in principle, significantly increase the emitted current as a consequence only of the close proximity electrostatic effect (CPEE). This phenomenon can be seen as a physical mechanism to produce self-oscillations of individual emitters. We discuss new insights for understanding the nature of self-oscillations in emitters based on the CPEE, including applications to nanometric oscillators.

  3. Gene Discovery in Bladder Cancer Progression using cDNA Microarrays

    PubMed Central

    Sanchez-Carbayo, Marta; Socci, Nicholas D.; Lozano, Juan Jose; Li, Wentian; Charytonowicz, Elizabeth; Belbin, Thomas J.; Prystowsky, Michael B.; Ortiz, Angel R.; Childs, Geoffrey; Cordon-Cardo, Carlos

    2003-01-01

    To identify gene expression changes along progression of bladder cancer, we compared the expression profiles of early-stage and advanced bladder tumors using cDNA microarrays containing 17,842 known genes and expressed sequence tags. The application of bootstrapping techniques to hierarchical clustering segregated early-stage and invasive transitional carcinomas into two main clusters. Multidimensional analysis confirmed these clusters and more importantly, it separated carcinoma in situ from papillary superficial lesions and subgroups within early-stage and invasive tumors displaying different overall survival. Additionally, it recognized early-stage tumors showing gene profiles similar to invasive disease. Different techniques including standard t-test, single-gene logistic regression, and support vector machine algorithms were applied to identify relevant genes involved in bladder cancer progression. Cytokeratin 20, neuropilin-2, p21, and p33ING1 were selected among the top ranked molecular targets differentially expressed and validated by immunohistochemistry using tissue microarrays (n = 173). Their expression patterns were significantly associated with pathological stage, tumor grade, and altered retinoblastoma (RB) expression. Moreover, p33ING1 expression levels were significantly associated with overall survival. Analysis of the annotation of the most significant genes revealed the relevance of critical genes and pathways during bladder cancer progression, including the overexpression of oncogenic genes such as DEK in superficial tumors or immune response genes such as Cd86 antigen in invasive disease. Gene profiling successfully classified bladder tumors based on their progression and clinical outcome. The present study has identified molecular biomarkers of potential clinical significance and critical molecular targets associated with bladder cancer progression. PMID:12875971

  4. Microseismic Monitoring of Stimulating Shale Gas Reservoir in SW China: 2. Spatial Clustering Controlled by the Preexisting Faults and Fractures

    NASA Astrophysics Data System (ADS)

    Chen, Haichao; Meng, Xiaobo; Niu, Fenglin; Tang, Youcai; Yin, Chen; Wu, Furong

    2018-02-01

    Microseismic monitoring is crucial to improving stimulation efficiency of hydraulic fracturing treatment, as well as to mitigating potential induced seismic hazard. We applied an improved matching and locating technique to the downhole microseismic data set during one treatment stage along a horizontal well within the Weiyuan shale gas play inside Sichuan Basin in SW China, resulting in 3,052 well-located microseismic events. We employed this expanded catalog to investigate the spatiotemporal evolution of the microseismicity in order to constrain migration of the injected fluids and the associated dynamic processes. The microseismicity is generally characterized by two distinctly different clusters, both of which are highly correlated with the injection activity spatially and temporarily. The distant and well-confined cluster (cluster A) is featured by relatively large-magnitude events, with 40 events of M -1 or greater, whereas the cluster in the immediate vicinity of the wellbore (cluster B) includes two apparent lineations of seismicity with a NE-SW trending, consistent with the predominant orientation of natural fractures. We calculated the b-value and D-value, an index of fracture complexity, and found significant differences between the two seismicity clusters. Particularly, the distant cluster showed an extremely low b-value ( 0.47) and D-value ( 1.35). We speculate that the distant cluster is triggered by reactivation of a preexisting critically stressed fault, whereas the two lineations are induced by shear failures of optimally oriented natural fractures associated with fluid diffusion. In both cases, the spatially clustered microseismicity related to hydraulic stimulation is strongly controlled by the preexisting faults and fractures.

  5. Optimization Techniques for Clustering,Connectivity, and Flow Problems in Complex Networks

    DTIC Science & Technology

    2012-10-01

    discrete optimization and for analysis of performance of algorithm portfolios; introducing a metaheuristic framework of variable objective search that...The results of empirical evaluation of the proposed algorithm are also included. 1.3 Theoretical analysis of heuristics and designing new metaheuristic ...analysis of heuristics for inapproximable problems and designing new metaheuristic approaches for the problems of interest; (IV) Developing new models

  6. GRC RBCC Concept Multidisciplinary Analysis

    NASA Technical Reports Server (NTRS)

    Suresh, Ambady

    2001-01-01

    This report outlines the GRC RBCC Concept for Multidisciplinary Analysis. The multidisciplinary coupling procedure is presented, along with technique validations and axisymmetric multidisciplinary inlet and structural results. The NPSS (Numerical Propulsion System Simulation) test bed developments and code parallelization are also presented. These include milestones and accomplishments, a discussion of running R4 fan application on the PII cluster as compared to other platforms, and the National Combustor Code speedup.

  7. Machine Learning for Biological Trajectory Classification Applications

    NASA Technical Reports Server (NTRS)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  8. Photobiomolecular deposition of metallic particles and films

    DOEpatents

    Hu, Zhong-Cheng

    2005-02-08

    The method of the invention is based on the unique electron-carrying function of a photocatalytic unit such as the photosynthesis system I (PSI) reaction center of the protein-chlorophyll complex isolated from chloroplasts. The method employs a photo-biomolecular metal deposition technique for precisely controlled nucleation and growth of metallic clusters/particles, e.g., platinum, palladium, and their alloys, etc., as well as for thin-film formation above the surface of a solid substrate. The photochemically mediated technique offers numerous advantages over traditional deposition methods including quantitative atom deposition control, high energy efficiency, and mild operating condition requirements.

  9. Photobiomolecular metallic particles and films

    DOEpatents

    Hu, Zhong-Cheng

    2003-05-06

    The method of the invention is based on the unique electron-carrying function of a photocatalytic unit such as the photosynthesis system I (PSI) reaction center of the protein-chlorophyll complex isolated from chloroplasts. The method employs a photo-biomolecular metal deposition technique for precisely controlled nucleation and growth of metallic clusters/particles, e.g., platinum, palladium, and their alloys, etc., as well as for thin-film formation above the surface of a solid substrate. The photochemically mediated technique offers numerous advantages over traditional deposition methods including quantitative atom deposition control, high energy efficiency, and mild operating condition requirements.

  10. Logo image clustering based on advanced statistics

    NASA Astrophysics Data System (ADS)

    Wei, Yi; Kamel, Mohamed; He, Yiwei

    2007-11-01

    In recent years, there has been a growing interest in the research of image content description techniques. Among those, image clustering is one of the most frequently discussed topics. Similar to image recognition, image clustering is also a high-level representation technique. However it focuses on the coarse categorization rather than the accurate recognition. Based on wavelet transform (WT) and advanced statistics, the authors propose a novel approach that divides various shaped logo images into groups according to the external boundary of each logo image. Experimental results show that the presented method is accurate, fast and insensitive to defects.

  11. Reconstruction of cluster masses using particle based lensing

    NASA Astrophysics Data System (ADS)

    Deb, Sanghamitra

    Clusters of galaxies are among the richest astrophysical data systems, but to truly understand these systems, we need a detailed study of the relationship between observables and the underlying cluster dark matter distribution. Gravitational lensing is the most direct probe of dark matter, but many mass reconstruction techniques assume that cluster light traces mass, or combine different lensing signals in an ad hoc way. In this talk, we will describe "Particle Based Lensing" (PBL), a new method for cluster mass reconstruction, that avoids many of the pitfalls of previous techniques. PBL optimally combines lensing information of varying signal-to-noise, and makes no assumptions about the relationship between mass and light. We will describe mass reconstructions in three very different, but very illuminating cluster systems: the "Bullet Cluster" (lE 0657-56), A901/902 and A1689. The "Bullet Cluster" is a system of merging clusters made famous by the first unambiguous lensing detection of dark matter. A901/902 is a multi-cluster system with four peaks, and provides an ideal laboratory for studying cluster interaction. We are particularly interested in measuring and correlating the dark matter clump ellipticities. A1689 is one of the richest clusters known, and has significant substructure at the core. It is also my first exercise in optimally combining weak and strong gravitational lensing in a cluster reconstruction. We find that the dark matter distribution is significantly clumpier than indicated by X-ray maps of the gas. We conclude by discussing various potential applications of PBL to existing and future data.

  12. Into the Bowels of Depression: Unravelling Medical Symptoms Associated with Depression by Applying Machine-Learning Techniques to a Community Based Population Sample.

    PubMed

    Dipnall, Joanna F; Pasco, Julie A; Berk, Michael; Williams, Lana J; Dodd, Seetal; Jacka, Felice N; Meyer, Denny

    2016-01-01

    Depression is commonly comorbid with many other somatic diseases and symptoms. Identification of individuals in clusters with comorbid symptoms may reveal new pathophysiological mechanisms and treatment targets. The aim of this research was to combine machine-learning (ML) algorithms with traditional regression techniques by utilising self-reported medical symptoms to identify and describe clusters of individuals with increased rates of depression from a large cross-sectional community based population epidemiological study. A multi-staged methodology utilising ML and traditional statistical techniques was performed using the community based population National Health and Nutrition Examination Study (2009-2010) (N = 3,922). A Self-organised Mapping (SOM) ML algorithm, combined with hierarchical clustering, was performed to create participant clusters based on 68 medical symptoms. Binary logistic regression, controlling for sociodemographic confounders, was used to then identify the key clusters of participants with higher levels of depression (PHQ-9≥10, n = 377). Finally, a Multiple Additive Regression Tree boosted ML algorithm was run to identify the important medical symptoms for each key cluster within 17 broad categories: heart, liver, thyroid, respiratory, diabetes, arthritis, fractures and osteoporosis, skeletal pain, blood pressure, blood transfusion, cholesterol, vision, hearing, psoriasis, weight, bowels and urinary. Five clusters of participants, based on medical symptoms, were identified to have significantly increased rates of depression compared to the cluster with the lowest rate: odds ratios ranged from 2.24 (95% CI 1.56, 3.24) to 6.33 (95% CI 1.67, 24.02). The ML boosted regression algorithm identified three key medical condition categories as being significantly more common in these clusters: bowel, pain and urinary symptoms. Bowel-related symptoms was found to dominate the relative importance of symptoms within the five key clusters. This methodology shows promise for the identification of conditions in general populations and supports the current focus on the potential importance of bowel symptoms and the gut in mental health research.

  13. Detection of Anomalies in Hydrometric Data Using Artificial Intelligence Techniques

    NASA Astrophysics Data System (ADS)

    Lauzon, N.; Lence, B. J.

    2002-12-01

    This work focuses on the detection of anomalies in hydrometric data sequences, such as 1) outliers, which are individual data having statistical properties that differ from those of the overall population; 2) shifts, which are sudden changes over time in the statistical properties of the historical records of data; and 3) trends, which are systematic changes over time in the statistical properties. For the purpose of the design and management of water resources systems, it is important to be aware of these anomalies in hydrometric data, for they can induce a bias in the estimation of water quantity and quality parameters. These anomalies may be viewed as specific patterns affecting the data, and therefore pattern recognition techniques can be used for identifying them. However, the number of possible patterns is very large for each type of anomaly and consequently large computing capacities are required to account for all possibilities using the standard statistical techniques, such as cluster analysis. Artificial intelligence techniques, such as the Kohonen neural network and fuzzy c-means, are clustering techniques commonly used for pattern recognition in several areas of engineering and have recently begun to be used for the analysis of natural systems. They require much less computing capacity than the standard statistical techniques, and therefore are well suited for the identification of outliers, shifts and trends in hydrometric data. This work constitutes a preliminary study, using synthetic data representing hydrometric data that can be found in Canada. The analysis of the results obtained shows that the Kohonen neural network and fuzzy c-means are reasonably successful in identifying anomalies. This work also addresses the problem of uncertainties inherent to the calibration procedures that fit the clusters to the possible patterns for both the Kohonen neural network and fuzzy c-means. Indeed, for the same database, different sets of clusters can be established with these calibration procedures. A simple method for analyzing uncertainties associated with the Kohonen neural network and fuzzy c-means is developed here. The method combines the results from several sets of clusters, either from the Kohonen neural network or fuzzy c-means, so as to provide an overall diagnosis as to the identification of outliers, shifts and trends. The results indicate an improvement in the performance for identifying anomalies when the method of combining cluster sets is used, compared with when only one cluster set is used.

  14. Site-Specific Biomolecule Labeling with Gold Clusters

    PubMed Central

    Ackerson, Christopher J.; Powell, Richard D.; Hainfeld, James F.

    2013-01-01

    Site-specific labeling of biomolecules in vitro with gold clusters can enhance the information content of electron cryomicroscopy experiments. This chapter provides a practical overview of well-established techniques for forming biomolecule/gold cluster conjugates. Three bioconjugation chemistries are covered: Linker-mediated bioconjugation, direct gold–biomolecule bonding, and coordination-mediated bonding of nickel(II) nitrilotriacetic acid (NTA)-derivatized gold clusters to polyhistidine (His)-tagged proteins. PMID:20887859

  15. Could the clinical interpretability of subgroups detected using clustering methods be improved by using a novel two-stage approach?

    PubMed

    Kent, Peter; Stochkendahl, Mette Jensen; Christensen, Henrik Wulff; Kongsted, Alice

    2015-01-01

    Recognition of homogeneous subgroups of patients can usefully improve prediction of their outcomes and the targeting of treatment. There are a number of research approaches that have been used to recognise homogeneity in such subgroups and to test their implications. One approach is to use statistical clustering techniques, such as Cluster Analysis or Latent Class Analysis, to detect latent relationships between patient characteristics. Influential patient characteristics can come from diverse domains of health, such as pain, activity limitation, physical impairment, social role participation, psychological factors, biomarkers and imaging. However, such 'whole person' research may result in data-driven subgroups that are complex, difficult to interpret and challenging to recognise clinically. This paper describes a novel approach to applying statistical clustering techniques that may improve the clinical interpretability of derived subgroups and reduce sample size requirements. This approach involves clustering in two sequential stages. The first stage involves clustering within health domains and therefore requires creating as many clustering models as there are health domains in the available data. This first stage produces scoring patterns within each domain. The second stage involves clustering using the scoring patterns from each health domain (from the first stage) to identify subgroups across all domains. We illustrate this using chest pain data from the baseline presentation of 580 patients. The new two-stage clustering resulted in two subgroups that approximated the classic textbook descriptions of musculoskeletal chest pain and atypical angina chest pain. The traditional single-stage clustering resulted in five clusters that were also clinically recognisable but displayed less distinct differences. In this paper, a new approach to using clustering techniques to identify clinically useful subgroups of patients is suggested. Research designs, statistical methods and outcome metrics suitable for performing that testing are also described. This approach has potential benefits but requires broad testing, in multiple patient samples, to determine its clinical value. The usefulness of the approach is likely to be context-specific, depending on the characteristics of the available data and the research question being asked of it.

  16. A Comparison of Two Approaches to Beta-Flexible Clustering.

    ERIC Educational Resources Information Center

    Belbin, Lee; And Others

    1992-01-01

    A method for hierarchical agglomerative polythetic (multivariate) clustering, based on unweighted pair group using arithmetic averages (UPGMA) is compared with the original beta-flexible technique, a weighted average method. Reasons the flexible UPGMA strategy is recommended are discussed, focusing on the ability to recover cluster structure over…

  17. Hyperspectral remote sensing of paddy crop using insitu measurement and clustering technique

    NASA Astrophysics Data System (ADS)

    Moharana, S.; Dutta, S.

    2014-11-01

    Rice Agriculture, mainly cultivated in South Asia regions, is being monitored for extracting crop parameter, crop area, crop growth profile, crop yield using both optical and microwave remote sensing. Hyperspectral data provide more detailed information of rice agriculture. The present study was carried out at the experimental station of the Regional Rainfed Low land Rice Research Station, Assam, India (26.1400° N, 91.7700° E) and the overall climate of the study area comes under Lower Brahmaputra Valley (LBV) Agro Climatic Zones. The hyperspectral measurements were made in the year 2009 from 72 plots that include eight rice varieties along with three different level of nitrogen treatments (50, 100, 150 kg/ha) covering rice transplanting to the crop harvesting period. With an emphasis to varieties, hyperspectral measurements were taken in the year 2014 from 24 plots having 24 rice genotypes with different crop developmental ages. All the measurements were performed using a spectroradiometer with a spectral range of 350-1050 nm under direct sunlight of a cloud free sky and stable condition of the atmosphere covering more than 95 % canopy. In this study, reflectance collected from canopy of rice were expressed in terms of waveforms. Furthermore, generated waveforms were analysed for all combinations of nitrogen applications and varieties. A hierarchical clustering technique was employed to classify these waveforms into different groups. By help of agglomerative clustering algorithm a few number of clusters were finalized for different rice varieties along with nitrogen treatments. By this clustering approach, observational error in spectroradiometer reflectance was also nullified. From this hierarchical clustering, appropriate spectral signature for rice canopy were identified and will help to create rice crop classification accurately and therefore have a prospect to make improved information on rice agriculture at both local and regional scales. From this hierarchical clustering, spectral signature library for rice canopy were identified which will help to create rice crop classification maps and critical wave bands like green (519,559 nm), red (649 nm), red edge (729 nm) and NIR region (779,819 nm) were marked sensitive to nitrogen which will further help in nitrogen mapping of paddy agriculture over therefore have the prospect to make improved informed decisions.

  18. Spectroscopic Confirmation of a Massive Red-sequence Selected Galaxy Cluster at Z=1.34 in the SpARCS-South Cluster Survey

    NASA Technical Reports Server (NTRS)

    Wilson, Gillian; Demarco, Ricardo; Muzzin, Adam; Yee, H.K.C.; Lacy, Mark; Surace, Jason; Gilbank, David; Blindert, Kris; Hoekstra, Henk; Majumdar, Subhabrata; hide

    2008-01-01

    The Spitzer Adaptation of the Red-sequence Cluster Survey (SpARCS) is a z'-passband imaging survey, consisting of deep (z' approx. 24 AB) observations made from both hemispheres using the CFHT 3.6m and CTIO 4m telescopes. The survey was designed with the primary aim of detecting galaxy clusters at z > 1. In tandem with pre-existing 3.6 micron observations from the Spitzer Space Telescope SWIRE Legacy Survey, SpARCS detects clusters using an infrared adaptation of the two-filter red-sequence cluster technique. The total effective area of the SpARCS cluster survey is 41.9 sq deg. In this paper, we provide an overview of the 13.6 sq deg Southern CTIO/MOSAICII observations. The 28.3 sq deg Northern CFHT/MegaCam observations are summarized in a companion paper by Muzzin et al. (2008a). In this paper, we also report spectroscopic confirmation of SpARCS J003550-431224, a very rich galaxy cluster at z = 1.335, discovered in the ELAIS-S1 field. To date, this is the highest spectroscopically confirmed redshift for a galaxy cluster discovered using the red-sequence technique. Based on nine confirmed members, SpARCS J003550-431224 has a preliminary velocity dispersion of 1050+/-230 km/s. With its proven capability for efficient cluster detection, SpARCS is a demonstration that we have entered an era of large, homogeneously-selected z > 1 cluster surveys.

  19. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

    PubMed

    Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela

    2015-05-01

    Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  20. Development of Pattern Recognition Techniques for the Evaluation of Toxicant Impacts to Multispecies Systems

    DTIC Science & Technology

    1993-06-18

    the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and clustering methods...rule rather than the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and...experiments using two microcosm protocols. We use nonmetric clustering, a multivariate pattern recognition technique developed by Matthews and Heame (1991

  1. The Effect of Roundtable and Clustering Teaching Techniques and Students' Personal Traits on Students' Achievement in Descriptive Writing

    ERIC Educational Resources Information Center

    Sinaga, Megawati

    2017-01-01

    The Objectives of this paper as an experimental research was to investigate the effect of Roundtable and Clustering teaching techniques and students' personal traits on students' achievement in descriptive writing. The students in grade ix of SMP Negeri 2 Pancurbatu 2016/2017 school academic year were chose as the population of this research. The…

  2. Electronic levels and charge distribution near the interface of nickel

    NASA Technical Reports Server (NTRS)

    Waber, J. T.

    1982-01-01

    The energy levels in clusters of nickel atoms were investigated by means of a series of cluster calculations using both the multiple scattering and computational techniques (designated SSO) which avoids the muffin-tin approximation. The point group symmetry of the cluster has significant effect on the energy of levels nominally not occupied. This influences the electron transfer process during chemisorption. The SSO technique permits the approaching atom or molecule plus a small number of nickel atoms to be treated as a cluster. Specifically, molecular levels become more negative in the O atom, as well as in a CO molecule, as the metal atoms are approached. Thus, electron transfer from the nickel and bond formation is facilitated. This result is of importance in understanding chemisorption and catalytic processes.

  3. RELICS: Strong-lensing Analysis of the Massive Clusters MACS J0308.9+2645 and PLCK G171.9‑40.7

    NASA Astrophysics Data System (ADS)

    Acebron, Ana; Cibirka, Nathália; Zitrin, Adi; Coe, Dan; Agulli, Irene; Sharon, Keren; Bradač, Maruša; Frye, Brenda; Livermore, Rachael C.; Mahler, Guillaume; Salmon, Brett; Umetsu, Keiichi; Bradley, Larry; Andrade-Santos, Felipe; Avila, Roberto; Carrasco, Daniela; Cerny, Catherine; Czakon, Nicole G.; Dawson, William A.; Hoag, Austin T.; Huang, Kuang-Han; Johnson, Traci L.; Jones, Christine; Kikuchihara, Shotaro; Lam, Daniel; Lovisari, Lorenzo; Mainali, Ramesh; Oesch, Pascal A.; Ogaz, Sara; Ouchi, Masami; Past, Matthew; Paterno-Mahler, Rachel; Peterson, Avery; Ryan, Russell E.; Sendra-Server, Irene; Stark, Daniel P.; Strait, Victoria; Toft, Sune; Trenti, Michele; Vulcani, Benedetta

    2018-05-01

    Strong gravitational lensing by galaxy clusters has become a powerful tool for probing the high-redshift universe, magnifying distant and faint background galaxies. Reliable strong-lensing (SL) models are crucial for determining the intrinsic properties of distant, magnified sources and for constructing their luminosity function. We present here the first SL analysis of MACS J0308.9+2645 and PLCK G171.9‑40.7, two massive galaxy clusters imaged with the Hubble Space Telescope, in the framework of the Reionization Lensing Cluster Survey (RELICS). We use the light-traces-mass modeling technique to uncover sets of multiply imaged galaxies and constrain the mass distribution of the clusters. Our SL analysis reveals that both clusters have particularly large Einstein radii (θ E > 30″ for a source redshift of z s = 2), providing fairly large areas with high magnifications, useful for high-redshift galaxy searches (∼2 arcmin2 with μ > 5 to ∼1 arcmin2 with μ > 10, similar to a typical Hubble Frontier Fields cluster). We also find that MACS J0308.9+2645 hosts a promising, apparently bright (J ∼ 23.2–24.6 AB), multiply imaged high-redshift candidate at z ∼ 6.4. These images are among the brightest high-redshift candidates found in RELICS. Our mass models, including magnification maps, are made publicly available for the community through the Mikulski Archive for Space Telescopes.

  4. Waveforms clustering of small magnitude earthquakes recorded in the Northern Sicilian offshore: evidence of multiplets

    NASA Astrophysics Data System (ADS)

    D'Alessandro, A.; Mangano, G.; D'Anna, G.; Luzio, D.; Selvaggi, G.

    2011-12-01

    On September 6th 2002 the northern Sicily was hit by a strong earthquake (MW 5.9). In the following six months over a thousand aftershocks were located in the same area. On December 7th 2009, the INGV OBSLab deployed an OBS/H near the epicentral area of the main shock at a depth of 1500 m. The submarine station was recovered after 233 days. During the eight months of the experiment the OBS/H recorded about 250 small magnitude events of clear local origin. In order to identify seismic events generated by the same tectonic structure, we have applied a clustering technique based on the similarity of the waveforms. The similarity matrix was constructed using the maximum of the normalized cross-covariance function. To identify the multiplets, we used a clustering technique based on an agglomerative hierarchical algorithm, based on the nearest neighbor strategy. The results were summarized in the dendrogram of Fig. 1. The partitions have been obtained by "cutting" the dendrogram at a level of distance equal to 0.3. So we have identified 9 multiplets and some doublets and triplets. Fig. 2 shows as example the multiplet 1. The events of this cluster have a high level of similarity; 25 of the 31 micro-events are characterized by a similarity greater than 0.9. In order to locate the micro-earthquakes recorded by the OBS/H only a single station location technique was implemented and applied. Some multiplets have clouds of hypocenters overlapping each other. These clusters, indistinguishable without the application of a waveforms clustering technique, show differences in the waveforms that must be attributed to differences in focal mechanisms which generated the waveforms.

  5. Net-zero Building Cluster Simulations and On-line Energy Forecasting for Adaptive and Real-Time Control and Decisions

    NASA Astrophysics Data System (ADS)

    Li, Xiwang

    Buildings consume about 41.1% of primary energy and 74% of the electricity in the U.S. Moreover, it is estimated by the National Energy Technology Laboratory that more than 1/4 of the 713 GW of U.S. electricity demand in 2010 could be dispatchable if only buildings could respond to that dispatch through advanced building energy control and operation strategies and smart grid infrastructure. In this study, it is envisioned that neighboring buildings will have the tendency to form a cluster, an open cyber-physical system to exploit the economic opportunities provided by a smart grid, distributed power generation, and storage devices. Through optimized demand management, these building clusters will then reduce overall primary energy consumption and peak time electricity consumption, and be more resilient to power disruptions. Therefore, this project seeks to develop a Net-zero building cluster simulation testbed and high fidelity energy forecasting models for adaptive and real-time control and decision making strategy development that can be used in a Net-zero building cluster. The following research activities are summarized in this thesis: 1) Development of a building cluster emulator for building cluster control and operation strategy assessment. 2) Development of a novel building energy forecasting methodology using active system identification and data fusion techniques. In this methodology, a systematic approach for building energy system characteristic evaluation, system excitation and model adaptation is included. The developed methodology is compared with other literature-reported building energy forecasting methods; 3) Development of the high fidelity on-line building cluster energy forecasting models, which includes energy forecasting models for buildings, PV panels, batteries and ice tank thermal storage systems 4) Small scale real building validation study to verify the performance of the developed building energy forecasting methodology. The outcomes of this thesis can be used for building cluster energy forecasting model development and model based control and operation optimization. The thesis concludes with a summary of the key outcomes of this research, as well as a list of recommendations for future work.

  6. Fluid-structure interaction modeling of clusters of spacecraft parachutes with modified geometric porosity

    NASA Astrophysics Data System (ADS)

    Takizawa, Kenji; Tezduyar, Tayfun E.; Boben, Joseph; Kostov, Nikolay; Boswell, Cody; Buscher, Austin

    2013-12-01

    To increase aerodynamic performance, the geometric porosity of a ringsail spacecraft parachute canopy is sometimes increased, beyond the "rings" and "sails" with hundreds of "ring gaps" and "sail slits." This creates extra computational challenges for fluid-structure interaction (FSI) modeling of clusters of such parachutes, beyond those created by the lightness of the canopy structure, geometric complexities of hundreds of gaps and slits, and the contact between the parachutes of the cluster. In FSI computation of parachutes with such "modified geometric porosity," the flow through the "windows" created by the removal of the panels and the wider gaps created by the removal of the sails cannot be accurately modeled with the Homogenized Modeling of Geometric Porosity (HMGP), which was introduced to deal with the hundreds of gaps and slits. The flow needs to be actually resolved. All these computational challenges need to be addressed simultaneously in FSI modeling of clusters of spacecraft parachutes with modified geometric porosity. The core numerical technology is the Stabilized Space-Time FSI (SSTFSI) technique, and the contact between the parachutes is handled with the Surface-Edge-Node Contact Tracking (SENCT) technique. In the computations reported here, in addition to the SSTFSI and SENCT techniques and HMGP, we use the special techniques we have developed for removing the numerical spinning component of the parachute motion and for restoring the mesh integrity without a remesh. We present results for 2- and 3-parachute clusters with two different payload models.

  7. Person mobility in the design and analysis of cluster-randomized cohort prevention trials.

    PubMed

    Vuchinich, Sam; Flay, Brian R; Aber, Lawrence; Bickman, Leonard

    2012-06-01

    Person mobility is an inescapable fact of life for most cluster-randomized (e.g., schools, hospitals, clinic, cities, state) cohort prevention trials. Mobility rates are an important substantive consideration in estimating the effects of an intervention. In cluster-randomized trials, mobility rates are often correlated with ethnicity, poverty and other variables associated with disparity. This raises the possibility that estimated intervention effects may generalize to only the least mobile segments of a population and, thus, create a threat to external validity. Such mobility can also create threats to the internal validity of conclusions from randomized trials. Researchers must decide how to deal with persons who leave study clusters during a trial (dropouts), persons and clusters that do not comply with an assigned intervention, and persons who enter clusters during a trial (late entrants), in addition to the persons who remain for the duration of a trial (stayers). Statistical techniques alone cannot solve the key issues of internal and external validity raised by the phenomenon of person mobility. This commentary presents a systematic, Campbellian-type analysis of person mobility in cluster-randomized cohort prevention trials. It describes four approaches for dealing with dropouts, late entrants and stayers with respect to data collection, analysis and generalizability. The questions at issue are: 1) From whom should data be collected at each wave of data collection? 2) Which cases should be included in the analyses of an intervention effect? and 3) To what populations can trial results be generalized? The conclusions lead to recommendations for the design and analysis of future cluster-randomized cohort prevention trials.

  8. SOTXTSTREAM: Density-based self-organizing clustering of text streams.

    PubMed

    Bryant, Avory C; Cios, Krzysztof J

    2017-01-01

    A streaming data clustering algorithm is presented building upon the density-based self-organizing stream clustering algorithm SOSTREAM. Many density-based clustering algorithms are limited by their inability to identify clusters with heterogeneous density. SOSTREAM addresses this limitation through the use of local (nearest neighbor-based) density determinations. Additionally, many stream clustering algorithms use a two-phase clustering approach. In the first phase, a micro-clustering solution is maintained online, while in the second phase, the micro-clustering solution is clustered offline to produce a macro solution. By performing self-organization techniques on micro-clusters in the online phase, SOSTREAM is able to maintain a macro clustering solution in a single phase. Leveraging concepts from SOSTREAM, a new density-based self-organizing text stream clustering algorithm, SOTXTSTREAM, is presented that addresses several shortcomings of SOSTREAM. Gains in clustering performance of this new algorithm are demonstrated on several real-world text stream datasets.

  9. Comparison of Bayesian clustering and edge detection methods for inferring boundaries in landscape genetics

    USGS Publications Warehouse

    Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.

    2011-01-01

    Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.

  10. AN ASTEROSEISMIC MEMBERSHIP STUDY OF THE RED GIANTS IN THREE OPEN CLUSTERS OBSERVED BY KEPLER: NGC 6791, NGC 6819, AND NGC 6811

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stello, Dennis; Huber, Daniel; Bedding, Timothy R.

    Studying star clusters offers significant advances in stellar astrophysics due to the combined power of having many stars with essentially the same distance, age, and initial composition. This makes clusters excellent test benches for verification of stellar evolution theory. To fully exploit this potential, it is vital that the star sample is uncontaminated by stars that are not members of the cluster. Techniques for determining cluster membership therefore play a key role in the investigation of clusters. We present results on three clusters in the Kepler field of view based on a newly established technique that uses asteroseismology to identifymore » fore- or background stars in the field, which demonstrates advantages over classical methods such as kinematic and photometry measurements. Four previously identified seismic non-members in NGC 6819 are confirmed in this study, and three additional non-members are found-two in NGC 6819 and one in NGC 6791. We further highlight which stars are, or might be, affected by blending, which needs to be taken into account when analyzing these Kepler data.« less

  11. Structure of overheated metal clusters: MD simulation study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vorontsov, Alexander

    2015-08-17

    The structure of overheated metal clusters appeared in condensation process was studied by computer simulation techniques. It was found that clusters with size larger than several tens of atoms have three layers: core part, intermediate dense packing layer and a gas- like shell with low density. The change of the size and structure of these layers with the variation of internal energy and the size of cluster is discussed.

  12. Cancer detection based on Raman spectra super-paramagnetic clustering

    NASA Astrophysics Data System (ADS)

    González-Solís, José Luis; Guizar-Ruiz, Juan Ignacio; Martínez-Espinosa, Juan Carlos; Martínez-Zerega, Brenda Esmeralda; Juárez-López, Héctor Alfonso; Vargas-Rodríguez, Héctor; Gallegos-Infante, Luis Armando; González-Silva, Ricardo Armando; Espinoza-Padilla, Pedro Basilio; Palomares-Anda, Pascual

    2016-08-01

    The clustering of Raman spectra of serum sample is analyzed using the super-paramagnetic clustering technique based in the Potts spin model. We investigated the clustering of biochemical networks by using Raman data that define edge lengths in the network, and where the interactions are functions of the Raman spectra's individual band intensities. For this study, we used two groups of 58 and 102 control Raman spectra and the intensities of 160, 150 and 42 Raman spectra of serum samples from breast and cervical cancer and leukemia patients, respectively. The spectra were collected from patients from different hospitals from Mexico. By using super-paramagnetic clustering technique, we identified the most natural and compact clusters allowing us to discriminate the control and cancer patients. A special interest was the leukemia case where its nearly hierarchical observed structure allowed the identification of the patients's leukemia type. The goal of this study is to apply a model of statistical physics, as the super-paramagnetic, to find these natural clusters that allow us to design a cancer detection method. To the best of our knowledge, this is the first report of preliminary results evaluating the usefulness of super-paramagnetic clustering in the discipline of spectroscopy where it is used for classification of spectra.

  13. Convalescing Cluster Configuration Using a Superlative Framework

    PubMed Central

    Sabitha, R.; Karthik, S.

    2015-01-01

    Competent data mining methods are vital to discover knowledge from databases which are built as a result of enormous growth of data. Various techniques of data mining are applied to obtain knowledge from these databases. Data clustering is one such descriptive data mining technique which guides in partitioning data objects into disjoint segments. K-means algorithm is a versatile algorithm among the various approaches used in data clustering. The algorithm and its diverse adaptation methods suffer certain problems in their performance. To overcome these issues a superlative algorithm has been proposed in this paper to perform data clustering. The specific feature of the proposed algorithm is discretizing the dataset, thereby improving the accuracy of clustering, and also adopting the binary search initialization method to generate cluster centroids. The generated centroids are fed as input to K-means approach which iteratively segments the data objects into respective clusters. The clustered results are measured for accuracy and validity. Experiments conducted by testing the approach on datasets from the UC Irvine Machine Learning Repository evidently show that the accuracy and validity measure is higher than the other two approaches, namely, simple K-means and Binary Search method. Thus, the proposed approach proves that discretization process will improve the efficacy of descriptive data mining tasks. PMID:26543895

  14. Cluster ensemble based on Random Forests for genetic data.

    PubMed

    Alhusain, Luluah; Hafez, Alaaeldin M

    2017-01-01

    Clustering plays a crucial role in several application domains, such as bioinformatics. In bioinformatics, clustering has been extensively used as an approach for detecting interesting patterns in genetic data. One application is population structure analysis, which aims to group individuals into subpopulations based on shared genetic variations, such as single nucleotide polymorphisms. Advances in DNA sequencing technology have facilitated the obtainment of genetic datasets with exceptional sizes. Genetic data usually contain hundreds of thousands of genetic markers genotyped for thousands of individuals, making an efficient means for handling such data desirable. Random Forests (RFs) has emerged as an efficient algorithm capable of handling high-dimensional data. RFs provides a proximity measure that can capture different levels of co-occurring relationships between variables. RFs has been widely considered a supervised learning method, although it can be converted into an unsupervised learning method. Therefore, RF-derived proximity measure combined with a clustering technique may be well suited for determining the underlying structure of unlabeled data. This paper proposes, RFcluE, a cluster ensemble approach for determining the underlying structure of genetic data based on RFs. The approach comprises a cluster ensemble framework to combine multiple runs of RF clustering. Experiments were conducted on high-dimensional, real genetic dataset to evaluate the proposed approach. The experiments included an examination of the impact of parameter changes, comparing RFcluE performance against other clustering methods, and an assessment of the relationship between the diversity and quality of the ensemble and its effect on RFcluE performance. This paper proposes, RFcluE, a cluster ensemble approach based on RF clustering to address the problem of population structure analysis and demonstrate the effectiveness of the approach. The paper also illustrates that applying a cluster ensemble approach, combining multiple RF clusterings, produces more robust and higher-quality results as a consequence of feeding the ensemble with diverse views of high-dimensional genetic data obtained through bagging and random subspace, the two key features of the RF algorithm.

  15. MOCCA code for star cluster simulation: comparison with optical observations using COCOA

    NASA Astrophysics Data System (ADS)

    Askar, Abbas; Giersz, Mirek; Pych, Wojciech; Olech, Arkadiusz; Hypki, Arkadiusz

    2016-02-01

    We introduce and present preliminary results from COCOA (Cluster simulatiOn Comparison with ObservAtions) code for a star cluster after 12 Gyr of evolution simulated using the MOCCA code. The COCOA code is being developed to quickly compare results of numerical simulations of star clusters with observational data. We use COCOA to obtain parameters of the projected cluster model. For comparison, a FITS file of the projected cluster was provided to observers so that they could use their observational methods and techniques to obtain cluster parameters. The results show that the similarity of cluster parameters obtained through numerical simulations and observations depends significantly on the quality of observational data and photometric accuracy.

  16. Using Multilevel Factor Analysis with Clustered Data: Investigating the Factor Structure of the Positive Values Scale

    ERIC Educational Resources Information Center

    Huang, Francis L.; Cornell, Dewey G.

    2016-01-01

    Advances in multilevel modeling techniques now make it possible to investigate the psychometric properties of instruments using clustered data. Factor models that overlook the clustering effect can lead to underestimated standard errors, incorrect parameter estimates, and model fit indices. In addition, factor structures may differ depending on…

  17. Hierarchical Spatio-temporal Visual Analysis of Cluster Evolution in Electrocorticography Data

    DOE PAGES

    Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; ...

    2016-10-02

    Here, we present ECoG ClusterFlow, a novel interactive visual analysis tool for the exploration of high-resolution Electrocorticography (ECoG) data. Our system detects and visualizes dynamic high-level structures, such as communities, using the time-varying spatial connectivity network derived from the high-resolution ECoG data. ECoG ClusterFlow provides a multi-scale visualization of the spatio-temporal patterns underlying the time-varying communities using two views: 1) an overview summarizing the evolution of clusters over time and 2) a hierarchical glyph-based technique that uses data aggregation and small multiples techniques to visualize the propagation of clusters in their spatial domain. ECoG ClusterFlow makes it possible 1) tomore » compare the spatio-temporal evolution patterns across various time intervals, 2) to compare the temporal information at varying levels of granularity, and 3) to investigate the evolution of spatial patterns without occluding the spatial context information. Lastly, we present case studies done in collaboration with neuroscientists on our team for both simulated and real epileptic seizure data aimed at evaluating the effectiveness of our approach.« less

  18. Machine learning approaches for estimation of prediction interval for the model output.

    PubMed

    Shrestha, Durga L; Solomatine, Dimitri P

    2006-03-01

    A novel method for estimating prediction uncertainty using machine learning techniques is presented. Uncertainty is expressed in the form of the two quantiles (constituting the prediction interval) of the underlying distribution of prediction errors. The idea is to partition the input space into different zones or clusters having similar model errors using fuzzy c-means clustering. The prediction interval is constructed for each cluster on the basis of empirical distributions of the errors associated with all instances belonging to the cluster under consideration and propagated from each cluster to the examples according to their membership grades in each cluster. Then a regression model is built for in-sample data using computed prediction limits as targets, and finally, this model is applied to estimate the prediction intervals (limits) for out-of-sample data. The method was tested on artificial and real hydrologic data sets using various machine learning techniques. Preliminary results show that the method is superior to other methods estimating the prediction interval. A new method for evaluating performance for estimating prediction interval is proposed as well.

  19. Galaxy Distribution in Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Okamoto, T.; Yachi, S.; Habe, A.

    beta-discrepancy have been pointed out from comparison of optical and X-ray observations of clusters of galaxies. To examine physical reason of beta-discrepancy, we use N-body simulation which contains two components, dark particles and galaxies which are identified by using adaptive-linking friend of friend technique at a certain red-shift. The gas component is not included here, since the gas distribution follows the dark matter distribution in dark halos (Jubio F. Navarro, Carlos S. Frenk and Simon D. M. White 1995). We find that the galaxy distribution follows the dark matter distribution, therefore beta-discrepancy does not exist, and this result is consistent with the interpretation of the beta-discrepancy by Bahcall and Lubin (1994), which was based on recent observation.

  20. Cluster Lensing with the BTC

    NASA Astrophysics Data System (ADS)

    Fischer, P.

    1997-12-01

    Weak distortions of background galaxies are rapidly emerging as a powerful tool for the measurement of galaxy cluster mass distributions. Lensing based studies have the advantage of being direct measurements of mass and are not model-dependent as are other techniques (X-ray, radial velocities). To date studies have been limited by CCD field size meaning that full coverage of the clusters out to the virial radii and beyond has not been possible. Probing this large radius region is essential for testing models of large scale structure formation. New wide field CCD mosaics, for the first time, allow mass measurements out to very large radius. We have obtained images for a sample of clusters with the ``Big Throughput Camera'' (BTC) on the CTIO 4m. This camera comprises four thinned SITE 2048(2) CCDs, each 15arcmin on a side for a total area of one quarter of a square degree. We have developed an automated reduction pipeline which: 1) corrects for spatial distortions, 2) corrects for PSF anisotropy, 3) determines relative scaling and background levels, and 4) combines multiple exposures. In this poster we will present some preliminary results of our cluster lensing study. This will include radial mass and light profiles and 2-d mass and galaxy density maps.

  1. Cluster headache: clinical features and therapeutic options.

    PubMed

    Gaul, Charly; Diener, Hans-Christoph; Müller, Oliver M

    2011-08-01

    Cluster headache is the most common type of trigemino-autonomic headache, affecting ca. 120 000 persons in Germany alone. The attacks of pain are in the periorbital area on one side, last 90 minutes on average, and are accompanied by trigemino-autonomic manifestations and restlessness. Most patients have episodic cluster headache; about 15% have chronic cluster headache, with greater impairment of their quality of life. The attacks often possess a circadian and seasonal rhythm. Selective literature review Oxygen inhalation and triptans are effective acute treatment for cluster attacks. First-line drugs for attack prophylaxis include verapamil and cortisone; alternatively, lithium and topiramate can be given. Short-term relief can be obtained by the subcutaneous infiltration of local anesthetics and steroids along the course of the greater occipital nerve, although most of the evidence in favor of this is not derived from randomized clinical trials. Patients whose pain is inadequately relieved by drug treatment can be offered newer, invasive treatments, such as deep brain stimulation in the hypothalamus (DBS) and bilateral occipital nerve stimulation (ONS). Pharmacotherapy for the treatment of acute attacks and for attack prophylaxis is effective in most patients. For the minority who do not gain adequate relief, newer invasive techniques are available in some referral centers. Definitive conclusions as to their value cannot yet be drawn from the available data.

  2. [Ag115S34(SCH2C6H4 t Bu)47(dpph)6]: synthesis, crystal structure and NMR investigations of a soluble silver chalcogenide nanocluster.

    PubMed

    Bestgen, Sebastian; Fuhr, Olaf; Breitung, Ben; Kiran Chakravadhanula, Venkata Sei; Guthausen, Gisela; Hennrich, Frank; Yu, Wen; Kappes, Manfred M; Roesky, Peter W; Fenske, Dieter

    2017-03-01

    With the aim to synthesize soluble cluster molecules, the silver salt of (4-( tert -butyl)phenyl)methanethiol [AgSCH 2 C 6 H 4 t Bu] was applied as a suitable precursor for the formation of a nanoscale silver sulfide cluster. In the presence of 1,6-(diphenylphosphino)hexane (dpph), the 115 nuclear silver cluster [Ag 115 S 34 (SCH 2 C 6 H 4 t Bu) 47 (dpph) 6 ] was obtained. The molecular structure of this compound was elucidated by single crystal X-ray analysis and fully characterized by spectroscopic techniques. In contrast to most of the previously published cluster compounds with more than a hundred heavy atoms, this nanoscale inorganic molecule is soluble in organic solvents, which allowed a comprehensive investigation in solution by UV-Vis spectroscopy and one- and two-dimensional NMR spectroscopy including 31 P/ 109 Ag-HSQC and DOSY experiments. These are the first heteronuclear NMR investigations on coinage metal chalcogenides. They give some first insight into the behavior of nanoscale silver sulfide clusters in solution. Additionally, molecular weight determinations were performed by 2D analytical ultracentrifugation and HR-TEM investigations confirm the presence of size-homogeneous nanoparticles present in solution.

  3. Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of Landsat digital data. [mapping of hydrothermally altered volcanic rocks

    NASA Technical Reports Server (NTRS)

    Ballew, G.

    1977-01-01

    The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed using Johnson's HICLUS program. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.

  4. Imbalance of community structures in epilepsy

    NASA Astrophysics Data System (ADS)

    Ortega, G. J.; Herrera Peco, I.; García de Sola, R.; Pastor, J.

    2010-09-01

    Epilepsy is commonly associated with synchronous activity in the form of spikes and also in developed seizures. Desynchronised activity seems to play an important role also in the seizure process, favouring the initiation of seizures. The aim of the present work is to explore synchronization activity in the inner areas in the temporal lobe of epileptic patients by a novel approach. Two temporal lobe epilepsy (TLE) patients' records have been analyzed through a cluster analysis. Electrical activity in the inner part of the temporal has been recorded by using Foramen Ovale Electrodes (FOE), a semi-invasive technique frequently used in drug resistant epileptic patients. Instead of tracking synchronized activity, we give here special attention to desynchronized activity, mainly those areas which are not included in synchronization clusters. Our results show that electrical activity in the epileptic side behaves in a less cohesive fashion than the contra-lateral side. There exists a clear tendency in the epileptic side to be organized as isolated clusters of electrical activity as compared with the contra-lateral side, which is organized in the form of large clusters of synchronous activity. In particular, we shall give special attention to the cluster desynchronization during the seizures. As we shall show, our results can help in understand several characteristics of the seizures dynamics.

  5. Dynamical Modeling of NGC 6397: Simulated HST Imaging

    NASA Astrophysics Data System (ADS)

    Dull, J. D.; Cohn, H. N.; Lugger, P. M.; Slavin, S. D.; Murphy, B. W.

    1994-12-01

    The proximity of NGC 6397 (2.2 kpc) provides an ideal opportunity to test current dynamical models for globular clusters with the HST Wide-Field/Planetary Camera (WFPC2)\\@. We have used a Monte Carlo algorithm to generate ensembles of simulated Planetary Camera (PC) U-band images of NGC 6397 from evolving, multi-mass Fokker-Planck models. These images, which are based on the post-repair HST-PC point-spread function, are used to develop and test analysis methods for recovering structural information from actual HST imaging. We have considered a range of exposure times up to 2.4times 10(4) s, based on our proposed HST Cycle 5 observations. Our Fokker-Planck models include energy input from dynamically-formed binaries. We have adopted a 20-group mass spectrum extending from 0.16 to 1.4 M_sun. We use theoretical luminosity functions for red giants and main sequence stars. Horizontal branch stars, blue stragglers, white dwarfs, and cataclysmic variables are also included. Simulated images are generated for cluster models at both maximal core collapse and at a post-collapse bounce. We are carrying out stellar photometry on these images using ``DAOPHOT-assisted aperture photometry'' software that we have developed. We are testing several techniques for analyzing the resulting star counts, to determine the underlying cluster structure, including parametric model fits and the nonparametric density estimation methods. Our simulated images also allow us to investigate the accuracy and completeness of methods for carrying out stellar photometry in HST Planetary Camera images of dense cluster cores.

  6. Cluster-based analysis of multi-model climate ensembles

    NASA Astrophysics Data System (ADS)

    Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.

    2018-06-01

    Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and useful framework in which to assess and visualise model spread, offering insight into geographical areas of agreement among models and a measure of diversity across an ensemble. Finally, we discuss caveats of the clustering techniques and note that while we have focused on tropospheric ozone, the principles underlying the cluster-based MMMs are applicable to other prognostic variables from climate models.

  7. A nonperturbative light-front coupled-cluster method

    NASA Astrophysics Data System (ADS)

    Hiller, J. R.

    2012-10-01

    The nonperturbative Hamiltonian eigenvalue problem for bound states of a quantum field theory is formulated in terms of Dirac's light-front coordinates and then approximated by the exponential-operator technique of the many-body coupled-cluster method. This approximation eliminates any need for the usual approximation of Fock-space truncation. Instead, the exponentiated operator is truncated, and the terms retained are determined by a set of nonlinear integral equations. These equations are solved simultaneously with an effective eigenvalue problem in the valence sector, where the number of constituents is small. Matrix elements can be calculated, with extensions of techniques from standard coupled-cluster theory, to obtain form factors and other observables.

  8. Revealing the ecological content of long-duration audio-recordings of the environment through clustering and visualisation.

    PubMed

    Phillips, Yvonne F; Towsey, Michael; Roe, Paul

    2018-01-01

    Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration.

  9. Revealing the ecological content of long-duration audio-recordings of the environment through clustering and visualisation

    PubMed Central

    Towsey, Michael; Roe, Paul

    2018-01-01

    Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration. PMID:29494629

  10. Spatial analysis for the epidemiological study of cardiovascular diseases: A systematic literature search.

    PubMed

    Mena, Carlos; Sepúlveda, Cesar; Fuentes, Eduardo; Ormazábal, Yony; Palomo, Iván

    2018-05-07

    Cardiovascular diseases (CVDs) are the primary cause of death and disability in de world, and the detection of populations at risk as well as localization of vulnerable areas is essential for adequate epidemiological management. Techniques developed for spatial analysis, among them geographical information systems and spatial statistics, such as cluster detection and spatial correlation, are useful for the study of the distribution of the CVDs. These techniques, enabling recognition of events at different geographical levels of study (e.g., rural, deprived neighbourhoods, etc.), make it possible to relate CVDs to factors present in the immediate environment. The systemic literature presented here shows that this group of diseases is clustered with regard to incidence, mortality and hospitalization as well as obesity, smoking, increased glycated haemoglobin levels, hypertension physical activity and age. In addition, acquired variables such as income, residency (rural or urban) and education, contribute to CVD clustering. Both local cluster detection and spatial regression techniques give statistical weight to the findings providing valuable information that can influence response mechanisms in the health services by indicating locations in need of intervention and assignment of available resources.

  11. Testing Gravity and Cosmic Acceleration with Galaxy Clustering

    NASA Astrophysics Data System (ADS)

    Kazin, Eyal; Tinker, J.; Sanchez, A. G.; Blanton, M.

    2012-01-01

    The large-scale structure contains vast amounts of cosmological information that can help understand the accelerating nature of the Universe and test gravity on large scales. Ongoing and future sky surveys are designed to test these using various techniques applied on clustering measurements of galaxies. We present redshift distortion measurements of the Sloan Digital Sky Survey II Luminous Red Galaxy sample. We find that when combining the normalized quadrupole Q with the projected correlation function wp(rp) along with cluster counts (Rapetti et al. 2010), results are consistent with General Relativity. The advantage of combining Q and wp is the addition of the bias information, when using the Halo Occupation Distribution framework. We also present improvements to the standard technique of measuring Hubble expansion rates H(z) and angular diameter distances DA(z) when using the baryonic acoustic feature as a standard ruler. We introduce clustering wedges as an alternative basis to the multipole expansion and show that it yields similar constraints. This alternative basis serves as a useful technique to test for systematics, and ultimately improve measurements of the cosmic acceleration.

  12. Development of the Electron Drift Instrument (EDI) for Cluster

    NASA Technical Reports Server (NTRS)

    Quinn, Jack; Christensen, John L. (Technical Monitor)

    2001-01-01

    The Electron Drift Instrument (EDI) is a new technique for measuring electric fields in space by detecting the effect on weak beams of test electrons. This U.S. portions of the technique, flight hardware, and flight software were developed for the Cluster mission under this contract. Dr. Goetz Paschmann of the Max Planck Institute in Garching, Germany, was the Principle Investigator for Cluster EDI. Hardware for Cluster was developed in the U.S. at the University of New Hampshire, Lockheed Palo Alto Research Laboratory, and University of California, San Diego. The Cluster satellites carrying the original EDI instruments were lost in the catastrophic launch failure of first flight of the Arianne-V rocket in 1996. Following that loss, NASA and ESA approved a rebuild of the Cluster mission, for which all four satellites were successfully launched in the Summer of 2000. Limited operations of EDI were also obtained on the Equator-S satellite, which was launched in December, 1997. A satellite failure caused a loss of the Equator-S mission after only 5 months, but these operations were extremely valuable in learning about the characteristics and operations of the complex EDI instrument. The Cluster mission, satellites, and instruments underwent an extensive on-orbit commissioning phase in the Fall of 2000, carrying over through January 2001. During this period all elements of the instruments were checked and careful measurements of inter-experiments interferences were made. EDI is currently working exceptionally well in orbit. Initial results verify that all aspects of the instrument are working as planned, and returning highly valuable scientific information. The first two papers describing EDI on-orbit results have been submitted for publication in April, 2001. The principles of the EDI technique, and its implementation on Cluster are described in two papers by Paschmann et al., attached as Appendices A and B. The EDI presentation at the formal Cluster Commissioning Review, held at ESA Headquarters in Paris, is attached as Appendix C.

  13. Comprehensive T-Matrix Reference Database: A 2007-2009 Update

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Zakharova, Nadia T.; Videen, Gorden; Khlebtsov, Nikolai G.; Wriedt, Thomas

    2010-01-01

    The T-matrix method is among the most versatile, efficient, and widely used theoretical techniques for the numerically exact computation of electromagnetic scattering by homogeneous and composite particles, clusters of particles, discrete random media, and particles in the vicinity of an interface separating two half-spaces with different refractive indices. This paper presents an update to the comprehensive database of T-matrix publications compiled by us previously and includes the publications that appeared since 2007. It also lists several earlier publications not included in the original database.

  14. Convective and stratiform components of a Winter Monsoon Cloud Cluster determined from geosynchronous infrared satellite data

    NASA Technical Reports Server (NTRS)

    Goldenberg, Stanley B.; Houze, Robert A., Jr.; Churchill, Dean D.

    1990-01-01

    The horizontal precipitation structure of cloud clusters observed over the South China Sea during the Winter Monsoon Experiment (WMONEX) is analyzed using a convective-stratiform technique (CST) developed by Adler and Negri (1988). The technique was modified by altering the method for identifying convective cells in the satellite data, accounting for the extremely cold cloud tops characteristic of the WMONEX region, and modifying the threshold infrared temperature for the boundary of the stratiform rain area. The precipitation analysis was extended to the entire history of the cloud cluster by applying the modified CST to IR imagery from geosynchronous-satellite observations. The ship and aircraft data from the later period of the cluster's lifetime make it possible to check the locations of convective and stratiform precipitation identified by the CST using in situ observations. The extended CST is considered to be effective for determining the climatology of the convective-stratiform structure of tropical cloud clusters.

  15. Site-specific biomolecule labeling with gold clusters.

    PubMed

    Ackerson, Christopher J; Powell, Richard D; Hainfeld, James F

    2010-01-01

    Site-specific labeling of biomolecules in vitro with gold clusters can enhance the information content of electron cryomicroscopy experiments. This chapter provides a practical overview of well-established techniques for forming biomolecule/gold cluster conjugates. Three bioconjugation chemistries are covered: linker-mediated bioconjugation, direct gold-biomolecule bonding, and coordination-mediated bonding of nickel(II) nitrilotriacetic acid (NTA)-derivatized gold clusters to polyhistidine (His)-tagged proteins. Copyright © 2010 Elsevier Inc. All rights reserved.

  16. Evolutionary models of rotating dense stellar systems: challenges in software and hardware

    NASA Astrophysics Data System (ADS)

    Fiestas, Jose

    2016-02-01

    We present evolutionary models of rotating self-gravitating systems (e.g. globular clusters, galaxy cores). These models are characterized by the presence of initial axisymmetry due to rotation. Central black hole seeds are alternatively included in our models, and black hole growth due to consumption of stellar matter is simulated until the central potential dominates the kinematics in the core. Goal is to study the long-term evolution (~ Gyr) of relaxed dense stellar systems, which deviate from spherical symmetry, their morphology and final kinematics. With this purpose, we developed a 2D Fokker-Planck analytical code, which results we confirm by detailed N-Body techniques, applying a high performance code, developed for GPU machines. We compare our models to available observations of galactic rotating globular clusters, and conclude that initial rotation modifies significantly the shape and lifetime of these systems, and can not be neglected in studying the evolution of globular clusters, and the galaxy itself.

  17. Epidemiological characteristics of reported sporadic and outbreak cases of E. coli O157 in people from Alberta, Canada (2000-2002): methodological challenges of comparing clustered to unclustered data.

    PubMed

    Pearl, D L; Louie, M; Chui, L; Doré, K; Grimsrud, K M; Martin, S W; Michel, P; Svenson, L W; McEwen, S A

    2008-04-01

    Using multivariable models, we compared whether there were significant differences between reported outbreak and sporadic cases in terms of their sex, age, and mode and site of disease transmission. We also determined the potential role of administrative, temporal, and spatial factors within these models. We compared a variety of approaches to account for clustering of cases in outbreaks including weighted logistic regression, random effects models, general estimating equations, robust variance estimates, and the random selection of one case from each outbreak. Age and mode of transmission were the only epidemiologically and statistically significant covariates in our final models using the above approaches. Weighing observations in a logistic regression model by the inverse of their outbreak size appeared to be a relatively robust and valid means for modelling these data. Some analytical techniques, designed to account for clustering, had difficulty converging or producing realistic measures of association.

  18. Overview of Silica-Related Clusters in the United States: Will Fracking Operations Become the Next Cluster?

    PubMed

    Quail, M Thomas

    2017-01-01

    Silicosis is the oldest know occupational pulmonary disease. It is a progressive disease and any level of exposure to respirable crystalline silica particles or dust has the potential to develop into silicosis. Silicosis is caused by silica particles or dust entering the lungs and damaging healthy lung tissue. The damage restricts the ability to breathe. Exposure to silica increases a worker’s risk of developing cancer or tuberculosis. This special report will provide background history of silicosis in the U.S., including the number of workers affected and their common industries. Over the years, these industries have impeded government oversight, resulting in silicosis exposure clusters. The risk of acquiring silicosis is diminished when industry implements safety measures with oversight by governmental agencies. Reputable authorities believe that the current innovative drilling techniques such as fracking will generate future cases of silicosis in the U.S. if safety measures to protect workers are ignored.

  19. Estimating Treatment Effects via Multilevel Matching within Homogenous Groups of Clusters

    ERIC Educational Resources Information Center

    Steiner, Peter M.; Kim, Jee-Seon

    2015-01-01

    Despite the popularity of propensity score (PS) techniques they are not yet well studied for matching multilevel data where selection into treatment takes place among level-one units within clusters. This paper suggests a PS matching strategy that tries to avoid the disadvantages of within- and across-cluster matching. The idea is to first…

  20. Techniques and computations for mapping plot clusters that straddle stand boundaries

    Treesearch

    Charles T. Scott; William A. Bechtold

    1995-01-01

    Many regional (extensive) forest surveys use clusters of subplots or prism points to reduce survey costs. Two common methods of handling clusters that straddle stand boundaries entail: (1) moving all subplots into a single forest cover type, or (2)"averaging" data across multiple conditions without regard to the boundaries. these methods result in biased...

  1. Clustering Binary Data in the Presence of Masking Variables

    ERIC Educational Resources Information Center

    Brusco, Michael J.

    2004-01-01

    A number of important applications require the clustering of binary data sets. Traditional nonhierarchical cluster analysis techniques, such as the popular K-means algorithm, can often be successfully applied to these data sets. However, the presence of masking variables in a data set can impede the ability of the K-means algorithm to recover the…

  2. Applying Sequential Analytic Methods to Self-Reported Information to Anticipate Care Needs.

    PubMed

    Bayliss, Elizabeth A; Powers, J David; Ellis, Jennifer L; Barrow, Jennifer C; Strobel, MaryJo; Beck, Arne

    2016-01-01

    Identifying care needs for newly enrolled or newly insured individuals is important under the Affordable Care Act. Systematically collected patient-reported information can potentially identify subgroups with specific care needs prior to service use. We conducted a retrospective cohort investigation of 6,047 individuals who completed a 10-question needs assessment upon initial enrollment in Kaiser Permanente Colorado (KPCO), a not-for-profit integrated delivery system, through the Colorado State Individual Exchange. We used responses from the Brief Health Questionnaire (BHQ), to develop a predictive model for cost for receiving care in the top 25 percent, then applied cluster analytic techniques to identify different high-cost subpopulations. Per-member, per-month cost was measured from 6 to 12 months following BHQ response. BHQ responses significantly predictive of high-cost care included self-reported health status, functional limitations, medication use, presence of 0-4 chronic conditions, self-reported emergency department (ED) use during the prior year, and lack of prior insurance. Age, gender, and deductible-based insurance product were also predictive. The largest possible range of predicted probabilities of being in the top 25 percent of cost was 3.5 percent to 96.4 percent. Within the top cost quartile, examples of potentially actionable clusters of patients included those with high morbidity, prior utilization, depression risk and financial constraints; those with high morbidity, previously uninsured individuals with few financial constraints; and relatively healthy, previously insured individuals with medication needs. Applying sequential predictive modeling and cluster analytic techniques to patient-reported information can identify subgroups of individuals within heterogeneous populations who may benefit from specific interventions to optimize initial care delivery.

  3. Scalable cluster administration - Chiba City I approach and lessons learned.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Navarro, J. P.; Evard, R.; Nurmi, D.

    2002-07-01

    Systems administrators of large clusters often need to perform the same administrative activity hundreds or thousands of times. Often such activities are time-consuming, especially the tasks of installing and maintaining software. By combining network services such as DHCP, TFTP, FTP, HTTP, and NFS with remote hardware control, cluster administrators can automate all administrative tasks. Scalable cluster administration addresses the following challenge: What systems design techniques can cluster builders use to automate cluster administration on very large clusters? We describe the approach used in the Mathematics and Computer Science Division of Argonne National Laboratory on Chiba City I, a 314-node Linuxmore » cluster; and we analyze the scalability, flexibility, and reliability benefits and limitations from that approach.« less

  4. AHIMSA - Ad hoc histogram information measure sensing algorithm for feature selection in the context of histogram inspired clustering techniques

    NASA Technical Reports Server (NTRS)

    Dasarathy, B. V.

    1976-01-01

    An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.

  5. Survey of adaptive image coding techniques

    NASA Technical Reports Server (NTRS)

    Habibi, A.

    1977-01-01

    The general problem of image data compression is discussed briefly with attention given to the use of Karhunen-Loeve transforms, suboptimal systems, and block quantization. A survey is then conducted encompassing the four categories of adaptive systems: (1) adaptive transform coding (adaptive sampling, adaptive quantization, etc.), (2) adaptive predictive coding (adaptive delta modulation, adaptive DPCM encoding, etc.), (3) adaptive cluster coding (blob algorithms and the multispectral cluster coding technique), and (4) adaptive entropy coding.

  6. Mercury-induced fragmentation of n-decane and n-undecane in positive mode ion mobility spectrometry.

    PubMed

    Gunzer, F

    2015-09-21

    Ion mobility spectrometry is a well-known technique for trace gas analysis. Using soft ionization techniques, fragmentation of analytes is normally not observed, with the consequence that analyte spectra of single substances are quite simple, i.e. showing in general only one peak. If the concentration is high enough, an extra cluster peak involving two analyte molecules can often be observed. When investigating n-alkanes, different results regarding the number of peaks in the spectra have been obtained in the past using this spectrometric technique. Here we present results obtained when analyzing n-alkanes (n-hexane to n-undecane) with a pulsed electron source, which show no fragmentation or clustering at all. However, when investigating a mixture of mercury and an n-alkane, a situation quite typical in the oil and gas industry, a strong fragmentation and cluster formation involving these fragments has been observed exclusively for n-decane and n-undecane.

  7. Honey bee-inspired algorithms for SNP haplotype reconstruction problem

    NASA Astrophysics Data System (ADS)

    PourkamaliAnaraki, Maryam; Sadeghi, Mehdi

    2016-03-01

    Reconstructing haplotypes from SNP fragments is an important problem in computational biology. There have been a lot of interests in this field because haplotypes have been shown to contain promising data for disease association research. It is proved that haplotype reconstruction in Minimum Error Correction model is an NP-hard problem. Therefore, several methods such as clustering techniques, evolutionary algorithms, neural networks and swarm intelligence approaches have been proposed in order to solve this problem in appropriate time. In this paper, we have focused on various evolutionary clustering techniques and try to find an efficient technique for solving haplotype reconstruction problem. It can be referred from our experiments that the clustering methods relying on the behaviour of honey bee colony in nature, specifically bees algorithm and artificial bee colony methods, are expected to result in more efficient solutions. An application program of the methods is available at the following link. http://www.bioinf.cs.ipm.ir/software/haprs/

  8. Fluid{Structure Interaction Modeling of Modified-Porosity Parachutes and Parachute Clusters

    NASA Astrophysics Data System (ADS)

    Boben, Joseph J.

    To increase aerodynamic performance, the geometric porosity of a ringsail spacecraft parachute canopy is sometimes increased, beyond the "rings" and "sails" with hundreds of "ring gaps" and "sail slits." This creates extra computational challenges for fluid-structure interaction (FSI) modeling of clusters of such parachutes, beyond those created by the lightness of the canopy structure, geometric complexities of hundreds of gaps and slits, and the contact between the parachutes of the cluster. In FSI computation of parachutes with such "modified geometric porosity," the ow through the "windows" created by the removal of the panels and the wider gaps created by the removal of the sails cannot be accurately modeled with the Homogenized Modeling of Geometric Porosity (HMGP), which was introduced to deal with the hundreds of gaps and slits. The ow needs to be actually resolved. All these computational challenges need to be addressed simultaneously in FSI modeling of clusters of spacecraft parachutes with modified geometric porosity. The core numerical technology is the Stabilized Space-Time FSI (SSTFSI) technique, and the contact between the parachutes is handled with the Surface-Edge-Node Contact Tracking (SENCT) technique. In the computations reported here, in addition to the SSTFSI and SENCT techniques and HMGP, we use the special techniques we have developed for removing the numerical spinning component of the parachute motion and for restoring the mesh integrity without a remesh. We present results for 2- and 3-parachute clusters with two different payload models. We also present the FSI computations we carried out for a single, subscale modified-porosity parachute.

  9. Clustering of Multivariate Geostatistical Data

    NASA Astrophysics Data System (ADS)

    Fouedjio, Francky

    2017-04-01

    Multivariate data indexed by geographical coordinates have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations belonging to the same cluster have a certain degree of homogeneity while data locations in the different clusters have to be as different as possible. However, groups of data locations created through classical clustering techniques turn out to show poor spatial contiguity, a feature obviously inconvenient for many geoscience applications. In this work, we develop a clustering method that overcomes this problem by accounting the spatial dependence structure of data; thus reinforcing the spatial contiguity of resulting cluster. The capability of the proposed clustering method to provide spatially contiguous and meaningful clusters of data locations is assessed using both synthetic and real datasets. Keywords: clustering, geostatistics, spatial contiguity, spatial dependence.

  10. Writing with a Personal Voice.

    ERIC Educational Resources Information Center

    Rico, Gabriele Lusser

    1985-01-01

    Clustering is a nonlinear brainstorming technique that can encourage children's natural writing ability by helping them draw on their need to make patterns out of their experience. Tips for introducing cluster writing into the classroom are offered. (MT)

  11. Visual cluster analysis and pattern recognition methods

    DOEpatents

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    2001-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  12. A Fast Projection-Based Algorithm for Clustering Big Data.

    PubMed

    Wu, Yun; He, Zhiquan; Lin, Hao; Zheng, Yufei; Zhang, Jingfen; Xu, Dong

    2018-06-07

    With the fast development of various techniques, more and more data have been accumulated with the unique properties of large size (tall) and high dimension (wide). The era of big data is coming. How to understand and discover new knowledge from these data has attracted more and more scholars' attention and has become the most important task in data mining. As one of the most important techniques in data mining, clustering analysis, a kind of unsupervised learning, could group a set data into objectives(clusters) that are meaningful, useful, or both. Thus, the technique has played very important role in knowledge discovery in big data. However, when facing the large-sized and high-dimensional data, most of the current clustering methods exhibited poor computational efficiency and high requirement of computational source, which will prevent us from clarifying the intrinsic properties and discovering the new knowledge behind the data. Based on this consideration, we developed a powerful clustering method, called MUFOLD-CL. The principle of the method is to project the data points to the centroid, and then to measure the similarity between any two points by calculating their projections on the centroid. The proposed method could achieve linear time complexity with respect to the sample size. Comparison with K-Means method on very large data showed that our method could produce better accuracy and require less computational time, demonstrating that the MUFOLD-CL can serve as a valuable tool, at least may play a complementary role to other existing methods, for big data clustering. Further comparisons with state-of-the-art clustering methods on smaller datasets showed that our method was fastest and achieved comparable accuracy. For the convenience of most scholars, a free soft package was constructed.

  13. Segmentation of dermatoscopic images by frequency domain filtering and k-means clustering algorithms.

    PubMed

    Rajab, Maher I

    2011-11-01

    Since the introduction of epiluminescence microscopy (ELM), image analysis tools have been extended to the field of dermatology, in an attempt to algorithmically reproduce clinical evaluation. Accurate image segmentation of skin lesions is one of the key steps for useful, early and non-invasive diagnosis of coetaneous melanomas. This paper proposes two image segmentation algorithms based on frequency domain processing and k-means clustering/fuzzy k-means clustering. The two methods are capable of segmenting and extracting the true border that reveals the global structure irregularity (indentations and protrusions), which may suggest excessive cell growth or regression of a melanoma. As a pre-processing step, Fourier low-pass filtering is applied to reduce the surrounding noise in a skin lesion image. A quantitative comparison of the techniques is enabled by the use of synthetic skin lesion images that model lesions covered with hair to which Gaussian noise is added. The proposed techniques are also compared with an established optimal-based thresholding skin-segmentation method. It is demonstrated that for lesions with a range of different border irregularity properties, the k-means clustering and fuzzy k-means clustering segmentation methods provide the best performance over a range of signal to noise ratios. The proposed segmentation techniques are also demonstrated to have similar performance when tested on real skin lesions representing high-resolution ELM images. This study suggests that the segmentation results obtained using a combination of low-pass frequency filtering and k-means or fuzzy k-means clustering are superior to the result that would be obtained by using k-means or fuzzy k-means clustering segmentation methods alone. © 2011 John Wiley & Sons A/S.

  14. The clustering of galaxies in the completed SDSS-III Baryon Oscillation Spectroscopic Survey: combining correlated Gaussian posterior distributions

    DOE PAGES

    Sánchez, Ariel G.; Grieb, Jan Niklas; Salazar-Albornoz, Salvador; ...

    2016-09-30

    The cosmological information contained in anisotropic galaxy clustering measurements can often be compressed into a small number of parameters whose posterior distribution is well described by a Gaussian. Here, we present a general methodology to combine these estimates into a single set of consensus constraints that encode the total information of the individual measurements, taking into account the full covariance between the different methods. We also illustrate this technique by applying it to combine the results obtained from different clustering analyses, including measurements of the signature of baryon acoustic oscillations and redshift-space distortions, based on a set of mock cataloguesmore » of the final SDSS-III Baryon Oscillation Spectroscopic Survey (BOSS). Our results show that the region of the parameter space allowed by the consensus constraints is smaller than that of the individual methods, highlighting the importance of performing multiple analyses on galaxy surveys even when the measurements are highly correlated. Our paper is part of a set that analyses the final galaxy clustering data set from BOSS. The methodology presented here is used in Alam et al. to produce the final cosmological constraints from BOSS.« less

  15. Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images

    DOE PAGES

    Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; ...

    2015-10-26

    Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores inmore » Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.« less

  16. Ensemble Clustering using Semidefinite Programming with Applications

    PubMed Central

    Singh, Vikas; Mukherjee, Lopamudra; Peng, Jiming; Xu, Jinhui

    2011-01-01

    In this paper, we study the ensemble clustering problem, where the input is in the form of multiple clustering solutions. The goal of ensemble clustering algorithms is to aggregate the solutions into one solution that maximizes the agreement in the input ensemble. We obtain several new results for this problem. Specifically, we show that the notion of agreement under such circumstances can be better captured using a 2D string encoding rather than a voting strategy, which is common among existing approaches. Our optimization proceeds by first constructing a non-linear objective function which is then transformed into a 0–1 Semidefinite program (SDP) using novel convexification techniques. This model can be subsequently relaxed to a polynomial time solvable SDP. In addition to the theoretical contributions, our experimental results on standard machine learning and synthetic datasets show that this approach leads to improvements not only in terms of the proposed agreement measure but also the existing agreement measures based on voting strategies. In addition, we identify several new application scenarios for this problem. These include combining multiple image segmentations and generating tissue maps from multiple-channel Diffusion Tensor brain images to identify the underlying structure of the brain. PMID:21927539

  17. Ensemble Clustering using Semidefinite Programming with Applications.

    PubMed

    Singh, Vikas; Mukherjee, Lopamudra; Peng, Jiming; Xu, Jinhui

    2010-05-01

    In this paper, we study the ensemble clustering problem, where the input is in the form of multiple clustering solutions. The goal of ensemble clustering algorithms is to aggregate the solutions into one solution that maximizes the agreement in the input ensemble. We obtain several new results for this problem. Specifically, we show that the notion of agreement under such circumstances can be better captured using a 2D string encoding rather than a voting strategy, which is common among existing approaches. Our optimization proceeds by first constructing a non-linear objective function which is then transformed into a 0-1 Semidefinite program (SDP) using novel convexification techniques. This model can be subsequently relaxed to a polynomial time solvable SDP. In addition to the theoretical contributions, our experimental results on standard machine learning and synthetic datasets show that this approach leads to improvements not only in terms of the proposed agreement measure but also the existing agreement measures based on voting strategies. In addition, we identify several new application scenarios for this problem. These include combining multiple image segmentations and generating tissue maps from multiple-channel Diffusion Tensor brain images to identify the underlying structure of the brain.

  18. Self-Learning Off-Lattice Kinetic Monte Carlo method as applied to growth on metal surfaces

    NASA Astrophysics Data System (ADS)

    Trushin, Oleg; Kara, Abdelkader; Rahman, Talat

    2007-03-01

    We propose a new development in the Self-Learning Kinetic Monte Carlo (SLKMC) method with the goal of improving the accuracy with which atomic mechanisms controlling diffusive processes on metal surfaces may be identified. This is important for diffusion of small clusters (2 - 20 atoms) in which atoms may occupy Off-Lattice positions. Such a procedure is also necessary for consideration of heteroepitaxial growth. The new technique combines an earlier version of SLKMC [1] with the inclusion of off-lattice occupancy. This allows us to include arbitrary positions of adatoms in the modeling and makes the simulations more realistic and reliable. We have tested this new approach for the case of the diffusion of small 2D Cu clusters diffusion on Cu(111) and found good performance and satisfactory agreement with results obtained from previous version of SLKMC. The new method also helped reveal a novel atomic mechanism contributing to cluster migration. We have also applied this method to study the diffusion of Cu clusters on Ag(111), and find that Cu atoms generally prefer to occupy off-lattice sites. [1] O. Trushin, A. Kara, A. Karim, T.S. Rahman Phys. Rev B 2005

  19. Phylogeny of kemenyan (Styrax sp.) from North Sumatra based on morphological characters

    NASA Astrophysics Data System (ADS)

    Susilowati, A.; Kholibrina, C. R.; Rachmat, H. H.; Munthe, M. A.

    2018-02-01

    Kemenyan is the most famous local tree species from North Sumatra. Kemenyan is known as rosin producer that very valuable for pharmacheutical, cosmetic, food preservatives and vernis. Based on its history, there were only two species of kemenyan those were kemenyan durame and toba, but in its the natural distribution we also found others species showing different characteristics with previously known ones. The objectives of this research were:The objectives of this research were: (1). To determine the morphological diversity of kemenyan in North Sumatra and (2). To determine phylogeny clustering based on the morphological characters. Data was collected from direct observation and morphological characterization, based on purposive sampling technique to those samples trees atPakpak Bharat, North Sumatra. Morphological characters were examined using descriptive analysis, phenotypic variability using standard deviation, and cluster analysis. The result showed that there was a difference between 4 species kemenyen (batak, minyak, durame and toba) according to 75 observed characters including flower, fruits, leaf, stem, bark, crown type, wood and the resin. Analysis and both quantitative and qualitative characters kemenyan clustered into two groups. In which, kemenyan toba separated with other clusters.

  20. Cluster Analysis of Atmospheric Dynamics and Pollution Transport in a Coastal Area

    NASA Astrophysics Data System (ADS)

    Sokolov, Anton; Dmitriev, Egor; Maksimovich, Elena; Delbarre, Hervé; Augustin, Patrick; Gengembre, Cyril; Fourmentin, Marc; Locoge, Nadine

    2016-11-01

    Summertime atmospheric dynamics in the coastal zone of the industrialized Dunkerque agglomeration in northern France was characterized by a cluster analysis of back trajectories in the context of pollution transport. The MESO-NH atmospheric model was used to simulate the local dynamics at multiple scales with horizontal resolution down to 500 m, and for the online calculation of the Lagrangian backward trajectories with 30-min temporal resolution. Airmass transport was performed along six principal pathways obtained by the weighted k-means clustering technique. Four of these centroids corresponded to a range of wind speeds over the English Channel: two for wind directions from the north-east and two from the south-west. Another pathway corresponded to a south-westerly continental transport. The backward trajectories of the largest and most dispersed sixth cluster contained low wind speeds, including sea-breeze circulations. Based on analyses of meteorological data and pollution measurements, the principal atmospheric pathways were related to local air-contamination events. Continuous air quality and meteorological data were collected during the Benzene-Toluene-Ethylbenzene-Xylene 2006 campaign. The sites of the pollution measurements served as the endpoints for the backward trajectories. Pollutant transport pathways corresponding to the highest air contamination were defined.

  1. Cluster analysis of stress corrosion mechanisms for steel wires used in bridge cables through acoustic emission particle swarm optimization.

    PubMed

    Li, Dongsheng; Yang, Wei; Zhang, Wenyao

    2017-05-01

    Stress corrosion is the major failure type of bridge cable damage. The acoustic emission (AE) technique was applied to monitor the stress corrosion process of steel wires used in bridge cable structures. The damage evolution of stress corrosion in bridge cables was obtained according to the AE characteristic parameter figure. A particle swarm optimization cluster method was developed to determine the relationship between the AE signal and stress corrosion mechanisms. Results indicate that the main AE sources of stress corrosion in bridge cables included four types: passive film breakdown and detachment of the corrosion product, crack initiation, crack extension, and cable fracture. By analyzing different types of clustering data, the mean value of each damage pattern's AE characteristic parameters was determined. Different corrosion damage source AE waveforms and the peak frequency were extracted. AE particle swarm optimization cluster analysis based on principal component analysis was also proposed. This method can completely distinguish the four types of damage sources and simplifies the determination of the evolution process of corrosion damage and broken wire signals. Copyright © 2017. Published by Elsevier B.V.

  2. Self-consistent semi-analytic models of the first stars

    NASA Astrophysics Data System (ADS)

    Visbal, Eli; Haiman, Zoltán; Bryan, Greg L.

    2018-04-01

    We have developed a semi-analytic framework to model the large-scale evolution of the first Population III (Pop III) stars and the transition to metal-enriched star formation. Our model follows dark matter haloes from cosmological N-body simulations, utilizing their individual merger histories and three-dimensional positions, and applies physically motivated prescriptions for star formation and feedback from Lyman-Werner (LW) radiation, hydrogen ionizing radiation, and external metal enrichment due to supernovae winds. This method is intended to complement analytic studies, which do not include clustering or individual merger histories, and hydrodynamical cosmological simulations, which include detailed physics, but are computationally expensive and have limited dynamic range. Utilizing this technique, we compute the cumulative Pop III and metal-enriched star formation rate density (SFRD) as a function of redshift at z ≥ 20. We find that varying the model parameters leads to significant qualitative changes in the global star formation history. The Pop III star formation efficiency and the delay time between Pop III and subsequent metal-enriched star formation are found to have the largest impact. The effect of clustering (i.e. including the three-dimensional positions of individual haloes) on various feedback mechanisms is also investigated. The impact of clustering on LW and ionization feedback is found to be relatively mild in our fiducial model, but can be larger if external metal enrichment can promote metal-enriched star formation over large distances.

  3. Clustering of Farsi sub-word images for whole-book recognition

    NASA Astrophysics Data System (ADS)

    Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier

    2015-01-01

    Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.

  4. Quantification and clustering of phenotypic screening data using time-series analysis for chemotherapy of schistosomiasis.

    PubMed

    Lee, Hyokyeong; Moody-Davis, Asher; Saha, Utsab; Suzuki, Brian M; Asarnow, Daniel; Chen, Steven; Arkin, Michelle; Caffrey, Conor R; Singh, Rahul

    2012-01-01

    Neglected tropical diseases, especially those caused by helminths, constitute some of the most common infections of the world's poorest people. Development of techniques for automated, high-throughput drug screening against these diseases, especially in whole-organism settings, constitutes one of the great challenges of modern drug discovery. We present a method for enabling high-throughput phenotypic drug screening against diseases caused by helminths with a focus on schistosomiasis. The proposed method allows for a quantitative analysis of the systemic impact of a drug molecule on the pathogen as exhibited by the complex continuum of its phenotypic responses. This method consists of two key parts: first, biological image analysis is employed to automatically monitor and quantify shape-, appearance-, and motion-based phenotypes of the parasites. Next, we represent these phenotypes as time-series and show how to compare, cluster, and quantitatively reason about them using techniques of time-series analysis. We present results on a number of algorithmic issues pertinent to the time-series representation of phenotypes. These include results on appropriate representation of phenotypic time-series, analysis of different time-series similarity measures for comparing phenotypic responses over time, and techniques for clustering such responses by similarity. Finally, we show how these algorithmic techniques can be used for quantifying the complex continuum of phenotypic responses of parasites. An important corollary is the ability of our method to recognize and rigorously group parasites based on the variability of their phenotypic response to different drugs. The methods and results presented in this paper enable automatic and quantitative scoring of high-throughput phenotypic screens focused on helmintic diseases. Furthermore, these methods allow us to analyze and stratify parasites based on their phenotypic response to drugs. Together, these advancements represent a significant breakthrough for the process of drug discovery against schistosomiasis in particular and can be extended to other helmintic diseases which together afflict a large part of humankind.

  5. Quantification and clustering of phenotypic screening data using time-series analysis for chemotherapy of schistosomiasis

    PubMed Central

    2012-01-01

    Background Neglected tropical diseases, especially those caused by helminths, constitute some of the most common infections of the world's poorest people. Development of techniques for automated, high-throughput drug screening against these diseases, especially in whole-organism settings, constitutes one of the great challenges of modern drug discovery. Method We present a method for enabling high-throughput phenotypic drug screening against diseases caused by helminths with a focus on schistosomiasis. The proposed method allows for a quantitative analysis of the systemic impact of a drug molecule on the pathogen as exhibited by the complex continuum of its phenotypic responses. This method consists of two key parts: first, biological image analysis is employed to automatically monitor and quantify shape-, appearance-, and motion-based phenotypes of the parasites. Next, we represent these phenotypes as time-series and show how to compare, cluster, and quantitatively reason about them using techniques of time-series analysis. Results We present results on a number of algorithmic issues pertinent to the time-series representation of phenotypes. These include results on appropriate representation of phenotypic time-series, analysis of different time-series similarity measures for comparing phenotypic responses over time, and techniques for clustering such responses by similarity. Finally, we show how these algorithmic techniques can be used for quantifying the complex continuum of phenotypic responses of parasites. An important corollary is the ability of our method to recognize and rigorously group parasites based on the variability of their phenotypic response to different drugs. Conclusions The methods and results presented in this paper enable automatic and quantitative scoring of high-throughput phenotypic screens focused on helmintic diseases. Furthermore, these methods allow us to analyze and stratify parasites based on their phenotypic response to drugs. Together, these advancements represent a significant breakthrough for the process of drug discovery against schistosomiasis in particular and can be extended to other helmintic diseases which together afflict a large part of humankind. PMID:22369037

  6. A Spatial Division Clustering Method and Low Dimensional Feature Extraction Technique Based Indoor Positioning System

    PubMed Central

    Mo, Yun; Zhang, Zhongzhao; Meng, Weixiao; Ma, Lin; Wang, Yao

    2014-01-01

    Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC) method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect. PMID:24451470

  7. Applications of Nanoparticle-Containing Plasmas for High-Order Harmonic Generation of Laser Radiation

    NASA Astrophysics Data System (ADS)

    Ganeev, Rashid A.

    The use of nanoparticles for efficient conversion of the wavelength of ultrashort laser toward the deep UV spectral range through harmonic generation is an attractive application of cluster-containing plasmas. Note that earlier observations of HHG in nanoparticles were limited by using the exotic gas clusters formed during fast cooling of atomic flow from the gas jets 1-4. One can assume the difficulties in definition of the structure of such clusters and the ratio between nanoparticles and atoms/ions in the gas flow. The characterization of gas phase cluster production was currently improved using the sophisticated techniques (e.g., a control of nanoparticle mass and spatial distribution, see the review 5). In the meantime, the plasma nanoparticle HHG has demonstrated some advantages over gas cluster HHG 6. The application of commercially available nanopowders allowed for precisely defining the sizes and structure of these clusters in the plume. The laser ablation technique made possible the predictable manipulation of plasma characteristics, which led to the creation of laser plumes containing mainly nanoparticles with known spatial structure. The latter allows the application of such plumes in nonlinear optics, X-ray emission of clusters, deposition of nanoparticles with fixed parameters on the substrates for semiconductor industry, production of nanostructured and nanocomposite films, etc.

  8. Elements concentration analysis in groundwater from the North Serra Geral aquifer in Santa Helena-Brazil using SR-TXRF spectrometer.

    PubMed

    Justen, Gisele C; Espinoza-Quiñones, Fernando R; Módenes, Aparecido Nivaldo; Bergamasco, Rosangela

    2012-01-01

    In this work the analysis of elements concentration in groundwater was performed using the synchrotron radiation total-reflection X-ray fluorescence (SR-TXRF) technique. A set of nine tube-wells with serious risk of contamination was chosen to monitor the mean concentration of elements in groundwater from the North Serra Geral aquifer in Santa Helena, Brazil, during 1 year. Element concentrations were determined applying a SR-TXRF methodology. The accuracy of SR-TXRF technique was validated by analysis of a certified reference material. As the groundwater composition in the North Serra Geral aquifer showed heterogeneity in the spatial distribution of eight major elements, a hierarchical clustering to the data was performed. By a similarity in their compositions, two of the nine wells were grouped in a first cluster, while the other seven were grouped in a second cluster. Calcium was the major element in all wells, with higher Ca concentration in the second cluster than in the first cluster. However, concentrations of Ti, V, Cr in the first cluster are slightly higher than those in the second cluster. The findings of this study within a monitoring program of tube-wells could provide a useful assessment of controls over groundwater composition and support management at regional level.

  9. Syndrome Surveillance Using Parametric Space-Time Clustering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    KOCH, MARK W.; MCKENNA, SEAN A.; BILISOLY, ROGER L.

    2002-11-01

    As demonstrated by the anthrax attack through the United States mail, people infected by the biological agent itself will give the first indication of a bioterror attack. Thus, a distributed information system that can rapidly and efficiently gather and analyze public health data would aid epidemiologists in detecting and characterizing emerging diseases, including bioterror attacks. We propose using clusters of adverse health events in space and time to detect possible bioterror attacks. Space-time clusters can indicate exposure to infectious diseases or localized exposure to toxins. Most space-time clustering approaches require individual patient data. To protect the patient's privacy, we havemore » extended these approaches to aggregated data and have embedded this extension in a sequential probability ratio test (SPRT) framework. The real-time and sequential nature of health data makes the SPRT an ideal candidate. The result of space-time clustering gives the statistical significance of a cluster at every location in the surveillance area and can be thought of as a ''health-index'' of the people living in this area. As a surrogate to bioterrorism data, we have experimented with two flu data sets. For both databases, we show that space-time clustering can detect a flu epidemic up to 21 to 28 days earlier than a conventional periodic regression technique. We have also tested using simulated anthrax attack data on top of a respiratory illness diagnostic category. Results show we do very well at detecting an attack as early as the second or third day after infected people start becoming severely symptomatic.« less

  10. Clustering multilayer omics data using MuNCut.

    PubMed

    Teran Hidalgo, Sebastian J; Ma, Shuangge

    2018-03-14

    Omics profiling is now a routine component of biomedical studies. In the analysis of omics data, clustering is an essential step and serves multiple purposes including for example revealing the unknown functionalities of omics units, assisting dimension reduction in outcome model building, and others. In the most recent omics studies, a prominent trend is to conduct multilayer profiling, which collects multiple types of genetic, genomic, epigenetic and other measurements on the same subjects. In the literature, clustering methods tailored to multilayer omics data are still limited. Directly applying the existing clustering methods to multilayer omics data and clustering each layer first and then combing across layers are both "suboptimal" in that they do not accommodate the interconnections within layers and across layers in an informative way. In this study, we develop the MuNCut (Multilayer NCut) clustering approach. It is tailored to multilayer omics data and sufficiently accounts for both across- and within-layer connections. It is based on the novel NCut technique and also takes advantages of regularized sparse estimation. It has an intuitive formulation and is computationally very feasible. To facilitate implementation, we develop the function muncut in the R package NcutYX. Under a wide spectrum of simulation settings, it outperforms competitors. The analysis of TCGA (The Cancer Genome Atlas) data on breast cancer and cervical cancer shows that MuNCut generates biologically meaningful results which differ from those using the alternatives. We propose a more effective clustering analysis of multiple omics data. It provides a new venue for jointly analyzing genetic, genomic, epigenetic and other measurements.

  11. Structure and stability of small Li2 +(X2Σ+ g )-Xen (n = 1-6) clusters

    NASA Astrophysics Data System (ADS)

    Saidi, Sameh; Ghanmi, Chedli; Berriche, Hamid

    2014-04-01

    We have studied the structure and stability of the Li2 +(X2Σ+ g )Xe n ( n = 1-6) clusters for special symmetry groups. The potential energy surfaces of these clusters, are described using an accurate ab initio approach based on non-empirical pseudopotential, parameterized l-dependent polarization potential and analytic potential forms for the Li+Xe and Xe-Xe interactions. The pseudopotential technique has reduced the number of active electrons of Li2 +(X2Σ+ g )-Xe n ( n = 1-6) clusters to only one electron, the Li valence electron. The core-core interactions for Li+Xe are included using accurate CCSD(T) potential fitted using the analytical form of Tang and Toennies. For the Xe-Xe potential interactions we have used the analytical form of Lennard Jones (LJ6 - 12). The potential energy surfaces of the Li2 +(X2Σ+ g )Xe n ( n = 1-6) clusters are performed for a fixed distance of the Li2 +(X2Σ+ g ) alkali dimer, its equilibrium distance. They are used to extract information on the stability of the Li2 +(X2Σ+ g Xe n ( n = 1-6) clusters. For each n, the stability of the different isomers is examined by comparing their potential energy surfaces. Moreover, we have determined the quantum energies ( D 0), the zero-point-energies (ZPE) and the ZPE%. To our best knowledge, there are neither experimental nor theoretical works realized for the Li2 +(X2Σ+ g Xe n ( n = 1-6) clusters, our results are presented for the first time.

  12. Development of advanced acreage estimation methods

    NASA Technical Reports Server (NTRS)

    Guseman, L. F., Jr. (Principal Investigator)

    1980-01-01

    The use of the AMOEBA clustering/classification algorithm was investigated as a basis for both a color display generation technique and maximum likelihood proportion estimation procedure. An approach to analyzing large data reduction systems was formulated and an exploratory empirical study of spatial correlation in LANDSAT data was also carried out. Topics addressed include: (1) development of multiimage color images; (2) spectral spatial classification algorithm development; (3) spatial correlation studies; and (4) evaluation of data systems.

  13. Correlation Functions in Two-Dimensional Critical Systems with Conformal Symmetry

    NASA Astrophysics Data System (ADS)

    Flores, Steven Miguel

    This thesis presents a study of certain conformal field theory (CFT) correlation functions that describe physical observables in conform ally invariant two-dimensional critical systems. These are typically continuum limits of critical lattice models in a domain within the complex plane and with a boundary. Certain clusters, called boundary clusters, anchor to the boundary of the domain, and many of their features are governed by a conformally invariant probability measure. For example, percolaion is an example of a critical lattice model, and when it is confined to a domain with a boundary, connected clusters of activated bonds that touch that boundary are the boundary clusters. This thesis is concerned with how the boundary clusters interact with each other according to that measure. One question that it considers are "how likely are these clusters to repel each other or to connect with one another in a certain topological configuration?" Chapter one non-rigorously derives an already well-known elliptic system of differential equations closely tied to this matter by using standard techniques of CFT, chapters two and three rigorously infer certain properties concerning the solution space of this system, and chapter four uses some of those results to predict an answer to this question. This thesis also considers local variations of this question such as "what regions of the domain do the perimeters of the boundary clusters explore," and "how often will several boundary clusters connect at just a single, specified point in the domain?" Chapter five predicts precise answers to these questions. All of these answers are quantitative predictions that we verify via high-precision computer simulation. Chapters four and five also present these simulation results. Further material that supplements chapter one is included in two appendices.

  14. Towards a realistic population of simulated galaxy groups and clusters

    NASA Astrophysics Data System (ADS)

    Le Brun, Amandine M. C.; McCarthy, Ian G.; Schaye, Joop; Ponman, Trevor J.

    2014-06-01

    We present a new suite of large-volume cosmological hydrodynamical simulations called cosmo-OWLS. They form an extension to the OverWhelmingly Large Simulations (OWLS) project, and have been designed to help improve our understanding of cluster astrophysics and non-linear structure formation, which are now the limiting systematic errors when using clusters as cosmological probes. Starting from identical initial conditions in either the Planck or WMAP7 cosmologies, we systematically vary the most important `sub-grid' physics, including feedback from supernovae and active galactic nuclei (AGN). We compare the properties of the simulated galaxy groups and clusters to a wide range of observational data, such as X-ray luminosity and temperature, gas mass fractions, entropy and density profiles, Sunyaev-Zel'dovich flux, I-band mass-to-light ratio, dominance of the brightest cluster galaxy and central massive black hole (BH) masses, by producing synthetic observations and mimicking observational analysis techniques. These comparisons demonstrate that some AGN feedback models can produce a realistic population of galaxy groups and clusters, broadly reproducing both the median trend and, for the first time, the scatter in physical properties over approximately two decades in mass (1013 M⊙ ≲ M500 ≲ 1015 M⊙) and 1.5 decades in radius (0.05 ≲ r/r500 ≲ 1.5). However, in other models, the AGN feedback is too violent (even though they reproduce the observed BH scaling relations), implying that calibration of the models is required. The production of realistic populations of simulated groups and clusters, as well as models that bracket the observations, opens the door to the creation of synthetic surveys for assisting the astrophysical and cosmological interpretation of cluster surveys, as well as quantifying the impact of selection effects.

  15. Binaries in globular clusters

    NASA Technical Reports Server (NTRS)

    Hut, Piet; Mcmillan, Steve; Goodman, Jeremy; Mateo, Mario; Phinney, E. S.; Pryor, Carlton; Richer, Harvey B.; Verbunt, Frank; Weinberg, Martin

    1992-01-01

    Recent observations have shown that globular clusters contain a substantial number of binaries most of which are believed to be primordial. We discuss different successful optical search techniques, based on radial-velocity variables, photometric variables, and the positions of stars in the color-magnitude diagram. In addition, we review searches in other wavelengths, which have turned up low-mass X-ray binaries and more recently a variety of radio pulsars. On the theoretical side, we give an overview of the different physical mechanisms through which individual binaries evolve. We discuss the various simulation techniques which recently have been employed to study the effects of a primordial binary population, and the fascinating interplay between stellar evolution and stellar dynamics which drives globular-cluster evolution.

  16. Semiconductor nanocrystals covalently bound to solid inorganic surfaces using self-assembled monolayers

    DOEpatents

    Alivisatos, A. Paul; Colvin, Vicki L.

    1998-01-01

    Methods are described for attaching semiconductor nanocrystals to solid inorganic surfaces, using self-assembled bifunctional organic monolayers as bridge compounds. Two different techniques are presented. One relies on the formation of self-assembled monolayers on these surfaces. When exposed to solutions of nanocrystals, these bridge compounds bind the crystals and anchor them to the surface. The second technique attaches nanocrystals already coated with bridge compounds to the surfaces. Analyses indicate the presence of quantum confined clusters on the surfaces at the nanolayer level. These materials allow electron spectroscopies to be completed on condensed phase clusters, and represent a first step towards synthesis of an organized assembly of clusters. These new products are also disclosed.

  17. Amplified Fragment Length Polymorphism Fingerprinting of Pseudomonas Strains from a Poultry Processing Plant

    PubMed Central

    Geornaras, Ifigenia; Kunene, Nokuthula F.; von Holy, Alexander; Hastings, John W.

    1999-01-01

    Molecular typing has been used previously to identify and trace dissemination of pathogenic and spoilage bacteria associated with food processing. Amplified fragment length polymorphism (AFLP) is a novel DNA fingerprinting technique which is considered highly reproducible and has high discriminatory power. This technique was used to fingerprint 88 Pseudomonas fluorescens and Pseudomonas putida strains that were previously isolated from plate counts of carcasses at six processing stages and various equipment surfaces and environmental sources of a poultry abattoir. Clustering of the AFLP patterns revealed a high level of diversity among the strains. Six clusters (clusters I through VI) were delineated at an arbitrary Dice coefficient level of 0.65; clusters III (31 strains) and IV (28 strains) were the largest clusters. More than one-half (52.3%) of the strains obtained from carcass samples, which may have represented the resident carcass population, grouped together in cluster III. By contrast, 43.2% of the strains from most of the equipment surfaces and environmental sources grouped together in cluster IV. In most cases, the clusters in which carcass strains from processing stages grouped corresponded to the clusters in which strains from the associated equipment surfaces and/or environmental sources were found. This provided evidence that there was cross-contamination between carcasses and the abattoir environment at the DNA level. The AFLP data also showed that strains were being disseminated from the beginning to the end of the poultry processing operation, since many strains associated with carcasses at the packaging stage were members of the same clusters as strains obtained from carcasses after the defeathering stage. PMID:10473382

  18. Amplified fragment length polymorphism fingerprinting of Pseudomonas strains from a poultry processing plant.

    PubMed

    Geornaras, I; Kunene, N F; von Holy, A; Hastings, J W

    1999-09-01

    Molecular typing has been used previously to identify and trace dissemination of pathogenic and spoilage bacteria associated with food processing. Amplified fragment length polymorphism (AFLP) is a novel DNA fingerprinting technique which is considered highly reproducible and has high discriminatory power. This technique was used to fingerprint 88 Pseudomonas fluorescens and Pseudomonas putida strains that were previously isolated from plate counts of carcasses at six processing stages and various equipment surfaces and environmental sources of a poultry abattoir. Clustering of the AFLP patterns revealed a high level of diversity among the strains. Six clusters (clusters I through VI) were delineated at an arbitrary Dice coefficient level of 0.65; clusters III (31 strains) and IV (28 strains) were the largest clusters. More than one-half (52.3%) of the strains obtained from carcass samples, which may have represented the resident carcass population, grouped together in cluster III. By contrast, 43.2% of the strains from most of the equipment surfaces and environmental sources grouped together in cluster IV. In most cases, the clusters in which carcass strains from processing stages grouped corresponded to the clusters in which strains from the associated equipment surfaces and/or environmental sources were found. This provided evidence that there was cross-contamination between carcasses and the abattoir environment at the DNA level. The AFLP data also showed that strains were being disseminated from the beginning to the end of the poultry processing operation, since many strains associated with carcasses at the packaging stage were members of the same clusters as strains obtained from carcasses after the defeathering stage.

  19. Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.

    PubMed

    Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio

    Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  20. Comparing Dark Energy Survey and HST –CLASH observations of the galaxy cluster RXC J2248.7-4431: implications for stellar mass versus dark matter

    DOE PAGES

    Palmese, A.; Lahav, O.; Banerji, M.; ...

    2016-08-20

    We derive the stellar mass fraction in the galaxy cluster RXC J2248.7-4431 observed with the Dark Energy Survey (DES) during the Science Verification period. We compare the stellar mass results from DES (5 filters) with those from the Hubble Space Telescope CLASH (17 filters). When the cluster spectroscopic redshift is assumed, we show that stellar masses from DES can be estimated within 25% of CLASH values. We compute the stellar mass contribution coming from red and blue galaxies, and study the relation between stellar mass and the underlying dark matter using weak lensing studies with DES and CLASH. An analysismore » of the radial profiles of the DES total and stellar mass yields a stellar-to-total fraction of f*=7.0+-2.2x10^-3 within a radius of r_200c~3 Mpc. Our analysis also includes a comparison of photometric redshifts and star/galaxy separation efficiency for both datasets. We conclude that space-based small field imaging can be used to calibrate the galaxy properties in DES for the much wider field of view. The technique developed to derive the stellar mass fraction in galaxy clusters can be applied to the ~100 000 clusters that will be observed within this survey. The stacking of all the DES clusters would reduce the errors on f* estimates and deduce important information about galaxy evolution.« less

  1. Comparing Dark Energy Survey and HST-CLASH observations of the galaxy cluster RXC J2248.7-4431: implications for stellar mass versus dark matter

    NASA Astrophysics Data System (ADS)

    Palmese, A.; Lahav, O.; Banerji, M.; Gruen, D.; Jouvel, S.; Melchior, P.; Aleksić, J.; Annis, J.; Diehl, H. T.; Hartley, W. G.; Jeltema, T.; Romer, A. K.; Rozo, E.; Rykoff, E. S.; Seitz, S.; Suchyta, E.; Zhang, Y.; Abbott, T. M. C.; Abdalla, F. B.; Allam, S.; Benoit-Lévy, A.; Bertin, E.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Capozzi, D.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Crocce, M.; Cunha, C. E.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Dietrich, J. P.; Doel, P.; Estrada, J.; Evrard, A. E.; Flaugher, B.; Frieman, J.; Gerdes, D. W.; Goldstein, D. A.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; James, D. J.; Kuehn, K.; Kuropatkin, N.; Li, T. S.; Lima, M.; Maia, M. A. G.; Marshall, J. L.; Miller, C. J.; Miquel, R.; Nord, B.; Ogando, R.; Plazas, A. A.; Roodman, A.; Sanchez, E.; Scarpine, V.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Tucker, D.; Vikram, V.

    2016-12-01

    We derive the stellar mass fraction in the galaxy cluster RXC J2248.7-4431 observed with the Dark Energy Survey (DES) during the Science Verification period. We compare the stellar mass results from DES (five filters) with those from the Hubble Space Telescope Cluster Lensing And Supernova Survey (CLASH; 17 filters). When the cluster spectroscopic redshift is assumed, we show that stellar masses from DES can be estimated within 25 per cent of CLASH values. We compute the stellar mass contribution coming from red and blue galaxies, and study the relation between stellar mass and the underlying dark matter using weak lensing studies with DES and CLASH. An analysis of the radial profiles of the DES total and stellar mass yields a stellar-to-total fraction of f⋆ = (6.8 ± 1.7) × 10-3 within a radius of r200c ≃ 2 Mpc. Our analysis also includes a comparison of photometric redshifts and star/galaxy separation efficiency for both data sets. We conclude that space-based small field imaging can be used to calibrate the galaxy properties in DES for the much wider field of view. The technique developed to derive the stellar mass fraction in galaxy clusters can be applied to the ˜100 000 clusters that will be observed within this survey and yield important information about galaxy evolution.

  2. Comparing Dark Energy Survey and HST –CLASH observations of the galaxy cluster RXC J2248.7-4431: implications for stellar mass versus dark matter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Palmese, A.; Lahav, O.; Banerji, M.

    We derive the stellar mass fraction in the galaxy cluster RXC J2248.7-4431 observed with the Dark Energy Survey (DES) during the Science Verification period. We compare the stellar mass results from DES (5 filters) with those from the Hubble Space Telescope CLASH (17 filters). When the cluster spectroscopic redshift is assumed, we show that stellar masses from DES can be estimated within 25% of CLASH values. We compute the stellar mass contribution coming from red and blue galaxies, and study the relation between stellar mass and the underlying dark matter using weak lensing studies with DES and CLASH. An analysismore » of the radial profiles of the DES total and stellar mass yields a stellar-to-total fraction of f*=7.0+-2.2x10^-3 within a radius of r_200c~3 Mpc. Our analysis also includes a comparison of photometric redshifts and star/galaxy separation efficiency for both datasets. We conclude that space-based small field imaging can be used to calibrate the galaxy properties in DES for the much wider field of view. The technique developed to derive the stellar mass fraction in galaxy clusters can be applied to the ~100 000 clusters that will be observed within this survey. The stacking of all the DES clusters would reduce the errors on f* estimates and deduce important information about galaxy evolution.« less

  3. Comparing Dark Energy Survey and HST –CLASH observations of the galaxy cluster RXC J2248.7-4431: implications for stellar mass versus dark matter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Palmese, A.; Lahav, O.; Banerji, M.

    We derive the stellar mass fraction in the galaxy cluster RXC J2248.7-4431 observed with the Dark Energy Survey (DES) during the Science Verification period. We compare the stellar mass results from DES (five filters) with those from the Hubble Space Telescope Cluster Lensing And Supernova Survey (CLASH; 17 filters). When the cluster spectroscopic redshift is assumed, we show that stellar masses from DES can be estimated within 25 per cent of CLASH values. We compute the stellar mass contribution coming from red and blue galaxies, and study the relation between stellar mass and the underlying dark matter using weak lensingmore » studies with DES and CLASH. An analysis of the radial profiles of the DES total and stellar mass yields a stellar-to-total fraction of f(star) = (6.8 +/- 1.7) x 10(-3) within a radius of r(200c) similar or equal to 2 Mpc. Our analysis also includes a comparison of photometric redshifts and star/galaxy separation efficiency for both data sets. We conclude that space-based small field imaging can be used to calibrate the galaxy properties in DES for the much wider field of view. The technique developed to derive the stellar mass fraction in galaxy clusters can be applied to the similar to 100 000 clusters that will be observed within this survey and yield important information about galaxy evolution.« less

  4. Visual cluster analysis and pattern recognition template and methods

    DOEpatents

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    1999-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  5. Rain volume estimation over areas using satellite and radar data

    NASA Technical Reports Server (NTRS)

    Doneaud, A. A.; Vonderhaar, T. H.

    1985-01-01

    An investigation of the feasibility of rain volume estimation using satellite data following a technique recently developed with radar data called the Arera Time Integral was undertaken. Case studies were selected on the basis of existing radar and satellite data sets which match in space and time. Four multicell clusters were analyzed. Routines for navigation remapping amd smoothing of satellite images were performed. Visible counts were normalized for solar zenith angle. A radar sector of interest was defined to delineate specific radar echo clusters for each radar time throughout the radar echo cluster lifetime. A satellite sector of interest was defined by applying small adjustments to the radar sector using a manual processing technique. The radar echo area, the IR maximum counts and the IR counts matching radar echo areas were found to evolve similarly, except for the decaying phase of the cluster where the cirrus debris keeps the IR counts high.

  6. Automated detection of microcalcification clusters in mammograms

    NASA Astrophysics Data System (ADS)

    Karale, Vikrant A.; Mukhopadhyay, Sudipta; Singh, Tulika; Khandelwal, Niranjan; Sadhu, Anup

    2017-03-01

    Mammography is the most efficient modality for detection of breast cancer at early stage. Microcalcifications are tiny bright spots in mammograms and can often get missed by the radiologist during diagnosis. The presence of microcalcification clusters in mammograms can act as an early sign of breast cancer. This paper presents a completely automated computer-aided detection (CAD) system for detection of microcalcification clusters in mammograms. Unsharp masking is used as a preprocessing step which enhances the contrast between microcalcifications and the background. The preprocessed image is thresholded and various shape and intensity based features are extracted. Support vector machine (SVM) classifier is used to reduce the false positives while preserving the true microcalcification clusters. The proposed technique is applied on two different databases i.e DDSM and private database. The proposed technique shows good sensitivity with moderate false positives (FPs) per image on both databases.

  7. Significance of clustering and classification applications in digital and physical libraries

    NASA Astrophysics Data System (ADS)

    Triantafyllou, Ioannis; Koulouris, Alexandros; Zervos, Spiros; Dendrinos, Markos; Giannakopoulos, Georgios

    2015-02-01

    Applications of clustering and classification techniques can be proved very significant in both digital and physical (paper-based) libraries. The most essential application, document classification and clustering, is crucial for the content that is produced and maintained in digital libraries, repositories, databases, social media, blogs etc., based on various tags and ontology elements, transcending the traditional library-oriented classification schemes. Other applications with very useful and beneficial role in the new digital library environment involve document routing, summarization and query expansion. Paper-based libraries can benefit as well since classification combined with advanced material characterization techniques such as FTIR (Fourier Transform InfraRed spectroscopy) can be vital for the study and prevention of material deterioration. An improved two-level self-organizing clustering architecture is proposed in order to enhance the discrimination capacity of the learning space, prior to classification, yielding promising results when applied to the above mentioned library tasks.

  8. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

    ERIC Educational Resources Information Center

    Brown, S. J.; White, S.; Power, N.

    2015-01-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  9. A Technique of Two-Stage Clustering Applied to Environmental and Civil Engineering and Related Methods of Citation Analysis.

    ERIC Educational Resources Information Center

    Miyamoto, S.; Nakayama, K.

    1983-01-01

    A method of two-stage clustering of literature based on citation frequency is applied to 5,065 articles from 57 journals in environmental and civil engineering. Results of related methods of citation analysis (hierarchical graph, clustering of journals, multidimensional scaling) applied to same set of articles are compared. Ten references are…

  10. Ten-year results of a ponderosa pine progeny test in the Black Hills

    Treesearch

    Wayne D. Shepperd; Sue E. McElderry

    1986-01-01

    Ten-year survival and growth of seedlings from 77 parent trees from throughout the Black Hills were compared, using a cluster-analysis technique. Five clusters were identified that account for most of the variability in survival and growth of the open-pollinated families. One cluster, containing 6 families, exhibited exceptional survival and growth. Another, containing...

  11. Overlapping Community Detection based on Network Decomposition

    NASA Astrophysics Data System (ADS)

    Ding, Zhuanlian; Zhang, Xingyi; Sun, Dengdi; Luo, Bin

    2016-04-01

    Community detection in complex network has become a vital step to understand the structure and dynamics of networks in various fields. However, traditional node clustering and relatively new proposed link clustering methods have inherent drawbacks to discover overlapping communities. Node clustering is inadequate to capture the pervasive overlaps, while link clustering is often criticized due to the high computational cost and ambiguous definition of communities. So, overlapping community detection is still a formidable challenge. In this work, we propose a new overlapping community detection algorithm based on network decomposition, called NDOCD. Specifically, NDOCD iteratively splits the network by removing all links in derived link communities, which are identified by utilizing node clustering technique. The network decomposition contributes to reducing the computation time and noise link elimination conduces to improving the quality of obtained communities. Besides, we employ node clustering technique rather than link similarity measure to discover link communities, thus NDOCD avoids an ambiguous definition of community and becomes less time-consuming. We test our approach on both synthetic and real-world networks. Results demonstrate the superior performance of our approach both in computation time and accuracy compared to state-of-the-art algorithms.

  12. Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques

    NASA Astrophysics Data System (ADS)

    Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein

    2017-10-01

    The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.

  13. Real-time analysis of self-assembled nucleobases by Venturi easy ambient sonic-spray ionization mass spectrometry.

    PubMed

    Na, Na; Shi, Ruixia; Long, Zi; Lu, Xin; Jiang, Fubin; Ouyang, Jin

    2014-10-01

    In this study, the real-time analysis of self-assembled nucleobases was employed by Venturi easy ambient sonic-spray ionization mass spectrometry (V-EASI-MS). With the analysis of three nucleobases including 6-methyluracil (6MU), uracil (U) and thymine (T) as examples, different orders of clusters centered with different metal ions were recorded in both positive and negative modes. Compared with the results obtained by traditional electrospray ionization mass spectrometry (ESI-MS) under the same condition, more clusters with high orders, such as [6MU7+Na](+), [6MU15+2NH4](2+), [6MU10+Na](+), [T7+Na](+), and [T15+2NH4](2+) were detected by V-EASI-MS, which demonstrated the soft ionization ability of V-EASI for studying the non-covalent interaction in a self-assembly process. Furthermore, with the injection of K(+) to the system by a syringe pumping, the real-time monitoring of the formation of nucleobases clusters was achieved by the direct extraction of samples from the system under the Venturi effect. Therefore, the effect of cations on the formation of clusters during self-assembly of nucleobases was demonstrated, which was in accordance with the reports. Free of high voltage, heating or radiation during the ionization, this technique is much soft and suitable for obtaining the real-time information of the self-assembly system, which also makes it quite convenient for extraction samples from the reaction system. This "easy and soft" ionization technique has provided a potential pathway for monitoring and controlling the self-assembly processes. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Superhydrophilic nanostructure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mao, Samuel S; Zormpa, Vasileia; Chen, Xiaobo

    2015-05-12

    An embodiment of a superhydrophilic nanostructure includes nanoparticles. The nanoparticles are formed into porous clusters. The porous clusters are formed into aggregate clusters. An embodiment of an article of manufacture includes the superhydrophilic nanostructure on a substrate. An embodiment of a method of fabricating a superhydrophilic nanostructure includes applying a solution that includes nanoparticles to a substrate. The substrate is heated to form aggregate clusters of porous clusters of the nanoparticles.

  15. The impact of clustering of extreme European windstorm events on (re)insurance market portfolios

    NASA Astrophysics Data System (ADS)

    Mitchell-Wallace, Kirsten; Alvarez-Diaz, Teresa

    2010-05-01

    Traditionally the occurrence of windstorm loss events in Europe has been considered as independent. However, a number of significant losses close in space and time indicates that this assumption may need to be revised. Under particular atmospheric conditions multiple loss-causing cyclones can occur in succession, affecting similar geographic regions and, therefore, insurance markets. A notable example is of Lothar and Martin in France in December 1999. Although the existence of cyclone families is well-known by meteorologists, there has been limited research into occurrence of serial windstorms. However, climate modelling research is now providing the ability to explore the physical drivers of clustering, and to improve understanding of the hazard aspect of catastrophe modelling. While analytics tools, including catastrophe models, may incorporate assumptions regarding the influence of dependency through statistical means, the most recent research outputs provide a new strand of information with the potential to re-assess the probabilistic loss potential in light of clustering and to provide an additional view on probable maximum losses to windstorm-exposed portfolios across regions such as Northwest Europe. There is however, a need for the testing of these new techniques within operational (re)insurance applications, and this paper provide an overview of the most current clustering research, including the 2009 paper by Vitolo et. al., in relation to reinsurance risk modelling, and to assess the potential impact of such additional information on the overall risk assessment process. We examine the consequences of the serial clustering of extra-tropical cyclones demonstrated by Vitolo et al. (2009) from the perspective of a large European reinsurer, examining potential implications for: • Pricing • Accumulation And • Capital adequacy

  16. Structure determination in 55-atom Li-Na and Na-K nanoalloys.

    PubMed

    Aguado, Andrés; López, José M

    2010-09-07

    The structure of 55-atom Li-Na and Na-K nanoalloys is determined through combined empirical potential (EP) and density functional theory (DFT) calculations. The potential energy surface generated by the EP model is extensively sampled by using the basin hopping technique, and a wide diversity of structural motifs is reoptimized at the DFT level. A composition comparison technique is applied at the DFT level in order to make a final refinement of the global minimum structures. For dilute concentrations of one of the alkali atoms, the structure of the pure metal cluster, namely, a perfect Mackay icosahedron, remains stable, with the minority component atoms entering the host cluster as substitutional impurities. At intermediate concentrations, the nanoalloys adopt instead a core-shell polyicosahedral (p-Ih) packing, where the element with smaller atomic size and larger cohesive energy segregates to the cluster core. The p-Ih structures show a marked prolate deformation, in agreement with the predictions of jelliumlike models. The electronic preference for a prolate cluster shape, which is frustrated in the 55-atom pure clusters due to the icosahedral geometrical shell closing, is therefore realized only in the 55-atom nanoalloys. An analysis of the electronic densities of states suggests that photoelectron spectroscopy would be a sufficiently sensitive technique to assess the structures of nanoalloys with fixed size and varying compositions.

  17. MACS J0416.1-2403: Impact of line-of-sight structures on strong gravitational lensing modelling of galaxy clusters

    NASA Astrophysics Data System (ADS)

    Chirivì, G.; Suyu, S. H.; Grillo, C.; Halkola, A.; Balestra, I.; Caminha, G. B.; Mercurio, A.; Rosati, P.

    2018-06-01

    Exploiting the powerful tool of strong gravitational lensing by galaxy clusters to study the highest-redshift Universe and cluster mass distributions relies on precise lens mass modelling. In this work, we aim to present the first attempt at modelling line-of-sight (LOS) mass distribution in addition to that of the cluster, extending previous modelling techniques that assume mass distributions to be on a single lens plane. We have focussed on the Hubble Frontier Field cluster MACS J0416.1-2403, and our multi-plane model reproduces the observed image positions with a rms offset of 0.''53. Starting from this best-fitting model, we simulated a mock cluster that resembles MACS J0416.1-2403 in order to explore the effects of LOS structures on cluster mass modelling. By systematically analysing the mock cluster under different model assumptions, we find that neglecting the lensing environment has a significant impact on the reconstruction of image positions (rms 0.''3); accounting for LOS galaxies as if they were at the cluster redshift can partially reduce this offset. Moreover, foreground galaxies are more important to include into the model than the background ones. While the magnification factor of the lensed multiple images are recovered within 10% for 95% of them, those 5% that lie near critical curves can be significantly affected by the exclusion of the lensing environment in the models. In addition, LOS galaxies cannot explain the apparent discrepancy in the properties of massive sub-halos between MACS J0416.1-2403 and N-body simulated clusters. Since our model of MACS J0416.1-2403 with LOS galaxies only reduced modestly the rms offset in the image positions, we conclude that additional complexities would be needed in future models of MACS J0416.1-2403.

  18. Overview of the 1997 Dirac High-Magnetic Series at LOS Alamos

    NASA Astrophysics Data System (ADS)

    Clark, D. A.; Campbell, L. J.; Forman, K. C.; Fowler, C. M.; Goettee, J. D.; Mielke, C. H.; Rickel, D. G.; Marshall, B. R.

    2004-11-01

    During the summer of 1997, a series of high magnetic field experiments was conducted at Los Alamos National Laboratory. Four experiments utilizing Russian built MC-1 generators, which can reach fields as high as 10 Megagauss, and four smaller strip generator experiments at fields near 1.5 Megagauss were conducted. Experiments mounted on the devices included magnetoresistance of high temperature superconductors and semiconductors, optical reflectivity (conductivity) of semiconductors, magnetization of a magnetic cluster material and a semiconductor, Faraday rotation in a semiconductor and a magnetic cluster material, and transmission spectroscopy of molecules. Brief descriptions of the experimental setups, magnetic field measurement techniques, field results and various experiments are presented. Magnetic field data and other information on Dirac `97 can be found at .

  19. Identifying synonymy between relational phrases using word embeddings.

    PubMed

    Nguyen, Nhung T H; Miwa, Makoto; Tsuruoka, Yoshimasa; Tojo, Satoshi

    2015-08-01

    Many text mining applications in the biomedical domain benefit from automatic clustering of relational phrases into synonymous groups, since it alleviates the problem of spurious mismatches caused by the diversity of natural language expressions. Most of the previous work that has addressed this task of synonymy resolution uses similarity metrics between relational phrases based on textual strings or dependency paths, which, for the most part, ignore the context around the relations. To overcome this shortcoming, we employ a word embedding technique to encode relational phrases. We then apply the k-means algorithm on top of the distributional representations to cluster the phrases. Our experimental results show that this approach outperforms state-of-the-art statistical models including latent Dirichlet allocation and Markov logic networks. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. A New Technique using Electron Velocity Data from the Four Cluster Spacecraft to Explore Magnetofluid Turbulence in the Solar Wind

    NASA Technical Reports Server (NTRS)

    Goldstein, Melvyn L.; Gurgiolo, C.; Fazakerley, A.; Lahiff, A.

    2008-01-01

    It is now possible in certain circumstances to use velocity moments computed from the Plasma Electron and Current Experiment (PEACE) on the four Cluster spacecraft to determine a number of turbulence properties of the solar wind, including direct measurements of the vorticity and compressibility. Assuming that the four spacecraft are not co-planar and that there is only a linear variation of the plasma variables across the volume defined by the four satellites, one can estimate the curl of the fluid velocity, i.e., the vorticity. From the vorticity it is possible to explore directly intermittent regions in the solar wind where dissipation is likely to be enhanced. In addition, one can estimate directly the Taylor microscale.

  1. The association between optimal lifestyle-related health behaviors and employee productivity.

    PubMed

    Katz, Abigail S; Pronk, Nicolaas P; Lowry, Marcia

    2014-07-01

    To investigate the association between lifestyle-related health behaviors including sleep and the cluster of physical activity, no tobacco use, fruits and vegetables intake, and alcohol consumption termed the "Optimal Lifestyle Metric" (OLM), and employee productivity. Data were obtained from employee health assessments (N = 18,079). Regression techniques were used to study the association between OLM and employee productivity, sleep and employee productivity, and the interaction of both OLM and sleep on employee productivity. Employees who slept less or more than 7 or 8 hours per night experienced significantly more productivity loss. Employees who adhered to all four OLM behaviors simultaneously experienced less productivity loss compared with those who did not. Adequate sleep and adherence to the OLM cluster of behaviors are associated with significantly less productivity loss.

  2. Localized analysis of paint-coat drying using dynamic speckle interferometry

    NASA Astrophysics Data System (ADS)

    Sierra-Sosa, Daniel; Tebaldi, Myrian; Grumel, Eduardo; Rabal, Hector; Elmaghraby, Adel

    2018-07-01

    The paint-coating is part of several industrial processes, including the automotive industry, architectural coatings, machinery and appliances. These paint-coatings must comply with high quality standards, for this reason evaluation techniques from paint-coatings are in constant development. One important factor from the paint-coating process is the drying, as it has influence on the quality of final results. In this work we present an assessment technique based on the optical dynamic speckle interferometry, this technique allows for the temporal activity evaluation of the paint-coating drying process, providing localized information from drying. This localized information is relevant in order to address the drying homogeneity, optimal drying, and quality control. The technique relies in the definition of a new temporal history of the speckle patterns to obtain the local activity; this information is then clustered to provide a convenient indicative of different drying process stages. The experimental results presented were validated using the gravimetric drying curves

  3. Determining the Number of Clusters in a Data Set Without Graphical Interpretation

    NASA Technical Reports Server (NTRS)

    Aguirre, Nathan S.; Davies, Misty D.

    2011-01-01

    Cluster analysis is a data mining technique that is meant ot simplify the process of classifying data points. The basic clustering process requires an input of data points and the number of clusters wanted. The clustering algorithm will then pick starting C points for the clusters, which can be either random spatial points or random data points. It then assigns each data point to the nearest C point where "nearest usually means Euclidean distance, but some algorithms use another criterion. The next step is determining whether the clustering arrangement this found is within a certain tolerance. If it falls within this tolerance, the process ends. Otherwise the C points are adjusted based on how many data points are in each cluster, and the steps repeat until the algorithm converges,

  4. New gas phase inorganic ion cluster species and their atmospheric implications

    NASA Technical Reports Server (NTRS)

    Maerk, T. D.; Peterson, K. I.; Castleman, A. W., Jr.

    1980-01-01

    Recent experimental laboratory observations, with high-pressure mass spectroscopy, have revealed the existence of previously unreported species involving water clustered to sodium dimer ions, and alkali metal hydroxides clustered to alkali metal ions. The important implications of these results concerning the existence of such species are here discussed, as well as how from a practical aspect they confirm the stability of certain cluster species proposed by Ferguson (1978) to explain masses recently detected at upper altitudes using mass spectrometric techniques.

  5. Some approaches to optimal cluster labeling of aerospace imagery

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1980-01-01

    Some approaches are presented to the problem of labeling clusters using information from a given set of labeled and unlabeled aerospace imagery patterns. The assignment of class labels to the clusters is formulated as the determination of the best assignment over all possible ones with respect to some criterion. Cluster labeling is also viewed as the probability of correct labeling with a maximization of likelihood function. Results of the application of these techniques in the processing of remotely sensed multispectral scanner imagery data are presented.

  6. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  7. Surface enhanced Raman spectroscopy (SERS) from a molecule adsorbed on a nanoscale silver particle cluster in a holographic plate

    NASA Astrophysics Data System (ADS)

    Jusinski, Leonard E.; Bahuguna, Ramen; Das, Amrita; Arya, Karamjeet

    2006-02-01

    Surface enhanced Raman spectroscopy has become a viable technique for the detection of single molecules. This highly sensitive technique is due to the very large (up to 14 orders in magnitude) enhancement in the Raman cross section when the molecule is adsorbed on a metal nanoparticle cluster. We report here SERS (Surface Enhanced Raman Spectroscopy) experiments performed by adsorbing analyte molecules on nanoscale silver particle clusters within the gelatin layer of commercially available holographic plates which have been developed and fixed. The Ag particles range in size between 5 - 30 nanometers (nm). Sample preparation was performed by immersing the prepared holographic plate in an analyte solution for a few minutes. We report here the production of SERS signals from Rhodamine 6G (R6G) molecules of nanomolar concentration. These measurements demonstrate a fast, low cost, reproducible technique of producing SERS substrates in a matter of minutes compared to the conventional procedure of preparing Ag clusters from colloidal solutions. SERS active colloidal solutions require up to a full day to prepare. In addition, the preparations of colloidal aggregates are not consistent in shape, contain additional interfering chemicals, and do not generate consistent SERS enhancement. Colloidal solutions require the addition of KCl or NaCl to increase the ionic strength to allow aggregation and cluster formation. We find no need to add KCl or NaCl to create SERS active clusters in the holographic gelatin matrix. These holographic plates, prepared using simple, conventional procedures, can be stored in an inert environment and preserve SERS activity after several weeks subsequent to preparation.

  8. Multiscale Embedded Gene Co-expression Network Analysis.

    PubMed

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  9. Optimising Regionalisation Techniques: Identifying Centres of Endemism in the Extraordinarily Endemic-Rich Cape Floristic Region

    PubMed Central

    Bradshaw, Peter L.; Colville, Jonathan F.; Linder, H. Peter

    2015-01-01

    We used a very large dataset (>40% of all species) from the endemic-rich Cape Floristic Region (CFR) to explore the impact of different weighting techniques, coefficients to calculate similarity among the cells, and clustering approaches on biogeographical regionalisation. The results were used to revise the biogeographical subdivision of the CFR. We show that weighted data (down-weighting widespread species), similarity calculated using Kulczinsky’s second measure, and clustering using UPGMA resulted in the optimal classification. This maximized the number of endemic species, the number of centres recognized, and operational geographic units assigned to centres of endemism (CoEs). We developed a dendrogram branch order cut-off (BOC) method to locate the optimal cut-off points on the dendrogram to define candidate clusters. Kulczinsky’s second measure dendrograms were combined using consensus, identifying areas of conflict which could be due to biotic element overlap or transitional areas. Post-clustering GIS manipulation substantially enhanced the endemic composition and geographic size of candidate CoEs. Although there was broad spatial congruence with previous phytogeographic studies, our techniques allowed for the recovery of additional phytogeographic detail not previously described for the CFR. PMID:26147438

  10. Self-organizing neural networks--an alternative way of cluster analysis in clinical chemistry.

    PubMed

    Reibnegger, G; Wachter, H

    1996-04-15

    Supervised learning schemes have been employed by several workers for training neural networks designed to solve clinical problems. We demonstrate that unsupervised techniques can also produce interesting and meaningful results. Using a data set on the chemical composition of milk from 22 different mammals, we demonstrate that self-organizing feature maps (Kohonen networks) as well as a modified version of error backpropagation technique yield results mimicking conventional cluster analysis. Both techniques are able to project a potentially multi-dimensional input vector onto a two-dimensional space whereby neighborhood relationships remain conserved. Thus, these techniques can be used for reducing dimensionality of complicated data sets and for enhancing comprehensibility of features hidden in the data matrix.

  11. Cluster-state quantum computing enhanced by high-fidelity generalized measurements.

    PubMed

    Biggerstaff, D N; Kaltenbaek, R; Hamel, D R; Weihs, G; Rudolph, T; Resch, K J

    2009-12-11

    We introduce and implement a technique to extend the quantum computational power of cluster states by replacing some projective measurements with generalized quantum measurements (POVMs). As an experimental demonstration we fully realize an arbitrary three-qubit cluster computation by implementing a tunable linear-optical POVM, as well as fast active feedforward, on a two-qubit photonic cluster state. Over 206 different computations, the average output fidelity is 0.9832+/-0.0002; furthermore the error contribution from our POVM device and feedforward is only of O(10(-3)), less than some recent thresholds for fault-tolerant cluster computing.

  12. Semantic Clustering of Search Engine Results

    PubMed Central

    Soliman, Sara Saad; El-Sayed, Maged F.; Hassan, Yasser F.

    2015-01-01

    This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision. PMID:26933673

  13. A hybrid monkey search algorithm for clustering analysis.

    PubMed

    Chen, Xin; Zhou, Yongquan; Luo, Qifang

    2014-01-01

    Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  14. Magnetization reversal mechanism of magnetic tunnel junctions

    NASA Astrophysics Data System (ADS)

    Liu, Cun-Ye; Li, Jian; Wang, Yue; Chen, Jian-Yong; Xu, Qing-Yu; Ni, Gang; Sang, Hai; Du, You-Wei

    2002-01-01

    Using the ion-beam-sputtering technique, we have fabricated Fe/Al2O3/Fe magnetic tunnelling junctions (MTJs). We have observed double-peaked shapes of curves, which have a level summit and a symmetrical feature, showing the magnetoresistance of the junction as a function of applied field. We have measured the tunnel conductance of MTJs which have insulating layers of different thicknesses. We have studied the dependence of the magnetoresistance of MTJs on tunnel conductance. The microstructures of hard- and soft-magnetic layers and interfaces of ferromagnets and insulators were probed. Analysing the influence of MJT microstructures, including those having clusters or/and granules in magnetic and non-magnetic films, a magnetization reversal mechanism (MRM) is proposed, which suggests that the MRM of tunnelling junctions may be explained by using a group-by-group reversal model of magnetic moments of the mesoscopical particles. We discuss the influence of MTJ microstructures, including those with clusters or/and granules in the ferromagnetic and non-magnetic films, on the MRM.

  15. Connectionist Interaction Information Retrieval.

    ERIC Educational Resources Information Center

    Dominich, Sandor

    2003-01-01

    Discussion of connectionist views for adaptive clustering in information retrieval focuses on a connectionist clustering technique and activation spreading-based information retrieval model using the interaction information retrieval method. Presents theoretical as well as simulation results as regards computational complexity and includes…

  16. Proper motions in the VVV Survey: Results for more than 15 million stars across NGC 6544

    NASA Astrophysics Data System (ADS)

    Contreras Ramos, R.; Zoccali, M.; Rojas, F.; Rojas-Arriagada, A.; Gárate, M.; Huijse, P.; Gran, F.; Soto, M.; Valcarce, A. A. R.; Estévez, P. A.; Minniti, D.

    2017-12-01

    Context. In the last six years, the VISTA Variable in the Vía Láctea (VVV) survey mapped 562 sq. deg. across the bulge and southern disk of the Galaxy. However, a detailed study of these regions, which includes 36 globular clusters (GCs) and thousands of open clusters is by no means an easy challenge. High differential reddening and severe crowding along the line of sight makes highly hamper to reliably distinguish stars belonging to different populations and/or systems. Aims: The aim of this study is to separate stars that likely belong to the Galactic GC NGC 6544 from its surrounding field by means of proper motion (PM) techniques. Methods: This work was based upon a new astrometric reduction method optimized for images of the VVV survey. Results: PSF-fitting photometry over the six years baseline of the survey allowed us to obtain a mean precision of 0.51 mas yr-1, in each PM coordinate, for stars with Ks< 15 mag. In the area studied here, cluster stars separate very well from field stars, down to the main sequence turnoff and below, allowing us to derive for the first time the absolute PM of NGC 6544. Isochrone fitting on the clean and differential reddening corrected cluster color magnitude diagram yields an age of 11-13 Gyr, and metallicity [Fe/H] =-1.5 dex, in agreement with previous studies restricted to the cluster core. We were able to derive the cluster orbit assuming an axisymmetric model of the Galaxy and conclude that NGC 6544 is likely a halo GC. We have not detected tidal tail signatures associated to the cluster, but a remarkable elongation in the galactic center direction has been found. The precision achieved in the PM determination also allows us to separate bulge stars from foreground disk stars, enabling the kinematical selection of bona fide bulge stars across the whole survey area. Conclusions: Kinematical techniques are a fundamental step toward disentangling different stellar populations that overlap in a studied field. Our results show that VVV data is perfectly suitable for this kind of analysis. Based on observations taken with ESO telescopes at Paranal Observatory under programme IDs 179.B-2002.

  17. New Target for an Old Method: Hubble Measures Globular Cluster Parallax

    NASA Astrophysics Data System (ADS)

    Hensley, Kerry

    2018-05-01

    Measuring precise distances to faraway objects has long been a challenge in astrophysics. Now, one of the earliest techniques used to measure the distance to astrophysical objects has been applied to a metal-poor globular cluster for the first time.A Classic TechniqueAn artists impression of the European Space Agencys Gaia spacecraft. Gaia is on track to map the positions and motions of a billion stars. [ESA]Distances to nearby stars are often measured using the parallax technique tracing the tiny apparent motion of a target star against the background of more distant stars as Earth orbits the Sun. This technique has come a long way since it was first used in the 1800s to measure the distance to stars a few tens of light-years away; with the advent of space observatories like Hipparcos and Gaia, parallax can now be used to map the positions of stars out to thousands of light-years.Precise distance measurements arent only important for setting the scale of the universe, however; they can also help us better understand stellar evolution over the course of cosmic history. Stellar evolution models are often anchored to a reference star cluster, the properties of which must be known precisely. These precise properties can be readily determined for young, nearby open clusters using parallax measurements. But stellar evolution models that anchor on themore-distant, ancient, metal-poor globular clusters have been hampered by theless-precise indirect methods used tomeasure distance to these faraway clusters until now.Top: An image of NGC 6397 overlaid with the area scanned by Hubble (dashed green) and the footprint of the camera (solid green). The blue ellipse represents the parallax motion of a star in the cluster, exaggerated by a factor of ten thousand. Bottom: An example scan from this field. [Adapted from Brown et al. 2018]New Measurement to an Old ClusterThomas Brown (Space Telescope Science Institute) and collaborators used the Hubble Space Telescope todetermine the distance to NGC 6397, one of the nearest metal-poor globular clusters and anchor for one stellar population model. Brown and coauthors used a technique called spatial scanning to greatly broaden the reach of the parallax method.Spatial scanning was initially developed as a way to increase the signal-to-noise of exoplanet transit observations, but it has also greatly improved the prospects of astrometry precisely determining the separations between astronomical objects. In spatial scanning, the telescope moves while the exposure is being taken, spreading the light out across many pixels.Unprecedented PrecisionThis technique allowed the authors to achieve a precision of 20100microarcseconds. From the observed parallax angle of just 0.418 milliarcseconds (for reference, the moons angular size is about 5 million times larger on the sky!), Brown and collaborators refined the distance to NGC 6397 to 7,795 light-years, with a measurement error of only a few percent.Using spatial scanning, Hubble can make parallax measurements of nearby globular clusters, while Gaia has the potential to reach even farther. Looking ahead, the measurement made by Brown and collaborators can be combined with the recently released Gaia data to trim the uncertainty down to just 1%. This highlights the power of space telescopes to make extremely precise measurements of astoundingly large distances informing our models and helping us measure the universe.CitationThomas Brown et al 2018ApJL856 L6. doi:10.3847/2041-8213/aab55a

  18. Efficient clustering aggregation based on data fragments.

    PubMed

    Wu, Ou; Hu, Weiming; Maybank, Stephen J; Zhu, Mingliang; Li, Bing

    2012-06-01

    Clustering aggregation, known as clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a single better clustering. Existing clustering aggregation algorithms are applied directly to data points, in what is referred to as the point-based approach. The algorithms are inefficient if the number of data points is large. We define an efficient approach for clustering aggregation based on data fragments. In this fragment-based approach, a data fragment is any subset of the data that is not split by any of the clustering results. To establish the theoretical bases of the proposed approach, we prove that clustering aggregation can be performed directly on data fragments under two widely used goodness measures for clustering aggregation taken from the literature. Three new clustering aggregation algorithms are described. The experimental results obtained using several public data sets show that the new algorithms have lower computational complexity than three well-known existing point-based clustering aggregation algorithms (Agglomerative, Furthest, and LocalSearch); nevertheless, the new algorithms do not sacrifice the accuracy.

  19. Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data

    PubMed Central

    Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.

    2003-01-01

    Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292

  20. Membership determination of open clusters based on a spectral clustering method

    NASA Astrophysics Data System (ADS)

    Gao, Xin-Hua

    2018-06-01

    We present a spectral clustering (SC) method aimed at segregating reliable members of open clusters in multi-dimensional space. The SC method is a non-parametric clustering technique that performs cluster division using eigenvectors of the similarity matrix; no prior knowledge of the clusters is required. This method is more flexible in dealing with multi-dimensional data compared to other methods of membership determination. We use this method to segregate the cluster members of five open clusters (Hyades, Coma Ber, Pleiades, Praesepe, and NGC 188) in five-dimensional space; fairly clean cluster members are obtained. We find that the SC method can capture a small number of cluster members (weak signal) from a large number of field stars (heavy noise). Based on these cluster members, we compute the mean proper motions and distances for the Hyades, Coma Ber, Pleiades, and Praesepe clusters, and our results are in general quite consistent with the results derived by other authors. The test results indicate that the SC method is highly suitable for segregating cluster members of open clusters based on high-precision multi-dimensional astrometric data such as Gaia data.

  1. Simulating The Dynamical Evolution Of Galaxies In Group And Cluster Environments

    NASA Astrophysics Data System (ADS)

    Vijayaraghavan, Rukmani

    2015-07-01

    Galaxy clusters are harsh environments for their constituent galaxies. A variety of physical processes effective in these dense environments transform gas-rich, spiral, star-forming galaxies to elliptical or spheroidal galaxies with very little gas and therefore minimal star formation. The consequences of these processes are well understood observationally. Galaxies in progressively denser environments have systematically declining star formation rates and gas content. However, a theoretical understanding of of where, when, and how these processes act, and the interplay between the various galaxy transformation mechanisms in clusters remains elusive. In this dissertation, I use numerical simulations of cluster mergers as well as galaxies evolving in quiescent environments to develop a theoretical framework to understand some of the physics of galaxy transformation in cluster environments. Galaxies can be transformed in smaller groups before they are accreted by their eventual massive cluster environments, an effect termed `pre-processing'. Galaxy cluster mergers themselves can accelerate many galaxy transformation mechanisms, including tidal and ram pressure stripping of galaxies and galaxy-galaxy collisions and mergers that result in reassemblies of galaxies' stars and gas. Observationally, cluster mergers have distinct velocity and phase-space signatures depending on the observer's line of sight with respect to the merger direction. Using dark matter only as well as hydrodynamic simulations of cluster mergers with random ensembles of particles tagged with galaxy models, I quantify the effects of cluster mergers on galaxy evolution before, during, and after the mergers. Based on my theoretical predictions of the dynamical signatures of these mergers in combination with galaxy transformation signatures, one can observationally identify remnants of mergers and quantify the effect of the environment on galaxies in dense group and cluster environments. The presence of long-lived, hot X-ray emitting coronae observed in a large fraction of group and cluster galaxies is not well-understood. These coronae are not fully stripped by ram pressure and tidal forces that are efficient in these environments. Theoretically, this is a fascinating and challenging problem that involves understanding and simulating the multitude of physical processes in these dense environments that can remove or replenish galaxies' hot coronae. To solve this problem, I have developed and implemented a robust simulation technique where I simulate the evolution of a realistic cluster environment with a population of galaxies and their gas. With this technique, it is possible to isolate and quantify the importance of the various cluster physical processes for coronal survival. To date, I have performed hydrodynamic simulations of galaxies being ram pressure stripped in quiescent group and cluster environments. Using these simulations, I have characterized the physics of ram pressure stripping and investigated the survival of these coronae in the presence of tidal and ram pressure stripping. I have also generated synthetic X-ray observations of these simulated systems to compare with observed coronae. I have also performed magnetohydrodynamic simulations of galaxies evolving in a magnetized intracluster medium plasma to isolate the effect of magnetic fields on coronal evolution, as well the effect of orbiting galaxies in amplifying magnetic fields. This work is an important step towards understanding the effect of cluster environments on galactic gas, and consequently, their long term evolution and impact on star formation rates.

  2. Insulin Resistance: Regression and Clustering

    PubMed Central

    Yoon, Sangho; Assimes, Themistocles L.; Quertermous, Thomas; Hsiao, Chin-Fu; Chuang, Lee-Ming; Hwu, Chii-Min; Rajaratnam, Bala; Olshen, Richard A.

    2014-01-01

    In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with “main effects” is not satisfactory, but prediction that includes interactions may be. PMID:24887437

  3. Critical Analysis of Cluster Models and Exchange-Correlation Functionals for Calculating Magnetic Shielding in Molecular Solids.

    PubMed

    Holmes, Sean T; Iuliucci, Robbie J; Mueller, Karl T; Dybowski, Cecil

    2015-11-10

    Calculations of the principal components of magnetic-shielding tensors in crystalline solids require the inclusion of the effects of lattice structure on the local electronic environment to obtain significant agreement with experimental NMR measurements. We assess periodic (GIPAW) and GIAO/symmetry-adapted cluster (SAC) models for computing magnetic-shielding tensors by calculations on a test set containing 72 insulating molecular solids, with a total of 393 principal components of chemical-shift tensors from 13C, 15N, 19F, and 31P sites. When clusters are carefully designed to represent the local solid-state environment and when periodic calculations include sufficient variability, both methods predict magnetic-shielding tensors that agree well with experimental chemical-shift values, demonstrating the correspondence of the two computational techniques. At the basis-set limit, we find that the small differences in the computed values have no statistical significance for three of the four nuclides considered. Subsequently, we explore the effects of additional DFT methods available only with the GIAO/cluster approach, particularly the use of hybrid-GGA functionals, meta-GGA functionals, and hybrid meta-GGA functionals that demonstrate improved agreement in calculations on symmetry-adapted clusters. We demonstrate that meta-GGA functionals improve computed NMR parameters over those obtained by GGA functionals in all cases, and that hybrid functionals improve computed results over the respective pure DFT functional for all nuclides except 15N.

  4. Are clusters important in understanding the mechanisms in atmospheric pressure ionization? Part 1: Reagent ion generation and chemical control of ion populations.

    PubMed

    Klee, Sonja; Derpmann, Valerie; Wißdorf, Walter; Klopotowski, Sebastian; Kersten, Hendrik; Brockmann, Klaus J; Benter, Thorsten; Albrecht, Sascha; Bruins, Andries P; Dousty, Faezeh; Kauppila, Tiina J; Kostiainen, Risto; O'Brien, Rob; Robb, Damon B; Syage, Jack A

    2014-08-01

    It is well documented since the early days of the development of atmospheric pressure ionization methods, which operate in the gas phase, that cluster ions are ubiquitous. This holds true for atmospheric pressure chemical ionization, as well as for more recent techniques, such as atmospheric pressure photoionization, direct analysis in real time, and many more. In fact, it is well established that cluster ions are the primary carriers of the net charge generated. Nevertheless, cluster ion chemistry has only been sporadically included in the numerous proposed ionization mechanisms leading to charged target analytes, which are often protonated molecules. This paper series, consisting of two parts, attempts to highlight the role of cluster ion chemistry with regard to the generation of analyte ions. In addition, the impact of the changing reaction matrix and the non-thermal collisions of ions en route from the atmospheric pressure ion source to the high vacuum analyzer region are discussed. This work addresses such issues as extent of protonation versus deuteration, the extent of analyte fragmentation, as well as highly variable ionization efficiencies, among others. In Part 1, the nature of the reagent ion generation is examined, as well as the extent of thermodynamic versus kinetic control of the resulting ion population entering the analyzer region.

  5. Clustering of Cyclodextrin-Functionalized Microbeads by an Amphiphilic Biopolymer: Real-Time Observation of Structures Resembling Blood Clots.

    PubMed

    Arya, Chandamany; Saez Cabesas, Camila A; Huang, Hubert; Raghavan, Srinivasa R

    2017-10-25

    Colloidal particles can be induced to cluster by adding polymers in a process called bridging flocculation. For bridging to occur, the polymer must bind strongly to the surfaces of adjacent particles, such as via electrostatic interactions. Here, we introduce a new system where bridging occurs due to specific interactions between the side chains of an amphiphilic polymer and supramolecules on the particle surface. The polymer is a hydrophobically modified chitosan (hmC) while the particles are uniform polymeric microbeads (∼160 μm in diameter) made by a microfluidic technique and functionalized on their surface by α-cyclodextrins (CDs). The CDs have hydrophobic binding pockets that can capture the n-alkyl hydrophobes present along the hmC chains. Clustering of CD-coated microbeads in water by hmC is visualized in real time using optical microscopy. Interestingly, the clustering follows two distinct stages: first, the microbeads are bridged into clusters by hmC chains, which occurs by the interaction of individual chains with the CDs on adjacent particles. Thereafter, additional hmC from the solution adsorbs onto the surfaces of the microbeads and an hmC "mesh" grows around the clusters. This growing nanostructured mesh can trap surrounding microsized objects and sequester them within the overall cluster. Such clustering is reminiscent of blood clotting where blood platelets initially cluster at a wound site, whereupon they induce growth of a protein (fibrin) mesh around the clusters, which entraps other passive cells. Clustering does not occur with the native chitosan (lacking hydrophobes) or with the bare particles (lacking CDs); these results confirm that the clustering is indeed due to hydrophobic interactions between the hmC and the CDs. Microbead clustering via amphiphilic biopolymers could be applicable in embolization, which is a surgical technique used to block blood flow to a particular area of the body, or in agglutination assays.

  6. Managing Transmission of Carbapenem-Resistant Enterobacteriaceae in Healthcare Settings: A View From the Trenches

    PubMed Central

    Palmore, Tara N.; Henderson, David K.

    2013-01-01

    In 2011, the National Institutes of Health Clinical Center experienced a cluster of infection and colonization caused by carbapenem-resistant Klebsiella pneumoniae among profoundly immunocompromised inpatients. This manuscript describes the approach and interventions that were implemented in an attempt to curtail the cluster. Interventions employed included engagement of all stakeholders involved in care of at-risk patients; detailed and frequent communication with hospital staff about issues relating to the outbreak; aggressive microbial surveillance; use of techniques that facilitate rapid identification of resistant organisms; rapid characterization of resistance mechanisms; whole-genome sequencing of outbreak isolates to characterize the spread and to investigate mechanisms of healthcare-associated spread; implementation of enhanced contact precautions for all infected or colonized patients; geographic and personnel cohorting; daily chlorhexidine gluconate baths; dedicating equipment to be used solely for cohorted patients and aggressive decontamination of equipment that had to be reused on uncohorted patients; monitoring adherence to infection control precautions, including unwavering attention to adherence to appropriate hand hygiene procedures; and attention to the details of environmental decontamination. In addition, the manuscript discusses some of the challenges associated with managing such an event, as well as a few of the unanticipated consequences associated with the aftermath of the case cluster. PMID:23934166

  7. Clustering-Based Ensemble Learning for Activity Recognition in Smart Homes

    PubMed Central

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-01-01

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks. PMID:25014095

  8. Clustering-based ensemble learning for activity recognition in smart homes.

    PubMed

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-07-10

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.

  9. Prediction of strontium bromide laser efficiency using cluster and decision tree analysis

    NASA Astrophysics Data System (ADS)

    Iliev, Iliycho; Gocheva-Ilieva, Snezhana; Kulin, Chavdar

    2018-01-01

    Subject of investigation is a new high-powered strontium bromide (SrBr2) vapor laser emitting in multiline region of wavelengths. The laser is an alternative to the atom strontium lasers and electron free lasers, especially at the line 6.45 μm which line is used in surgery for medical processing of biological tissues and bones with minimal damage. In this paper the experimental data from measurements of operational and output characteristics of the laser are statistically processed by means of cluster analysis and tree-based regression techniques. The aim is to extract the more important relationships and dependences from the available data which influence the increase of the overall laser efficiency. There are constructed and analyzed a set of cluster models. It is shown by using different cluster methods that the seven investigated operational characteristics (laser tube diameter, length, supplied electrical power, and others) and laser efficiency are combined in 2 clusters. By the built regression tree models using Classification and Regression Trees (CART) technique there are obtained dependences to predict the values of efficiency, and especially the maximum efficiency with over 95% accuracy.

  10. Analysis of the effects of the global financial crisis on the Turkish economy, using hierarchical methods

    NASA Astrophysics Data System (ADS)

    Kantar, Ersin; Keskin, Mustafa; Deviren, Bayram

    2012-04-01

    We have analyzed the topology of 50 important Turkish companies for the period 2006-2010 using the concept of hierarchical methods (the minimal spanning tree (MST) and hierarchical tree (HT)). We investigated the statistical reliability of links between companies in the MST by using the bootstrap technique. We also used the average linkage cluster analysis (ALCA) technique to observe the cluster structures much better. The MST and HT are known as useful tools to perceive and detect global structure, taxonomy, and hierarchy in financial data. We obtained four clusters of companies according to their proximity. We also observed that the Banks and Holdings cluster always forms in the centre of the MSTs for the periods 2006-2007, 2008, and 2009-2010. The clusters match nicely with their common production activities or their strong interrelationship. The effects of the Automobile sector increased after the global financial crisis due to the temporary incentives provided by the Turkish government. We find that Turkish companies were not very affected by the global financial crisis.

  11. System and method for merging clusters of wireless nodes in a wireless network

    DOEpatents

    Budampati, Ramakrishna S [Maple Grove, MN; Gonia, Patrick S [Maplewood, MN; Kolavennu, Soumitri N [Blaine, MN; Mahasenan, Arun V [Kerala, IN

    2012-05-29

    A system includes a first cluster having multiple first wireless nodes. One first node is configured to act as a first cluster master, and other first nodes are configured to receive time synchronization information provided by the first cluster master. The system also includes a second cluster having one or more second wireless nodes. One second node is configured to act as a second cluster master, and any other second nodes configured to receive time synchronization information provided by the second cluster master. The system further includes a manager configured to merge the clusters into a combined cluster. One of the nodes is configured to act as a single cluster master for the combined cluster, and the other nodes are configured to receive time synchronization information provided by the single cluster master.

  12. Understanding the Support Needs of People with Intellectual and Related Developmental Disabilities through Cluster Analysis and Factor Analysis of Statewide Data

    ERIC Educational Resources Information Center

    Viriyangkura, Yuwadee

    2014-01-01

    Through a secondary analysis of statewide data from Colorado, people with intellectual and related developmental disabilities (ID/DD) were classified into five clusters based on their support needs characteristics using cluster analysis techniques. Prior latent factor models of support needs in the field of ID/DD were examined to investigate the…

  13. A method of using cluster analysis to study statistical dependence in multivariate data

    NASA Technical Reports Server (NTRS)

    Borucki, W. J.; Card, D. H.; Lyle, G. C.

    1975-01-01

    A technique is presented that uses both cluster analysis and a Monte Carlo significance test of clusters to discover associations between variables in multidimensional data. The method is applied to an example of a noisy function in three-dimensional space, to a sample from a mixture of three bivariate normal distributions, and to the well-known Fisher's Iris data.

  14. Biostatistics Series Module 10: Brief Overview of Multivariate Methods.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2017-01-01

    Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.

  15. Collaborative Filtering Based on Sequential Extraction of User-Item Clusters

    NASA Astrophysics Data System (ADS)

    Honda, Katsuhiro; Notsu, Akira; Ichihashi, Hidetomo

    Collaborative filtering is a computational realization of “word-of-mouth” in network community, in which the items prefered by “neighbors” are recommended. This paper proposes a new item-selection model for extracting user-item clusters from rectangular relation matrices, in which mutual relations between users and items are denoted in an alternative process of “liking or not”. A technique for sequential co-cluster extraction from rectangular relational data is given by combining the structural balancing-based user-item clustering method with sequential fuzzy cluster extraction appraoch. Then, the tecunique is applied to the collaborative filtering problem, in which some items may be shared by several user clusters.

  16. Quantum chemical calculation of the equilibrium structures of small metal atom clusters

    NASA Technical Reports Server (NTRS)

    Kahn, L. R.

    1982-01-01

    Metal atom clusters are studied based on the application of ab initio quantum mechanical approaches. Because these large 'molecular' systems pose special practical computational problems in the application of the quantum mechanical methods, there is a special need to find simplifying techniques that do not compromise the reliability of the calculations. Research is therefore directed towards various aspects of the implementation of the effective core potential technique for the removal of the metal atom core electrons from the calculations.

  17. MSFC Skylab program engineering and integration

    NASA Technical Reports Server (NTRS)

    1974-01-01

    A technical history and managerial critique of the MSFC role in the Skylab program is presented. The George C. Marshall Space Flight Center had primary hardware development responsibility for the Saturn Workshop Modules and many of the designated experiments in addition to the system integration responsibility for the entire Skylab Orbital Cluster. The report also includes recommendations and conclusions applicable to hardware design, test program philosophy and performance, and program management techniques with potential application to future programs.

  18. Semiconductor nanocrystals covalently bound to solid inorganic surfaces using self-assembled monolayers

    DOEpatents

    Alivisatos, A.P.; Colvin, V.L.

    1998-05-12

    Methods are described for attaching semiconductor nanocrystals to solid inorganic surfaces, using self-assembled bifunctional organic monolayers as bridge compounds. Two different techniques are presented. One relies on the formation of self-assembled monolayers on these surfaces. When exposed to solutions of nanocrystals, these bridge compounds bind the crystals and anchor them to the surface. The second technique attaches nanocrystals already coated with bridge compounds to the surfaces. Analyses indicate the presence of quantum confined clusters on the surfaces at the nanolayer level. These materials allow electron spectroscopies to be completed on condensed phase clusters, and represent a first step towards synthesis of an organized assembly of clusters. These new products are also disclosed. 10 figs.

  19. Formation of multiply charged ions from large molecules using massive-cluster impact.

    PubMed

    Mahoney, J F; Cornett, D S; Lee, T D

    1994-05-01

    Massive-cluster impact is demonstrated to be an effective ionization technique for the mass analysis of proteins as large as 17 kDa. The design of the cluster source permits coupling to both magnetic-sector and quadrupole mass spectrometers. Mass spectra are characterized by the almost total absence of chemical background and a predominance of multiply charged ions formed from 100% glycerol matrix. The number of charge states produced by the technique is observed to range from +3 to +9 for chicken egg lysozyme (14,310 Da). The lower m/z values provided by higher charge states increase the effective mass range of analyses performed with conventional ionization by fast-atom bombardment or liquid secondary ion mass spectrometry.

  20. Visual cluster analysis and pattern recognition template and methods

    DOEpatents

    Osbourn, G.C.; Martinez, R.F.

    1999-05-04

    A method of clustering using a novel template to define a region of influence is disclosed. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques. 30 figs.

  1. Genome engineering for microbial natural product discovery.

    PubMed

    Choi, Si-Sun; Katsuyama, Yohei; Bai, Linquan; Deng, Zixin; Ohnishi, Yasuo; Kim, Eung-Soo

    2018-03-03

    The discovery and development of microbial natural products (MNPs) have played pivotal roles in the fields of human medicine and its related biotechnology sectors over the past several decades. The post-genomic era has witnessed the development of microbial genome mining approaches to isolate previously unsuspected MNP biosynthetic gene clusters (BGCs) hidden in the genome, followed by various BGC awakening techniques to visualize compound production. Additional microbial genome engineering techniques have allowed higher MNP production titers, which could complement a traditional culture-based MNP chasing approach. Here, we describe recent developments in the MNP research paradigm, including microbial genome mining, NP BGC activation, and NP overproducing cell factory design. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. A Multi-Hop Energy Neutral Clustering Algorithm for Maximizing Network Information Gathering in Energy Harvesting Wireless Sensor Networks.

    PubMed

    Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X

    2015-12-26

    Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols.

  3. Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

    NASA Astrophysics Data System (ADS)

    Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G.; Hummer, Gerhard

    2014-09-01

    Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space.

  4. PuReD-MCL: a graph-based PubMed document clustering methodology.

    PubMed

    Theodosiou, T; Darzentas, N; Angelis, L; Ouzounis, C A

    2008-09-01

    Biomedical literature is the principal repository of biomedical knowledge, with PubMed being the most complete database collecting, organizing and analyzing such textual knowledge. There are numerous efforts that attempt to exploit this information by using text mining and machine learning techniques. We developed a novel approach, called PuReD-MCL (Pubmed Related Documents-MCL), which is based on the graph clustering algorithm MCL and relevant resources from PubMed. PuReD-MCL avoids using natural language processing (NLP) techniques directly; instead, it takes advantage of existing resources, available from PubMed. PuReD-MCL then clusters documents efficiently using the MCL graph clustering algorithm, which is based on graph flow simulation. This process allows users to analyse the results by highlighting important clues, and finally to visualize the clusters and all relevant information using an interactive graph layout algorithm, for instance BioLayout Express 3D. The methodology was applied to two different datasets, previously used for the validation of the document clustering tool TextQuest. The first dataset involves the organisms Escherichia coli and yeast, whereas the second is related to Drosophila development. PuReD-MCL successfully reproduces the annotated results obtained from TextQuest, while at the same time provides additional insights into the clusters and the corresponding documents. Source code in perl and R are available from http://tartara.csd.auth.gr/~theodos/

  5. Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G., E-mail: yannis@princeton.edu, E-mail: gerhard.hummer@biophys.mpg.de

    Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlapmore » with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space.« less

  6. Applying Machine Learning to Star Cluster Classification

    NASA Astrophysics Data System (ADS)

    Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar

    2016-01-01

    Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.

  7. A Multi-Hop Energy Neutral Clustering Algorithm for Maximizing Network Information Gathering in Energy Harvesting Wireless Sensor Networks

    PubMed Central

    Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X.

    2015-01-01

    Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols. PMID:26712764

  8. Receptive field optimisation and supervision of a fuzzy spiking neural network.

    PubMed

    Glackin, Cornelius; Maguire, Liam; McDaid, Liam; Sayers, Heather

    2011-04-01

    This paper presents a supervised training algorithm that implements fuzzy reasoning on a spiking neural network. Neuron selectivity is facilitated using receptive fields that enable individual neurons to be responsive to certain spike train firing rates and behave in a similar manner as fuzzy membership functions. The connectivity of the hidden and output layers in the fuzzy spiking neural network (FSNN) is representative of a fuzzy rule base. Fuzzy C-Means clustering is utilised to produce clusters that represent the antecedent part of the fuzzy rule base that aid classification of the feature data. Suitable cluster widths are determined using two strategies; subjective thresholding and evolutionary thresholding respectively. The former technique typically results in compact solutions in terms of the number of neurons, and is shown to be particularly suited to small data sets. In the latter technique a pool of cluster candidates is generated using Fuzzy C-Means clustering and then a genetic algorithm is employed to select the most suitable clusters and to specify cluster widths. In both scenarios, the network is supervised but learning only occurs locally as in the biological case. The advantages and disadvantages of the network topology for the Fisher Iris and Wisconsin Breast Cancer benchmark classification tasks are demonstrated and directions of current and future work are discussed. Copyright © 2010 Elsevier Ltd. All rights reserved.

  9. Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

    PubMed Central

    Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G.; Hummer, Gerhard

    2014-01-01

    Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space. PMID:25240340

  10. Mapping Emission from Clusters of CdSe/ZnS Nanoparticles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ryan, Duncan P.; Goodwin, Peter M.; Sheehan, Chris J.

    In this paper, we have carried out correlated super-resolution and SEM imaging studies of clusters of CdSe/ZnS nanoparticles containing up to ten particles to explore how the fluorescence behavior of these clusters depends on the number of particles, the specific cluster geometry, the shell thickness, and the technique used to produce the clusters. The total emission yield was less than proportional to the number of particles in the clusters for both thick and thin shells. With super-resolution imaging, the emission center of the cluster could be spatially resolved at distance scales on the order of the cluster size. The intrinsicmore » fluorescence intermittency of the nanoparticles altered the emission distribution across the cluster, which enabled the identification of relative emission intensities of individual particles or small groups of particles within the cluster. Finally, for clusters undergoing interparticle energy transfer, donor/acceptor pairs and regions where energy was funneled could be identified.« less

  11. Mapping Emission from Clusters of CdSe/ZnS Nanoparticles

    DOE PAGES

    Ryan, Duncan P.; Goodwin, Peter M.; Sheehan, Chris J.; ...

    2018-01-24

    In this paper, we have carried out correlated super-resolution and SEM imaging studies of clusters of CdSe/ZnS nanoparticles containing up to ten particles to explore how the fluorescence behavior of these clusters depends on the number of particles, the specific cluster geometry, the shell thickness, and the technique used to produce the clusters. The total emission yield was less than proportional to the number of particles in the clusters for both thick and thin shells. With super-resolution imaging, the emission center of the cluster could be spatially resolved at distance scales on the order of the cluster size. The intrinsicmore » fluorescence intermittency of the nanoparticles altered the emission distribution across the cluster, which enabled the identification of relative emission intensities of individual particles or small groups of particles within the cluster. Finally, for clusters undergoing interparticle energy transfer, donor/acceptor pairs and regions where energy was funneled could be identified.« less

  12. Physicochemical study of natural fractionated biocolloid by asymmetric flow field-flow fractionation in tandem with various complementary techniques using biologically synthesized silver nanocomposites.

    PubMed

    Railean-Plugaru, Viorica; Pomastowski, Pawel; Kowalkowski, Tomasz; Sprynskyy, Myroslav; Buszewski, Boguslaw

    2018-04-01

    Asymmetric flow field-flow fractionation coupled with use of ultraviolet-visible, multiangle light scattering (MALLS), and dynamic light scattering (DLS) detectors was used for separation and characterization of biologically synthesized silver composites in two liquid compositions. Moreover, to supplement the DLS/MALLS information, various complementary techniques such as transmission electron spectroscopy, Fourier transform infrared spectroscopy, and matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) were used. The hydrodynamic diameter and the radius of gyration of silver composites were slightly larger than the sizes obtained by transmission electron microscopy (TEM). Moreover, the TEM results revealed the presence of silver clusters and even several morphologies, including multitwinned. Additionally, MALDI-TOF MS examination showed that the particles have an uncommon cluster structure. It can be described as being composed of two or more silver clusters. The organic surface of the nanoparticles can modify their dispersion. We demonstrated that the variation of the silver surface coating directly influenced the migration rate of biologically synthesized silver composites. Moreover, this study proves that the fractionation mechanism of silver biocolloids relies not only on the particle size but also on the type and mass of the surface coatings. Because silver nanoparticles typically have size-dependent cytotoxicity, this behavior is particularly relevant for biomedical applications. Graphical abstract Workflow for asymmetric flow field-flow fractionation of natural biologically synthesized silver nanocomposites.

  13. Particle dynamics during nanoparticle synthesis by laser ablation in a background gas

    NASA Astrophysics Data System (ADS)

    Nakata, Yoshiki; Muramoto, Junichi; Okada, Tatsuo; Maeda, Mitsuo

    2002-02-01

    Particle dynamics during Si nanoparticle synthesis in a laser-ablation plume in different background gases were investigated by laser-spectroscopic imaging techniques. Two-dimensional laser induced fluorescence and ultraviolet Rayleigh scattering techniques were used to visualize the spatial distribution of the Si atoms and nanoparticles grown, respectively. We have developed a visualization technique called re-decomposition laser-induced fluorescence to observe small nanoparticles (hereafter called clusters) which are difficult to observe by the conventional imaging techniques. In this article, the whole process of nanoparticle synthesis in different background gases of He, Ne, Ar, N2 and O2 was investigated by these techniques. In He, Ne, Ar and N2 background gases at 10 Torr, the clustering of the Si atoms started 200, 250, 300 and 800 μs after ablation, respectively. The growth rate of the clusters in He background gas was much larger than that in the other gases. The spatial distributions of the Si nanoparticles were mushroom like in He, N2 and O2, and column like in Ne and Ar. It is thought that the difference in distribution was caused by differences in the flow characteristics of the background gases, which would imply that the viscosity of the background gas is one of the main governing parameters.

  14. Ab initio Bogoliubov coupled cluster theory for open-shell nuclei

    DOE PAGES

    Signoracci, Angelo J.; Duguet, Thomas; Hagen, Gaute; ...

    2015-06-29

    Background: Ab initio many-body methods have been developed over the past 10 yr to address closed-shell nuclei up to mass A≈130 on the basis of realistic two- and three-nucleon interactions. A current frontier relates to the extension of those many-body methods to the description of open-shell nuclei. Several routes to address open-shell nuclei are currently under investigation, including ideas that exploit spontaneous symmetry breaking. Purpose: Singly open-shell nuclei can be efficiently described via the sole breaking of U(1) gauge symmetry associated with particle-number conservation as a way to account for their superfluid character. While this route was recently followed withinmore » the framework of self-consistent Green's function theory, the goal of the present work is to formulate a similar extension within the framework of coupled cluster theory. Methods: We formulate and apply Bogoliubov coupled cluster (BCC) theory, which consists of representing the exact ground-state wave function of the system as the exponential of a quasiparticle excitation cluster operator acting on a Bogoliubov reference state. Equations for the ground-state energy and the cluster amplitudes are derived at the singles and doubles level (BCCSD) both algebraically and diagrammatically. The formalism includes three-nucleon forces at the normal-ordered two-body level. The first BCC code is implemented in m scheme, which will permit the treatment of doubly open-shell nuclei via the further breaking of SU(2) symmetry associated with angular momentum conservation. Results: Proof-of-principle calculations in an N max=6 spherical harmonic oscillator basis for 16,18O and 18Ne in the BCCD approximation are in good agreement with standard coupled cluster results with the same chiral two-nucleon interaction, while 20O and 20Mg display underbinding relative to experiment. The breaking of U(1) symmetry, monitored by computing the variance associated with the particle-number operator, is relatively constant for all five nuclei, in both the Hartree-Fock-Bogoliubov and BCCD approximations. Conclusions: The newly developed many-body formalism increases the potential span of ab initio calculations based on single-reference coupled cluster techniques tremendously, i.e., potentially to reach several hundred additional midmass nuclei. The new formalism offers a wealth of potential applications and further extensions dedicated to the description of ground and excited states of open-shell nuclei. Short-term goals include the implementation of three-nucleon forces at the normal-ordered two-body level. Midterm extensions include the approximate treatment of triples corrections and the development of the equation-of-motion methodology to treat both excited states and odd nuclei. Long-term extensions include exact restoration of U(1) and SU(2) symmetries.« less

  15. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    NASA Astrophysics Data System (ADS)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  16. Using Fuzzy Clustering for Real-time Space Flight Safety

    NASA Technical Reports Server (NTRS)

    Lee, Charles; Haskell, Richard E.; Hanna, Darrin; Alena, Richard L.

    2004-01-01

    To ensure space flight safety, it is necessary to monitor myriad sensor readings on the ground and in flight. Since a space shuttle has many sensors, monitoring data and drawing conclusions from information contained within the data in real time is challenging. The nature of the information can be critical to the success of the mission and safety of the crew and therefore, must be processed with minimal data-processing time. Data analysis algorithms could be used to synthesize sensor readings and compare data associated with normal operation with the data obtained that contain fault patterns to draw conclusions. Detecting abnormal operation during early stages in the transition from safe to unsafe operation requires a large amount of historical data that can be categorized into different classes (non-risk, risk). Even though the 40 years of shuttle flight program has accumulated volumes of historical data, these data don t comprehensively represent all possible fault patterns since fault patterns are usually unknown before the fault occurs. This paper presents a method that uses a similarity measure between fuzzy clusters to detect possible faults in real time. A clustering technique based on a fuzzy equivalence relation is used to characterize temporal data. Data collected during an initial time period are separated into clusters. These clusters are characterized by their centroids. Clusters formed during subsequent time periods are either merged with an existing cluster or added to the cluster list. The resulting list of cluster centroids, called a cluster group, characterizes the behavior of a particular set of temporal data. The degree to which new clusters formed in a subsequent time period are similar to the cluster group is characterized by a similarity measure, q. This method is applied to downlink data from Columbia flights. The results show that this technique can detect an unexpected fault that has not been present in the training data set.

  17. Unsupervised, Robust Estimation-based Clustering for Multispectral Images

    NASA Technical Reports Server (NTRS)

    Netanyahu, Nathan S.

    1997-01-01

    To prepare for the challenge of handling the archiving and querying of terabyte-sized scientific spatial databases, the NASA Goddard Space Flight Center's Applied Information Sciences Branch (AISB, Code 935) developed a number of characterization algorithms that rely on supervised clustering techniques. The research reported upon here has been aimed at continuing the evolution of some of these supervised techniques, namely the neural network and decision tree-based classifiers, plus extending the approach to incorporating unsupervised clustering algorithms, such as those based on robust estimation (RE) techniques. The algorithms developed under this task should be suited for use by the Intelligent Information Fusion System (IIFS) metadata extraction modules, and as such these algorithms must be fast, robust, and anytime in nature. Finally, so that the planner/schedule module of the IlFS can oversee the use and execution of these algorithms, all information required by the planner/scheduler must be provided to the IIFS development team to ensure the timely integration of these algorithms into the overall system.

  18. Assembly and features of secondary metabolite biosynthetic gene clusters in Streptomyces ansochromogenes.

    PubMed

    Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong

    2013-07-01

    A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.

  19. Whole Genome Sequence and Phylogenetic Analysis Show Helicobacter pylori Strains from Latin America Have Followed a Unique Evolution Pathway

    PubMed Central

    Muñoz-Ramírez, Zilia Y.; Mendez-Tenorio, Alfonso; Kato, Ikuko; Bravo, Maria M.; Rizzato, Cosmeri; Thorell, Kaisa; Torres, Roberto; Aviles-Jimenez, Francisco; Camorlinga, Margarita; Canzian, Federico; Torres, Javier

    2017-01-01

    Helicobacter pylori (HP) genetics may determine its clinical outcomes. Despite high prevalence of HP infection in Latin America (LA), there have been no phylogenetic studies in the region. We aimed to understand the structure of HP populations in LA mestizo individuals, where gastric cancer incidence remains high. The genome of 107 HP strains from Mexico, Nicaragua and Colombia were analyzed with 59 publicly available worldwide genomes. To study bacterial relationship on whole genome level we propose a virtual hybridization technique using thousands of high-entropy 13 bp DNA probes to generate fingerprints. Phylogenetic virtual genome fingerprint (VGF) was compared with Multi Locus Sequence Analysis (MLST) and with phylogenetic analyses of cagPAI virulence island sequences. With MLST some Nicaraguan and Mexican strains clustered close to Africa isolates, whereas European isolates were spread without clustering and intermingled with LA isolates. VGF analysis resulted in increased resolution of populations, separating European from LA strains. Furthermore, clusters with exclusively Colombian, Mexican, or Nicaraguan strains were observed, where the Colombian cluster separated from Europe, Asia, and Africa, while Nicaraguan and Mexican clades grouped close to Africa. In addition, a mixed large LA cluster including Mexican, Colombian, Nicaraguan, Peruvian, and Salvadorian strains was observed; all LA clusters separated from the Amerind clade. With cagPAI sequence analyses LA clades clearly separated from Europe, Asia and Amerind, and Colombian strains formed a single cluster. A NeighborNet analyses suggested frequent and recent recombination events particularly among LA strains. Results suggests that in the new world, H. pylori has evolved to fit mestizo LA populations, already 500 years after the Spanish colonization. This co-adaption may account for regional variability in gastric cancer risk. PMID:28293542

  20. Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis.

    PubMed

    Liao, Minlei; Li, Yunfeng; Kianifard, Farid; Obi, Engels; Arcona, Stephen

    2016-03-02

    Cluster analysis (CA) is a frequently used applied statistical technique that helps to reveal hidden structures and "clusters" found in large data sets. However, this method has not been widely used in large healthcare claims databases where the distribution of expenditure data is commonly severely skewed. The purpose of this study was to identify cost change patterns of patients with end-stage renal disease (ESRD) who initiated hemodialysis (HD) by applying different clustering methods. A retrospective, cross-sectional, observational study was conducted using the Truven Health MarketScan® Research Databases. Patients aged ≥18 years with ≥2 ESRD diagnoses who initiated HD between 2008 and 2010 were included. The K-means CA method and hierarchical CA with various linkage methods were applied to all-cause costs within baseline (12-months pre-HD) and follow-up periods (12-months post-HD) to identify clusters. Demographic, clinical, and cost information was extracted from both periods, and then examined by cluster. A total of 18,380 patients were identified. Meaningful all-cause cost clusters were generated using K-means CA and hierarchical CA with either flexible beta or Ward's methods. Based on cluster sample sizes and change of cost patterns, the K-means CA method and 4 clusters were selected: Cluster 1: Average to High (n = 113); Cluster 2: Very High to High (n = 89); Cluster 3: Average to Average (n = 16,624); or Cluster 4: Increasing Costs, High at Both Points (n = 1554). Median cost changes in the 12-month pre-HD and post-HD periods increased from $185,070 to $884,605 for Cluster 1 (Average to High), decreased from $910,930 to $157,997 for Cluster 2 (Very High to High), were relatively stable and remained low from $15,168 to $13,026 for Cluster 3 (Average to Average), and increased from $57,909 to $193,140 for Cluster 4 (Increasing Costs, High at Both Points). Relatively stable costs after starting HD were associated with more stable scores on comorbidity index scores from the pre-and post-HD periods, while increasing costs were associated with more sharply increasing comorbidity scores. The K-means CA method appeared to be the most appropriate in healthcare claims data with highly skewed cost information when taking into account both change of cost patterns and sample size in the smallest cluster.

  1. X-ray absorption and Mössbauer spectroscopies characterization of iron nanoclusters prepared by the gas aggregation technique.

    PubMed

    Sánchez-Marcos, J; Laguna-Marco, M A; Martínez-Morillas, R; Céspedes, E; Menéndez, N; Jiménez-Villacorta, F; Prieto, C

    2012-11-01

    Partially oxidized iron nanoclusters have been prepared by the gas-phase aggregation technique with typical sizes of 2-3 nm. This preparation technique has been reported to obtain clusters with interesting magnetic properties such as very large exchange bias. In this paper, a sample composition study carried out by Mössbauer and X-ray absorption spectroscopies is reported. The information reached by these techniques, which is based on the iron short range order, results to be an ideal way to have a characterization of the whole sample since the obtained data are an average over a very large amount of the clusters. In addition, our results indicate the presence of ferrihydrite, which is a compound typically ignored when studying this type of systems.

  2. Discussion of CoSA: Clustering of Sparse Approximations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Armstrong, Derek Elswick

    2017-03-07

    The purpose of this talk is to discuss the possible applications of CoSA (Clustering of Sparse Approximations) to the exploitation of HSI (HyperSpectral Imagery) data. CoSA is presented by Moody et al. in the Journal of Applied Remote Sensing (“Land cover classification in multispectral imagery using clustering of sparse approximations over learned feature dictionaries”, Vol. 8, 2014) and is based on machine learning techniques.

  3. Presentation on systems cluster research

    NASA Technical Reports Server (NTRS)

    Morgenthaler, George W.

    1989-01-01

    This viewgraph presentation presents an overview of systems cluster research performed by the Center for Space Construction. The goals of the research are to develop concepts, insights, and models for space construction and to develop systems engineering/analysis curricula for training future aerospace engineers. The following topics are covered: CSC systems analysis/systems engineering (SIMCON) model, CSC systems cluster schedule, system life-cycle, model optimization techniques, publications, cooperative efforts, and sponsored research.

  4. Clustering Algorithms: Their Application to Gene Expression Data

    PubMed Central

    Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

    2016-01-01

    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867

  5. Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach

    PubMed Central

    Kudisthalert, Wasu

    2018-01-01

    Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets–Maximum Unbiased Validation Dataset–which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6. PMID:29652912

  6. On the Connection between Turbulent Motions and Particle Acceleration in Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Eckert, D.; Gaspari, M.; Vazza, F.; Gastaldello, F.; Tramacere, A.; Zimmer, S.; Ettori, S.; Paltani, S.

    2017-07-01

    Giant radio halos are megaparsec-scale diffuse radio sources associated with the central regions of galaxy clusters. The most promising scenario to explain the origin of these sources is that of turbulent re-acceleration, in which MeV electrons injected throughout the formation history of galaxy clusters are accelerated to higher energies by turbulent motions mostly induced by cluster mergers. In this Letter, we use the amplitude of density fluctuations in the intracluster medium as a proxy for the turbulent velocity and apply this technique to a sample of 51 clusters with available radio data. Our results indicate a segregation in the turbulent velocity of radio halo and radio quiet clusters, with the turbulent velocity of the former being on average higher by about a factor of two. The velocity dispersion recovered with this technique correlates with the measured radio power through the relation {P}{radio}\\propto {σ }v3.3+/- 0.7, which implies that the radio power is nearly proportional to the turbulent energy rate. In case turbulence cascades without being dissipated down to the particle acceleration scales, our results provide an observational confirmation of a key prediction of the turbulent re-acceleration model and possibly shed light on the origin of radio halos.

  7. Investigation of Metal and Metal Oxide Clusters Small Enough to Constitute the Critical Size for Gas Phase Nucleation in Combustion Processes.

    DTIC Science & Technology

    1980-11-01

    Ao-A093 950 NORTHWESTERN UNIV EVANSTON IL DEPT OF M4ECHANICAL ND-ETC F/S 7/4 INVESTIGATION OF 1ETAL AND METAL OXIDE CLUSTERS S1ALL ENOUGH TO--ETC(U...34 " 18. SUPPLEMENTARY NOTES 19. KEY WORDS (Continue on reveroe side if necessary snd Identify by block number) Clusters , Nucleation, Molecular Beam, Free...contract a variety of techniques have been employed to study the properties of small atomic and molecular clusters formed in the gas phase via

  8. Combination of automated high throughput platforms, flow cytometry, and hierarchical clustering to detect cell state.

    PubMed

    Kitsos, Christine M; Bhamidipati, Phani; Melnikova, Irena; Cash, Ethan P; McNulty, Chris; Furman, Julia; Cima, Michael J; Levinson, Douglas

    2007-01-01

    This study examined whether hierarchical clustering could be used to detect cell states induced by treatment combinations that were generated through automation and high-throughput (HT) technology. Data-mining techniques were used to analyze the large experimental data sets to determine whether nonlinear, non-obvious responses could be extracted from the data. Unary, binary, and ternary combinations of pharmacological factors (examples of stimuli) were used to induce differentiation of HL-60 cells using a HT automated approach. Cell profiles were analyzed by incorporating hierarchical clustering methods on data collected by flow cytometry. Data-mining techniques were used to explore the combinatorial space for nonlinear, unexpected events. Additional small-scale, follow-up experiments were performed on cellular profiles of interest. Multiple, distinct cellular profiles were detected using hierarchical clustering of expressed cell-surface antigens. Data-mining of this large, complex data set retrieved cases of both factor dominance and cooperativity, as well as atypical cellular profiles. Follow-up experiments found that treatment combinations producing "atypical cell types" made those cells more susceptible to apoptosis. CONCLUSIONS Hierarchical clustering and other data-mining techniques were applied to analyze large data sets from HT flow cytometry. From each sample, the data set was filtered and used to define discrete, usable states that were then related back to their original formulations. Analysis of resultant cell populations induced by a multitude of treatments identified unexpected phenotypes and nonlinear response profiles.

  9. A method to detect progression of glaucoma using the multifocal visual evoked potential technique

    PubMed Central

    Wangsupadilok, Boonchai; Kanadani, Fabio N.; Grippo, Tomas M.; Liebmann, Jeffrey M.; Ritch, Robert; Hood, Donald C.

    2010-01-01

    Purpose To describe a method for monitoring progression of glaucoma using the multifocal visual evoked potential (mfVEP) technique. Methods Eighty-seven patients diagnosed with open-angle glaucoma were divided into two groups. Group I, comprised 43 patients who had a repeat mfVEP test within 50 days (mean 0.9 ± 0.5 months), and group II, 44 patients who had a repeat test after at least 6 months (mean 20.7 ± 9.7 months). Monocular mfVEPs were obtained using a 60-sector pattern reversal dartboard display. Monocular and interocular analyses were performed. Data from the two visits were compared. The total number of abnormal test points with P < 5% within the visual field (total scores) and number of abnormal test points within a cluster (cluster size) were calculated. Data for group I provided a measure of test–retest variability independent of disease progression. Data for group II provided a possible measure of progression. Results The difference in the total scores for group II between visit 1 and visit 2 for the interocular and monocular comparison was significant (P < 0.05) as was the difference in cluster size for the interocular comparison (P < 0.05). Group I did not show a significant change in either total score or cluster size. Conclusion The change in the total score and cluster size over time provides a possible method for assessing progression of glaucoma with the mfVEP technique. PMID:18830654

  10. Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques.

    PubMed

    Yin, Zhong; Zhang, Jianhua

    2014-07-01

    Identifying the abnormal changes of mental workload (MWL) over time is quite crucial for preventing the accidents due to cognitive overload and inattention of human operators in safety-critical human-machine systems. It is known that various neuroimaging technologies can be used to identify the MWL variations. In order to classify MWL into a few discrete levels using representative MWL indicators and small-sized training samples, a novel EEG-based approach by combining locally linear embedding (LLE), support vector clustering (SVC) and support vector data description (SVDD) techniques is proposed and evaluated by using the experimentally measured data. The MWL indicators from different cortical regions are first elicited by using the LLE technique. Then, the SVC approach is used to find the clusters of these MWL indicators and thereby to detect MWL variations. It is shown that the clusters can be interpreted as the binary class MWL. Furthermore, a trained binary SVDD classifier is shown to be capable of detecting slight variations of those indicators. By combining the two schemes, a SVC-SVDD framework is proposed, where the clear-cut (smaller) cluster is detected by SVC first and then a subsequent SVDD model is utilized to divide the overlapped (larger) cluster into two classes. Finally, three-class MWL levels (low, normal and high) can be identified automatically. The experimental data analysis results are compared with those of several existing methods. It has been demonstrated that the proposed framework can lead to acceptable computational accuracy and has the advantages of both unsupervised and supervised training strategies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  11. Temporal and spatial assessment of river surface water quality using multivariate statistical techniques: a study in Can Tho City, a Mekong Delta area, Vietnam.

    PubMed

    Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep

    2015-05-01

    The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.

  12. MEASURING LENSING MAGNIFICATION OF QUASARS BY LARGE SCALE STRUCTURE USING THE VARIABILITY-LUMINOSITY RELATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bauer, Anne H.; Seitz, Stella; Jerke, Jonathan

    2011-05-10

    We introduce a technique to measure gravitational lensing magnification using the variability of type I quasars. Quasars' variability amplitudes and luminosities are tightly correlated, on average. Magnification due to gravitational lensing increases the quasars' apparent luminosity, while leaving the variability amplitude unchanged. Therefore, the mean magnification of an ensemble of quasars can be measured through the mean shift in the variability-luminosity relation. As a proof of principle, we use this technique to measure the magnification of quasars spectroscopically identified in the Sloan Digital Sky Survey (SDSS), due to gravitational lensing by galaxy clusters in the SDSS MaxBCG catalog. The Palomar-QUESTmore » Variability Survey, reduced using the DeepSky pipeline, provides variability data for the sources. We measure the average quasar magnification as a function of scaled distance (r/R{sub 200}) from the nearest cluster; our measurements are consistent with expectations assuming Navarro-Frenk-White cluster profiles, particularly after accounting for the known uncertainty in the clusters' centers. Variability-based lensing measurements are a valuable complement to shape-based techniques because their systematic errors are very different, and also because the variability measurements are amenable to photometric errors of a few percent and to depths seen in current wide-field surveys. Given the volume data of the expected from current and upcoming surveys, this new technique has the potential to be competitive with weak lensing shear measurements of large-scale structure.« less

  13. Mitigation of time-varying distortions in Nyquist-WDM systems using machine learning

    NASA Astrophysics Data System (ADS)

    Granada Torres, Jhon J.; Varughese, Siddharth; Thomas, Varghese A.; Chiuchiarelli, Andrea; Ralph, Stephen E.; Cárdenas Soto, Ana M.; Guerrero González, Neil

    2017-11-01

    We propose a machine learning-based nonsymmetrical demodulation technique relying on clustering to mitigate time-varying distortions derived from several impairments such as IQ imbalance, bias drift, phase noise and interchannel interference. Experimental results show that those impairments cause centroid movements in the received constellations seen in time-windows of 10k symbols in controlled scenarios. In our demodulation technique, the k-means algorithm iteratively identifies the cluster centroids in the constellation of the received symbols in short time windows by means of the optimization of decision thresholds for a minimum BER. We experimentally verified the effectiveness of this computationally efficient technique in multicarrier 16QAM Nyquist-WDM systems over 270 km links. Our nonsymmetrical demodulation technique outperforms the conventional QAM demodulation technique, reducing the OSNR requirement up to ∼0.8 dB at a BER of 1 × 10-2 for signals affected by interchannel interference.

  14. Detection of Clostridium difficile infection clusters, using the temporal scan statistic, in a community hospital in southern Ontario, Canada, 2006-2011.

    PubMed

    Faires, Meredith C; Pearl, David L; Ciccotelli, William A; Berke, Olaf; Reid-Smith, Richard J; Weese, J Scott

    2014-05-12

    In hospitals, Clostridium difficile infection (CDI) surveillance relies on unvalidated guidelines or threshold criteria to identify outbreaks. This can result in false-positive and -negative cluster alarms. The application of statistical methods to identify and understand CDI clusters may be a useful alternative or complement to standard surveillance techniques. The objectives of this study were to investigate the utility of the temporal scan statistic for detecting CDI clusters and determine if there are significant differences in the rate of CDI cases by month, season, and year in a community hospital. Bacteriology reports of patients identified with a CDI from August 2006 to February 2011 were collected. For patients detected with CDI from March 2010 to February 2011, stool specimens were obtained. Clostridium difficile isolates were characterized by ribotyping and investigated for the presence of toxin genes by PCR. CDI clusters were investigated using a retrospective temporal scan test statistic. Statistically significant clusters were compared to known CDI outbreaks within the hospital. A negative binomial regression model was used to identify associations between year, season, month and the rate of CDI cases. Overall, 86 CDI cases were identified. Eighteen specimens were analyzed and nine ribotypes were classified with ribotype 027 (n = 6) the most prevalent. The temporal scan statistic identified significant CDI clusters at the hospital (n = 5), service (n = 6), and ward (n = 4) levels (P ≤ 0.05). Three clusters were concordant with the one C. difficile outbreak identified by hospital personnel. Two clusters were identified as potential outbreaks. The negative binomial model indicated years 2007-2010 (P ≤ 0.05) had decreased CDI rates compared to 2006 and spring had an increased CDI rate compared to the fall (P = 0.023). Application of the temporal scan statistic identified several clusters, including potential outbreaks not detected by hospital personnel. The identification of time periods with decreased or increased CDI rates may have been a result of specific hospital events. Understanding the clustering of CDIs can aid in the interpretation of surveillance data and lead to the development of better early detection systems.

  15. Using Self-Organizing Map (SOM) Clusters of Ozonesonde Profiles to Evaluate Climatologies and Create Linkages between Meteorology and Pollution

    NASA Astrophysics Data System (ADS)

    Stauffer, R. M.; Thompson, A. M.; Young, G. S.; Oltmans, S. J.; Johnson, B.

    2016-12-01

    Ozone (O3) climatologies are typically created by averaging ozonesonde profiles on a monthly or seasonal basis, either for specific regions or zonally. We demonstrate the advantages of using a statistical clustering technique, self-organizing maps (SOM), over this simple averaging, through analysis of more than 4500 sonde profiles taken from the long-term US sites at Boulder, CO; Huntsville, AL; Trinidad Head, CA; and Wallops Island, VA. First, we apply SOM to O3 mixing ratios from surface to 12 km amsl. At all four sites, profiles in SOM clusters exhibit similar tropopause height, 500 hPa height and temperature, and total and tropospheric column O3. Second, when profiles from each SOM cluster are compared to monthly O3 means, near-tropopause O3 in three of the clusters is double (over +100 ppbv) the climatological O3 mixing ratio. The three clusters include 13-16% of all profiles, mostly from winter and spring. Large mid-tropospheric deviations from monthly means are found in two highly-populated clusters that represent either distinctly polluted (summer) or clean O3 (fall-winter, high tropopause) profiles. Thus, SOM indeed appear to represent US O3 profile statistics better than conventional climatologies. In the case of Trinidad Head, SOM clusters of O3 profile data from the lower troposphere (surface-6 km amsl) can discriminate background vs polluted O3 and the meteorology associated with each. Two of nine O3 clusters exhibit thin layers ( 100s of m thick) of high O3, typically between 1 and 4 km. Comparisons between clusters and downwind, high-altitude surface O3 measurements display a marked impact of the elevated tropospheric O­­3. Days corresponding to the high O3 clusters exhibit hourly surface O3 anomalies at surface sites of +5 -10 ppbv compared to a climatology; the anomalies can last up to four days. We also explore applications of SOM to tropical ozonesonde profiles, where tropospheric O3 variability is generally smaller.

  16. Macula segmentation and fovea localization employing image processing and heuristic based clustering for automated retinal screening.

    PubMed

    R, GeethaRamani; Balasubramanian, Lakshmi

    2018-07-01

    Macula segmentation and fovea localization is one of the primary tasks in retinal analysis as they are responsible for detailed vision. Existing approaches required segmentation of retinal structures viz. optic disc and blood vessels for this purpose. This work avoids knowledge of other retinal structures and attempts data mining techniques to segment macula. Unsupervised clustering algorithm is exploited for this purpose. Selection of initial cluster centres has a great impact on performance of clustering algorithms. A heuristic based clustering in which initial centres are selected based on measures defining statistical distribution of data is incorporated in the proposed methodology. The initial phase of proposed framework includes image cropping, green channel extraction, contrast enhancement and application of mathematical closing. Then, the pre-processed image is subjected to heuristic based clustering yielding a binary map. The binary image is post-processed to eliminate unwanted components. Finally, the component which possessed the minimum intensity is finalized as macula and its centre constitutes the fovea. The proposed approach outperforms existing works by reporting that 100%,of HRF, 100% of DRIVE, 96.92% of DIARETDB0, 97.75% of DIARETDB1, 98.81% of HEI-MED, 90% of STARE and 99.33% of MESSIDOR images satisfy the 1R criterion, a standard adopted for evaluating performance of macula and fovea identification. The proposed system thus helps the ophthalmologists in identifying the macula thereby facilitating to identify if any abnormality is present within the macula region. Copyright © 2018 Elsevier B.V. All rights reserved.

  17. In vitro expansion and differentiation of rat pancreatic duct-derived stem cells into insulin secreting cells using a dynamicthree-dimensional cell culture system.

    PubMed

    Chen, X C; Liu, H; Li, H; Cheng, Y; Yang, L; Liu, Y F

    2016-06-27

    In this study, a dynamic three-dimensional cell culture technology was used to expand and differentiate rat pancreatic duct-derived stem cells (PDSCs) into islet-like cell clusters that can secrete insulin. PDSCs were isolated from rat pancreatic tissues by in situ collagenase digestion and density gradient centrifugation. Using a dynamic three-dimensional culture technique, the cells were expanded and differentiated into functional islet-like cell clusters, which were characterized by morphological and phenotype analyses. After maintaining 1 x 108 isolated rat PDSCs in a dynamic three-dimensional cell culture for 7 days, 1.5 x 109 cells could be harvested. Passaged PDSCs expressed markers of pancreatic endocrine progenitors, including CD29 (86.17%), CD73 (90.73%), CD90 (84.13%), CD105 (78.28%), and Pdx-1. Following 14 additional days of culture in serum-free medium with nicotinamide, keratinocyte growth factor (KGF), and b fibroblast growth factor (FGF), the cells were differentiated into islet-like cell clusters (ICCs). The ICC morphology reflected that of fused cell clusters. During the late stage of differentiation, representative clusters were non-adherent and expressed insulin indicated by dithizone (DTZ)-positive staining. Insulin was detected in the extracellular fluid and cytoplasm of ICCs after 14 days of differentiation. Additionally, insulin levels were significantly higher at this time compared with the levels exhibited by PDSCs before differentiation (P < 0.01). By using a dynamic three-dimensional cell culture system, PDSCs can be expanded in vitro and can differentiate into functional islet-like cell clusters.

  18. Pattern Activity Clustering and Evaluation (PACE)

    NASA Astrophysics Data System (ADS)

    Blasch, Erik; Banas, Christopher; Paul, Michael; Bussjager, Becky; Seetharaman, Guna

    2012-06-01

    With the vast amount of network information available on activities of people (i.e. motions, transportation routes, and site visits) there is a need to explore the salient properties of data that detect and discriminate the behavior of individuals. Recent machine learning approaches include methods of data mining, statistical analysis, clustering, and estimation that support activity-based intelligence. We seek to explore contemporary methods in activity analysis using machine learning techniques that discover and characterize behaviors that enable grouping, anomaly detection, and adversarial intent prediction. To evaluate these methods, we describe the mathematics and potential information theory metrics to characterize behavior. A scenario is presented to demonstrate the concept and metrics that could be useful for layered sensing behavior pattern learning and analysis. We leverage work on group tracking, learning and clustering approaches; as well as utilize information theoretical metrics for classification, behavioral and event pattern recognition, and activity and entity analysis. The performance evaluation of activity analysis supports high-level information fusion of user alerts, data queries and sensor management for data extraction, relations discovery, and situation analysis of existing data.

  19. Unsupervised Decoding of Long-Term, Naturalistic Human Neural Recordings with Automated Video and Audio Annotations

    PubMed Central

    Wang, Nancy X. R.; Olson, Jared D.; Ojemann, Jeffrey G.; Rao, Rajesh P. N.; Brunton, Bingni W.

    2016-01-01

    Fully automated decoding of human activities and intentions from direct neural recordings is a tantalizing challenge in brain-computer interfacing. Implementing Brain Computer Interfaces (BCIs) outside carefully controlled experiments in laboratory settings requires adaptive and scalable strategies with minimal supervision. Here we describe an unsupervised approach to decoding neural states from naturalistic human brain recordings. We analyzed continuous, long-term electrocorticography (ECoG) data recorded over many days from the brain of subjects in a hospital room, with simultaneous audio and video recordings. We discovered coherent clusters in high-dimensional ECoG recordings using hierarchical clustering and automatically annotated them using speech and movement labels extracted from audio and video. To our knowledge, this represents the first time techniques from computer vision and speech processing have been used for natural ECoG decoding. Interpretable behaviors were decoded from ECoG data, including moving, speaking and resting; the results were assessed by comparison with manual annotation. Discovered clusters were projected back onto the brain revealing features consistent with known functional areas, opening the door to automated functional brain mapping in natural settings. PMID:27148018

  20. DCE: A Distributed Energy-Efficient Clustering Protocol for Wireless Sensor Network Based on Double-Phase Cluster-Head Election.

    PubMed

    Han, Ruisong; Yang, Wei; Wang, Yipeng; You, Kaiming

    2017-05-01

    Clustering is an effective technique used to reduce energy consumption and extend the lifetime of wireless sensor network (WSN). The characteristic of energy heterogeneity of WSNs should be considered when designing clustering protocols. We propose and evaluate a novel distributed energy-efficient clustering protocol called DCE for heterogeneous wireless sensor networks, based on a Double-phase Cluster-head Election scheme. In DCE, the procedure of cluster head election is divided into two phases. In the first phase, tentative cluster heads are elected with the probabilities which are decided by the relative levels of initial and residual energy. Then, in the second phase, the tentative cluster heads are replaced by their cluster members to form the final set of cluster heads if any member in their cluster has more residual energy. Employing two phases for cluster-head election ensures that the nodes with more energy have a higher chance to be cluster heads. Energy consumption is well-distributed in the proposed protocol, and the simulation results show that DCE achieves longer stability periods than other typical clustering protocols in heterogeneous scenarios.

  1. Dynamic Trajectory Extraction from Stereo Vision Using Fuzzy Clustering

    NASA Astrophysics Data System (ADS)

    Onishi, Masaki; Yoda, Ikushi

    In recent years, many human tracking researches have been proposed in order to analyze human dynamic trajectory. These researches are general technology applicable to various fields, such as customer purchase analysis in a shopping environment and safety control in a (railroad) crossing. In this paper, we present a new approach for tracking human positions by stereo image. We use the framework of two-stepped clustering with k-means method and fuzzy clustering to detect human regions. In the initial clustering, k-means method makes middle clusters from objective features extracted by stereo vision at high speed. In the last clustering, c-means fuzzy method cluster middle clusters based on attributes into human regions. Our proposed method can be correctly clustered by expressing ambiguity using fuzzy clustering, even when many people are close to each other. The validity of our technique was evaluated with the experiment of trajectories extraction of doctors and nurses in an emergency room of a hospital.

  2. Patterns of victimization between and within peer clusters in a high school social network.

    PubMed

    Swartz, Kristin; Reyns, Bradford W; Wilcox, Pamela; Dunham, Jessica R

    2012-01-01

    This study presents a descriptive analysis of patterns of violent victimization between and within the various cohesive clusters of peers comprising a sample of more than 500 9th-12th grade students from one high school. Social network analysis techniques provide a visualization of the overall friendship network structure and allow for the examination of variation in victimization across the various peer clusters within the larger network. Social relationships among clusters with varying levels of victimization are also illustrated so as to provide a sense of possible spatial clustering or diffusion of victimization across proximal peer clusters. Additionally, to provide a sense of the sorts of peer clusters that support (or do not support) victimization, characteristics of clusters at both the high and low ends of the victimization scale are discussed. Finally, several of the peer clusters at both the high and low ends of the victimization continuum are "unpacked", allowing examination of within-network individual-level differences in victimization for these select clusters.

  3. Fractal Clustering and Knowledge-driven Validation Assessment for Gene Expression Profiling.

    PubMed

    Wang, Lu-Yong; Balasubramanian, Ammaiappan; Chakraborty, Amit; Comaniciu, Dorin

    2005-01-01

    DNA microarray experiments generate a substantial amount of information about the global gene expression. Gene expression profiles can be represented as points in multi-dimensional space. It is essential to identify relevant groups of genes in biomedical research. Clustering is helpful in pattern recognition in gene expression profiles. A number of clustering techniques have been introduced. However, these traditional methods mainly utilize shape-based assumption or some distance metric to cluster the points in multi-dimension linear Euclidean space. Their results shows poor consistence with the functional annotation of genes in previous validation study. From a novel different perspective, we propose fractal clustering method to cluster genes using intrinsic (fractal) dimension from modern geometry. This method clusters points in such a way that points in the same clusters are more self-affine among themselves than to the points in other clusters. We assess this method using annotation-based validation assessment for gene clusters. It shows that this method is superior in identifying functional related gene groups than other traditional methods.

  4. Exploring the Internal Dynamics of Globular Clusters

    NASA Astrophysics Data System (ADS)

    Watkins, Laura L.; van der Marel, Roeland; Bellini, Andrea; Luetzgendorf, Nora; HSTPROMO Collaboration

    2018-01-01

    Exploring the Internal Dynamics of Globular ClustersThe formation histories and structural properties of globular clusters are imprinted on their internal dynamics. Energy equipartition results in velocity differences for stars of different mass, and leads to mass segregation, which results in different spatial distributions for stars of different mass. Intermediate-mass black holes significantly increase the velocity dispersions at the centres of clusters. By combining accurate measurements of their internal kinematics with state-of-the-art dynamical models, we can characterise both the velocity dispersion and mass profiles of clusters, tease apart the different effects, and understand how clusters may have formed and evolved.Using proper motions from the Hubble Space Telescope Proper Motion (HSTPROMO) Collaboration for a set of 22 Milky Way globular clusters, and our discrete dynamical modelling techniques designed to work with large, high-quality datasets, we are studying a variety of internal cluster properties. We will present the results of theoretical work on simulated clusters that demonstrates the efficacy of our approach, and preliminary results from application to real clusters.

  5. Clustering methods applied in the detection of Ki67 hot-spots in whole tumor slide images: an efficient way to characterize heterogeneous tissue-based biomarkers.

    PubMed

    Lopez, Xavier Moles; Debeir, Olivier; Maris, Calliope; Rorive, Sandrine; Roland, Isabelle; Saerens, Marco; Salmon, Isabelle; Decaestecker, Christine

    2012-09-01

    Whole-slide scanners allow the digitization of an entire histological slide at very high resolution. This new acquisition technique opens a wide range of possibilities for addressing challenging image analysis problems, including the identification of tissue-based biomarkers. In this study, we use whole-slide scanner technology for imaging the proliferating activity patterns in tumor slides based on Ki67 immunohistochemistry. Faced with large images, pathologists require tools that can help them identify tumor regions that exhibit high proliferating activity, called "hot-spots" (HSs). Pathologists need tools that can quantitatively characterize these HS patterns. To respond to this clinical need, the present study investigates various clustering methods with the aim of identifying Ki67 HSs in whole tumor slide images. This task requires a method capable of identifying an unknown number of clusters, which may be highly variable in terms of shape, size, and density. We developed a hybrid clustering method, referred to as Seedlink. Compared to manual HS selections by three pathologists, we show that Seedlink provides an efficient way of detecting Ki67 HSs and improves the agreement among pathologists when identifying HSs. Copyright © 2012 International Society for Advancement of Cytometry.

  6. Eye-gaze determination of user intent at the computer interface

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldberg, J.H.; Schryver, J.C.

    1993-12-31

    Determination of user intent at the computer interface through eye-gaze monitoring can significantly aid applications for the disabled, as well as telerobotics and process control interfaces. Whereas current eye-gaze control applications are limited to object selection and x/y gazepoint tracking, a methodology was developed here to discriminate a more abstract interface operation: zooming-in or out. This methodology first collects samples of eve-gaze location looking at controlled stimuli, at 30 Hz, just prior to a user`s decision to zoom. The sample is broken into data frames, or temporal snapshots. Within a data frame, all spatial samples are connected into a minimummore » spanning tree, then clustered, according to user defined parameters. Each cluster is mapped to one in the prior data frame, and statistics are computed from each cluster. These characteristics include cluster size, position, and pupil size. A multiple discriminant analysis uses these statistics both within and between data frames to formulate optimal rules for assigning the observations into zooming, zoom-out, or no zoom conditions. The statistical procedure effectively generates heuristics for future assignments, based upon these variables. Future work will enhance the accuracy and precision of the modeling technique, and will empirically test users in controlled experiments.« less

  7. Conformation, structure and molecular solvation: a spectroscopic and computational study of 2-phenoxy ethanol and its singly and multiply hydrated clusters

    NASA Astrophysics Data System (ADS)

    Macleod, Neil A.; Simons, John P.

    2002-10-01

    The conformational landscapes of 2-phenoxy ethanol (POX) and its hydrated clusters have been studied in the gas-phase, providing a model for pharmaceutical β-blockers. A combination of experimental techniques, including resonant two-photon ionisation (R2PI), laser-induced-fluorescence (LIF) and resonant ion-dip infra-red spectroscopy (RIDIRS), coupled with high-level ab initio calculations has allowed the assignment of the individually resolved spectral features to discrete conformational and supra-molecular structures. Assignments were made by comparison of experimental vibrational spectra and partially resolved ultra-violet rotational band contours with those predicted from quantum chemical calculations. The isolated molecule displays a solitary structure with an extended geometry of the side-chain which is stabilised by an intramolecular hydrogen-bond between the alcohol (proton donor) and the ether (proton acceptor) groups of the side-chain. In singly hydrated clusters the water molecule is accommodated by insertion into the intramolecular hydrogen-bond. In the doubly hydrated and higher clusters cyclic structures are generated which incorporate both the water molecules and the terminal OH group of the side-chain; additional (weak) hydrogen bonded interactions with the phenoxy group provide a degree of selectivity but essentially, the water 'droplet' forms on the end of the alcohol side-chain.

  8. Molecular epidemiological characteristics of Salmonella enterica serovars Enteritidis, Typhimurium and Livingstone strains isolated in a Tunisian university hospital.

    PubMed

    Ktari, Sonia; Ksibi, Boutheina; Gharsallah, Houda; Mnif, Basma; Maalej, Sonda; Rhimi, Fouzia; Hammami, Adnene

    2016-03-01

    Enteritidis, Typhimurium and Livingstone are the main Salmonella enterica serovars recovered in Tunisia. Here, we aimed to assess the genetic diversity of fifty-seven Salmonella enterica strains from different sampling periods, origins and settings using pulsed-field gel electrophoresis (PFGE), multi-locus sequence typing (MLST) and multi-locus variable-number tandem repeat analysis (MLVA). Salmonella Enteritidis, isolated from human and food sources from two regions in Sfax in 2007, were grouped into one cluster using PFGE. However, using MLVA these strains were divided into two clusters. Salmonella Typhimurium strains, recovered in 2012 and represent sporadic cases of human clinical isolates, were included in one PFGE cluster. Nevertheless, the MLVA technique, divided Salmonella Typhimurium isolates into six clusters with diversity index reaching (DI = 0.757). For Salmonella Livingstone which was responsible of two nosocomial outbreaks during 2000-2003, the PFGE and MLVA methods showed that these strains were genetically closely related. Salmonella Enteritidis and Salmonella Livingstone populations showed a single ST lineage ST11 and ST543 respectively. For Salmonella Typhimurium, two MLST sequence types ST19 and ST328 were defined. Salmonella Enteritidis and Salmonella Typhimurium strains were clearly differentiated by MLVA which was not the case using PFGE. © 2015 APMIS. Published by John Wiley & Sons Ltd.

  9. Building a new predictor for multiple linear regression technique-based corrective maintenance turnaround time.

    PubMed

    Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa

    2008-01-01

    This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.

  10. A stereoscopic system for viewing the temporal evolution of brain activity clusters in response to linguistic stimuli

    NASA Astrophysics Data System (ADS)

    Forbes, Angus; Villegas, Javier; Almryde, Kyle R.; Plante, Elena

    2014-03-01

    In this paper, we present a novel application, 3D+Time Brain View, for the stereoscopic visualization of functional Magnetic Resonance Imaging (fMRI) data gathered from participants exposed to unfamiliar spoken languages. An analysis technique based on Independent Component Analysis (ICA) is used to identify statistically significant clusters of brain activity and their changes over time during different testing sessions. That is, our system illustrates the temporal evolution of participants' brain activity as they are introduced to a foreign language through displaying these clusters as they change over time. The raw fMRI data is presented as a stereoscopic pair in an immersive environment utilizing passive stereo rendering. The clusters are presented using a ray casting technique for volume rendering. Our system incorporates the temporal information and the results of the ICA into the stereoscopic 3D rendering, making it easier for domain experts to explore and analyze the data.

  11. Automated clustering of probe molecules from solvent mapping of protein surfaces: new algorithms applied to hot-spot mapping and structure-based drug design

    NASA Astrophysics Data System (ADS)

    Lerner, Michael G.; Meagher, Kristin L.; Carlson, Heather A.

    2008-10-01

    Use of solvent mapping, based on multiple-copy minimization (MCM) techniques, is common in structure-based drug discovery. The minima of small-molecule probes define locations for complementary interactions within a binding pocket. Here, we present improved methods for MCM. In particular, a Jarvis-Patrick (JP) method is outlined for grouping the final locations of minimized probes into physical clusters. This algorithm has been tested through a study of protein-protein interfaces, showing the process to be robust, deterministic, and fast in the mapping of protein "hot spots." Improvements in the initial placement of probe molecules are also described. A final application to HIV-1 protease shows how our automated technique can be used to partition data too complicated to analyze by hand. These new automated methods may be easily and quickly extended to other protein systems, and our clustering methodology may be readily incorporated into other clustering packages.

  12. Characteristics of voxel prediction power in full-brain Granger causality analysis of fMRI data

    NASA Astrophysics Data System (ADS)

    Garg, Rahul; Cecchi, Guillermo A.; Rao, A. Ravishankar

    2011-03-01

    Functional neuroimaging research is moving from the study of "activations" to the study of "interactions" among brain regions. Granger causality analysis provides a powerful technique to model spatio-temporal interactions among brain regions. We apply this technique to full-brain fMRI data without aggregating any voxel data into regions of interest (ROIs). We circumvent the problem of dimensionality using sparse regression from machine learning. On a simple finger-tapping experiment we found that (1) a small number of voxels in the brain have very high prediction power, explaining the future time course of other voxels in the brain; (2) these voxels occur in small sized clusters (of size 1-4 voxels) distributed throughout the brain; (3) albeit small, these clusters overlap with most of the clusters identified with the non-temporal General Linear Model (GLM); and (4) the method identifies clusters which, while not determined by the task and not detectable by GLM, still influence brain activity.

  13. Visualizing nD Point Clouds as Topological Landscape Profiles to Guide Local Data Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oesterling, Patrick; Heine, Christian; Weber, Gunther H.

    2012-05-04

    Analyzing high-dimensional point clouds is a classical challenge in visual analytics. Traditional techniques, such as projections or axis-based techniques, suffer from projection artifacts, occlusion, and visual complexity.We propose to split data analysis into two parts to address these shortcomings. First, a structural overview phase abstracts data by its density distribution. This phase performs topological analysis to support accurate and non-overlapping presentation of the high-dimensional cluster structure as a topological landscape profile. Utilizing a landscape metaphor, it presents clusters and their nesting as hills whose height, width, and shape reflect cluster coherence, size, and stability, respectively. A second local analysis phasemore » utilizes this global structural knowledge to select individual clusters or point sets for further, localized data analysis. Focusing on structural entities significantly reduces visual clutter in established geometric visualizations and permits a clearer, more thorough data analysis. In conclusion, this analysis complements the global topological perspective and enables the user to study subspaces or geometric properties, such as shape.« less

  14. Parallel and Scalable Clustering and Classification for Big Data in Geosciences

    NASA Astrophysics Data System (ADS)

    Riedel, M.

    2015-12-01

    Machine learning, data mining, and statistical computing are common techniques to perform analysis in earth sciences. This contribution will focus on two concrete and widely used data analytics methods suitable to analyse 'big data' in the context of geoscience use cases: clustering and classification. From the broad class of available clustering methods we focus on the density-based spatial clustering of appliactions with noise (DBSCAN) algorithm that enables the identification of outliers or interesting anomalies. A new open source parallel and scalable DBSCAN implementation will be discussed in the light of a scientific use case that detects water mixing events in the Koljoefjords. The second technique we cover is classification, with a focus set on the support vector machines algorithm (SVMs), as one of the best out-of-the-box classification algorithm. A parallel and scalable SVM implementation will be discussed in the light of a scientific use case in the field of remote sensing with 52 different classes of land cover types.

  15. Oxygen diffusion in alpha-Al2O3. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Cawley, J. D.; Halloran, J. W.; Cooper, A. R.

    1984-01-01

    Oxygen self diffusion coefficients were determined in single crystal alpha-Al2O3 using the gas exchange technique. The samples were semi-infinite slabs cut from five different boules with varying background impurities. The diffusion direction was parallel to the c-axis. The tracer profiles were determined by two techniques, single spectrum proton activation and secondary ion mass spectrometry. The SIMS proved to be a more useful tool. The determined diffusion coefficients, which were insensitive to impurity levels and oxygen partial pressure, could be described by D = .00151 exp (-572kJ/RT) sq m/s. The insensitivities are discussed in terms of point defect clustering. Two independent models are consistent with the findings, the first considers the clusters as immobile point defect traps which buffer changes in the defect chemistry. The second considers clusters to be mobile and oxygen diffusion to be intrinsic behavior, the mechanism for oxygen transport involving neutral clusters of Schottky quintuplets.

  16. SLUG - stochastically lighting up galaxies - III. A suite of tools for simulated photometry, spectroscopy, and Bayesian inference with stochastic stellar populations

    NASA Astrophysics Data System (ADS)

    Krumholz, Mark R.; Fumagalli, Michele; da Silva, Robert L.; Rendahl, Theodore; Parra, Jonathan

    2015-09-01

    Stellar population synthesis techniques for predicting the observable light emitted by a stellar population have extensive applications in numerous areas of astronomy. However, accurate predictions for small populations of young stars, such as those found in individual star clusters, star-forming dwarf galaxies, and small segments of spiral galaxies, require that the population be treated stochastically. Conversely, accurate deductions of the properties of such objects also require consideration of stochasticity. Here we describe a comprehensive suite of modular, open-source software tools for tackling these related problems. These include the following: a greatly-enhanced version of the SLUG code introduced by da Silva et al., which computes spectra and photometry for stochastically or deterministically sampled stellar populations with nearly arbitrary star formation histories, clustering properties, and initial mass functions; CLOUDY_SLUG, a tool that automatically couples SLUG-computed spectra with the CLOUDY radiative transfer code in order to predict stochastic nebular emission; BAYESPHOT, a general-purpose tool for performing Bayesian inference on the physical properties of stellar systems based on unresolved photometry; and CLUSTER_SLUG and SFR_SLUG, a pair of tools that use BAYESPHOT on a library of SLUG models to compute the mass, age, and extinction of mono-age star clusters, and the star formation rate of galaxies, respectively. The latter two tools make use of an extensive library of pre-computed stellar population models, which are included in the software. The complete package is available at http://www.slugsps.com.

  17. Fast distributed large-pixel-count hologram computation using a GPU cluster.

    PubMed

    Pan, Yuechao; Xu, Xuewu; Liang, Xinan

    2013-09-10

    Large-pixel-count holograms are one essential part for big size holographic three-dimensional (3D) display, but the generation of such holograms is computationally demanding. In order to address this issue, we have built a graphics processing unit (GPU) cluster with 32.5 Tflop/s computing power and implemented distributed hologram computation on it with speed improvement techniques, such as shared memory on GPU, GPU level adaptive load balancing, and node level load distribution. Using these speed improvement techniques on the GPU cluster, we have achieved 71.4 times computation speed increase for 186M-pixel holograms. Furthermore, we have used the approaches of diffraction limits and subdivision of holograms to overcome the GPU memory limit in computing large-pixel-count holograms. 745M-pixel and 1.80G-pixel holograms were computed in 343 and 3326 s, respectively, for more than 2 million object points with RGB colors. Color 3D objects with 1.02M points were successfully reconstructed from 186M-pixel hologram computed in 8.82 s with all the above three speed improvement techniques. It is shown that distributed hologram computation using a GPU cluster is a promising approach to increase the computation speed of large-pixel-count holograms for large size holographic display.

  18. Analyzing simulation-based PRA data through traditional and topological clustering: A BWR station blackout case study

    DOE PAGES

    Maljovec, D.; Liu, S.; Wang, B.; ...

    2015-07-14

    Here, dynamic probabilistic risk assessment (DPRA) methodologies couple system simulator codes (e.g., RELAP and MELCOR) with simulation controller codes (e.g., RAVEN and ADAPT). Whereas system simulator codes model system dynamics deterministically, simulation controller codes introduce both deterministic (e.g., system control logic and operating procedures) and stochastic (e.g., component failures and parameter uncertainties) elements into the simulation. Typically, a DPRA is performed by sampling values of a set of parameters and simulating the system behavior for that specific set of parameter values. For complex systems, a major challenge in using DPRA methodologies is to analyze the large number of scenarios generated,more » where clustering techniques are typically employed to better organize and interpret the data. In this paper, we focus on the analysis of two nuclear simulation datasets that are part of the risk-informed safety margin characterization (RISMC) boiling water reactor (BWR) station blackout (SBO) case study. We provide the domain experts a software tool that encodes traditional and topological clustering techniques within an interactive analysis and visualization environment, for understanding the structures of such high-dimensional nuclear simulation datasets. We demonstrate through our case study that both types of clustering techniques complement each other for enhanced structural understanding of the data.« less

  19. Regularization with numerical extrapolation for finite and UV-divergent multi-loop integrals

    NASA Astrophysics Data System (ADS)

    de Doncker, E.; Yuasa, F.; Kato, K.; Ishikawa, T.; Kapenga, J.; Olagbemi, O.

    2018-03-01

    We give numerical integration results for Feynman loop diagrams such as those covered by Laporta (2000) and by Baikov and Chetyrkin (2010), and which may give rise to loop integrals with UV singularities. We explore automatic adaptive integration using multivariate techniques from the PARINT package for multivariate integration, as well as iterated integration with programs from the QUADPACK package, and a trapezoidal method based on a double exponential transformation. PARINT is layered over MPI (Message Passing Interface), and incorporates advanced parallel/distributed techniques including load balancing among processes that may be distributed over a cluster or a network/grid of nodes. Results are included for 2-loop vertex and box diagrams and for sets of 2-, 3- and 4-loop self-energy diagrams with or without UV terms. Numerical regularization of integrals with singular terms is achieved by linear and non-linear extrapolation methods.

  20. Investigation of correlation classification techniques

    NASA Technical Reports Server (NTRS)

    Haskell, R. E.

    1975-01-01

    A two-step classification algorithm for processing multispectral scanner data was developed and tested. The first step is a single pass clustering algorithm that assigns each pixel, based on its spectral signature, to a particular cluster. The output of that step is a cluster tape in which a single integer is associated with each pixel. The cluster tape is used as the input to the second step, where ground truth information is used to classify each cluster using an iterative method of potentials. Once the clusters have been assigned to classes the cluster tape is read pixel-by-pixel and an output tape is produced in which each pixel is assigned to its proper class. In addition to the digital classification programs, a method of using correlation clustering to process multispectral scanner data in real time by means of an interactive color video display is also described.

Top